[jira] [Updated] (HDFS-7966) New Data Transfer Protocol via HTTP/2

2015-07-20 Thread Duo Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HDFS-7966:

Attachment: TestHttp2ReadBlockInsideEventLoop.svg

The flame graph of a {{TestHttp2ReadBlockInsideEventLoop}} run.

 New Data Transfer Protocol via HTTP/2
 -

 Key: HDFS-7966
 URL: https://issues.apache.org/jira/browse/HDFS-7966
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Qianqian Shi
  Labels: gsoc, gsoc2015, mentor
 Attachments: GSoC2015_Proposal.pdf, 
 TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, 
 TestHttp2ReadBlockInsideEventLoop.svg


 The current Data Transfer Protocol (DTP) implements a rich set of features 
 that span across multiple layers, including:
 * Connection pooling and authentication (session layer)
 * Encryption (presentation layer)
 * Data writing pipeline (application layer)
 All these features are HDFS-specific and defined by implementation. As a 
 result it requires non-trivial amount of work to implement HDFS clients and 
 servers.
 This jira explores to delegate the responsibilities of the session and 
 presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
 connection multiplexing, QoS, authentication and encryption, reducing the 
 scope of DTP to the application layer only. By leveraging the existing HTTP/2 
 library, it should simplify the implementation of both HDFS clients and 
 servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2

2015-07-20 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633089#comment-14633089
 ] 

Duo Zhang commented on HDFS-7966:
-

Write a single threaded testcase that do all the test works inside event loop.

https://github.com/Apache9/hadoop/blob/HDFS-7966-POC/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/web/dtp/TestHttp2ReadBlockInsideEventLoop.java

And at server side, I remove the thread pool in {{ReadBlockHandler}}.

The result is
{noformat}
*** time based on tcp 17734ms
*** time based on http2 20019ms

*** time based on tcp 18878ms
*** time based on http2 21422ms

*** time based on tcp 17562ms
*** time based on http2 20568ms

*** time based on tcp 18726ms
*** time based on http2 20251ms

*** time based on tcp 18632ms
*** time based on http2 21227ms
{noformat}

The average time of original tcp is 18306.4ms, and HTTP/2 is 20697.4ms. 

20697.4 / 18306.4 = 1.13, so HTTP/2 is 13% slower than tcp. In the above test 
it is 30% slower, so I think context switch maybe one of the problem why HTTP/2 
is much slower than tcp. Will do this test on a real cluster to get more data.

And the one {{EventLoop}} per datanode problem, I think it is a problem on a 
small cluster. So I think we should allow creating multiple HTTP/2 connections 
to one datanode. I will modify {{Http2ConnectionPool}} and do the test again.

Thanks.

 New Data Transfer Protocol via HTTP/2
 -

 Key: HDFS-7966
 URL: https://issues.apache.org/jira/browse/HDFS-7966
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Qianqian Shi
  Labels: gsoc, gsoc2015, mentor
 Attachments: GSoC2015_Proposal.pdf, 
 TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg


 The current Data Transfer Protocol (DTP) implements a rich set of features 
 that span across multiple layers, including:
 * Connection pooling and authentication (session layer)
 * Encryption (presentation layer)
 * Data writing pipeline (application layer)
 All these features are HDFS-specific and defined by implementation. As a 
 result it requires non-trivial amount of work to implement HDFS clients and 
 servers.
 This jira explores to delegate the responsibilities of the session and 
 presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
 connection multiplexing, QoS, authentication and encryption, reducing the 
 scope of DTP to the application layer only. By leveraging the existing HTTP/2 
 library, it should simplify the implementation of both HDFS clients and 
 servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8750) FIleSystem does not honor Configuration.getClassLoader() while loading FileSystem implementations

2015-07-20 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633385#comment-14633385
 ] 

Steve Loughran commented on HDFS-8750:
--

Don't worry about it: it's what the review process is for.

If there is one one lesson all of us working on Hadoop eventually learn, there 
are no simple changes -a one-liner may not add a major new feature, but it is 
at much risk of breaking things as the bigger patches. Hence the obsession with 
tests

 FIleSystem does not honor Configuration.getClassLoader() while loading 
 FileSystem implementations
 -

 Key: HDFS-8750
 URL: https://issues.apache.org/jira/browse/HDFS-8750
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fs, HDFS
Reporter: Himanshu
Assignee: Himanshu
 Attachments: HDFS-8750.001.patch, HDFS-8750.002.patch


 In FileSystem.loadFileSystems(), at 
 https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L2652
 a scheme - FileSystem implementation map is created from the jars 
 available on classpath. It uses Thread.currentThread().getClassLoader() via 
 ServiceLoader.load(FileSystem.class)
 Instead, loadFileSystems() should take Configuration as an argument and 
 should first check if a classloader is configured in 
 configuration.getClassLoader(), if yes then 
 ServiceLoader.load(FileSystem.class, configuration.getClassLoader()) should 
 be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml

2015-07-20 Thread kanaka kumar avvaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru updated HDFS-8753:
--
Attachment: HDFS-8753-HDFS-7240.00.patch

 Ozone: Unify StorageContainerConfiguration with ozone-default.xml  
 ozone-site.xml 
 ---

 Key: HDFS-8753
 URL: https://issues.apache.org/jira/browse/HDFS-8753
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: kanaka kumar avvaru
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8753-HDFS-7240.00.patch


 This JIRA proposes adding ozone-default.xml to main resources  
 ozone-site.xml to test resources with default known parameters as of now.
 Also, need to unify {{StorageContainerConfiguration}} to initialize conf with 
 both the files as at present there are two classes with this name.
 {code}
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java
  loads only ozone-site.xml
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java
  loads only storage-container-site.xml
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml

2015-07-20 Thread kanaka kumar avvaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kanaka kumar avvaru updated HDFS-8753:
--
Status: Patch Available  (was: Open)

Attached patch to refer {{OzoneConfiguration}}  and remov the old 
{{StorageContainerConfiguration}} duplicate files.

 Ozone: Unify StorageContainerConfiguration with ozone-default.xml  
 ozone-site.xml 
 ---

 Key: HDFS-8753
 URL: https://issues.apache.org/jira/browse/HDFS-8753
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: kanaka kumar avvaru
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8753-HDFS-7240.00.patch


 This JIRA proposes adding ozone-default.xml to main resources  
 ozone-site.xml to test resources with default known parameters as of now.
 Also, need to unify {{StorageContainerConfiguration}} to initialize conf with 
 both the files as at present there are two classes with this name.
 {code}
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java
  loads only ozone-site.xml
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java
  loads only storage-container-site.xml
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633471#comment-14633471
 ] 

Hadoop QA commented on HDFS-8753:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m  2s | Findbugs (version ) appears to 
be broken on HDFS-7240. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 11s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 17s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 36s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  7s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |   0m 22s | Tests failed in hadoop-hdfs. |
| | |  43m  5s | |
\\
\\
|| Reason || Tests ||
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746081/HDFS-8753-HDFS-7240.00.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7240 / 8576861 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11749/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11749/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11749/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11749/console |


This message was automatically generated.

 Ozone: Unify StorageContainerConfiguration with ozone-default.xml  
 ozone-site.xml 
 ---

 Key: HDFS-8753
 URL: https://issues.apache.org/jira/browse/HDFS-8753
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: kanaka kumar avvaru
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8753-HDFS-7240.00.patch


 This JIRA proposes adding ozone-default.xml to main resources  
 ozone-site.xml to test resources with default known parameters as of now.
 Also, need to unify {{StorageContainerConfiguration}} to initialize conf with 
 both the files as at present there are two classes with this name.
 {code}
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java
  loads only ozone-site.xml
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java
  loads only storage-container-site.xml
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8760) Erasure Coding: reuse BlockReader when reading the same block in pread

2015-07-20 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633492#comment-14633492
 ] 

Walter Su commented on HDFS-8760:
-

LGTM. +1 after address one minor issue: (related)
updateReadStatistics(..) is called twice, in readCells(..) and 
ByteBufferStrategy.doRead(..)

I found other issues while reviewing the patch: (not related)
1. some util functions are static import from StripedBlockUtil, while others 
are called by StripedBlockUtil.method(...)
2. DFSStripedInputStream.read(Bytebuffer) is identical with the one in super 
class.
3. StripeReader / readStripe(..) the stripe means AlignedStripe, may across 
many real stripes. Need some javadoc.
4. Suppose {{buf}} is the buffer given by user. Pread() makes blockReader 
directly put data to {{buf}}. Stateful read() needs blockReader put data to 
curStripeBuf, then copy curStripeBuf to {{buf}}. curStripeBuf is useful when 
user calls read()/read(small buf) frequently, especially when there are bad DN. 
I think if buf.size  curStripeBuf.size we can directly write data to buf 
without curStripeBuf.
Maybe copy is fine. But why is it a DirectByteBuffer? I don't know how does it 
help decoding, but it's bad that copy data from heap to native memory, then 
copy from native memory to heap, if there's no need to decode.
We need a further digging.


 Erasure Coding: reuse BlockReader when reading the same block in pread
 --

 Key: HDFS-8760
 URL: https://issues.apache.org/jira/browse/HDFS-8760
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8760.000.patch


 Currently in pread, we create a new block reader for each aligned stripe even 
 though these stripes belong to the same block. It's better to reuse them to 
 avoid unnecessary block reader creation overhead. This can also avoid reading 
 from the same bad DataNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7337) Configurable and pluggable Erasure Codec and schema

2015-07-20 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633567#comment-14633567
 ] 

Kai Zheng commented on HDFS-7337:
-

Thanks [~andrew.wang] for the thoughts to move on this a bit. They sound good 
to me and I'm fine. Some points for further discussion:
bq. ...also the need to persist this information in the fsimage/editlog, ...
Did you mean schema? If so, it looks like a point we all agree with. Per 
discussion in HDFS-7859 and related, we planed to do it in follow-on, along 
with support of multiple schemas. For now we only support one system defined 
schema, RS(6, 3).
bq. Codec enum (e.g. RS, LRC, etc), ...When we get to the point of 
fully-pluggable codecs, we can add a special wildcard enum value to support 
this
Good to have the enum for built-in codecs for now and the wildcard for 
customized additional ones in future.
bq. In client's hdfs-site.xml, we can configure a codec implementation for 
every codec. This would look something like...
In existing codes we're using the following format for the similar purpose. 
Please confirm if it looks good.
{noformat}
  /** Raw coder factory for the RS codec. */
  public static final String IO_ERASURECODE_CODEC_RS_RAWCODER_KEY =
  io.erasurecode.codec.rs.rawcoder;

  /** Raw coder factory for the XOR codec. */
  public static final String IO_ERASURECODE_CODEC_XOR_RAWCODER_KEY =
  io.erasurecode.codec.xor.rawcoder;
{noformat}
The related codes reside in {{CodecUtil}} reading above configurations. Would 
you check it if necessary.
When we're clear what's needed to be done for the phase, I would have an issue 
to get them done separately. Thanks.


 Configurable and pluggable Erasure Codec and schema
 ---

 Key: HDFS-7337
 URL: https://issues.apache.org/jira/browse/HDFS-7337
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Kai Zheng
 Attachments: HDFS-7337-prototype-v1.patch, 
 HDFS-7337-prototype-v2.zip, HDFS-7337-prototype-v3.zip, 
 PluggableErasureCodec-v2.pdf, PluggableErasureCodec-v3.pdf, 
 PluggableErasureCodec.pdf


 According to HDFS-7285 and the design, this considers to support multiple 
 Erasure Codecs via pluggable approach. It allows to define and configure 
 multiple codec schemas with different coding algorithms and parameters. The 
 resultant codec schemas can be utilized and specified via command tool for 
 different file folders. While design and implement such pluggable framework, 
 it’s also to implement a concrete codec by default (Reed Solomon) to prove 
 the framework is useful and workable. Separate JIRA could be opened for the 
 RS codec implementation.
 Note HDFS-7353 will focus on the very low level codec API and implementation 
 to make concrete vendor libraries transparent to the upper layer. This JIRA 
 focuses on high level stuffs that interact with configuration, schema and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8760) Erasure Coding: reuse BlockReader when reading the same block in pread

2015-07-20 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633501#comment-14633501
 ] 

Walter Su commented on HDFS-8760:
-

I just saw HADOOP-12060 fixes #4. Please ignore #4.

 Erasure Coding: reuse BlockReader when reading the same block in pread
 --

 Key: HDFS-8760
 URL: https://issues.apache.org/jira/browse/HDFS-8760
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8760.000.patch


 Currently in pread, we create a new block reader for each aligned stripe even 
 though these stripes belong to the same block. It's better to reuse them to 
 avoid unnecessary block reader creation overhead. This can also avoid reading 
 from the same bad DataNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8604) Erasure Coding: update invalidateBlock(..) logic for striped block

2015-07-20 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su resolved HDFS-8604.
-
Resolution: Duplicate

already fixed in HDFS-8619

 Erasure Coding: update invalidateBlock(..) logic for striped block
 --

 Key: HDFS-8604
 URL: https://issues.apache.org/jira/browse/HDFS-8604
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Walter Su
Assignee: Walter Su

 {code}  
   private boolean invalidateBlock(BlockToMarkCorrupt b, DatanodeInfo dn
   ) throws IOException {
   ..
 } else if (nr.liveReplicas() = 1) { 
   // If we have at least one copy on a live node, then we can delete it.
   addToInvalidates(b.corrupted, dn); 
   removeStoredBlock(b.stored, node);
 {code}
 We don't delete corrupted block if all we left is corrupted block. We give 
 user the decision. So user has chance to recover it manually.
 We should not compare liveReplicas() of Striped block with 1. The logic 
 need update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8483) Erasure coding: test DataNode reporting bad/corrupted blocks which belongs to a striped block.

2015-07-20 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633612#comment-14633612
 ] 

Walter Su commented on HDFS-8483:
-

Sorry, the description says this jira test DatanodeProtocol#reportBadBlocks, 
but the patch test blockreport.

DN will report bad blocks using DatanodeProtocol#reportBadBlocks when:
1. VolumeScanner found bad block (See DataNode.reportBadBlocks(..) )
2. blockReceiver found upstream DN in the pipiline has corrupted block  (See 
BlockReceiver.verifyChunks(..) )
3. DN gets DNA_TRANSFER command from NN, and try to copy a replica from another 
DN. (See DataNode.reportBadBlocks(..) )

EC Striping doesn't have #2,#3 situation. And I think it's trival to test #1 
because NamenodeRpcServer.reportBadBlocks(..) has the same implementation for 
ClientProtocol, DatanodeProtocol.

It's more important to write a striping version of {{TestProcessCorruptBlocks}}.

 Erasure coding: test DataNode reporting bad/corrupted blocks which belongs to 
 a striped block.
 --

 Key: HDFS-8483
 URL: https://issues.apache.org/jira/browse/HDFS-8483
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma
 Fix For: HDFS-7285

 Attachments: HDFS-8483.0.patch


 We can mimic one/several DataNode(s) reporting bad block(s) (which belong to 
 a striped block) to the NameNode (through the 
 DatanodeProtocol#reportBadBlocks call), and check if the 
 recovery/invalidation work can be correctly scheduled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml

2015-07-20 Thread kanaka kumar avvaru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633611#comment-14633611
 ] 

kanaka kumar avvaru commented on HDFS-8753:
---

Looks like some .proto files are moved to {{hadoop-hdfs-client}} project. 
[~arpitagarwal], can you please check if we need to update build config for the 
branch.

 Ozone: Unify StorageContainerConfiguration with ozone-default.xml  
 ozone-site.xml 
 ---

 Key: HDFS-8753
 URL: https://issues.apache.org/jira/browse/HDFS-8753
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: kanaka kumar avvaru
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8753-HDFS-7240.00.patch


 This JIRA proposes adding ozone-default.xml to main resources  
 ozone-site.xml to test resources with default known parameters as of now.
 Also, need to unify {{StorageContainerConfiguration}} to initialize conf with 
 both the files as at present there are two classes with this name.
 {code}
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java
  loads only ozone-site.xml
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java
  loads only storage-container-site.xml
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8799) Erasure Coding: add tests for process corrupt striped blocks

2015-07-20 Thread Walter Su (JIRA)
Walter Su created HDFS-8799:
---

 Summary: Erasure Coding: add tests for process corrupt striped 
blocks
 Key: HDFS-8799
 URL: https://issues.apache.org/jira/browse/HDFS-8799
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8791) block ID-based DN storage layout can be very slow for datanode on ext4

2015-07-20 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633669#comment-14633669
 ] 

Nathan Roberts commented on HDFS-8791:
--

Hi [~cmccabe]. Thanks for the idea. Yes, I had actually tried something like 
that. I actually just kept a loop of DU's running on the node (outside of the 
datanode process for simplicity sake). I thought this would prevent it from 
happening but it turns out it still gets into this situation. I suspect the 
reason is that when there is memory pressure, it will start to seek a little, 
and then once it starts to seek a little the system quickly degrades because 
buffers are being thrown away faster than the disks can seek. 

 block ID-based DN storage layout can be very slow for datanode on ext4
 --

 Key: HDFS-8791
 URL: https://issues.apache.org/jira/browse/HDFS-8791
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Priority: Critical

 We are seeing cases where the new directory layout causes the datanode to 
 basically cause the disks to seek for 10s of minutes. This can be when the 
 datanode is running du, and it can also be when it is performing a 
 checkDirs(). Both of these operations currently scan all directories in the 
 block pool and that's very expensive in the new layout.
 The new layout creates 256 subdirs, each with 256 subdirs. Essentially 64K 
 leaf directories where block files are placed.
 So, what we have on disk is:
 - 256 inodes for the first level directories
 - 256 directory blocks for the first level directories
 - 256*256 inodes for the second level directories
 - 256*256 directory blocks for the second level directories
 - Then the inodes and blocks to store the the HDFS blocks themselves.
 The main problem is the 256*256 directory blocks. 
 inodes and dentries will be cached by linux and one can configure how likely 
 the system is to prune those entries (vfs_cache_pressure). However, ext4 
 relies on the buffer cache to cache the directory blocks and I'm not aware of 
 any way to tell linux to favor buffer cache pages (even if it did I'm not 
 sure I would want it to in general).
 Also, ext4 tries hard to spread directories evenly across the entire volume, 
 this basically means the 64K directory blocks are probably randomly spread 
 across the entire disk. A du type scan will look at directories one at a 
 time, so the ioscheduler can't optimize the corresponding seeks, meaning the 
 seeks will be random and far. 
 In a system I was using to diagnose this, I had 60K blocks. A DU when things 
 are hot is less than 1 second. When things are cold, about 20 minutes.
 How do things get cold?
 - A large set of tasks run on the node. This pushes almost all of the buffer 
 cache out, causing the next DU to hit this situation. We are seeing cases 
 where a large job can cause a seek storm across the entire cluster.
 Why didn't the previous layout see this?
 - It might have but it wasn't nearly as pronounced. The previous layout would 
 be a few hundred directory blocks. Even when completely cold, these would 
 only take a few a hundred seeks which would mean single digit seconds.  
 - With only a few hundred directories, the odds of the directory blocks 
 getting modified is quite high, this keeps those blocks hot and much less 
 likely to be evicted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8760) Erasure Coding: reuse BlockReader when reading the same block in pread

2015-07-20 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8760:

Attachment: HDFS-8760-HDFS-7285.001.patch

Thanks for the review, Walter! Update the patch to address all your comments. 

In the meanwhile, {{testWriteReadUsingWebHdfs}} may fail after the change. The 
failure may be related to HDFS-8797. So currently I temporarily disabled the 
pread test in {{testWriteReadUsingWebHdfs}} and we can add it back later.

 Erasure Coding: reuse BlockReader when reading the same block in pread
 --

 Key: HDFS-8760
 URL: https://issues.apache.org/jira/browse/HDFS-8760
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8760-HDFS-7285.001.patch, HDFS-8760.000.patch


 Currently in pread, we create a new block reader for each aligned stripe even 
 though these stripes belong to the same block. It's better to reuse them to 
 avoid unnecessary block reader creation overhead. This can also avoid reading 
 from the same bad DataNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-07-20 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633730#comment-14633730
 ] 

Chang Li commented on HDFS-6407:


[~benoyantony] is there any status on this issue?

 new namenode UI, lost ability to sort columns in datanode tab
 -

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Benoy Antony
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.patch, 
 browse_directory.png, datanodes.png, snapshots.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633741#comment-14633741
 ] 

Hadoop QA commented on HDFS-6407:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   0m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:red}-1{color} | @author |   0m  0s | The patch appears to contain 2 
@author tags which the Hadoop  community has agreed to not allow in code 
contributions. |
| {color:red}-1{color} | release audit |   0m 14s | The applied patch generated 
3 release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   0m 17s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12730214/HDFS-6407-003.patch |
| Optional Tests |  |
| git revision | trunk / 98c2bc8 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11750/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11750/console |


This message was automatically generated.

 new namenode UI, lost ability to sort columns in datanode tab
 -

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Benoy Antony
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.patch, 
 browse_directory.png, datanodes.png, snapshots.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-2433) TestFileAppend4 fails intermittently

2015-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers resolved HDFS-2433.
--
Resolution: Cannot Reproduce

I don't think I've seen this fail in a long, long time. Going to close this 
out. Please reopen if you disagree.

 TestFileAppend4 fails intermittently
 

 Key: HDFS-2433
 URL: https://issues.apache.org/jira/browse/HDFS-2433
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode, test
Affects Versions: 0.20.205.0, 1.0.0
Reporter: Robert Joseph Evans
Priority: Critical
 Attachments: failed.tar.bz2


 A Jenkins build we have running failed twice in a row with issues form 
 TestFileAppend4.testAppendSyncReplication1 in an attempt to reproduce the 
 error I ran TestFileAppend4 in a loop over night saving the results away.  
 (No clean was done in between test runs)
 When TestFileAppend4 is run in a loop the testAppendSyncReplication[012] 
 tests fail about 10% of the time (14 times out of 130 tries)  They all fail 
 with something like the following.  Often it is only one of the tests that 
 fail, but I have seen as many as two fail in one run.
 {noformat}
 Testcase: testAppendSyncReplication2 took 32.198 sec
 FAILED
 Should have 2 replicas for that block, not 1
 junit.framework.AssertionFailedError: Should have 2 replicas for that block, 
 not 1
 at 
 org.apache.hadoop.hdfs.TestFileAppend4.replicationTest(TestFileAppend4.java:477)
 at 
 org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncReplication2(TestFileAppend4.java:425)
 {noformat}
 I also saw several other tests that are a part of TestFileApped4 fail during 
 this experiment.  They may all be related to one another so I am filing them 
 in the same JIRA.  If it turns out that they are not related then they can be 
 split up later.
 testAppendSyncBlockPlusBbw failed 6 out of the 130 times or about 5% of the 
 time
 {noformat}
 Testcase: testAppendSyncBlockPlusBbw took 1.633 sec
 FAILED
 unexpected file size! received=0 , expected=1024
 junit.framework.AssertionFailedError: unexpected file size! received=0 , 
 expected=1024
 at 
 org.apache.hadoop.hdfs.TestFileAppend4.assertFileSize(TestFileAppend4.java:136)
 at 
 org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncBlockPlusBbw(TestFileAppend4.java:401)
 {noformat}
 testAppendSyncChecksum[012] failed 2 out of the 130 times or about 1.5% of 
 the time
 {noformat}
 Testcase: testAppendSyncChecksum1 took 32.385 sec
 FAILED
 Should have 1 replica for that block, not 2
 junit.framework.AssertionFailedError: Should have 1 replica for that block, 
 not 2
 at 
 org.apache.hadoop.hdfs.TestFileAppend4.checksumTest(TestFileAppend4.java:556)
 at 
 org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncChecksum1(TestFileAppend4.java:500)
 {noformat}
 I will attach logs for all of the failures.  Be aware that I did change some 
 of the logging messages in this test so I could better see when 
 testAppendSyncReplication started and ended.  Other then that the code is 
 stock 0.20.205 RC2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-3660) TestDatanodeBlockScanner#testBlockCorruptionRecoveryPolicy2 times out

2015-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers resolved HDFS-3660.
--
  Resolution: Cannot Reproduce
Target Version/s:   (was: )

This is an ancient/stale flaky test JIRA. Resolving.

 TestDatanodeBlockScanner#testBlockCorruptionRecoveryPolicy2 times out   
 

 Key: HDFS-3660
 URL: https://issues.apache.org/jira/browse/HDFS-3660
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Priority: Minor

 Saw this on a recent jenkins run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8788) Implement unit tests for remote block reader in libhdfspp

2015-07-20 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633930#comment-14633930
 ] 

James Clampffer commented on HDFS-8788:
---

+1

 Implement unit tests for remote block reader in libhdfspp
 -

 Key: HDFS-8788
 URL: https://issues.apache.org/jira/browse/HDFS-8788
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8788.000.patch


 This jira proposes to implement unit tests for the remote block reader in 
 gmock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8760) Erasure Coding: reuse BlockReader when reading the same block in pread

2015-07-20 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8760:

Status: Patch Available  (was: Open)

 Erasure Coding: reuse BlockReader when reading the same block in pread
 --

 Key: HDFS-8760
 URL: https://issues.apache.org/jira/browse/HDFS-8760
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8760-HDFS-7285.001.patch, HDFS-8760.000.patch


 Currently in pread, we create a new block reader for each aligned stripe even 
 though these stripes belong to the same block. It's better to reuse them to 
 avoid unnecessary block reader creation overhead. This can also avoid reading 
 from the same bad DataNode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8788) Implement unit tests for remote block reader in libhdfspp

2015-07-20 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633929#comment-14633929
 ] 

James Clampffer commented on HDFS-8788:
---

+1

 Implement unit tests for remote block reader in libhdfspp
 -

 Key: HDFS-8788
 URL: https://issues.apache.org/jira/browse/HDFS-8788
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8788.000.patch


 This jira proposes to implement unit tests for the remote block reader in 
 gmock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-3811) TestPersistBlocks#TestRestartDfsWithFlush appears to be flaky

2015-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers resolved HDFS-3811.
--
Resolution: Cannot Reproduce

I don't think I've seen this fail in a very long time. Going to resolve this. 
Please reopen if you disagree.

 TestPersistBlocks#TestRestartDfsWithFlush appears to be flaky
 -

 Key: HDFS-3811
 URL: https://issues.apache.org/jira/browse/HDFS-3811
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.2-alpha
Reporter: Andrew Wang
Assignee: Todd Lipcon
 Attachments: stacktrace, testfail-editlog.log, testfail.log, 
 testpersistblocks.txt


 This test failed on a recent Jenkins build, but passes for me locally. Seems 
 flaky.
 See:
 https://builds.apache.org/job/PreCommit-HDFS-Build/3021//testReport/org.apache.hadoop.hdfs/TestPersistBlocks/TestRestartDfsWithFlush/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633946#comment-14633946
 ] 

Allen Wittenauer commented on HDFS-8344:


{code}
dfs.block.uc.max.recovery.attemps
{code}

Typo on the configuration entry.  Also, should probably be in hdfs-default.xml. 

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-4001) TestSafeMode#testInitializeReplQueuesEarly may time out

2015-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers resolved HDFS-4001.
--
Resolution: Fixed

Haven't seen this fail in a very long time. Closing this out. Feel free to 
reopen if you disagree.

 TestSafeMode#testInitializeReplQueuesEarly may time out
 ---

 Key: HDFS-4001
 URL: https://issues.apache.org/jira/browse/HDFS-4001
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
 Attachments: timeout.txt.gz


 Saw this failure on a recent branch-2 jenkins run, has also been seen on 
 trunk.
 {noformat}
 java.util.concurrent.TimeoutException: Timed out waiting for condition
   at 
 org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:107)
   at 
 org.apache.hadoop.hdfs.TestSafeMode.testInitializeReplQueuesEarly(TestSafeMode.java:191)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8673) HDFS reports file already exists if there is a file/dir name end with ._COPYING_

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633859#comment-14633859
 ] 

Hadoop QA commented on HDFS-8673:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m  9s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 39s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 53s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m  9s | Tests passed in 
hadoop-common. |
| | |  60m 26s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746128/HDFS-8673.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 98c2bc8 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11751/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11751/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11751/console |


This message was automatically generated.

 HDFS reports file already exists if there is a file/dir name end with 
 ._COPYING_
 

 Key: HDFS-8673
 URL: https://issues.apache.org/jira/browse/HDFS-8673
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
 Attachments: HDFS-8673.000-WIP.patch, HDFS-8673.000.patch, 
 HDFS-8673.001.patch, HDFS-8673.002.patch, HDFS-8673.003.patch, 
 HDFS-8673.003.patch


 Because CLI is using CommandWithDestination.java which add ._COPYING_ to 
 the tail of file name when it does the copy. It will cause problem if there 
 is a file/dir already called *._COPYING_ on HDFS.
 For file:
 -bash-4.1$ hadoop fs -put 5M /user/occ/
 -bash-4.1$ hadoop fs -mv /user/occ/5M /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup5242880 2015-06-26 05:16 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup 131072 2015-06-26 05:19 /user/occ/5M
 For dir:
 -bash-4.1$ hadoop fs -mkdir /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 drwxr-xr-x   - occ supergroup  0 2015-06-26 05:24 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 put: /user/occ/5M._COPYING_ already exists as a directory
 -bash-4.1$ hadoop fs -ls /user/occ/
 (/user/occ/5M._COPYING_ is gone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8673) HDFS reports file already exists if there is a file/dir name end with ._COPYING_

2015-07-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He reassigned HDFS-8673:
-

Assignee: Chen He

 HDFS reports file already exists if there is a file/dir name end with 
 ._COPYING_
 

 Key: HDFS-8673
 URL: https://issues.apache.org/jira/browse/HDFS-8673
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
 Attachments: HDFS-8673.000-WIP.patch, HDFS-8673.000.patch, 
 HDFS-8673.001.patch, HDFS-8673.002.patch, HDFS-8673.003.patch


 Because CLI is using CommandWithDestination.java which add ._COPYING_ to 
 the tail of file name when it does the copy. It will cause problem if there 
 is a file/dir already called *._COPYING_ on HDFS.
 For file:
 -bash-4.1$ hadoop fs -put 5M /user/occ/
 -bash-4.1$ hadoop fs -mv /user/occ/5M /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup5242880 2015-06-26 05:16 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup 131072 2015-06-26 05:19 /user/occ/5M
 For dir:
 -bash-4.1$ hadoop fs -mkdir /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 drwxr-xr-x   - occ supergroup  0 2015-06-26 05:24 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 put: /user/occ/5M._COPYING_ already exists as a directory
 -bash-4.1$ hadoop fs -ls /user/occ/
 (/user/occ/5M._COPYING_ is gone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8791) block ID-based DN storage layout can be very slow for datanode on ext4

2015-07-20 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633804#comment-14633804
 ] 

Nathan Roberts commented on HDFS-8791:
--

I agree we should optimize all the potential scans (du, checkDirs, 
directoryScanner, etc)

I also think we need to do something more general because I feel like people 
will trip on this in all sorts of ways. Even tools outside of the DN process 
that do periodic scans will be affected and will in-turn adversely affect the 
datenode's performance. Also, it's hard to see this problem until you're 
running at scale so it will be difficult to catch jiras that introduce yet 
another scan, because they run really fast when everything is in memory.

I'm wondering if we shouldn't move to a hashing scheme that is more dynamic and 
grows/shrinks based on the number of blocks in the volume. A consistent hash to 
minimize renames, plus some logic that knows how to look in two places (old 
hash, new hash), seems like it might work. We could set a threshold of avg 100 
blocks per directory, when we cross that threshold then we add enough subdirs 
to bring the avg down to 95. 

I think ext2 and ext3 will see a similar problem. Are you seeing something 
different? I'll admit that my understanding of the differences isn't 
exhaustive, but it sure seems like all of them rely on the buffer cache to 
maintain directory blocks and all of them try to spread directories across the 
disk, so they'd all be subject to the same sort of thing. 


 block ID-based DN storage layout can be very slow for datanode on ext4
 --

 Key: HDFS-8791
 URL: https://issues.apache.org/jira/browse/HDFS-8791
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Priority: Critical

 We are seeing cases where the new directory layout causes the datanode to 
 basically cause the disks to seek for 10s of minutes. This can be when the 
 datanode is running du, and it can also be when it is performing a 
 checkDirs(). Both of these operations currently scan all directories in the 
 block pool and that's very expensive in the new layout.
 The new layout creates 256 subdirs, each with 256 subdirs. Essentially 64K 
 leaf directories where block files are placed.
 So, what we have on disk is:
 - 256 inodes for the first level directories
 - 256 directory blocks for the first level directories
 - 256*256 inodes for the second level directories
 - 256*256 directory blocks for the second level directories
 - Then the inodes and blocks to store the the HDFS blocks themselves.
 The main problem is the 256*256 directory blocks. 
 inodes and dentries will be cached by linux and one can configure how likely 
 the system is to prune those entries (vfs_cache_pressure). However, ext4 
 relies on the buffer cache to cache the directory blocks and I'm not aware of 
 any way to tell linux to favor buffer cache pages (even if it did I'm not 
 sure I would want it to in general).
 Also, ext4 tries hard to spread directories evenly across the entire volume, 
 this basically means the 64K directory blocks are probably randomly spread 
 across the entire disk. A du type scan will look at directories one at a 
 time, so the ioscheduler can't optimize the corresponding seeks, meaning the 
 seeks will be random and far. 
 In a system I was using to diagnose this, I had 60K blocks. A DU when things 
 are hot is less than 1 second. When things are cold, about 20 minutes.
 How do things get cold?
 - A large set of tasks run on the node. This pushes almost all of the buffer 
 cache out, causing the next DU to hit this situation. We are seeing cases 
 where a large job can cause a seek storm across the entire cluster.
 Why didn't the previous layout see this?
 - It might have but it wasn't nearly as pronounced. The previous layout would 
 be a few hundred directory blocks. Even when completely cold, these would 
 only take a few a hundred seeks which would mean single digit seconds.  
 - With only a few hundred directories, the odds of the directory blocks 
 getting modified is quite high, this keeps those blocks hot and much less 
 likely to be evicted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml

2015-07-20 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633875#comment-14633875
 ] 

Anu Engineer commented on HDFS-8753:


Hi [~kanaka],

Thanks for the patch.  I am able to build this patch on my local machine, and 
from the build logs it looks like it failed due to 
{code}
{Exception in thread main java.io.FileNotFoundException: 
/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/patchprocess/HDFS-7240FindbugsWarningshadoop-hdfs.xml
 (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.init(FileInputStream.java:146)
at 
edu.umd.cs.findbugs.SortedBugCollection.progessMonitoredInputStream(SortedBugCollection.java:1231)
at 
edu.umd.cs.findbugs.SortedBugCollection.readXML(SortedBugCollection.java:308)
at 
edu.umd.cs.findbugs.SortedBugCollection.readXML(SortedBugCollection.java:295)
at edu.umd.cs.findbugs.workflow.Filter.main(Filter.java:712)
Pre-patch HDFS-7240 findbugs is broken?
{code}

Not being able to run findbugs on the pre-patch build. I have re-submitted the 
build  to if we can repro this issue

https://builds.apache.org/job/PreCommit-HDFS-Build/11752/console

Thanks
Anu


 Ozone: Unify StorageContainerConfiguration with ozone-default.xml  
 ozone-site.xml 
 ---

 Key: HDFS-8753
 URL: https://issues.apache.org/jira/browse/HDFS-8753
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: kanaka kumar avvaru
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8753-HDFS-7240.00.patch


 This JIRA proposes adding ozone-default.xml to main resources  
 ozone-site.xml to test resources with default known parameters as of now.
 Also, need to unify {{StorageContainerConfiguration}} to initialize conf with 
 both the files as at present there are two classes with this name.
 {code}
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java
  loads only ozone-site.xml
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java
  loads only storage-container-site.xml
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-07-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-6407:
---
Priority: Critical  (was: Minor)

 new namenode UI, lost ability to sort columns in datanode tab
 -

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Benoy Antony
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.patch, 
 browse_directory.png, datanodes.png, snapshots.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8791) block ID-based DN storage layout can be very slow for datanode on ext4

2015-07-20 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633675#comment-14633675
 ] 

Nathan Roberts commented on HDFS-8791:
--

I forgot to mention that I'm pretty confident it's not the inodes, but rather 
the directory blocks. inodes have their own cache that I can control with 
vfs_cache_pressure. directory blocks however are just cached via the buffer 
cache (afaik), and the buffer cache is much more difficult to have any control 
over.

 block ID-based DN storage layout can be very slow for datanode on ext4
 --

 Key: HDFS-8791
 URL: https://issues.apache.org/jira/browse/HDFS-8791
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Priority: Critical

 We are seeing cases where the new directory layout causes the datanode to 
 basically cause the disks to seek for 10s of minutes. This can be when the 
 datanode is running du, and it can also be when it is performing a 
 checkDirs(). Both of these operations currently scan all directories in the 
 block pool and that's very expensive in the new layout.
 The new layout creates 256 subdirs, each with 256 subdirs. Essentially 64K 
 leaf directories where block files are placed.
 So, what we have on disk is:
 - 256 inodes for the first level directories
 - 256 directory blocks for the first level directories
 - 256*256 inodes for the second level directories
 - 256*256 directory blocks for the second level directories
 - Then the inodes and blocks to store the the HDFS blocks themselves.
 The main problem is the 256*256 directory blocks. 
 inodes and dentries will be cached by linux and one can configure how likely 
 the system is to prune those entries (vfs_cache_pressure). However, ext4 
 relies on the buffer cache to cache the directory blocks and I'm not aware of 
 any way to tell linux to favor buffer cache pages (even if it did I'm not 
 sure I would want it to in general).
 Also, ext4 tries hard to spread directories evenly across the entire volume, 
 this basically means the 64K directory blocks are probably randomly spread 
 across the entire disk. A du type scan will look at directories one at a 
 time, so the ioscheduler can't optimize the corresponding seeks, meaning the 
 seeks will be random and far. 
 In a system I was using to diagnose this, I had 60K blocks. A DU when things 
 are hot is less than 1 second. When things are cold, about 20 minutes.
 How do things get cold?
 - A large set of tasks run on the node. This pushes almost all of the buffer 
 cache out, causing the next DU to hit this situation. We are seeing cases 
 where a large job can cause a seek storm across the entire cluster.
 Why didn't the previous layout see this?
 - It might have but it wasn't nearly as pronounced. The previous layout would 
 be a few hundred directory blocks. Even when completely cold, these would 
 only take a few a hundred seeks which would mean single digit seconds.  
 - With only a few hundred directories, the odds of the directory blocks 
 getting modified is quite high, this keeps those blocks hot and much less 
 likely to be evicted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8673) HDFS reports file already exists if there is a file/dir name end with ._COPYING_

2015-07-20 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He updated HDFS-8673:
--
Attachment: HDFS-8673.003.patch

reattach path to trigger Hadoop QA

 HDFS reports file already exists if there is a file/dir name end with 
 ._COPYING_
 

 Key: HDFS-8673
 URL: https://issues.apache.org/jira/browse/HDFS-8673
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
 Attachments: HDFS-8673.000-WIP.patch, HDFS-8673.000.patch, 
 HDFS-8673.001.patch, HDFS-8673.002.patch, HDFS-8673.003.patch, 
 HDFS-8673.003.patch


 Because CLI is using CommandWithDestination.java which add ._COPYING_ to 
 the tail of file name when it does the copy. It will cause problem if there 
 is a file/dir already called *._COPYING_ on HDFS.
 For file:
 -bash-4.1$ hadoop fs -put 5M /user/occ/
 -bash-4.1$ hadoop fs -mv /user/occ/5M /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup5242880 2015-06-26 05:16 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup 131072 2015-06-26 05:19 /user/occ/5M
 For dir:
 -bash-4.1$ hadoop fs -mkdir /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 drwxr-xr-x   - occ supergroup  0 2015-06-26 05:24 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 put: /user/occ/5M._COPYING_ already exists as a directory
 -bash-4.1$ hadoop fs -ls /user/occ/
 (/user/occ/5M._COPYING_ is gone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8673) HDFS reports file already exists if there is a file/dir name end with ._COPYING_

2015-07-20 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633767#comment-14633767
 ] 

Chen He commented on HDFS-8673:
---

PATCH, :)

 HDFS reports file already exists if there is a file/dir name end with 
 ._COPYING_
 

 Key: HDFS-8673
 URL: https://issues.apache.org/jira/browse/HDFS-8673
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fs
Affects Versions: 2.7.0
Reporter: Chen He
Assignee: Chen He
 Attachments: HDFS-8673.000-WIP.patch, HDFS-8673.000.patch, 
 HDFS-8673.001.patch, HDFS-8673.002.patch, HDFS-8673.003.patch, 
 HDFS-8673.003.patch


 Because CLI is using CommandWithDestination.java which add ._COPYING_ to 
 the tail of file name when it does the copy. It will cause problem if there 
 is a file/dir already called *._COPYING_ on HDFS.
 For file:
 -bash-4.1$ hadoop fs -put 5M /user/occ/
 -bash-4.1$ hadoop fs -mv /user/occ/5M /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup5242880 2015-06-26 05:16 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 -rw-r--r--   1 occ supergroup 131072 2015-06-26 05:19 /user/occ/5M
 For dir:
 -bash-4.1$ hadoop fs -mkdir /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -ls /user/occ/
 Found 1 items
 drwxr-xr-x   - occ supergroup  0 2015-06-26 05:24 
 /user/occ/5M._COPYING_
 -bash-4.1$ hadoop fs -put 128K /user/occ/5M
 put: /user/occ/5M._COPYING_ already exists as a directory
 -bash-4.1$ hadoop fs -ls /user/occ/
 (/user/occ/5M._COPYING_ is gone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8486) DN startup may cause severe data loss

2015-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633828#comment-14633828
 ] 

Daryn Sharp commented on HDFS-8486:
---

Public service notice:
* _Every restart of a 2.6.x or 2.7.0 DN incurs a risk of unwanted block 
deletion_.
* Apply this patch if you are running a pre-2.7.1 release.

I previously attributed this as an ancient bug but it's new to 2.6.  HDFS-2560 
did start the scanner too early but the race caused a benign log warning.  In 
2.6, HDFS-6931 made an unrelated change that introduced the faulty (mass) 
deletion logic.

 DN startup may cause severe data loss
 -

 Key: HDFS-8486
 URL: https://issues.apache.org/jira/browse/HDFS-8486
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 0.23.1, 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8486.patch, HDFS-8486.patch


 A race condition between block pool initialization and the directory scanner 
 may cause a mass deletion of blocks in multiple storages.
 If block pool initialization finds a block on disk that is already in the 
 replica map, it deletes one of the blocks based on size, GS, etc.  
 Unfortunately it _always_ deletes one of the blocks even if identical, thus 
 the replica map _must_ be empty when the pool is initialized.
 The directory scanner starts at a random time within its periodic interval 
 (default 6h).  If the scanner starts very early it races to populate the 
 replica map, causing the block pool init to erroneously delete blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634056#comment-14634056
 ] 

Haohui Mai commented on HDFS-8344:
--

Then the question becomes what would be a good default value of configuration? 
Why does it require retrying on UC blocks for a number of times instead of just 
marking the block as missing when the hard limit expire?

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8775) SASL support for data transfer protocol in libhdfspp

2015-07-20 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8775:
-
Attachment: HDFS-8775.000.patch

 SASL support for data transfer protocol in libhdfspp
 

 Key: HDFS-8775
 URL: https://issues.apache.org/jira/browse/HDFS-8775
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8775.000.patch


 This jira proposes to implement basic SASL support for the data transfer 
 protocol which allows libhdfspp to talk to secure clusters.
 Support for encryption is deferred to subsequent jiras.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8499) Refactor BlockInfo class hierarchy with static helper class

2015-07-20 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634120#comment-14634120
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8499:
---

 ..., but if you could share some thoughts on the comparison it'd be nice. ...

BlockInfoContiguous and BlockInfoStriped could be implemented using completely 
different data structures.  They also can be constructuced in a completely 
different way.  Indeed, BlockInfoContiguous is constructed by a write pipeline 
and BlockInfoStriped is constructed by parallel write.  Therefore, 
BlockInfoContiguousUC and BlockInfoStripedUC may not share a lot of common 
code.  However, Design #1 assumes both BlockInfoContiguous and BlockInfoStriped 
can be constructed in a similar way.

Also, if BlockInfoContiguousUC/BlockInfoStripedUC does not extend 
BlockInfoContiguous/BlockInfoStriped, their data structures cannot be made 
private to the classes.  HDFS-8499 adds ContiguousBlockStorageOp for 
BlockInfoContiguous and BlockInfoUnderConstructionContiguous so that the actual 
logic for contiguous BlockInfo is actually in the static methods in 
ContiguousBlockStorageOp.  It is a procedural langage apporach but not OO 
apporach.  BlockInfoContiguous/BlockInfoUnderConstructionContiguous become 
adapter-style classes -- they simply call the methods in 
ContiguousBlockStorageOp and there are a lot of code duplication between these 
two classes.  They same thing is going to happen to BlockInfoStriped and 
BlockInfoStripedUC in Design #1.

 ... I chose option A to avoid breaking the existing is-a relationship in 
 trunk. ...

Do you mean breaking the trunk code before HDFS-8499.  If yes, Could you 
explain how Design #2 breaks the existing is-a relationship?


 Refactor BlockInfo class hierarchy with static helper class
 ---

 Key: HDFS-8499
 URL: https://issues.apache.org/jira/browse/HDFS-8499
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.7.0
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Fix For: 2.8.0

 Attachments: HDFS-8499.00.patch, HDFS-8499.01.patch, 
 HDFS-8499.02.patch, HDFS-8499.03.patch, HDFS-8499.04.patch, 
 HDFS-8499.05.patch, HDFS-8499.06.patch, HDFS-8499.07.patch, 
 HDFS-8499.UCFeature.patch, HDFS-bistriped.patch


 In HDFS-7285 branch, the {{BlockInfoUnderConstruction}} interface provides a 
 common abstraction for striped and contiguous UC blocks. This JIRA aims to 
 merge it to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634094#comment-14634094
 ] 

Hudson commented on HDFS-8344:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #8186 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8186/])
HDFS-8344. NameNode doesn't recover lease for files with missing blocks 
(raviprak) (raviprak: rev e4f756260f16156179ba4adad974ec92279c2fac)
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 2.8.0

 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8764) Generate Hadoop RPC stubs from protobuf definitions

2015-07-20 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-8764.
--
   Resolution: Fixed
Fix Version/s: HDFS-8707

Committed to the HDFS-8707 branch. Thanks Jing and James for the reviews.

 Generate Hadoop RPC stubs from protobuf definitions
 ---

 Key: HDFS-8764
 URL: https://issues.apache.org/jira/browse/HDFS-8764
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: HDFS-8707

 Attachments: HDFS-8764.000.patch


 It would be nice to have the the RPC stubs generated from the protobuf 
 definitions which is similar to what the HADOOP-10388 has achieved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-8344:
---
   Resolution: Fixed
Fix Version/s: 2.8.0
 Release Note: Allow a configuration to specify the maximum number of 
recovery attempts for blocks under construction.
   Status: Resolved  (was: Patch Available)

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 2.8.0

 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634088#comment-14634088
 ] 

Ravi Prakash commented on HDFS-8344:


Thanks for the review Allen, Kihwal, Masatake and Haohui. I've committed this 
to trunk and branch-2.

I just saw your comment Haohui. The datanode might be busy and recovery may 
fail the first time. I thought it best to try recovery a few times before 
giving up.

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 2.8.0

 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-07-20 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8306:

Attachment: HDFS-8306.008.patch

Updated the patch to use utf-8 {{InputSource}}.

 Generate ACL and Xattr outputs in OIV XML outputs
 -

 Key: HDFS-8306
 URL: https://issues.apache.org/jira/browse/HDFS-8306
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HDFS-8306.000.patch, HDFS-8306.001.patch, 
 HDFS-8306.002.patch, HDFS-8306.003.patch, HDFS-8306.004.patch, 
 HDFS-8306.005.patch, HDFS-8306.006.patch, HDFS-8306.007.patch, 
 HDFS-8306.008.patch, HDFS-8306.debug0.patch, HDFS-8306.debug1.patch


 Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
 outputs. It makes inspecting {{fsimage}} from XML outputs less practical. 
 Also it prevents recovering a fsimage from XML file.
 This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
 achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8788) Implement unit tests for remote block reader in libhdfspp

2015-07-20 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-8788.
--
   Resolution: Fixed
Fix Version/s: HDFS-8707

Committed to the HDFS-8707 branch. Thanks James for the reviews.

 Implement unit tests for remote block reader in libhdfspp
 -

 Key: HDFS-8788
 URL: https://issues.apache.org/jira/browse/HDFS-8788
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: HDFS-8707

 Attachments: HDFS-8788.000.patch


 This jira proposes to implement unit tests for the remote block reader in 
 gmock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3455) Add docs for NameNode initializeSharedEdits and bootstrapStandby commands

2015-07-20 Thread Anthony Rojas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Rojas updated HDFS-3455:

Assignee: (was: Anthony Rojas)

 Add docs for NameNode initializeSharedEdits and bootstrapStandby commands
 -

 Key: HDFS-3455
 URL: https://issues.apache.org/jira/browse/HDFS-3455
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
  Labels: newbie

 We've made the HA setup easier by adding new flags to the namenode to 
 automatically set up the standby. But, we didn't document them yet. We should 
 amend the HDFSHighAvailability.apt.vm docs to include this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-8344:
---
Attachment: HDFS-8344.07.patch

Thanks a lot for the careful review Allen! Here's another with the fixes.

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8753) Ozone: Unify StorageContainerConfiguration with ozone-default.xml ozone-site.xml

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634044#comment-14634044
 ] 

Hadoop QA commented on HDFS-8753:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 23s | Findbugs (version ) appears to 
be broken on HDFS-7240. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 43s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 47s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 19s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 32s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 33s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 161m  6s | Tests failed in hadoop-hdfs. |
| | | 202m 35s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746081/HDFS-8753-HDFS-7240.00.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7240 / 8576861 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11752/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11752/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11752/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11752/console |


This message was automatically generated.

 Ozone: Unify StorageContainerConfiguration with ozone-default.xml  
 ozone-site.xml 
 ---

 Key: HDFS-8753
 URL: https://issues.apache.org/jira/browse/HDFS-8753
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: kanaka kumar avvaru
Assignee: kanaka kumar avvaru
 Attachments: HDFS-8753-HDFS-7240.00.patch


 This JIRA proposes adding ozone-default.xml to main resources  
 ozone-site.xml to test resources with default known parameters as of now.
 Also, need to unify {{StorageContainerConfiguration}} to initialize conf with 
 both the files as at present there are two classes with this name.
 {code}
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\ozone\StorageContainerConfiguration.java
  loads only ozone-site.xml
 hadoop-hdfs-project\hadoop-hdfs\src\main\java\org\apache\hadoop\storagecontainer\StorageContainerConfiguration.java
  loads only storage-container-site.xml
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634047#comment-14634047
 ] 

Allen Wittenauer commented on HDFS-8344:


+1 lgtm

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8486) DN startup may cause severe data loss

2015-07-20 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-8486:
--
Release Note: 
Public service notice:
- Every restart of a 2.6.x or 2.7.0 DN incurs a risk of unwanted block deletion.
- Apply this patch if you are running a pre-2.7.1 release.

(Promoting comment into release-notes area of JIRA just so its better visible)

 DN startup may cause severe data loss
 -

 Key: HDFS-8486
 URL: https://issues.apache.org/jira/browse/HDFS-8486
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 0.23.1, 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8486.patch, HDFS-8486.patch


 A race condition between block pool initialization and the directory scanner 
 may cause a mass deletion of blocks in multiple storages.
 If block pool initialization finds a block on disk that is already in the 
 replica map, it deletes one of the blocks based on size, GS, etc.  
 Unfortunately it _always_ deletes one of the blocks even if identical, thus 
 the replica map _must_ be empty when the pool is initialized.
 The directory scanner starts at a random time within its periodic interval 
 (default 6h).  If the scanner starts very early it races to populate the 
 replica map, causing the block pool init to erroneously delete blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8760) Erasure Coding: reuse BlockReader when reading the same block in pread

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634011#comment-14634011
 ] 

Hadoop QA commented on HDFS-8760:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 41s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m  9s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 19s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 14s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 42s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 45s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 24s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  18m  7s | Tests failed in hadoop-hdfs. |
| | |  62m 44s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestReservedRawPaths |
|   | hadoop.hdfs.server.blockmanagement.TestDatanodeManager |
|   | hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots |
|   | hadoop.hdfs.TestSetrepIncreasing |
|   | hadoop.hdfs.TestModTime |
|   | hadoop.fs.TestUrlStreamHandler |
|   | hadoop.hdfs.security.TestDelegationToken |
|   | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolarent |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.server.namenode.TestFileLimit |
|   | hadoop.hdfs.TestParallelShortCircuitRead |
|   | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot |
|   | hadoop.hdfs.TestDisableConnCache |
|   | hadoop.hdfs.server.blockmanagement.TestBlockInfoStriped |
|   | hadoop.hdfs.web.TestWebHdfsWithAuthenticationFilter |
|   | hadoop.hdfs.server.namenode.TestEditLogAutoroll |
|   | hadoop.TestRefreshCallQueue |
|   | hadoop.hdfs.protocolPB.TestPBHelper |
|   | hadoop.hdfs.web.TestWebHdfsUrl |
|   | hadoop.hdfs.TestECSchemas |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | hadoop.hdfs.TestConnCache |
|   | hadoop.cli.TestCryptoAdminCLI |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestSetrepDecreasing |
|   | hadoop.hdfs.server.datanode.TestDiskError |
|   | hadoop.fs.viewfs.TestViewFsWithAcls |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.namenode.TestHostsFiles |
|   | hadoop.hdfs.server.datanode.TestTransferRbw |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy |
|   | hadoop.fs.contract.hdfs.TestHDFSContractDelete |
|   | hadoop.hdfs.server.namenode.TestFileContextAcl |
|   | hadoop.hdfs.TestSafeModeWithStripedFile |
|   | hadoop.fs.TestFcHdfsSetUMask |
|   | hadoop.fs.TestUnbuffer |
|   | hadoop.hdfs.server.namenode.TestClusterId |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.server.namenode.TestFSDirectory |
|   | hadoop.hdfs.server.namenode.TestLeaseManager |
|   | hadoop.fs.contract.hdfs.TestHDFSContractOpen |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotListing |
|   | hadoop.hdfs.server.datanode.TestStorageReport |
|   | hadoop.hdfs.server.datanode.TestBlockRecovery |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.TestReadWhileWriting |
|   | hadoop.fs.contract.hdfs.TestHDFSContractMkdir |
|   | hadoop.fs.contract.hdfs.TestHDFSContractAppend |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestGetGroupsWithHA |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.TestSecondaryWebUi |
|   | hadoop.hdfs.server.namenode.TestMalformedURLs |
|   | hadoop.hdfs.server.namenode.TestAuditLogger |
|   | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |

[jira] [Updated] (HDFS-8799) Erasure Coding: add tests for namenode processing corrupt striped blocks

2015-07-20 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8799:

Summary: Erasure Coding: add tests for namenode processing corrupt striped 
blocks  (was: Erasure Coding: add tests for process corrupt striped blocks)

 Erasure Coding: add tests for namenode processing corrupt striped blocks
 

 Key: HDFS-8799
 URL: https://issues.apache.org/jira/browse/HDFS-8799
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8799) Erasure Coding: add tests for namenode processing corrupt striped blocks

2015-07-20 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633718#comment-14633718
 ] 

Walter Su commented on HDFS-8799:
-

{{TestAddStripedBlocks#testCehckStripedReplicaCorrupt()}} tests the count of 
corruptReplicas.
This jira tests whether the corruptReplicas should be deleted, and when. Just 
like {{TestProcessCorruptBlocks}}

 Erasure Coding: add tests for namenode processing corrupt striped blocks
 

 Key: HDFS-8799
 URL: https://issues.apache.org/jira/browse/HDFS-8799
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8335) FSNamesystem/FSDirStatAndListingOp getFileInfo and getListingInt construct FSPermissionChecker regardless of isPermissionEnabled()

2015-07-20 Thread Gabor Liptak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak reassigned HDFS-8335:
--

Assignee: Gabor Liptak

 FSNamesystem/FSDirStatAndListingOp getFileInfo and getListingInt construct 
 FSPermissionChecker regardless of isPermissionEnabled()
 --

 Key: HDFS-8335
 URL: https://issues.apache.org/jira/browse/HDFS-8335
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0, 2.6.0, 2.7.0, 2.8.0
Reporter: David Bryson
Assignee: Gabor Liptak
 Attachments: HDFS-8335.2.patch, HDFS-8335.patch


 FSNamesystem (2.5.x)/FSDirStatAndListingOp(current trunk) getFileInfo and 
 getListingInt methods call getPermissionChecker() to construct a 
 FSPermissionChecker regardless of isPermissionEnabled(). When permission 
 checking is disabled, this leads to an unnecessary performance hit 
 constructing a UserGroupInformation object that is never used.
 For example, from a stack dump when driving concurrent requests, they all end 
 up blocking.
 Here's the thread holding the lock:
 IPC Server handler 9 on 9000 daemon prio=10 tid=0x7f78d8b9e800 
 nid=0x142f3 runnable [0x7f78c2ddc000]
java.lang.Thread.State: RUNNABLE
 at java.io.FileInputStream.readBytes(Native Method)
 at java.io.FileInputStream.read(FileInputStream.java:272)
 at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
 at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
 - locked 0x0007d9b105c0 (a java.lang.UNIXProcess$ProcessPipeInputStream)
 at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
 at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
 at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
 - locked 0x0007d9b1a888 (a java.io.InputStreamReader)
 at java.io.InputStreamReader.read(InputStreamReader.java:184)
 at java.io.BufferedReader.fill(BufferedReader.java:154)
 at java.io.BufferedReader.read1(BufferedReader.java:205)
 at java.io.BufferedReader.read(BufferedReader.java:279)
 - locked 0x0007d9b1a888 (a java.io.InputStreamReader)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:715)
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:524)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:774)
 at 
 org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:84)
 at 
 org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
 at 
 org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50)
 at org.apache.hadoop.security.Groups.getGroups(Groups.java:139)
 at 
 org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1474)
 - locked 0x0007a6df75f8 (a 
 org.apache.hadoop.security.UserGroupInformation)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.init(FSPermissionChecker.java:82)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3534)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4489)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4478)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:898)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:602)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 Here is (one of the many) threads waiting on the lock:
 IPC Server handler 2 on 9000 daemon prio=10 tid=0x7f78d8c48800 
 nid=0x142ec waiting for monitor entry [0x7f78c34e3000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 

[jira] [Updated] (HDFS-8335) FSNamesystem/FSDirStatAndListingOp getFileInfo and getListingInt construct FSPermissionChecker regardless of isPermissionEnabled()

2015-07-20 Thread Gabor Liptak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak updated HDFS-8335:
---
Attachment: HDFS-8335.2.patch

 FSNamesystem/FSDirStatAndListingOp getFileInfo and getListingInt construct 
 FSPermissionChecker regardless of isPermissionEnabled()
 --

 Key: HDFS-8335
 URL: https://issues.apache.org/jira/browse/HDFS-8335
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 2.5.0, 2.6.0, 2.7.0, 2.8.0
Reporter: David Bryson
 Attachments: HDFS-8335.2.patch, HDFS-8335.patch


 FSNamesystem (2.5.x)/FSDirStatAndListingOp(current trunk) getFileInfo and 
 getListingInt methods call getPermissionChecker() to construct a 
 FSPermissionChecker regardless of isPermissionEnabled(). When permission 
 checking is disabled, this leads to an unnecessary performance hit 
 constructing a UserGroupInformation object that is never used.
 For example, from a stack dump when driving concurrent requests, they all end 
 up blocking.
 Here's the thread holding the lock:
 IPC Server handler 9 on 9000 daemon prio=10 tid=0x7f78d8b9e800 
 nid=0x142f3 runnable [0x7f78c2ddc000]
java.lang.Thread.State: RUNNABLE
 at java.io.FileInputStream.readBytes(Native Method)
 at java.io.FileInputStream.read(FileInputStream.java:272)
 at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
 at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
 - locked 0x0007d9b105c0 (a java.lang.UNIXProcess$ProcessPipeInputStream)
 at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
 at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
 at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
 - locked 0x0007d9b1a888 (a java.io.InputStreamReader)
 at java.io.InputStreamReader.read(InputStreamReader.java:184)
 at java.io.BufferedReader.fill(BufferedReader.java:154)
 at java.io.BufferedReader.read1(BufferedReader.java:205)
 at java.io.BufferedReader.read(BufferedReader.java:279)
 - locked 0x0007d9b1a888 (a java.io.InputStreamReader)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.parseExecResult(Shell.java:715)
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:524)
 at org.apache.hadoop.util.Shell.run(Shell.java:455)
 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:774)
 at 
 org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:84)
 at 
 org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
 at 
 org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50)
 at org.apache.hadoop.security.Groups.getGroups(Groups.java:139)
 at 
 org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1474)
 - locked 0x0007a6df75f8 (a 
 org.apache.hadoop.security.UserGroupInformation)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.init(FSPermissionChecker.java:82)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3534)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4489)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4478)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:898)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:602)
 at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 Here is (one of the many) threads waiting on the lock:
 IPC Server handler 2 on 9000 daemon prio=10 tid=0x7f78d8c48800 
 nid=0x142ec waiting for monitor entry [0x7f78c34e3000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1472)
 - 

[jira] [Updated] (HDFS-8794) Improve CorruptReplicasMap#corruptReplicasMap

2015-07-20 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8794:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.8.0
Target Version/s: 2.8.0  (was: 2.7.1)
  Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks for [~arpitagarwal] review.

 Improve CorruptReplicasMap#corruptReplicasMap
 -

 Key: HDFS-8794
 URL: https://issues.apache.org/jira/browse/HDFS-8794
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8794.001.patch, HDFS-8794.002.patch


 Currently we use {{TreeMap}} for {{corruptReplicasMap}}, actually the only 
 need sorted place is {{getCorruptReplicaBlockIds}} which is used by test.
 So we can use {{HashMap}}.
 From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a 
 simliar optimization HDFS-7433. Of course we need to make few change to 
 {{getCorruptReplicaBlockIds}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8495) Consolidate append() related implementation into a single class

2015-07-20 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8495:
---
Component/s: (was: namenode)

 Consolidate append() related implementation into a single class
 ---

 Key: HDFS-8495
 URL: https://issues.apache.org/jira/browse/HDFS-8495
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, 
 HDFS-8495-002.patch, HDFS-8495-003.patch, HDFS-8495-003.patch, 
 HDFS-8495-004.patch, HDFS-8495-005.patch, HDFS-8495-006.patch


 This jira proposes to consolidate {{FSNamesystem#append()}} related methods 
 into a single class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8495) Consolidate append() related implementation into a single class

2015-07-20 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8495:
---
Component/s: namenode

 Consolidate append() related implementation into a single class
 ---

 Key: HDFS-8495
 URL: https://issues.apache.org/jira/browse/HDFS-8495
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8495-000.patch, HDFS-8495-001.patch, 
 HDFS-8495-002.patch, HDFS-8495-003.patch, HDFS-8495-003.patch, 
 HDFS-8495-004.patch, HDFS-8495-005.patch, HDFS-8495-006.patch


 This jira proposes to consolidate {{FSNamesystem#append()}} related methods 
 into a single class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8562) HDFS Performance is impacted by FileInputStream Finalizer

2015-07-20 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634412#comment-14634412
 ] 

mingleizhang commented on HDFS-8562:


Thank you,Yanping.I'm trying to solve these problems these days.

 HDFS Performance is impacted by FileInputStream Finalizer
 -

 Key: HDFS-8562
 URL: https://issues.apache.org/jira/browse/HDFS-8562
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, HDFS, performance
Affects Versions: 2.5.0
 Environment: Impact any application that uses HDFS
Reporter: Yanping Wang
Assignee: mingleizhang

 While running HBase using HDFS as datanodes, we noticed excessive high GC 
 pause spikes. For example with jdk8 update 40 and G1 collector, we saw 
 datanode GC pauses spiked toward 160 milliseconds while they should be around 
 20 milliseconds. 
 We tracked down to GC logs and found those long GC pauses were devoted to 
 process high number of final references. 
 For example, this Young GC:
 2715.501: [GC pause (G1 Evacuation Pause) (young) 0.1529017 secs]
 2715.572: [SoftReference, 0 refs, 0.0001034 secs]
 2715.572: [WeakReference, 0 refs, 0.123 secs]
 2715.572: [FinalReference, 8292 refs, 0.0748194 secs]
 2715.647: [PhantomReference, 0 refs, 160 refs, 0.0001333 secs]
 2715.647: [JNI Weak Reference, 0.140 secs]
 [Ref Proc: 122.3 ms]
 [Eden: 910.0M(910.0M)-0.0B(911.0M) Survivors: 11.0M-10.0M Heap: 
 951.1M(1536.0M)-40.2M(1536.0M)]
 [Times: user=0.47 sys=0.01, real=0.15 secs]
 This young GC took 152.9 milliseconds STW pause, while spent 122.3 
 milliseconds in Ref Proc, which processed 8292 FinalReference in 74.8 
 milliseconds plus some overhead.
 We used JFR and JMAP with Memory Analyzer to track down and found those 
 FinalReference were all from FileInputStream.  We checked HDFS code and saw 
 the use of the FileInputStream in datanode:
 https://apache.googlesource.com/hadoop-common/+/refs/heads/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java
 {code}
 1.public static MappableBlock load(long length,
 2.FileInputStream blockIn, FileInputStream metaIn,
 3.String blockFileName) throws IOException {
 4.MappableBlock mappableBlock = null;
 5.MappedByteBuffer mmap = null;
 6.FileChannel blockChannel = null;
 7.try {
 8.blockChannel = blockIn.getChannel();
 9.if (blockChannel == null) {
 10.   throw new IOException(Block InputStream has no FileChannel.);
 11.   }
 12.   mmap = blockChannel.map(MapMode.READ_ONLY, 0, length);
 13.   NativeIO.POSIX.getCacheManipulator().mlock(blockFileName, mmap, length);
 14.   verifyChecksum(length, metaIn, blockChannel, blockFileName);
 15.   mappableBlock = new MappableBlock(mmap, length);
 16.   } finally {
 17.   IOUtils.closeQuietly(blockChannel);
 18.   if (mappableBlock == null) {
 19.   if (mmap != null) {
 20.   NativeIO.POSIX.munmap(mmap); // unmapping also unlocks
 21.   }
 22.   }
 23.   }
 24.   return mappableBlock;
 25.   }
 {code}
 We looked up 
 https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html  and
 http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/23bdcede4e39/src/share/classes/java/io/FileInputStream.java
  and noticed FileInputStream relies on the Finalizer to release its resource. 
 When a class that has a finalizer created, an entry for that class instance 
 is put on a queue in the JVM so the JVM knows it has a finalizer that needs 
 to be executed.   
 The current issue is: even with programmers do call close() after using 
 FileInputStream, its finalize() method will still be called. In other words, 
 still get the side effect of the FinalReference being registered at 
 FileInputStream allocation time, and also reference processing to reclaim the 
 FinalReference during GC (any GC solution has to deal with this). 
 We can imagine When running industry deployment HDFS, millions of files could 
 be opened and closed which resulted in a very large number of finalizers 
 being registered and subsequently being executed.  That could cause very long 
 GC pause times.
 We tried to use Files.newInputStream() to replace FileInputStream, but it was 
 clear we could not replace FileInputStream in 
 hdfs/server/datanode/fsdataset/impl/MappableBlock.java 
 We notified Oracle JVM team of this performance issue that impacting all Big 
 Data applications using HDFS. We recommended the proper fix in Java SE 
 FileInputStream. Because (1) it is really nothing wrong to use 
 FileInputStream in above datanode code, (2) as the object with a finalizer is 
 registered with finalizer list within the JVM at object allocation time, if 
 someone makes an explicit call to close or free the resources that are to be 
 done in the finalizer, then 

[jira] [Commented] (HDFS-7483) Display information per tier on the Namenode UI

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634419#comment-14634419
 ] 

Hadoop QA commented on HDFS-7483:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |   6m  0s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 53s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 36s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 41s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m  9s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 168m 44s | Tests failed in hadoop-hdfs. |
| | | 189m 30s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.TestDistributedFileSystem |
| Timed out tests | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746206/HDFS-7483.003.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / e4f7562 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11757/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11757/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11757/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11757/console |


This message was automatically generated.

 Display information per tier on the Namenode UI
 ---

 Key: HDFS-7483
 URL: https://issues.apache.org/jira/browse/HDFS-7483
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, 
 HDFS-7483.003.patch, overview.png, storagetypes.png, 
 storagetypes_withnostorage.png, withOneStorageType.png, withTwoStorageType.png


 If cluster has different types of storage, it is useful to display the 
 storage information per type. 
 The information will be available via JMX (HDFS-7390)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8794) Improve CorruptReplicasMap#corruptReplicasMap

2015-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634433#comment-14634433
 ] 

Hudson commented on HDFS-8794:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8188 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8188/])
HDFS-8794. Improve CorruptReplicasMap#corruptReplicasMap. (yliu) (yliu: rev 
d6d58606b8adf94b208aed5fc2d054b9dd081db1)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CorruptReplicasMap.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestCorruptReplicaInfo.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Improve CorruptReplicasMap#corruptReplicasMap
 -

 Key: HDFS-8794
 URL: https://issues.apache.org/jira/browse/HDFS-8794
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.8.0

 Attachments: HDFS-8794.001.patch, HDFS-8794.002.patch


 Currently we use {{TreeMap}} for {{corruptReplicasMap}}, actually the only 
 need sorted place is {{getCorruptReplicaBlockIds}} which is used by test.
 So we can use {{HashMap}}.
 From memory and performance view, {{HashMap}} is better than {{TreeMap}}, a 
 simliar optimization HDFS-7433. Of course we need to make few change to 
 {{getCorruptReplicaBlockIds}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-07-20 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-6407:
-
Priority: Minor  (was: Critical)

 new namenode UI, lost ability to sort columns in datanode tab
 -

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Benoy Antony
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.patch, 
 browse_directory.png, datanodes.png, snapshots.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7582) Enforce maximum number of ACL entries separately per access and default.

2015-07-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634219#comment-14634219
 ] 

Chris Nauroth commented on HDFS-7582:
-

Hi [~vinayrpet].  Thank you again for your patience.  The patch looks good.  I 
found just one thing that needs to be corrected.

{code}
if (defaultEntries.size()  MAX_ENTRIES) {
  throw new AclException(Invalid ACL: ACL has  + accessEntries.size()
  +  default entries, which exceeds maximum of  + MAX_ENTRIES + .);
}
{code}

The text of this exception needs to use {{defaultEntries.size()}} instead of 
{{accessEntries.size()}}.

 Enforce maximum number of ACL entries separately per access and default.
 

 Key: HDFS-7582
 URL: https://issues.apache.org/jira/browse/HDFS-7582
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-7582-001.patch, HDFS-7582-01.patch


 Current ACL limits are only on the total number of entries.
 But there can be a situation where number of default entries for a directory 
 will be more than half of the maximum entries, i.e.  16.
 In such case, under this parent directory only files can be created which 
 will have ACLs inherited using parent's default entries.
 But when directories are created, total number of entries will be more than 
 the maximum allowed, because sub-directories copies both inherited ACLs as 
 well as default entries.
 Since currently there is no check while copying ACLs from default ACLs 
 directory creation succeeds, but any modification (only permission on one 
 entry also) on the same ACL will fail.
 It would be better to enforce the maximum of 32 entries separately per access 
 and default.  This would be consistent with our observations testing ACLs on 
 other file systems, such as XFS and ext3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7483) Display information per tier on the Namenode UI

2015-07-20 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7483:
-
Attachment: HDFS-7483.003.patch

 Display information per tier on the Namenode UI
 ---

 Key: HDFS-7483
 URL: https://issues.apache.org/jira/browse/HDFS-7483
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, 
 HDFS-7483.003.patch, overview.png, storagetypes.png, 
 storagetypes_withnostorage.png, withOneStorageType.png, withTwoStorageType.png


 If cluster has different types of storage, it is useful to display the 
 storage information per type. 
 The information will be available via JMX (HDFS-7390)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7483) Display information per tier on the Namenode UI

2015-07-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634237#comment-14634237
 ] 

Haohui Mai commented on HDFS-7483:
--

Sorry for the delay.

[~benoyantony], I uploaded the v3 patch to demonstrate the approach. The basic 
idea is to calculate the percentage using JavaScript and leave the templates to 
deal with the formatting only.

 Display information per tier on the Namenode UI
 ---

 Key: HDFS-7483
 URL: https://issues.apache.org/jira/browse/HDFS-7483
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, 
 HDFS-7483.003.patch, overview.png, storagetypes.png, 
 storagetypes_withnostorage.png, withOneStorageType.png, withTwoStorageType.png


 If cluster has different types of storage, it is useful to display the 
 storage information per type. 
 The information will be available via JMX (HDFS-7390)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8779) WebUI can't display randomly generated block ID

2015-07-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634300#comment-14634300
 ] 

Haohui Mai commented on HDFS-8779:
--

Is it possible to bring in a dedicated library like 
https://github.com/sidorares/json-bigint instead of putting hacks into the JSON 
string? It looks much cleaner.

bq. I'm guessing that the Java WebHdfsFileSystem implementation somehow already 
avoids the JS MAX_SAFE_INTEGER issue...

I don't see why there are related. As pointed out in the description, in Java 
MAX_LONG equals to 2^63 - 1 while in JavaScript MAX_SAFE_INTEGER is only 2^53 - 
1. JavaScript can represent numbers that are larger than 2^53 -1, but most 
likely with the loss of precisions.

 WebUI can't display randomly generated block ID
 ---

 Key: HDFS-8779
 URL: https://issues.apache.org/jira/browse/HDFS-8779
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, 
 HDFS-8779.03.patch, after-02-patch.png, before.png


 Old release use randomly generated block ID(HDFS-4645).
 max value of Long in Java is 2^63-1
 max value of number in Javascript is 2^53-1. ( See 
 [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])
 Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.
 A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8800) shutdown has bugs

2015-07-20 Thread John Smith (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Smith updated HDFS-8800:
-
Description: namenode stop creates stack traces and extra gc logs.

 shutdown has bugs
 -

 Key: HDFS-8800
 URL: https://issues.apache.org/jira/browse/HDFS-8800
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: John Smith

 namenode stop creates stack traces and extra gc logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8059) Erasure coding: revisit how to store EC schema and cellSize in NameNode

2015-07-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634323#comment-14634323
 ] 

Andrew Wang commented on HDFS-8059:
---

Hey Jing,

bq. Could you please provide more details about how putting EC schema in 
INodeFile can solve the rename problem...?

The schema will travel with the file when it gets renamed. We can read the 
schema information off the file directly rather than going up to the zone, so 
the zone's schema is only used at write-time.

bq. Can you direct me to this list?

I just refer to the phase 1 umbrella JIRA at HDFS-7285, this subtask probably 
should move over there since it's related to persisting schema info to NN 
on-disk metadata, which I think we should figure out before merging.

 Erasure coding: revisit how to store EC schema and cellSize in NameNode
 ---

 Key: HDFS-8059
 URL: https://issues.apache.org/jira/browse/HDFS-8059
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8059.001.patch


 Move {{dataBlockNum}} and {{parityBlockNum}} from BlockInfoStriped to 
 INodeFile, and store them in {{FileWithStripedBlocksFeature}}.
 Ideally these two nums are the same for all striped blocks in a file, and 
 store them in BlockInfoStriped will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8657) Update docs for mSNN

2015-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634340#comment-14634340
 ] 

Hudson commented on HDFS-8657:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8187 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8187/])
HDFS-8657. Update docs for mSNN. Contributed by Jesse Yates. (atm: rev 
ed01dc70b2f4ff4bdcaf71c19acf244da0868a82)
* 
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithNFS.md


 Update docs for mSNN
 

 Key: HDFS-8657
 URL: https://issues.apache.org/jira/browse/HDFS-8657
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 3.0.0

 Attachments: hdfs-8657-v0.patch, hdfs-8657-v1.patch


 After the commit of HDFS-6440, some docs need to be updated to reflect the 
 new support for more than 2 NNs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8306) Generate ACL and Xattr outputs in OIV XML outputs

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634347#comment-14634347
 ] 

Hadoop QA commented on HDFS-8306:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m  2s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 20s | The applied patch generated  1 
new checkstyle issues (total was 61, now 62). |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 21s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  6s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 161m 37s | Tests failed in hadoop-hdfs. |
| | | 205m  3s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746185/HDFS-8306.008.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e4f7562 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11756/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11756/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11756/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11756/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11756/console |


This message was automatically generated.

 Generate ACL and Xattr outputs in OIV XML outputs
 -

 Key: HDFS-8306
 URL: https://issues.apache.org/jira/browse/HDFS-8306
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.7.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HDFS-8306.000.patch, HDFS-8306.001.patch, 
 HDFS-8306.002.patch, HDFS-8306.003.patch, HDFS-8306.004.patch, 
 HDFS-8306.005.patch, HDFS-8306.006.patch, HDFS-8306.007.patch, 
 HDFS-8306.008.patch, HDFS-8306.debug0.patch, HDFS-8306.debug1.patch


 Currently, in the {{hdfs oiv}} XML outputs, not all fields of fsimage are 
 outputs. It makes inspecting {{fsimage}} from XML outputs less practical. 
 Also it prevents recovering a fsimage from XML file.
 This JIRA is adding ACL and XAttrs in the XML outputs as the first step to 
 achieve the goal described in HDFS-8061.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8797) WebHdfsFileSystem creates too many connections for pread

2015-07-20 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634362#comment-14634362
 ] 

Yi Liu commented on HDFS-8797:
--

Thanks Jing for working on this, I think it's a good approach to override 
{{readFully}} and create a new InputStream for pread. A minor comment, we 
should also override {{int read(long position, ...}}

 WebHdfsFileSystem creates too many connections for pread
 

 Key: HDFS-8797
 URL: https://issues.apache.org/jira/browse/HDFS-8797
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8797.000.patch


 While running a test we found that WebHdfsFileSystem can create several 
 thousand connections when doing a position read of a 200MB file. For each 
 connection the client will connect to the DataNode again and the DataNode 
 will create a new DFSClient instance to handle the read request. This also 
 leads to several thousand {{getBlockLocations}} call to the NameNode.
 The cause of the issue is that in {{FSInputStream#read(long, byte[], int, 
 int)}}, each time the inputstream reads some time, it seeks back to the old 
 position and resets its state to SEEK. Thus the next read will regenerate the 
 connection.
 {code}
   public int read(long position, byte[] buffer, int offset, int length)
 throws IOException {
 synchronized (this) {
   long oldPos = getPos();
   int nread = -1;
   try {
 seek(position);
 nread = read(buffer, offset, length);
   } finally {
 seek(oldPos);
   }
   return nread;
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8695) OzoneHandler : Add Bucket REST Interface

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634373#comment-14634373
 ] 

Hadoop QA commented on HDFS-8695:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 57s | Findbugs (version ) appears to 
be broken on HDFS-7240. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |  10m  5s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m 58s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 22s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 36s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 46s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 10s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 50s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  76m 37s | Tests failed in hadoop-hdfs. |
| | | 129m  7s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestFileAppend4 |
|   | hadoop.hdfs.TestRead |
|   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshot |
|   | hadoop.hdfs.server.namenode.TestSecondaryWebUi |
|   | hadoop.hdfs.server.namenode.TestSaveNamespace |
|   | hadoop.hdfs.server.namenode.TestNNStorageRetentionFunctional |
|   | hadoop.hdfs.tools.TestGetGroups |
|   | hadoop.hdfs.server.namenode.TestFavoredNodesEndToEnd |
|   | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions |
|   | hadoop.hdfs.server.datanode.TestRefreshNamenodes |
|   | hadoop.hdfs.TestHdfsAdmin |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOnSameDN |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength |
|   | hadoop.hdfs.server.namenode.ha.TestLossyRetryInvocationHandler |
|   | hadoop.hdfs.TestClientReportBadBlock |
|   | hadoop.hdfs.TestSafeMode |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.namenode.TestAuditLogger |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.tools.TestDebugAdmin |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.TestDatanodeReport |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
|   | hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup |
|   | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement |
|   | hadoop.hdfs.server.namenode.TestNameNodeRpcServer |
|   | hadoop.hdfs.TestFileAppendRestart |
|   | hadoop.hdfs.server.namenode.TestSecondaryNameNodeUpgrade |
|   | hadoop.hdfs.TestMultiThreadedHflush |
|   | hadoop.hdfs.TestParallelRead |
|   | hadoop.hdfs.server.namenode.snapshot.TestSetQuotaWithSnapshot |
|   | hadoop.hdfs.server.namenode.ha.TestHAFsck |
|   | hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots |
|   | hadoop.hdfs.tools.TestStoragePolicyCommands |
|   | hadoop.hdfs.protocol.TestBlockListAsLongs |
|   | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolerant |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
|   | hadoop.hdfs.server.namenode.TestHDFSConcat |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.TestRefreshCallQueue |
|   | hadoop.hdfs.TestSetTimes |
|   | hadoop.hdfs.TestListFilesInDFS |
|   | hadoop.hdfs.server.namenode.TestAddBlock |
|   | hadoop.hdfs.server.namenode.TestMalformedURLs |
|   | hadoop.hdfs.server.datanode.TestDnRespectsBlockReportSplitThreshold |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.server.namenode.TestNameEditsConfigs |
|   | hadoop.hdfs.TestMiniDFSCluster |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.mover.TestMover |
|   | hadoop.security.TestPermissionSymlinks |
|   | hadoop.hdfs.TestDFSRollback |
|   | 

[jira] [Commented] (HDFS-6300) Prevent multiple balancers from running simultaneously

2015-07-20 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634192#comment-14634192
 ] 

Aaron T. Myers commented on HDFS-6300:
--

Given these recent fixes, do we think that HDFS-4505 is now obsolete and should 
therefore be closed?

 Prevent multiple balancers from running simultaneously
 --

 Key: HDFS-6300
 URL: https://issues.apache.org/jira/browse/HDFS-6300
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Reporter: Rakesh R
Assignee: Rakesh R
Priority: Critical
 Fix For: 2.7.1

 Attachments: HDFS-6300-001.patch, HDFS-6300-002.patch, 
 HDFS-6300-003.patch, HDFS-6300-004.patch, HDFS-6300-005.patch, 
 HDFS-6300-006.patch, HDFS-6300.patch


 Javadoc of Balancer.java says, it will not allow to run second balancer if 
 the first one is in progress. But I've noticed multiple can run together and 
 balancer.id implementation is not safe guarding.
 {code}
  * liAnother balancer is running. Exiting...
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8059) Erasure coding: revisit how to store EC schema and cellSize in NameNode

2015-07-20 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634198#comment-14634198
 ] 

Jing Zhao commented on HDFS-8059:
-

bq. Breaking rename as I stated above is a huge limitation.

Could you please provide more details about how putting EC schema in INodeFile 
can solve the rename problem (assuming the rename is across two EC zones with 
different EC schema)? Please note with wrong schemas an EC file cannot be 
correctly read.

bq. This JIRA is also on the shortlist of remaining issues for the EC branch

Can you direct me to this list?

 Erasure coding: revisit how to store EC schema and cellSize in NameNode
 ---

 Key: HDFS-8059
 URL: https://issues.apache.org/jira/browse/HDFS-8059
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8059.001.patch


 Move {{dataBlockNum}} and {{parityBlockNum}} from BlockInfoStriped to 
 INodeFile, and store them in {{FileWithStripedBlocksFeature}}.
 Ideally these two nums are the same for all striped blocks in a file, and 
 store them in BlockInfoStriped will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8695) OzoneHandler : Add Bucket REST Interface

2015-07-20 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8695:
---
Status: Patch Available  (was: Open)

 OzoneHandler : Add Bucket REST Interface
 

 Key: HDFS-8695
 URL: https://issues.apache.org/jira/browse/HDFS-8695
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: hdfs-8695-HDFS-7240.001.patch


 Add Bucket REST interface into Ozone server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8760) Erasure Coding: reuse BlockReader when reading the same block in pread

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634269#comment-14634269
 ] 

Hadoop QA commented on HDFS-8760:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m  3s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |  11m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  12m 45s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 16s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 53s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 59s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   4m 26s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 57s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  82m 13s | Tests failed in hadoop-hdfs. |
| | | 139m 16s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | 
hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolarent |
|   | hadoop.hdfs.server.namenode.TestFileLimit |
|   | hadoop.hdfs.server.namenode.snapshot.TestFileContextSnapshot |
|   | hadoop.hdfs.server.namenode.TestEditLogAutoroll |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.namenode.TestHostsFiles |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy |
|   | hadoop.hdfs.server.namenode.TestFileContextAcl |
|   | hadoop.hdfs.server.namenode.TestClusterId |
|   | hadoop.hdfs.server.namenode.TestDeleteRace |
|   | hadoop.hdfs.server.namenode.TestFSDirectory |
|   | hadoop.hdfs.server.namenode.TestLeaseManager |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotListing |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
|   | hadoop.hdfs.server.namenode.ha.TestGetGroupsWithHA |
|   | hadoop.hdfs.server.namenode.TestSecondaryWebUi |
|   | hadoop.hdfs.server.namenode.TestMalformedURLs |
|   | hadoop.hdfs.server.namenode.TestAuditLogger |
|   | hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
|   | hadoop.hdfs.server.namenode.TestHDFSConcat |
|   | hadoop.hdfs.server.namenode.TestAddBlockRetry |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandby |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeProtocol |
|   | hadoop.hdfs.server.namenode.ha.TestHAMetrics |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap |
|   | hadoop.hdfs.server.namenode.TestFsLimits |
|   | hadoop.hdfs.server.namenode.TestNNStorageRetentionFunctional |
|   | hadoop.hdfs.server.datanode.TestHSync |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter |
|   | hadoop.hdfs.server.namenode.TestNameNodeRpcServer |
|   | hadoop.hdfs.server.namenode.TestFileContextXAttr |
|   | hadoop.hdfs.server.namenode.TestAclConfigFlag |
|   | hadoop.hdfs.server.namenode.TestFSImageWithXAttr |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.namenode.snapshot.TestXAttrWithSnapshot |
|   | hadoop.hdfs.server.namenode.TestFSImageWithAcl |
|   | hadoop.hdfs.server.namenode.TestQuotaWithStripedBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyBlockManagement |
|   | hadoop.hdfs.server.namenode.TestListCorruptFileBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.TestParallelImageWrite |
|   | hadoop.hdfs.server.namenode.snapshot.TestAclWithSnapshot |
|   | hadoop.hdfs.server.namenode.TestNameNodeResourceChecker |
|   | hadoop.hdfs.server.namenode.TestGenericJournalConf |
|   | hadoop.hdfs.server.namenode.TestEditLogJournalFailures |
|   | 

[jira] [Commented] (HDFS-8748) ACL permission check does not union groups to determine effective permissions

2015-07-20 Thread Scott Opell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634276#comment-14634276
 ] 

Scott Opell commented on HDFS-8748:
---

Hi Chris.

I actually was just able to get my hadoop environment setup and got a patch 
together.
However, in my explorations, I found that while the behavior outlined in the 
design document is mentioned in the POSIX spec, its not actually what they 
decided to go with. Check out page 272 of the pdf here for more details.
http://users.suse.com/~agruen/acl/posix/Posix_1003.1e-990310.pdf

So basically, the code matches POSIX and the design document doesn't. 
I submitted my patch, which should work if the project decides to go with the 
non-posix behavior.

I guess your OS doesn't fully comply with the 1003.1e draft.

 ACL permission check does not union groups to determine effective permissions
 -

 Key: HDFS-8748
 URL: https://issues.apache.org/jira/browse/HDFS-8748
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Scott Opell
  Labels: acl, permission

 In the ACL permission checking routine, the implemented named group section 
 does not match the design document.
 In the design document, its shown in the pseudo-code that if the requester is 
 not the owner or a named user, then the applicable groups are unioned 
 together to form effective permissions for the requester.
 Instead, the current implementation will search for the first group that 
 grants access and will use that. It will not union the permissions together.
 Here is the design document's description of the desired behavior
 {quote}
 If the user is a member of the file's group or at least one group for which 
 there is a
 named group entry in the ACL, then effective permissions are calculated from 
 groups.
 This is the union of the file group permissions (if the user is a member of 
 the file group)
 and all named group entries matching the user's groups. For example, consider 
 a user
 that is a member of 2 groups: sales and execs. The user is not the file 
 owner, and the
 ACL contains no named user entries. The ACL contains named group entries for 
 both
 groups as follows: group:sales:r­­\-\-, group:execs:\-­w\-­. In this case, 
 the user's effective
 permissions are rw­-.
 {quote}
  
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 10??
 The design document's algorithm matches that description:
 *Design Document Algorithm*
 {code:title=DesignDocument}
 if (user == fileOwner) {
 effectivePermissions = aclEntries.getOwnerPermissions()
 } else if (user ∈ aclEntries.getNamedUsers()) {
 effectivePermissions = aclEntries.getNamedUserPermissions(user)
 } else if (userGroupsInAcl != ∅) {
 effectivePermissions = ∅
 if (fileGroup ∈ userGroupsInAcl) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getGroupPermissions()
 }
 for ({group | group ∈ userGroupsInAcl}) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getNamedGroupPermissions(group)
 }
 } else {
 effectivePermissions = aclEntries.getOthersPermissions()
 }
 {code}
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 9??
 The current implementation does NOT match the description.
 *Current Trunk*
 {code:title=FSPermissionChecker.java}
 // Use owner entry from permission bits if user is owner.
 if (getUser().equals(inode.getUserName())) {
   if (mode.getUserAction().implies(access)) {
 return;
   }
   foundMatch = true;
 }
 // Check named user and group entries if user was not denied by owner 
 entry.
 if (!foundMatch) {
   for (int pos = 0, entry; pos  aclFeature.getEntriesSize(); pos++) {
 entry = aclFeature.getEntryAt(pos);
 if (AclEntryStatusFormat.getScope(entry) == AclEntryScope.DEFAULT) {
   break;
 }
 AclEntryType type = AclEntryStatusFormat.getType(entry);
 String name = AclEntryStatusFormat.getName(entry);
 if (type == AclEntryType.USER) {
   // Use named user entry with mask from permission bits applied if 
 user
   // matches name.
   if (getUser().equals(name)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
 break;
   }
 } else if (type == AclEntryType.GROUP) {
   // Use group entry (unnamed or named) with mask from permission bits
   // applied if user is a member and entry grants access.  If user is 
 a
   // 

[jira] [Updated] (HDFS-8657) Update docs for mSNN

2015-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-8657:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

I've just committed this change to trunk.

Thanks very much for the contribution, Jesse.

 Update docs for mSNN
 

 Key: HDFS-8657
 URL: https://issues.apache.org/jira/browse/HDFS-8657
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Fix For: 3.0.0

 Attachments: hdfs-8657-v0.patch, hdfs-8657-v1.patch


 After the commit of HDFS-6440, some docs need to be updated to reflect the 
 new support for more than 2 NNs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8748) ACL permission check does not union groups to determine effective permissions

2015-07-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634288#comment-14634288
 ] 

Chris Nauroth commented on HDFS-8748:
-

Thank you for the pointer to 1003.1e.  That's a very interesting find.  The 
high-level goal always has been POSIX adherence, which makes me inclined to 
leave the current code as is.

 ACL permission check does not union groups to determine effective permissions
 -

 Key: HDFS-8748
 URL: https://issues.apache.org/jira/browse/HDFS-8748
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Scott Opell
  Labels: acl, permission
 Attachments: HDFS_8748.patch


 In the ACL permission checking routine, the implemented named group section 
 does not match the design document.
 In the design document, its shown in the pseudo-code that if the requester is 
 not the owner or a named user, then the applicable groups are unioned 
 together to form effective permissions for the requester.
 Instead, the current implementation will search for the first group that 
 grants access and will use that. It will not union the permissions together.
 Here is the design document's description of the desired behavior
 {quote}
 If the user is a member of the file's group or at least one group for which 
 there is a
 named group entry in the ACL, then effective permissions are calculated from 
 groups.
 This is the union of the file group permissions (if the user is a member of 
 the file group)
 and all named group entries matching the user's groups. For example, consider 
 a user
 that is a member of 2 groups: sales and execs. The user is not the file 
 owner, and the
 ACL contains no named user entries. The ACL contains named group entries for 
 both
 groups as follows: group:sales:r­­\-\-, group:execs:\-­w\-­. In this case, 
 the user's effective
 permissions are rw­-.
 {quote}
  
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 10??
 The design document's algorithm matches that description:
 *Design Document Algorithm*
 {code:title=DesignDocument}
 if (user == fileOwner) {
 effectivePermissions = aclEntries.getOwnerPermissions()
 } else if (user ∈ aclEntries.getNamedUsers()) {
 effectivePermissions = aclEntries.getNamedUserPermissions(user)
 } else if (userGroupsInAcl != ∅) {
 effectivePermissions = ∅
 if (fileGroup ∈ userGroupsInAcl) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getGroupPermissions()
 }
 for ({group | group ∈ userGroupsInAcl}) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getNamedGroupPermissions(group)
 }
 } else {
 effectivePermissions = aclEntries.getOthersPermissions()
 }
 {code}
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 9??
 The current implementation does NOT match the description.
 *Current Trunk*
 {code:title=FSPermissionChecker.java}
 // Use owner entry from permission bits if user is owner.
 if (getUser().equals(inode.getUserName())) {
   if (mode.getUserAction().implies(access)) {
 return;
   }
   foundMatch = true;
 }
 // Check named user and group entries if user was not denied by owner 
 entry.
 if (!foundMatch) {
   for (int pos = 0, entry; pos  aclFeature.getEntriesSize(); pos++) {
 entry = aclFeature.getEntryAt(pos);
 if (AclEntryStatusFormat.getScope(entry) == AclEntryScope.DEFAULT) {
   break;
 }
 AclEntryType type = AclEntryStatusFormat.getType(entry);
 String name = AclEntryStatusFormat.getName(entry);
 if (type == AclEntryType.USER) {
   // Use named user entry with mask from permission bits applied if 
 user
   // matches name.
   if (getUser().equals(name)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
 break;
   }
 } else if (type == AclEntryType.GROUP) {
   // Use group entry (unnamed or named) with mask from permission bits
   // applied if user is a member and entry grants access.  If user is 
 a
   // member of multiple groups that have entries that grant access, 
 then
   // it doesn't matter which is chosen, so exit early after first 
 match.
   String group = name == null ? inode.getGroupName() : name;
   if (getGroups().contains(group)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());

[jira] [Updated] (HDFS-8748) ACL permission check does not union groups to determine effective permissions

2015-07-20 Thread Scott Opell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Opell updated HDFS-8748:
--
Flags: Patch

 ACL permission check does not union groups to determine effective permissions
 -

 Key: HDFS-8748
 URL: https://issues.apache.org/jira/browse/HDFS-8748
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Scott Opell
  Labels: acl, permission
 Attachments: HDFS_8748.patch


 In the ACL permission checking routine, the implemented named group section 
 does not match the design document.
 In the design document, its shown in the pseudo-code that if the requester is 
 not the owner or a named user, then the applicable groups are unioned 
 together to form effective permissions for the requester.
 Instead, the current implementation will search for the first group that 
 grants access and will use that. It will not union the permissions together.
 Here is the design document's description of the desired behavior
 {quote}
 If the user is a member of the file's group or at least one group for which 
 there is a
 named group entry in the ACL, then effective permissions are calculated from 
 groups.
 This is the union of the file group permissions (if the user is a member of 
 the file group)
 and all named group entries matching the user's groups. For example, consider 
 a user
 that is a member of 2 groups: sales and execs. The user is not the file 
 owner, and the
 ACL contains no named user entries. The ACL contains named group entries for 
 both
 groups as follows: group:sales:r­­\-\-, group:execs:\-­w\-­. In this case, 
 the user's effective
 permissions are rw­-.
 {quote}
  
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 10??
 The design document's algorithm matches that description:
 *Design Document Algorithm*
 {code:title=DesignDocument}
 if (user == fileOwner) {
 effectivePermissions = aclEntries.getOwnerPermissions()
 } else if (user ∈ aclEntries.getNamedUsers()) {
 effectivePermissions = aclEntries.getNamedUserPermissions(user)
 } else if (userGroupsInAcl != ∅) {
 effectivePermissions = ∅
 if (fileGroup ∈ userGroupsInAcl) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getGroupPermissions()
 }
 for ({group | group ∈ userGroupsInAcl}) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getNamedGroupPermissions(group)
 }
 } else {
 effectivePermissions = aclEntries.getOthersPermissions()
 }
 {code}
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 9??
 The current implementation does NOT match the description.
 *Current Trunk*
 {code:title=FSPermissionChecker.java}
 // Use owner entry from permission bits if user is owner.
 if (getUser().equals(inode.getUserName())) {
   if (mode.getUserAction().implies(access)) {
 return;
   }
   foundMatch = true;
 }
 // Check named user and group entries if user was not denied by owner 
 entry.
 if (!foundMatch) {
   for (int pos = 0, entry; pos  aclFeature.getEntriesSize(); pos++) {
 entry = aclFeature.getEntryAt(pos);
 if (AclEntryStatusFormat.getScope(entry) == AclEntryScope.DEFAULT) {
   break;
 }
 AclEntryType type = AclEntryStatusFormat.getType(entry);
 String name = AclEntryStatusFormat.getName(entry);
 if (type == AclEntryType.USER) {
   // Use named user entry with mask from permission bits applied if 
 user
   // matches name.
   if (getUser().equals(name)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
 break;
   }
 } else if (type == AclEntryType.GROUP) {
   // Use group entry (unnamed or named) with mask from permission bits
   // applied if user is a member and entry grants access.  If user is 
 a
   // member of multiple groups that have entries that grant access, 
 then
   // it doesn't matter which is chosen, so exit early after first 
 match.
   String group = name == null ? inode.getGroupName() : name;
   if (getGroups().contains(group)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
   }
 }
   }
 }
 {code}
 As seen in the GROUP section, the permissions check will succeed if and 

[jira] [Updated] (HDFS-8797) WebHdfsFileSystem creates too many connections for pread

2015-07-20 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8797:

Attachment: HDFS-8797.000.patch

One possible way to fix can be to override the {{readFully}} method in 
{{ByteRangeInputStream}} in which we use a newly created InputStream so that we 
do not need to touch the internal states. Upload a patch to demo the idea.

 WebHdfsFileSystem creates too many connections for pread
 

 Key: HDFS-8797
 URL: https://issues.apache.org/jira/browse/HDFS-8797
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8797.000.patch


 While running a test we found that WebHdfsFileSystem can create several 
 thousand connections when doing a position read of a 200MB file. For each 
 connection the client will connect to the DataNode again and the DataNode 
 will create a new DFSClient instance to handle the read request. This also 
 leads to several thousand {{getBlockLocations}} call to the NameNode.
 The cause of the issue is that in {{FSInputStream#read(long, byte[], int, 
 int)}}, each time the inputstream reads some time, it seeks back to the old 
 position and resets its state to SEEK. Thus the next read will regenerate the 
 connection.
 {code}
   public int read(long position, byte[] buffer, int offset, int length)
 throws IOException {
 synchronized (this) {
   long oldPos = getPos();
   int nread = -1;
   try {
 seek(position);
 nread = read(buffer, offset, length);
   } finally {
 seek(oldPos);
   }
   return nread;
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8800) shutdown has bugs

2015-07-20 Thread John Smith (JIRA)
John Smith created HDFS-8800:


 Summary: shutdown has bugs
 Key: HDFS-8800
 URL: https://issues.apache.org/jira/browse/HDFS-8800
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: John Smith






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8797) WebHdfsFileSystem creates too many connections for pread

2015-07-20 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8797:

Status: Patch Available  (was: Open)

 WebHdfsFileSystem creates too many connections for pread
 

 Key: HDFS-8797
 URL: https://issues.apache.org/jira/browse/HDFS-8797
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8797.000.patch


 While running a test we found that WebHdfsFileSystem can create several 
 thousand connections when doing a position read of a 200MB file. For each 
 connection the client will connect to the DataNode again and the DataNode 
 will create a new DFSClient instance to handle the read request. This also 
 leads to several thousand {{getBlockLocations}} call to the NameNode.
 The cause of the issue is that in {{FSInputStream#read(long, byte[], int, 
 int)}}, each time the inputstream reads some time, it seeks back to the old 
 position and resets its state to SEEK. Thus the next read will regenerate the 
 connection.
 {code}
   public int read(long position, byte[] buffer, int offset, int length)
 throws IOException {
 synchronized (this) {
   long oldPos = getPos();
   int nread = -1;
   try {
 seek(position);
 nread = read(buffer, offset, length);
   } finally {
 seek(oldPos);
   }
   return nread;
 }
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634302#comment-14634302
 ] 

Hadoop QA commented on HDFS-8344:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m  1s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 39s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 20s | The applied patch generated  5 
new checkstyle issues (total was 854, now 855). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 35s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  4s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 161m 17s | Tests failed in hadoop-hdfs. |
| | | 204m 48s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.TestDistributedFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746170/HDFS-8344.07.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 98c2bc8 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11754/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11754/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11754/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11754/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11754/console |


This message was automatically generated.

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 2.8.0

 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks

2015-07-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634172#comment-14634172
 ] 

Haohui Mai commented on HDFS-8344:
--

-1. Can you please revert the commit?

I'm concerned with the complexity associated with the commit as well as the 
difficulty for the users to choose the right configuration. It's an internal 
implementation detail and it should not be exposed to users whenever it's 
possible. We intentionally keep the soft and hard limit not configurable to 
avoid the users shooting their foot.

bq. The datanode might be busy and recovery may fail the first time.

That's exactly what the hard limit / retries of leases is designed for. Again 
this is only one type of internal implementation towards the solution. The 
detail should not be exposed to the users.

 NameNode doesn't recover lease for files with missing blocks
 

 Key: HDFS-8344
 URL: https://issues.apache.org/jira/browse/HDFS-8344
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Fix For: 2.8.0

 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, 
 HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, 
 HDFS-8344.06.patch, HDFS-8344.07.patch


 I found another\(?) instance in which the lease is not recovered. This is 
 reproducible easily on a pseudo-distributed single node cluster
 # Before you start it helps if you set. This is not necessary, but simply 
 reduces how long you have to wait
 {code}
   public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000;
   public static final long LEASE_HARDLIMIT_PERIOD = 2 * 
 LEASE_SOFTLIMIT_PERIOD;
 {code}
 # Client starts to write a file. (could be less than 1 block, but it hflushed 
 so some of the data has landed on the datanodes) (I'm copying the client code 
 I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar)
 # Client crashes. (I simulate this by kill -9 the $(hadoop jar 
 TestHadoop.jar) process after it has printed Wrote to the bufferedWriter
 # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was 
 only 1)
 I believe the lease should be recovered and the block should be marked 
 missing. However this is not happening. The lease is never recovered.
 The effect of this bug for us was that nodes could not be decommissioned 
 cleanly. Although we knew that the client had crashed, the Namenode never 
 released the leases (even after restarting the Namenode) (even months 
 afterwards). There are actually several other cases too where we don't 
 consider what happens if ALL the datanodes die while the file is being 
 written, but I am going to punt on that for another time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2015-07-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634178#comment-14634178
 ] 

Haohui Mai commented on HDFS-6407:
--

Though it's nice to fix, it is not a core HDFS functionality though. Changing 
the priority back to Minor. Please feel free to bump the priority if you feel 
differently. Contributions are appreciated.

 new namenode UI, lost ability to sort columns in datanode tab
 -

 Key: HDFS-6407
 URL: https://issues.apache.org/jira/browse/HDFS-6407
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Benoy Antony
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: 002-datanodes-sorted-capacityUsed.png, 
 002-datanodes.png, 002-filebrowser.png, 002-snapshots.png, 
 HDFS-6407-002.patch, HDFS-6407-003.patch, HDFS-6407.patch, 
 browse_directory.png, datanodes.png, snapshots.png


 old ui supported clicking on column header to sort on that column. The new ui 
 seems to have dropped this very useful feature.
 There are a few tables in the Namenode UI to display  datanodes information, 
 directory listings and snapshots.
 When there are many items in the tables, it is useful to have ability to sort 
 on the different columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8059) Erasure coding: revisit how to store EC schema and cellSize in NameNode

2015-07-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634181#comment-14634181
 ] 

Andrew Wang commented on HDFS-8059:
---

[~wheat9] if you could also address the other points I made in my above 
comment, that would also add to the discussion. Breaking rename as I stated 
above is a huge limitation.

This JIRA is also on the shortlist of remaining issues for the EC branch, so 
we'd like to make progress on it quickly.

 Erasure coding: revisit how to store EC schema and cellSize in NameNode
 ---

 Key: HDFS-8059
 URL: https://issues.apache.org/jira/browse/HDFS-8059
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Yi Liu
Assignee: Yi Liu
 Attachments: HDFS-8059.001.patch


 Move {{dataBlockNum}} and {{parityBlockNum}} from BlockInfoStriped to 
 INodeFile, and store them in {{FileWithStripedBlocksFeature}}.
 Ideally these two nums are the same for all striped blocks in a file, and 
 store them in BlockInfoStriped will waste NN memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8695) OzoneHandler : Add Bucket REST Interface

2015-07-20 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-8695:
---
Attachment: hdfs-8695-HDFS-7240.001.patch

* Adds REST interface for buckets and corresponding handler

 OzoneHandler : Add Bucket REST Interface
 

 Key: HDFS-8695
 URL: https://issues.apache.org/jira/browse/HDFS-8695
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone
Reporter: Anu Engineer
Assignee: Anu Engineer
 Attachments: hdfs-8695-HDFS-7240.001.patch


 Add Bucket REST interface into Ozone server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8748) ACL permission check does not union groups to determine effective permissions

2015-07-20 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-8748:

Assignee: Chris Nauroth

Hello [~scott_o].  Thank you for reporting this.  I can confirm that this is a 
bug.  (The design doc is correct, and the current code has a bug.)

To confirm this, I ran a test case on an ext4 file system with ACLs enabled.  
See below for a transcript of my test case.  Executing a file requires both 
read and execute permissions (r-x).  In my test case, I defined the read 
permission on one named group entry and the execute permission on a second 
named group entry.  My user was able to execute the file.  This proves that on 
ext4, permissions can be defined on separate named group ACL entries, and the 
permission checks will treat the union of those entries as the effective 
permissions.

Scott, are you interested in coding a patch?  If not, then I'll assign this to 
myself for the fix.

{code}
 whoami
cnauroth

 groups
cnauroth sudo testgroup1

 getfacl test_HDFS-8748 
# file: test_HDFS-8748
# owner: root
# group: root
user::rwx
group::---
group:sudo:r--
group:testgroup1:--x
mask::r-x
other::---

 ./test_HDFS-8748 

 echo $?
0

 sudo setfacl -m group:sudo:r--,group:testgroup1:--- test_HDFS-8748 

 ./test_HDFS-8748 
-bash: ./test_HDFS-8748: Permission denied

 echo $?
126

 sudo setfacl -m group:sudo:---,group:testgroup1:--x test_HDFS-8748 

 ./test_HDFS-8748 
bash: ./test_HDFS-8748: Permission denied

 echo $?
126

 sudo setfacl -m group:sudo:r--,group:testgroup1:--x test_HDFS-8748 

 ./test_HDFS-8748 

 echo $?
0
{code}


 ACL permission check does not union groups to determine effective permissions
 -

 Key: HDFS-8748
 URL: https://issues.apache.org/jira/browse/HDFS-8748
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Scott Opell
Assignee: Chris Nauroth
  Labels: acl, permission

 In the ACL permission checking routine, the implemented named group section 
 does not match the design document.
 In the design document, its shown in the pseudo-code that if the requester is 
 not the owner or a named user, then the applicable groups are unioned 
 together to form effective permissions for the requester.
 Instead, the current implementation will search for the first group that 
 grants access and will use that. It will not union the permissions together.
 Here is the design document's description of the desired behavior
 {quote}
 If the user is a member of the file's group or at least one group for which 
 there is a
 named group entry in the ACL, then effective permissions are calculated from 
 groups.
 This is the union of the file group permissions (if the user is a member of 
 the file group)
 and all named group entries matching the user's groups. For example, consider 
 a user
 that is a member of 2 groups: sales and execs. The user is not the file 
 owner, and the
 ACL contains no named user entries. The ACL contains named group entries for 
 both
 groups as follows: group:sales:r­­\-\-, group:execs:\-­w\-­. In this case, 
 the user's effective
 permissions are rw­-.
 {quote}
  
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 10??
 The design document's algorithm matches that description:
 *Design Document Algorithm*
 {code:title=DesignDocument}
 if (user == fileOwner) {
 effectivePermissions = aclEntries.getOwnerPermissions()
 } else if (user ∈ aclEntries.getNamedUsers()) {
 effectivePermissions = aclEntries.getNamedUserPermissions(user)
 } else if (userGroupsInAcl != ∅) {
 effectivePermissions = ∅
 if (fileGroup ∈ userGroupsInAcl) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getGroupPermissions()
 }
 for ({group | group ∈ userGroupsInAcl}) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getNamedGroupPermissions(group)
 }
 } else {
 effectivePermissions = aclEntries.getOthersPermissions()
 }
 {code}
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 9??
 The current implementation does NOT match the description.
 *Current Trunk*
 {code:title=FSPermissionChecker.java}
 // Use owner entry from permission bits if user is owner.
 if (getUser().equals(inode.getUserName())) {
   if (mode.getUserAction().implies(access)) {
 return;
   }
   foundMatch = true;
 }
 // Check named user and group entries if user was not denied by owner 
 entry.
 if (!foundMatch) {
   for (int pos = 0, entry; pos  aclFeature.getEntriesSize(); pos++) {
 entry = aclFeature.getEntryAt(pos);
 if (AclEntryStatusFormat.getScope(entry) == AclEntryScope.DEFAULT) {

[jira] [Updated] (HDFS-8748) ACL permission check does not union groups to determine effective permissions

2015-07-20 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-8748:

Assignee: (was: Chris Nauroth)

 ACL permission check does not union groups to determine effective permissions
 -

 Key: HDFS-8748
 URL: https://issues.apache.org/jira/browse/HDFS-8748
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Scott Opell
  Labels: acl, permission

 In the ACL permission checking routine, the implemented named group section 
 does not match the design document.
 In the design document, its shown in the pseudo-code that if the requester is 
 not the owner or a named user, then the applicable groups are unioned 
 together to form effective permissions for the requester.
 Instead, the current implementation will search for the first group that 
 grants access and will use that. It will not union the permissions together.
 Here is the design document's description of the desired behavior
 {quote}
 If the user is a member of the file's group or at least one group for which 
 there is a
 named group entry in the ACL, then effective permissions are calculated from 
 groups.
 This is the union of the file group permissions (if the user is a member of 
 the file group)
 and all named group entries matching the user's groups. For example, consider 
 a user
 that is a member of 2 groups: sales and execs. The user is not the file 
 owner, and the
 ACL contains no named user entries. The ACL contains named group entries for 
 both
 groups as follows: group:sales:r­­\-\-, group:execs:\-­w\-­. In this case, 
 the user's effective
 permissions are rw­-.
 {quote}
  
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 10??
 The design document's algorithm matches that description:
 *Design Document Algorithm*
 {code:title=DesignDocument}
 if (user == fileOwner) {
 effectivePermissions = aclEntries.getOwnerPermissions()
 } else if (user ∈ aclEntries.getNamedUsers()) {
 effectivePermissions = aclEntries.getNamedUserPermissions(user)
 } else if (userGroupsInAcl != ∅) {
 effectivePermissions = ∅
 if (fileGroup ∈ userGroupsInAcl) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getGroupPermissions()
 }
 for ({group | group ∈ userGroupsInAcl}) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getNamedGroupPermissions(group)
 }
 } else {
 effectivePermissions = aclEntries.getOthersPermissions()
 }
 {code}
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 9??
 The current implementation does NOT match the description.
 *Current Trunk*
 {code:title=FSPermissionChecker.java}
 // Use owner entry from permission bits if user is owner.
 if (getUser().equals(inode.getUserName())) {
   if (mode.getUserAction().implies(access)) {
 return;
   }
   foundMatch = true;
 }
 // Check named user and group entries if user was not denied by owner 
 entry.
 if (!foundMatch) {
   for (int pos = 0, entry; pos  aclFeature.getEntriesSize(); pos++) {
 entry = aclFeature.getEntryAt(pos);
 if (AclEntryStatusFormat.getScope(entry) == AclEntryScope.DEFAULT) {
   break;
 }
 AclEntryType type = AclEntryStatusFormat.getType(entry);
 String name = AclEntryStatusFormat.getName(entry);
 if (type == AclEntryType.USER) {
   // Use named user entry with mask from permission bits applied if 
 user
   // matches name.
   if (getUser().equals(name)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
 break;
   }
 } else if (type == AclEntryType.GROUP) {
   // Use group entry (unnamed or named) with mask from permission bits
   // applied if user is a member and entry grants access.  If user is 
 a
   // member of multiple groups that have entries that grant access, 
 then
   // it doesn't matter which is chosen, so exit early after first 
 match.
   String group = name == null ? inode.getGroupName() : name;
   if (getGroups().contains(group)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
   }
 }
   }
 }
 {code}
 As seen in the GROUP section, the permissions check will succeed if and only 
 if a 

[jira] [Commented] (HDFS-7483) Display information per tier on the Namenode UI

2015-07-20 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634243#comment-14634243
 ] 

Benoy Antony commented on HDFS-7483:


Nice. Thanks for the snippet. I like this approach. 
+1 pending jenkins. 


 Display information per tier on the Namenode UI
 ---

 Key: HDFS-7483
 URL: https://issues.apache.org/jira/browse/HDFS-7483
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, 
 HDFS-7483.003.patch, overview.png, storagetypes.png, 
 storagetypes_withnostorage.png, withOneStorageType.png, withTwoStorageType.png


 If cluster has different types of storage, it is useful to display the 
 storage information per type. 
 The information will be available via JMX (HDFS-7390)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8657) Update docs for mSNN

2015-07-20 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634275#comment-14634275
 ] 

Aaron T. Myers commented on HDFS-8657:
--

+1, latest patch looks good to me. I'm going to commit this momentarily.

 Update docs for mSNN
 

 Key: HDFS-8657
 URL: https://issues.apache.org/jira/browse/HDFS-8657
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Minor
 Attachments: hdfs-8657-v0.patch, hdfs-8657-v1.patch


 After the commit of HDFS-6440, some docs need to be updated to reflect the 
 new support for more than 2 NNs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8748) ACL permission check does not union groups to determine effective permissions

2015-07-20 Thread Scott Opell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Opell updated HDFS-8748:
--
Attachment: HDFS_8748.patch

 ACL permission check does not union groups to determine effective permissions
 -

 Key: HDFS-8748
 URL: https://issues.apache.org/jira/browse/HDFS-8748
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
Reporter: Scott Opell
  Labels: acl, permission
 Attachments: HDFS_8748.patch


 In the ACL permission checking routine, the implemented named group section 
 does not match the design document.
 In the design document, its shown in the pseudo-code that if the requester is 
 not the owner or a named user, then the applicable groups are unioned 
 together to form effective permissions for the requester.
 Instead, the current implementation will search for the first group that 
 grants access and will use that. It will not union the permissions together.
 Here is the design document's description of the desired behavior
 {quote}
 If the user is a member of the file's group or at least one group for which 
 there is a
 named group entry in the ACL, then effective permissions are calculated from 
 groups.
 This is the union of the file group permissions (if the user is a member of 
 the file group)
 and all named group entries matching the user's groups. For example, consider 
 a user
 that is a member of 2 groups: sales and execs. The user is not the file 
 owner, and the
 ACL contains no named user entries. The ACL contains named group entries for 
 both
 groups as follows: group:sales:r­­\-\-, group:execs:\-­w\-­. In this case, 
 the user's effective
 permissions are rw­-.
 {quote}
  
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 10??
 The design document's algorithm matches that description:
 *Design Document Algorithm*
 {code:title=DesignDocument}
 if (user == fileOwner) {
 effectivePermissions = aclEntries.getOwnerPermissions()
 } else if (user ∈ aclEntries.getNamedUsers()) {
 effectivePermissions = aclEntries.getNamedUserPermissions(user)
 } else if (userGroupsInAcl != ∅) {
 effectivePermissions = ∅
 if (fileGroup ∈ userGroupsInAcl) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getGroupPermissions()
 }
 for ({group | group ∈ userGroupsInAcl}) {
 effectivePermissions = effectivePermissions ∪
 aclEntries.getNamedGroupPermissions(group)
 }
 } else {
 effectivePermissions = aclEntries.getOthersPermissions()
 }
 {code}
 ??https://issues.apache.org/jira/secure/attachment/12627729/HDFS-ACLs-Design-3.pdf
  page 9??
 The current implementation does NOT match the description.
 *Current Trunk*
 {code:title=FSPermissionChecker.java}
 // Use owner entry from permission bits if user is owner.
 if (getUser().equals(inode.getUserName())) {
   if (mode.getUserAction().implies(access)) {
 return;
   }
   foundMatch = true;
 }
 // Check named user and group entries if user was not denied by owner 
 entry.
 if (!foundMatch) {
   for (int pos = 0, entry; pos  aclFeature.getEntriesSize(); pos++) {
 entry = aclFeature.getEntryAt(pos);
 if (AclEntryStatusFormat.getScope(entry) == AclEntryScope.DEFAULT) {
   break;
 }
 AclEntryType type = AclEntryStatusFormat.getType(entry);
 String name = AclEntryStatusFormat.getName(entry);
 if (type == AclEntryType.USER) {
   // Use named user entry with mask from permission bits applied if 
 user
   // matches name.
   if (getUser().equals(name)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
 break;
   }
 } else if (type == AclEntryType.GROUP) {
   // Use group entry (unnamed or named) with mask from permission bits
   // applied if user is a member and entry grants access.  If user is 
 a
   // member of multiple groups that have entries that grant access, 
 then
   // it doesn't matter which is chosen, so exit early after first 
 match.
   String group = name == null ? inode.getGroupName() : name;
   if (getGroups().contains(group)) {
 FsAction masked = AclEntryStatusFormat.getPermission(entry).and(
 mode.getGroupAction());
 if (masked.implies(access)) {
   return;
 }
 foundMatch = true;
   }
 }
   }
 }
 {code}
 As seen in the GROUP section, the permissions check will 

[jira] [Commented] (HDFS-8779) WebUI can't display randomly generated block ID

2015-07-20 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634318#comment-14634318
 ] 

Andrew Wang commented on HDFS-8779:
---

Yea, so like I said WebHdfsFileSystem's JSON parser does not have the 2^53-1 
limitation...that was an aside to my concern about compatibility, which I think 
is accurate.

 WebUI can't display randomly generated block ID
 ---

 Key: HDFS-8779
 URL: https://issues.apache.org/jira/browse/HDFS-8779
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, 
 HDFS-8779.03.patch, after-02-patch.png, before.png


 Old release use randomly generated block ID(HDFS-4645).
 max value of Long in Java is 2^63-1
 max value of number in Javascript is 2^53-1. ( See 
 [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])
 Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.
 A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2015-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634458#comment-14634458
 ] 

Hudson commented on HDFS-6945:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8189 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8189/])
Move HDFS-6945 to 2.7.2 section in CHANGES.txt. (aajisaka: rev 
a628f675900d2533ddf86fb3d3e601238ecd68c3)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 BlockManager should remove a block from excessReplicateMap and decrement 
 ExcessBlocks metric when the block is removed
 --

 Key: HDFS-6945
 URL: https://issues.apache.org/jira/browse/HDFS-6945
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Critical
  Labels: metrics
 Fix For: 2.8.0

 Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
 HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch


 I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
 however, there are no over-replicated blocks (confirmed by fsck).
 After a further research, I noticed when deleting a block, BlockManager does 
 not remove the block from excessReplicateMap or decrement excessBlocksCount.
 Usually the metric is decremented when processing block report, however, if 
 the block has been deleted, BlockManager does not remove the block from 
 excessReplicateMap or decrement the metric.
 That way the metric and excessReplicateMap can increase infinitely (i.e. 
 memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2015-07-20 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6945:

Fix Version/s: 2.7.2

 BlockManager should remove a block from excessReplicateMap and decrement 
 ExcessBlocks metric when the block is removed
 --

 Key: HDFS-6945
 URL: https://issues.apache.org/jira/browse/HDFS-6945
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Critical
  Labels: metrics
 Fix For: 2.8.0, 2.7.2

 Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
 HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch


 I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
 however, there are no over-replicated blocks (confirmed by fsck).
 After a further research, I noticed when deleting a block, BlockManager does 
 not remove the block from excessReplicateMap or decrement excessBlocksCount.
 Usually the metric is decremented when processing block report, however, if 
 the block has been deleted, BlockManager does not remove the block from 
 excessReplicateMap or decrement the metric.
 That way the metric and excessReplicateMap can increase infinitely (i.e. 
 memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7483) Display information per tier on the Namenode UI

2015-07-20 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7483:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~benoyantony] for the 
contribution.

 Display information per tier on the Namenode UI
 ---

 Key: HDFS-7483
 URL: https://issues.apache.org/jira/browse/HDFS-7483
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Benoy Antony
Assignee: Benoy Antony
 Fix For: 2.8.0

 Attachments: HDFS-7483-001.patch, HDFS-7483-002.patch, 
 HDFS-7483.003.patch, overview.png, storagetypes.png, 
 storagetypes_withnostorage.png, withOneStorageType.png, withTwoStorageType.png


 If cluster has different types of storage, it is useful to display the 
 storage information per type. 
 The information will be available via JMX (HDFS-7390)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8779) WebUI can't display randomly generated block ID

2015-07-20 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8779:

Attachment: HDFS-8779.04.patch
patch-to-json-parse.txt

Thanks [~wheat9] for the idea.
json-bigint doesn't have a front-end version. The author gives a browserify 
version in 
[here|http://stackoverflow.com/questions/18755125/node-js-is-there-any-proper-way-to-parse-json-with-large-numbers-long-bigint]
The file is up to 79kb.

Since both [BigNumber|https://github.com/MikeMcl/bignumber.js] and 
[JSON-js|https://github.com/douglascrockford/JSON-js] have a front-end version. 
I re-create the file using the idea of json-bigint:

I simply add 2 lines to JSON-js library ( that's how json-bigint does):
{code}
+if (string.length  15)
+   return new BigNumber(string);
{code}
{{patch-to-json-parse.txt}} shows that. I didn't change anything to 
{{BigNumber}}.

Uploaded 04 patch, tested in chrome/ie11/firefox. I still prefer 03 patch 
because it's simple. Both works for me.

 WebUI can't display randomly generated block ID
 ---

 Key: HDFS-8779
 URL: https://issues.apache.org/jira/browse/HDFS-8779
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, 
 HDFS-8779.03.patch, HDFS-8779.04.patch, after-02-patch.png, before.png, 
 patch-to-json-parse.txt


 Old release use randomly generated block ID(HDFS-4645).
 max value of Long in Java is 2^63-1
 max value of number in Javascript is 2^53-1. ( See 
 [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])
 Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.
 A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8616) Cherry pick HDFS-6495 for excess block leak

2015-07-20 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA resolved HDFS-8616.
-
Resolution: Done

I've backported HDFS-6945 to 2.7.2. Please reopen this issue if you disagree.

 Cherry pick HDFS-6495 for excess block leak
 ---

 Key: HDFS-8616
 URL: https://issues.apache.org/jira/browse/HDFS-8616
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Akira AJISAKA

 Busy clusters quickly leak tens or hundreds of thousands of excess blocks 
 which slow BR processing.  HDFS-6495 should be cherry picked into 2.7.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8616) Cherry pick HDFS-6945 for excess block leak

2015-07-20 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-8616:

Summary: Cherry pick HDFS-6945 for excess block leak  (was: Cherry pick 
HDFS-6495 for excess block leak)

 Cherry pick HDFS-6945 for excess block leak
 ---

 Key: HDFS-8616
 URL: https://issues.apache.org/jira/browse/HDFS-8616
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Akira AJISAKA

 Busy clusters quickly leak tens or hundreds of thousands of excess blocks 
 which slow BR processing.  HDFS-6495 should be cherry picked into 2.7.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8779) WebUI can't display randomly generated block ID

2015-07-20 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8779:

Description: 
Old release use randomly generated block ID(HDFS-4645).
max value of Long in Java is 2^63-1
max value of -number-(*integer*) in Javascript is 2^53-1. ( See 
[Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])

Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.

A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.

  was:
Old release use randomly generated block ID(HDFS-4645).
max value of Long in Java is 2^63-1
max value of number in Javascript is 2^53-1. ( See 
[Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])

Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.

A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.


 WebUI can't display randomly generated block ID
 ---

 Key: HDFS-8779
 URL: https://issues.apache.org/jira/browse/HDFS-8779
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, 
 HDFS-8779.03.patch, HDFS-8779.04.patch, after-02-patch.png, before.png, 
 patch-to-json-parse.txt


 Old release use randomly generated block ID(HDFS-4645).
 max value of Long in Java is 2^63-1
 max value of -number-(*integer*) in Javascript is 2^53-1. ( See 
 [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])
 Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.
 A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8779) WebUI can't display randomly generated block ID

2015-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634469#comment-14634469
 ] 

Hadoop QA commented on HDFS-8779:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   0m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | release audit |   0m 13s | The applied patch generated 
2 release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| | |   0m 16s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12746254/HDFS-8779.04.patch |
| Optional Tests |  |
| git revision | trunk / df1e8ce |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11761/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11761/console |


This message was automatically generated.

 WebUI can't display randomly generated block ID
 ---

 Key: HDFS-8779
 URL: https://issues.apache.org/jira/browse/HDFS-8779
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8779.01.patch, HDFS-8779.02.patch, 
 HDFS-8779.03.patch, HDFS-8779.04.patch, after-02-patch.png, before.png, 
 patch-to-json-parse.txt


 Old release use randomly generated block ID(HDFS-4645).
 max value of Long in Java is 2^63-1
 max value of -number-(*integer*) in Javascript is 2^53-1. ( See 
 [Link|https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER])
 Which means almost every randomly generated block ID exceeds MAX_SAFE_INTEGER.
 A integer which exceeds MAX_SAFE_INTEGER cannot be represented in Javascript.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >