[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2

2015-08-03 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652579#comment-14652579
 ] 

Haohui Mai commented on HDFS-7966:
--

bq. What's the upside of this new implementation? 

Performance is definitely one important factor. One of the motivation is to 
improve the efficiency of DN when there are hundreds of thousands of reads by 
reducing the overhead of context switches. [~Apache9], do you have any 
performance numbers on this scenario?

HTTP/2-based DTP also serves as a building block of the next-level of 
innovation, just to quote the description in the jira:

{quote}
This jira explores to delegate the responsibilities of the session and 
presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
connection multiplexing, QoS, authentication and encryption, reducing the scope 
of DTP to the application layer only. By leveraging the existing HTTP/2 
library, it should simplify the implementation of both HDFS clients and servers.
{quote}

bq. If it were the same performance but had other redeeming qualities (e.g. 
less code) then it's still worth consideration.

This is designed to be a new code path so that it is compatible with older 
releases. You can still rely on the old DTP protocol depending on the 
application scenario.

 New Data Transfer Protocol via HTTP/2
 -

 Key: HDFS-7966
 URL: https://issues.apache.org/jira/browse/HDFS-7966
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Qianqian Shi
  Labels: gsoc, gsoc2015, mentor
 Attachments: GSoC2015_Proposal.pdf, 
 TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, 
 TestHttp2ReadBlockInsideEventLoop.svg


 The current Data Transfer Protocol (DTP) implements a rich set of features 
 that span across multiple layers, including:
 * Connection pooling and authentication (session layer)
 * Encryption (presentation layer)
 * Data writing pipeline (application layer)
 All these features are HDFS-specific and defined by implementation. As a 
 result it requires non-trivial amount of work to implement HDFS clients and 
 servers.
 This jira explores to delegate the responsibilities of the session and 
 presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
 connection multiplexing, QoS, authentication and encryption, reducing the 
 scope of DTP to the application layer only. By leveraging the existing HTTP/2 
 library, it should simplify the implementation of both HDFS clients and 
 servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652695#comment-14652695
 ] 

Zhe Zhang commented on HDFS-8849:
-

Thanks Allen for the advice. I think we can report the *number of missing 
blocks with min replication* instead.

 fsck should report number of missing blocks with replication factor 1
 -

 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor

 HDFS-7165 supports reporting number of blocks with replication factor 1 in 
 {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
 support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2

2015-08-03 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652698#comment-14652698
 ] 

Duo Zhang commented on HDFS-7966:
-

I do not have enough machines to test the scenario... What I see if I create 
lots of thread to read from datanode concurrently is that HTTP/2 will start the 
request almost at the same time, but TCP will start the request one by 
one(maybe tens by tens where the number is cpu count). So there won't be a 
situation that DN really handle lots of concurrent read from client, and the 
context switch maybe small than HTTP/2 implementation since we also have a 
ThreadPool besides EventLoopGroup in HTTP/2 connection. And what make things 
worse is that our client is not event driven so we can not reduce the thread 
count of client...
Let me see if I can make a scenario that HTTP/2 fast than TCP...
Thanks.

 New Data Transfer Protocol via HTTP/2
 -

 Key: HDFS-7966
 URL: https://issues.apache.org/jira/browse/HDFS-7966
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Qianqian Shi
  Labels: gsoc, gsoc2015, mentor
 Attachments: GSoC2015_Proposal.pdf, 
 TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, 
 TestHttp2ReadBlockInsideEventLoop.svg


 The current Data Transfer Protocol (DTP) implements a rich set of features 
 that span across multiple layers, including:
 * Connection pooling and authentication (session layer)
 * Encryption (presentation layer)
 * Data writing pipeline (application layer)
 All these features are HDFS-specific and defined by implementation. As a 
 result it requires non-trivial amount of work to implement HDFS clients and 
 servers.
 This jira explores to delegate the responsibilities of the session and 
 presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
 connection multiplexing, QoS, authentication and encryption, reducing the 
 scope of DTP to the application layer only. By leveraging the existing HTTP/2 
 library, it should simplify the implementation of both HDFS clients and 
 servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8499) Refactor BlockInfo class hierarchy with static helper class

2015-08-03 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652684#comment-14652684
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8499:
---

Not yet.  Should be able to try it on Wednesday.

 Refactor BlockInfo class hierarchy with static helper class
 ---

 Key: HDFS-8499
 URL: https://issues.apache.org/jira/browse/HDFS-8499
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.7.0
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Fix For: 2.8.0

 Attachments: HDFS-8499.00.patch, HDFS-8499.01.patch, 
 HDFS-8499.02.patch, HDFS-8499.03.patch, HDFS-8499.04.patch, 
 HDFS-8499.05.patch, HDFS-8499.06.patch, HDFS-8499.07.patch, 
 HDFS-8499.UCFeature.patch, HDFS-bistriped.patch


 In HDFS-7285 branch, the {{BlockInfoUnderConstruction}} interface provides a 
 common abstraction for striped and contiguous UC blocks. This JIRA aims to 
 merge it to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8804) Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer allocation

2015-08-03 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652808#comment-14652808
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8804:
---

+1 the new patch looks good.

 Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer 
 allocation
 ---

 Key: HDFS-8804
 URL: https://issues.apache.org/jira/browse/HDFS-8804
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8804.000.patch, HDFS-8804.001.patch


 Currently we directly allocate direct ByteBuffer in DFSStripedInputstream for 
 the stripe buffer and the buffers holding parity data. It's better to get 
 ByteBuffer from DirectBufferPool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8804) Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer allocation

2015-08-03 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-8804.
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7285

I've committed this to the feature branch. Thank you guys for the review!

bq. we can at least assert alignedStripe.range.spanInBlock is no larger than 
cellSize

This is guaranteed by the logic in {{readOneStripe}}. Thus my feeling here is 
the assertion is unnecessary. Also we don't have this assertion for data block 
buffer.

 Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer 
 allocation
 ---

 Key: HDFS-8804
 URL: https://issues.apache.org/jira/browse/HDFS-8804
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: HDFS-7285

 Attachments: HDFS-8804.000.patch, HDFS-8804.001.patch


 Currently we directly allocate direct ByteBuffer in DFSStripedInputstream for 
 the stripe buffer and the buffers holding parity data. It's better to get 
 ByteBuffer from DirectBufferPool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-08-03 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-8850:
--

 Summary: VolumeScanner thread exits with exception if there is no 
block pool to be scanned but there are suspicious blocks
 Key: HDFS-8850
 URL: https://issues.apache.org/jira/browse/HDFS-8850
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


The VolumeScanner threads inside the BlockScanner exit with an exception if 
there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-08-03 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652932#comment-14652932
 ] 

Yi Liu commented on HDFS-8850:
--

Yes, you are right.  +1 pending Jenkins.

 VolumeScanner thread exits with exception if there is no block pool to be 
 scanned but there are suspicious blocks
 -

 Key: HDFS-8850
 URL: https://issues.apache.org/jira/browse/HDFS-8850
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8850.001.patch


 The VolumeScanner threads inside the BlockScanner exit with an exception if 
 there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-08-03 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8850:
---
Status: Patch Available  (was: Open)

 VolumeScanner thread exits with exception if there is no block pool to be 
 scanned but there are suspicious blocks
 -

 Key: HDFS-8850
 URL: https://issues.apache.org/jira/browse/HDFS-8850
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8850.001.patch


 The VolumeScanner threads inside the BlockScanner exit with an exception if 
 there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-488) Implement moveToLocal HDFS command

2015-08-03 Thread Steven Capo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Capo updated HDFS-488:
-
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.1
Affects Version/s: 2.7.1
 Target Version/s: 2.7.1
 Tags: MoveToLocal
   Status: Patch Available  (was: Open)

 Implement moveToLocal  HDFS command
 ---

 Key: HDFS-488
 URL: https://issues.apache.org/jira/browse/HDFS-488
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Ravi Phulari
Assignee: Steven Capo
  Labels: newbie
 Fix For: 2.7.1

 Attachments: Screen Shot 2014-07-23 at 12.28.23 PM 1.png


 Surprisingly  executing HDFS FsShell command -moveToLocal outputs  - Option 
 '-moveToLocal' is not implemented yet.
  
 {code}
 statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t
 Option '-moveToLocal' is not implemented yet.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-488) Implement moveToLocal HDFS command

2015-08-03 Thread Steven Capo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Capo updated HDFS-488:
-
Attachment: HDFS-488.patch

 Implement moveToLocal  HDFS command
 ---

 Key: HDFS-488
 URL: https://issues.apache.org/jira/browse/HDFS-488
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Ravi Phulari
Assignee: Steven Capo
  Labels: newbie
 Fix For: 2.7.1

 Attachments: HDFS-488.patch, Screen Shot 2014-07-23 at 12.28.23 PM 
 1.png


 Surprisingly  executing HDFS FsShell command -moveToLocal outputs  - Option 
 '-moveToLocal' is not implemented yet.
  
 {code}
 statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t
 Option '-moveToLocal' is not implemented yet.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8761) Windows HDFS daemon - datanode.DirectoryScanner: Error compiling report (...) XXX is not a prefix of YYY

2015-08-03 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652581#comment-14652581
 ] 

Chris Nauroth commented on HDFS-8761:
-

[~odelalleau], glad to hear this helped!

bq. I wonder how this is not a bug though, even if there exists a workaround. 
But not a big deal.

I agree that the configuration file can end up looking non-intuitive on 
Windows.  Unfortunately, I don't see a way to do any better while maintaining 
the feature that everything defaults to using {{hadoop.tmp.dir}} for quick dev 
deployments.  This is a side effect of the fact that a Windows file system path 
is not always valid as a URL.  On Linux, a file system path will always be a 
valid URL (assuming the individual path names stick to the characters that 
don't require escaping).  I typically advise using a full {{file:}} URL in 
production configurations to make everything clearer for operators.

 Windows HDFS daemon - datanode.DirectoryScanner: Error compiling report (...) 
 XXX is not a prefix of YYY
 

 Key: HDFS-8761
 URL: https://issues.apache.org/jira/browse/HDFS-8761
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
 Environment: Windows 7, Java SDK 1.8.0_45
Reporter: Olivier Delalleau
Priority: Minor

 I'm periodically seeing errors like the one below output by the HDFS daemon 
 (started with start-dfs.cmd). This is with the default settings for data 
 location (=not specified in my hdfs-site.xml). I assume it may be fixable by 
 specifying a path with the drive letter in the config file, however I haven't 
 be able to do that (see 
 http://stackoverflow.com/questions/31353226/setting-hadoop-tmp-dir-on-windows-gives-error-uri-has-an-authority-component).
 15/07/11 17:29:57 ERROR datanode.DirectoryScanner: Error compiling report
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 \tmp\hadoop-odelalleau\dfs\data is not a prefix of 
 D:\tmp\hadoop-odelalleau\dfs\data\current\BP-1474392971-10.128.22.110-1436634926842\current\finalized\subdir0\subdir0\blk_1073741825
 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
 at java.util.concurrent.FutureTask.get(FutureTask.java:192)
 at 
 org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:566)
 at 
 org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:425)
 at 
 org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:406)
 at 
 org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:362)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652645#comment-14652645
 ] 

Hadoop QA commented on HDFS-8220:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 36s | Findbugs (version ) appears to 
be broken on HDFS-7285. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 51s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 38s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 41s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 28s | The patch appears to introduce 5 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 21s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 177m 12s | Tests failed in hadoop-hdfs. |
| | | 220m 24s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestWriteStripedFileWithFailure |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748509/HDFS-8220-HDFS-7285-09.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7285 / ba90c02 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11889/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11889/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11889/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11889/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11889/console |


This message was automatically generated.

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285-09.patch, 
 HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 

[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-03 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652789#comment-14652789
 ] 

Jing Zhao commented on HDFS-8828:
-

Thanks for the explanation, Yufei! Yes, you're right that our current code uses 
the file list to check if a file is in the source. In that sense excluding 
-delete may be our only option here. But we may need to provide more details 
in the documentation about the behavior, as also suggested by Yongjun.

 Utilize Snapshot diff report to build copy list in distcp
 -

 Key: HDFS-8828
 URL: https://issues.apache.org/jira/browse/HDFS-8828
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Yufei Gu
Assignee: Yufei Gu
 Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
 HDFS-8828.003.patch


 Some users reported huge time cost to build file copy list in distcp. (30 
 hours for 1.6M files). We can leverage snapshot diff report to build file 
 copy list including files/dirs which are changes only between two snapshots 
 (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
 less copy list building time. 2. less file copy MR jobs.
 HDFS snapshot diff report provide information about file/directory creation, 
 deletion, rename and modification between two snapshots or a snapshot and a 
 normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
 the default distcp. So it still relies on default distcp to building complete 
 list of files under the source dir. This patch only puts creation and 
 modification files into the copy list based on snapshot diff report. We can 
 minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652970#comment-14652970
 ] 

Allen Wittenauer commented on HDFS-8849:


This is one of those times where I feel that no matter what I say, it's pretty 
clear the dev is hell bent on putting in some useless feature that doesn't 
actually benefit anyone.  

That said, I'll also remind you that putting this into 2.x is a breaking change 
by the compatibility requirements since changing the output of fsck isn't 
allowed.

 fsck should report number of missing blocks with replication factor 1
 -

 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor

 HDFS-7165 supports reporting number of blocks with replication factor 1 in 
 {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
 support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-488) Implement moveToLocal HDFS command

2015-08-03 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-488:
--
 Priority: Minor  (was: Major)
 Hadoop Flags:   (was: Reviewed)
Fix Version/s: (was: 2.7.1)

 Implement moveToLocal  HDFS command
 ---

 Key: HDFS-488
 URL: https://issues.apache.org/jira/browse/HDFS-488
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Ravi Phulari
Assignee: Steven Capo
Priority: Minor
  Labels: newbie
 Attachments: HDFS-488.patch, Screen Shot 2014-07-23 at 12.28.23 PM 
 1.png


 Surprisingly  executing HDFS FsShell command -moveToLocal outputs  - Option 
 '-moveToLocal' is not implemented yet.
  
 {code}
 statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t
 Option '-moveToLocal' is not implemented yet.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652639#comment-14652639
 ] 

Hadoop QA commented on HDFS-8823:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m  3s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | javac |   7m 39s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 30s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 22s | The applied patch generated  5 
new checkstyle issues (total was 577, now 573). |
| {color:green}+1{color} | whitespace |   0m  6s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 26s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 34s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  3s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 159m 12s | Tests failed in hadoop-hdfs. |
| | | 202m 55s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748510/HDFS-8823.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 469cfcd |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11890/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11890/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11890/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11890/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11890/console |


This message was automatically generated.

 Move replication factor into individual blocks
 --

 Key: HDFS-8823
 URL: https://issues.apache.org/jira/browse/HDFS-8823
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch


 This jira proposes to record the replication factor in the {{BlockInfo}} 
 class. The changes have two advantages:
 * Decoupling the namespace and the block management layer. It is a 
 prerequisite step to move block management off the heap or to a separate 
 process.
 * Increased flexibility on replicating blocks. Currently the replication 
 factors of all blocks have to be the same. The replication factors of these 
 blocks are equal to the highest replication factor across all snapshots. The 
 changes will allow blocks in a file to have different replication factor, 
 potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2

2015-08-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652657#comment-14652657
 ] 

Andrew Wang commented on HDFS-7966:
---

Agree there potentially are performance advantages, but it looks like all the 
benchmarks thus far show worse performance. I'd be very happy to see positive 
results, since erasure coding will lead to a lot more remote reads and thus 
possibly hitting this code path.

There has to be some upside though for this to be merged. The existing DTP 
already implements a number of the features mentioned, so not sure how much we 
gain there. And if perf isn't as good or better, then we're increasing our 
maintenance burden for something that won't get used.

 New Data Transfer Protocol via HTTP/2
 -

 Key: HDFS-7966
 URL: https://issues.apache.org/jira/browse/HDFS-7966
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Qianqian Shi
  Labels: gsoc, gsoc2015, mentor
 Attachments: GSoC2015_Proposal.pdf, 
 TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, 
 TestHttp2ReadBlockInsideEventLoop.svg


 The current Data Transfer Protocol (DTP) implements a rich set of features 
 that span across multiple layers, including:
 * Connection pooling and authentication (session layer)
 * Encryption (presentation layer)
 * Data writing pipeline (application layer)
 All these features are HDFS-specific and defined by implementation. As a 
 result it requires non-trivial amount of work to implement HDFS clients and 
 servers.
 This jira explores to delegate the responsibilities of the session and 
 presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
 connection multiplexing, QoS, authentication and encryption, reducing the 
 scope of DTP to the application layer only. By leveraging the existing HTTP/2 
 library, it should simplify the implementation of both HDFS clients and 
 servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-03 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652736#comment-14652736
 ] 

Yufei Gu commented on HDFS-8828:


Hi Jing Zhao,

Thank you for reviewing the code. We changed the option here for the following 
reason. This patch is to build the diff file list instead of complete file 
list. In the other words, only files/directories changed/created will be in the 
copy file list. With the -delete option on, the MR jobs will delete every 
files/directories in the target which are not in the copy file list. So it will 
delete files we intend to keep. 

 Utilize Snapshot diff report to build copy list in distcp
 -

 Key: HDFS-8828
 URL: https://issues.apache.org/jira/browse/HDFS-8828
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Yufei Gu
Assignee: Yufei Gu
 Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
 HDFS-8828.003.patch


 Some users reported huge time cost to build file copy list in distcp. (30 
 hours for 1.6M files). We can leverage snapshot diff report to build file 
 copy list including files/dirs which are changes only between two snapshots 
 (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
 less copy list building time. 2. less file copy MR jobs.
 HDFS snapshot diff report provide information about file/directory creation, 
 deletion, rename and modification between two snapshots or a snapshot and a 
 normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
 the default distcp. So it still relies on default distcp to building complete 
 list of files under the source dir. This patch only puts creation and 
 modification files into the copy list based on snapshot diff report. We can 
 minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652728#comment-14652728
 ] 

Hadoop QA commented on HDFS-8828:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 22s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 37s | The applied patch generated  2  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 25s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  4s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 47s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   6m 19s | Tests passed in 
hadoop-distcp. |
| | |  42m 32s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748535/HDFS-8828.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 469cfcd |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11891/artifact/patchprocess/diffJavadocWarnings.txt
 |
| hadoop-distcp test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11891/artifact/patchprocess/testrun_hadoop-distcp.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11891/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11891/console |


This message was automatically generated.

 Utilize Snapshot diff report to build copy list in distcp
 -

 Key: HDFS-8828
 URL: https://issues.apache.org/jira/browse/HDFS-8828
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Yufei Gu
Assignee: Yufei Gu
 Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
 HDFS-8828.003.patch


 Some users reported huge time cost to build file copy list in distcp. (30 
 hours for 1.6M files). We can leverage snapshot diff report to build file 
 copy list including files/dirs which are changes only between two snapshots 
 (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
 less copy list building time. 2. less file copy MR jobs.
 HDFS snapshot diff report provide information about file/directory creation, 
 deletion, rename and modification between two snapshots or a snapshot and a 
 normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
 the default distcp. So it still relies on default distcp to building complete 
 list of files under the source dir. This patch only puts creation and 
 modification files into the copy list based on snapshot diff report. We can 
 minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-08-03 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8850:
---
Attachment: HDFS-8850.001.patch

 VolumeScanner thread exits with exception if there is no block pool to be 
 scanned but there are suspicious blocks
 -

 Key: HDFS-8850
 URL: https://issues.apache.org/jira/browse/HDFS-8850
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8850.001.patch


 The VolumeScanner threads inside the BlockScanner exit with an exception if 
 there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-488) Implement moveToLocal HDFS command

2015-08-03 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-488:
--
Tags:   (was: MoveToLocal)

 Implement moveToLocal  HDFS command
 ---

 Key: HDFS-488
 URL: https://issues.apache.org/jira/browse/HDFS-488
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Ravi Phulari
Assignee: Steven Capo
Priority: Minor
  Labels: newbie
 Attachments: HDFS-488.patch, Screen Shot 2014-07-23 at 12.28.23 PM 
 1.png


 Surprisingly  executing HDFS FsShell command -moveToLocal outputs  - Option 
 '-moveToLocal' is not implemented yet.
  
 {code}
 statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t
 Option '-moveToLocal' is not implemented yet.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-488) Implement moveToLocal HDFS command

2015-08-03 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-488:
--
Target Version/s:   (was: 2.7.1)

 Implement moveToLocal  HDFS command
 ---

 Key: HDFS-488
 URL: https://issues.apache.org/jira/browse/HDFS-488
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Ravi Phulari
Assignee: Steven Capo
Priority: Minor
  Labels: newbie
 Attachments: HDFS-488.patch, Screen Shot 2014-07-23 at 12.28.23 PM 
 1.png


 Surprisingly  executing HDFS FsShell command -moveToLocal outputs  - Option 
 '-moveToLocal' is not implemented yet.
  
 {code}
 statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t
 Option '-moveToLocal' is not implemented yet.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-08-03 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653015#comment-14653015
 ] 

Ajith S commented on HDFS-8808:
---

Hi [~ggop]

Why not bootstrap the standby without that property and when its complete, 
before starting the standby you add dfs.image.tranfer.bandwidthPerSec

 dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
 

 Key: HDFS-8808
 URL: https://issues.apache.org/jira/browse/HDFS-8808
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Gautam Gopalakrishnan

 The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
 speed with which the fsimage is copied between the namenodes during regular 
 use. However, as a side effect, this also limits transfers when the 
 {{-bootstrapStandby}} option is used. This option is often used during 
 upgrades and could potentially slow down the entire workflow. The request 
 here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
 setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653034#comment-14653034
 ] 

Walter Su commented on HDFS-8220:
-

When I ran tests, I ran into some NPEs. Could you add {{si.isFailed()}} guard 
to {{updateBlockForPipeline}} and {{updatePipeline}} as well?

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285-09.patch, 
 HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8851) datanode fails to start due to a bad disk

2015-08-03 Thread Wang Hao (JIRA)
Wang Hao created HDFS-8851:
--

 Summary: datanode fails to start due to a bad disk
 Key: HDFS-8851
 URL: https://issues.apache.org/jira/browse/HDFS-8851
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.1
Reporter: Wang Hao


Data node can not start due to a bad disk. I found a similar issue HDFS-6245 is 
reported, but our situation is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8851) datanode fails to start due to a bad disk

2015-08-03 Thread Wang Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653048#comment-14653048
 ] 

Wang Hao commented on HDFS-8851:


code
15/08/04 12:01:24 INFO common.Storage: Analyzing storage directories for bpid 
BP-454299492-10.84.100.171-1416301904728
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Locking is disabled
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 INFO common.Storage: Restored 0 block files from trash.
15/08/04 12:01:24 FATAL datanode.DataNode: Initialization failed for Block pool 
registering (Datanode Uuid unassigned) service to 
hadoop001.dx.momo.com/10.84.100.171:8022. Exiting.
java.io.IOException: Input/output error
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:243)
at java.util.Properties$LineReader.readLine(Properties.java:434)
at java.util.Properties.load0(Properties.java:353)
at java.util.Properties.load(Properties.java:341)
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.readPropertiesFile(StorageInfo.java:247)
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:227)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:256)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:155)
at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:269)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:975)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:946)
at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:278)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:812)
at java.lang.Thread.run(Thread.java:745)
15/08/04 12:01:24 WARN datanode.DataNode: Ending block pool service for: Block 
pool registering (Datanode Uuid unassigned) service to 
hadoop001.dx.momo.com/10.84.100.171:8022
15/08/04 12:01:24 INFO datanode.DataNode: Removed Block pool registering 
(Datanode Uuid unassigned)
15/08/04 12:01:26 WARN datanode.DataNode: Exiting Datanode
15/08/04 12:01:26 INFO util.ExitUtil: Exiting with status 0
15/08/04 12:01:26 INFO datanode.DataNode: SHUTDOWN_MSG:
code

 datanode fails to start due to a bad disk
 -

 Key: HDFS-8851
 URL: https://issues.apache.org/jira/browse/HDFS-8851
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.1
Reporter: Wang Hao

 Data node can not start due to a bad disk. I found a similar issue HDFS-6245 
 is reported, but our situation is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails

2015-08-03 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8704:

Attachment: HDFS-8704-HDFS-7285-004.patch

 Erasure Coding: client fails to write large file when one datanode fails
 

 Key: HDFS-8704
 URL: https://issues.apache.org/jira/browse/HDFS-8704
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch, 
 HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch


 I test current code on a 5-node cluster using RS(3,2).  When a datanode is 
 corrupt, client succeeds to write a file smaller than a block group but fails 
 to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests 
 files smaller than a block group, this jira will add more test situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8827) Erasure Coding: When namenode processes over replicated striped block, NPE will be occur in ReplicationMonitor

2015-08-03 Thread Takuya Fukudome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Fukudome updated HDFS-8827:
--
Attachment: HDFS-8827.1.patch

Thanks for the comment, [~zhz]! I attached an initial patch which added a unit 
test occurs this issue. It processes a small EC file which doesn't have full 
internal blocks and its internal blocks are over replicated.
If I understood correctly, when some indices of internal blocks are missing and 
internal blocks are over replicated, 
{{BlockPlacementPolicyDefault#chooseReplicaToDelete}} will return null. I think 
the cause is the {{excessTypes}} in {{chooseExcessReplicasStriped}} is empty 
during the process of such blocks.

 Erasure Coding: When namenode processes over replicated striped block, NPE 
 will be occur in ReplicationMonitor
 --

 Key: HDFS-8827
 URL: https://issues.apache.org/jira/browse/HDFS-8827
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takuya Fukudome
Assignee: Takuya Fukudome
 Attachments: HDFS-8827.1.patch, processing-over-replica-npe.log


 In our test cluster, when namenode processed over replicated striped blocks, 
 null pointer exception(NPE) occurred. This happened under below situation: 1) 
 some datanodes shutdown. 2) namenode recovers block group which lost internal 
 blocks. 3) restart the stopped datanodes. 4) namenode processes over 
 replicated striped blocks. 5) NPE occurs
 I think BlockPlacementPolicyDefault#chooseReplicaToDelete will return null in 
 this situation which causes this NPE problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653076#comment-14653076
 ] 

Hadoop QA commented on HDFS-8850:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 11s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 20s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 31s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  2s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 158m 28s | Tests failed in hadoop-hdfs. |
| | | 202m 14s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748576/HDFS-8850.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c3364ca |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11892/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11892/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11892/console |


This message was automatically generated.

 VolumeScanner thread exits with exception if there is no block pool to be 
 scanned but there are suspicious blocks
 -

 Key: HDFS-8850
 URL: https://issues.apache.org/jira/browse/HDFS-8850
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-8850.001.patch


 The VolumeScanner threads inside the BlockScanner exit with an exception if 
 there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8851) datanode fails to start due to a bad disk

2015-08-03 Thread Wang Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653056#comment-14653056
 ] 

Wang Hao commented on HDFS-8851:


There is a IOException when read VERSION because of the disk is bad, it will 
causes datanode failed to start. I think we should handle the exception during 
init storage.

 datanode fails to start due to a bad disk
 -

 Key: HDFS-8851
 URL: https://issues.apache.org/jira/browse/HDFS-8851
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.1
Reporter: Wang Hao

 Data node can not start due to a bad disk. I found a similar issue HDFS-6245 
 is reported, but our situation is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8852) Documentation of Hadoop 2.x is outdated about append write support

2015-08-03 Thread Hong Dai Thanh (JIRA)
Hong Dai Thanh created HDFS-8852:


 Summary: Documentation of Hadoop 2.x is outdated about append 
write support
 Key: HDFS-8852
 URL: https://issues.apache.org/jira/browse/HDFS-8852
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Hong Dai Thanh


In the [latest version of the 
documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model],
 and also documentation for all releases with version 2, it’s mentioned that “A 
file once created, written, and closed need not be changed. “ and “There is a 
plan to support appending-writes to files in the future.” 
 
However, as far as I know, HDFS has supported append write since 0.21, based on 
[HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old version 
of the documentation in 
2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs]

Various posts on the Internet also suggests that append write has been 
available in HDFS, and will always be available in Hadoop version 2 branch.
 
Can we update the documentation to reflect the current status?

(Please also review whether the documentation should also be updated for 
version 0.21 and above, and the version 1.x branch)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-488) Implement moveToLocal HDFS command

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652995#comment-14652995
 ] 

Hadoop QA commented on HDFS-488:


\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:red}-1{color} | javac |   0m 32s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748586/HDFS-488.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c3364ca |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11893/console |


This message was automatically generated.

 Implement moveToLocal  HDFS command
 ---

 Key: HDFS-488
 URL: https://issues.apache.org/jira/browse/HDFS-488
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Ravi Phulari
Assignee: Steven Capo
Priority: Minor
  Labels: newbie
 Attachments: HDFS-488.patch, Screen Shot 2014-07-23 at 12.28.23 PM 
 1.png


 Surprisingly  executing HDFS FsShell command -moveToLocal outputs  - Option 
 '-moveToLocal' is not implemented yet.
  
 {code}
 statepick-lm:Hadoop rphulari$ bin/hadoop fs -moveToLocal bt t
 Option '-moveToLocal' is not implemented yet.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8704) Erasure Coding: client fails to write large file when one datanode fails

2015-08-03 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8704:

Status: Patch Available  (was: Open)

 Erasure Coding: client fails to write large file when one datanode fails
 

 Key: HDFS-8704
 URL: https://issues.apache.org/jira/browse/HDFS-8704
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-8704-000.patch, HDFS-8704-HDFS-7285-002.patch, 
 HDFS-8704-HDFS-7285-003.patch, HDFS-8704-HDFS-7285-004.patch


 I test current code on a 5-node cluster using RS(3,2).  When a datanode is 
 corrupt, client succeeds to write a file smaller than a block group but fails 
 to write a large one. {{TestDFSStripeOutputStreamWithFailure}} only tests 
 files smaller than a block group, this jira will add more test situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8852) HDFS architecture documentation of version 2.x is outdated about append write support

2015-08-03 Thread Hong Dai Thanh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Dai Thanh updated HDFS-8852:
-
Summary: HDFS architecture documentation of version 2.x is outdated about 
append write support  (was: Documentation of Hadoop 2.x is outdated about 
append write support)

 HDFS architecture documentation of version 2.x is outdated about append write 
 support
 -

 Key: HDFS-8852
 URL: https://issues.apache.org/jira/browse/HDFS-8852
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Hong Dai Thanh

 In the [latest version of the 
 documentation|http://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Simple_Coherency_Model],
  and also documentation for all releases with version 2, it’s mentioned that 
 “A file once created, written, and closed need not be changed. “ and “There 
 is a plan to support appending-writes to files in the future.” 
  
 However, as far as I know, HDFS has supported append write since 0.21, based 
 on [HDFS-265|https://issues.apache.org/jira/browse/HDFS-265] and [the old 
 version of the documentation in 
 2012|https://web.archive.org/web/20121221171824/http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html#Appending-Writes+and+File+Syncs]
 Various posts on the Internet also suggests that append write has been 
 available in HDFS, and will always be available in Hadoop version 2 branch.
  
 Can we update the documentation to reflect the current status?
 (Please also review whether the documentation should also be updated for 
 version 0.21 and above, and the version 1.x branch)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8747) Provide Better Scratch Space and Soft Delete Support for HDFS Encryption Zones

2015-08-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652534#comment-14652534
 ] 

Andrew Wang commented on HDFS-8747:
---

From our side, we have some customers using encryption who want Trash as a 
safety mechanism. So simply using -skipTrash means they lose this safety. My 
advice has been to use snapshots, since snapshots provide similar (if not 
superior) properties to trash. That's also why I'm willing to accept some of 
the compromises regarding the proposed design; while not perfect, it's better 
than what we've got now.

I do think though that nested encryption zones would make this better yet (for 
reasons even besides trash), and would not be too difficult to implement.

 Provide Better Scratch Space and Soft Delete Support for HDFS Encryption 
 Zones
 --

 Key: HDFS-8747
 URL: https://issues.apache.org/jira/browse/HDFS-8747
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-8747-07092015.pdf, HDFS-8747-07152015.pdf, 
 HDFS-8747-07292015.pdf


 HDFS Transparent Data Encryption At-Rest was introduced in Hadoop 2.6 to 
 allow create encryption zone on top of a single HDFS directory. Files under 
 the root directory of the encryption zone will be encrypted/decrypted 
 transparently upon HDFS client write or read operations. 
 Generally, it does not support rename(without data copying) across encryption 
 zones or between encryption zone and non-encryption zone because different 
 security settings of encryption zones. However, there are certain use cases 
 where efficient rename support is desired. This JIRA is to propose better 
 support of two such use cases “Scratch Space” (a.k.a. staging area) and “Soft 
 Delete” (a.k.a. trash) with HDFS encryption zones.
 “Scratch Space” is widely used in Hadoop jobs, which requires efficient 
 rename support. Temporary files from MR jobs are usually stored in staging 
 area outside encryption zone such as “/tmp” directory and then rename to 
 targeted directories as specified once the data is ready to be further 
 processed. 
 Below is a summary of supported/unsupported cases from latest Hadoop:
 * Rename within the encryption zone is supported
 * Rename the entire encryption zone by moving the root directory of the zone  
 is allowed.
 * Rename sub-directory/file from encryption zone to non-encryption zone is 
 not allowed.
 * Rename sub-directory/file from encryption zone A to encryption zone B is 
 not allowed.
 * Rename from non-encryption zone to encryption zone is not allowed.
 “Soft delete” (a.k.a. trash) is a client-side “soft delete” feature that 
 helps prevent accidental deletion of files and directories. If trash is 
 enabled and a file or directory is deleted using the Hadoop shell, the file 
 is moved to the .Trash directory of the user's home directory instead of 
 being deleted.  Deleted files are initially moved (renamed) to the Current 
 sub-directory of the .Trash directory with original path being preserved. 
 Files and directories in the trash can be restored simply by moving them to a 
 location outside the .Trash directory.
 Due to the limited rename support, delete sub-directory/file within 
 encryption zone with trash feature is not allowed. Client has to use 
 -skipTrash option to work around this. HADOOP-10902 and HDFS-6767 improved 
 the error message but without a complete solution to the problem. 
 We propose to solve the problem by generalizing the mapping between 
 encryption zone and its underlying HDFS directories from 1:1 today to 1:N. 
 The encryption zone should allow non-overlapped directories such as scratch 
 space or soft delete trash locations to be added/removed dynamically after 
 creation. This way, rename for scratch space and soft delete can be 
 better supported without breaking the assumption that rename is only 
 supported within the zone. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8804) Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer allocation

2015-08-03 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652536#comment-14652536
 ] 

Zhe Zhang commented on HDFS-8804:
-

Thanks Jing for the work! The patch looks good to me. The only minor comment is 
that the below section could use some assertions to avoid overlapped allocation 
in the {{parityBuf}}:
{code}
  ByteBuffer buf = getParityBuffer().duplicate();
  buf.position(cellSize * decodeIndex);
  buf.limit(cellSize * decodeIndex + (int) alignedStripe.range.spanInBlock);
  decodeInputs[decodeIndex] = buf.slice();
{code}

For example, since this is stateful read, we can at least assert 
{{alignedStripe.range.spanInBlock}} is no larger than {{cellSize}}. Ideally we 
should assert {{decodeIndex}} has not been allocated yet but it doesn't seem 
easy.

As follow-on we can think about how to do it for pread.

 Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer 
 allocation
 ---

 Key: HDFS-8804
 URL: https://issues.apache.org/jira/browse/HDFS-8804
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8804.000.patch, HDFS-8804.001.patch


 Currently we directly allocate direct ByteBuffer in DFSStripedInputstream for 
 the stripe buffer and the buffers holding parity data. It's better to get 
 ByteBuffer from DirectBufferPool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7966) New Data Transfer Protocol via HTTP/2

2015-08-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652549#comment-14652549
 ] 

Andrew Wang commented on HDFS-7966:
---

I guess my question here is similar to what [~stack] and [~tlipcon] posed at 
the beginning. What's the upside of this new implementation? Seems like it's 
between 10 to 30% slower than the current implementation, which is not good. If 
it were the same performance but had other redeeming qualities (e.g. less code) 
then it's still worth consideration.

 New Data Transfer Protocol via HTTP/2
 -

 Key: HDFS-7966
 URL: https://issues.apache.org/jira/browse/HDFS-7966
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Qianqian Shi
  Labels: gsoc, gsoc2015, mentor
 Attachments: GSoC2015_Proposal.pdf, 
 TestHttp2LargeReadPerformance.svg, TestHttp2Performance.svg, 
 TestHttp2ReadBlockInsideEventLoop.svg


 The current Data Transfer Protocol (DTP) implements a rich set of features 
 that span across multiple layers, including:
 * Connection pooling and authentication (session layer)
 * Encryption (presentation layer)
 * Data writing pipeline (application layer)
 All these features are HDFS-specific and defined by implementation. As a 
 result it requires non-trivial amount of work to implement HDFS clients and 
 servers.
 This jira explores to delegate the responsibilities of the session and 
 presentation layers to the HTTP/2 protocol. Particularly, HTTP/2 handles 
 connection multiplexing, QoS, authentication and encryption, reducing the 
 scope of DTP to the application layer only. By leveraging the existing HTTP/2 
 library, it should simplify the implementation of both HDFS clients and 
 servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-03 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652559#comment-14652559
 ] 

Jing Zhao commented on HDFS-8828:
-

Thanks for working on this, Yufei! One quick comment is about the following 
change:
{code}
-if ((!syncFolder || !deleteMissing)  useDiff) {
+if ((!syncFolder || deleteMissing)  useDiff) {
   throw new IllegalArgumentException(
-  Diff is valid only with update and delete options);
+  Diff is valid only with update options);
 }
{code}

Currently we delete files/directories according to DELETE diff already. Looks 
to me this is consistent with the deleteMissing option actually. Any specific 
reason we want to change the semantic here?

 Utilize Snapshot diff report to build copy list in distcp
 -

 Key: HDFS-8828
 URL: https://issues.apache.org/jira/browse/HDFS-8828
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Yufei Gu
Assignee: Yufei Gu
 Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
 HDFS-8828.003.patch


 Some users reported huge time cost to build file copy list in distcp. (30 
 hours for 1.6M files). We can leverage snapshot diff report to build file 
 copy list including files/dirs which are changes only between two snapshots 
 (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
 less copy list building time. 2. less file copy MR jobs.
 HDFS snapshot diff report provide information about file/directory creation, 
 deletion, rename and modification between two snapshots or a snapshot and a 
 normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
 the default distcp. So it still relies on default distcp to building complete 
 list of files under the source dir. This patch only puts creation and 
 modification files into the copy list based on snapshot diff report. We can 
 minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8784) BlockInfo#numNodes should be numStorages

2015-08-03 Thread Jagadesh Kiran N (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651638#comment-14651638
 ] 

Jagadesh Kiran N commented on HDFS-8784:


Pre-Patch failure is not related to the changes done in the patch

 BlockInfo#numNodes should be numStorages
 

 Key: HDFS-8784
 URL: https://issues.apache.org/jira/browse/HDFS-8784
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Jagadesh Kiran N
 Attachments: HDFS-8784-00.patch, HDFS-8784-01.patch


 The method actually returns the number of storages holding a block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8839) Erasure Coding: client occasionally gets less block locations when some datanodes fail

2015-08-03 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651524#comment-14651524
 ] 

Walter Su commented on HDFS-8839:
-

bq. Otherwise, the client writing can't go on.
Yes, it hangs. It's a problem.

bq. the namenode should still allocate 9 locations even it knows one of them is 
invalid. 
It's not the best solution. Please check my last comment at HDFS-8220. We can 
continue discuss there.

 Erasure Coding: client occasionally gets less block locations when some 
 datanodes fail 
 ---

 Key: HDFS-8839
 URL: https://issues.apache.org/jira/browse/HDFS-8839
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo

 9 datanodes, write two block groups. A datanode dies when writing the first 
 block group. When client retrieves the second block group from namenode, the 
 returned block group only contains 8 locations occasionally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8784) BlockInfo#numNodes should be numStorages

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651595#comment-14651595
 ] 

Hadoop QA commented on HDFS-8784:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 15s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 34s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  4s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 162m  7s | Tests passed in hadoop-hdfs. 
|
| | | 203m 16s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748391/HDFS-8784-01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 90b5104 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11887/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11887/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11887/console |


This message was automatically generated.

 BlockInfo#numNodes should be numStorages
 

 Key: HDFS-8784
 URL: https://issues.apache.org/jira/browse/HDFS-8784
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Jagadesh Kiran N
 Attachments: HDFS-8784-00.patch, HDFS-8784-01.patch


 The method actually returns the number of storages holding a block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8841) Catch throwable return null

2015-08-03 Thread Jagadesh Kiran N (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651694#comment-14651694
 ] 

Jagadesh Kiran N commented on HDFS-8841:


I failed to see the chance of Error(like ClassNotFoundError) in the following 
code. Could you please point me to the same. Thanks!

try {
final Path tmp = new Path(job.get(TMP_DIR_LABEL), relativedst);
if (destFileSys.delete(tmp, true))
break;
} catch (Throwable ex) {
// ignore, we are just cleaning up
LOG.debug(Ignoring cleanup exception, ex);
}

 Catch throwable return null
 ---

 Key: HDFS-8841
 URL: https://issues.apache.org/jira/browse/HDFS-8841
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: songwanging
Assignee: Jagadesh Kiran N
Priority: Minor

 In method map of class: 
 \hadoop-2.7.1-src\hadoop-tools\hadoop-extras\src\main\java\org\apache\hadoop\tools\DistCpV1.java.
 This method has this code:
  public void map(LongWritable key,
 FilePair value,
 OutputCollectorWritableComparable?, Text out,
 Reporter reporter) throws IOException {
  ...
 } catch (Throwable ex) {
   // ignore, we are just cleaning up
   LOG.debug(Ignoring cleanup exception, ex);
 }

   }
 } 
 ...
 }
 Throwable is the parent type of Exception and Error, so catching Throwable 
 means catching both Exceptions as well as Errors. An Exception is something 
 you could recover (like IOException), an Error is something more serious and 
 usually you could'nt recover easily (like ClassNotFoundError) so it doesn't 
 make much sense to catch an Error.
 We should  convert to catch Exception instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7601) Operations(e.g. balance) failed due to deficient configuration parsing

2015-08-03 Thread Doris Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doris Gu updated HDFS-7601:
---
Attachment: 0001-for-hdfs-7601.patch

 Operations(e.g. balance) failed due to deficient configuration parsing
 --

 Key: HDFS-7601
 URL: https://issues.apache.org/jira/browse/HDFS-7601
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.3.0, 2.6.0
Reporter: Doris Gu
Assignee: Doris Gu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: 0001-for-hdfs-7601.patch


 Some operations, for example,balance,parses configuration(from 
 core-site.xml,hdfs-site.xml) to get NameServiceUris to link to.
 Current method considers those end with or without /  as two different 
 uris, then following operation may meet errors.
 bq. [hdfs://haCluster, hdfs://haCluster/] are considered to be two different 
 uris   which actually the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8838) Tolerate datanode failures in DFSStripedOutputStream when the data length is small

2015-08-03 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651555#comment-14651555
 ] 

Li Bo commented on HDFS-8838:
-

The number of datanodes is set to 9 in the unit test. Due to the problem of 
HDFS-8220 or HDFS-8839, I think we should use at least 10 datanodes for testing 
one datanode failure.

 Tolerate datanode failures in DFSStripedOutputStream when the data length is 
 small
 --

 Key: HDFS-8838
 URL: https://issues.apache.org/jira/browse/HDFS-8838
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h8838_20150729.patch, h8838_20150731.patch


 Currently, DFSStripedOutputStream cannot tolerate datanode failures when the 
 data length is small.  We fix the bugs here and add more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7601) Operations(e.g. balance) failed due to deficient configuration parsing

2015-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651561#comment-14651561
 ] 

Hadoop QA commented on HDFS-7601:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 47s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | javac |   1m 44s | The patch appears to cause the 
build to fail. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12748405/0001-for-hdfs-7601.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 90b5104 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11888/console |


This message was automatically generated.

 Operations(e.g. balance) failed due to deficient configuration parsing
 --

 Key: HDFS-7601
 URL: https://issues.apache.org/jira/browse/HDFS-7601
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.3.0, 2.6.0
Reporter: Doris Gu
Assignee: Doris Gu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: 0001-for-hdfs-7601.patch


 Some operations, for example,balance,parses configuration(from 
 core-site.xml,hdfs-site.xml) to get NameServiceUris to link to.
 Current method considers those end with or without /  as two different 
 uris, then following operation may meet errors.
 bq. [hdfs://haCluster, hdfs://haCluster/] are considered to be two different 
 uris   which actually the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651523#comment-14651523
 ] 

Walter Su commented on HDFS-8220:
-

 bq. If number of datanodes  NUM_DATA_BLOCKS then throw IOException(Failed to 
get datablocks number of datanodes!)
Yes.

I saw if ( numOfDNs = NUM_DATA_BLOCKS  numOfDNs   GROUP_SIZE ), the 
OutputStream hangs and stop writing, even if the file is smaller than a 
cellSize. We should fix that.
The writing should succeed because user could add more DN nodes later. ECWorker 
can recover the missing blocks.

The reason is some streamers can't get {{followingBlock}}, so they keep polling 
from {{followingBlocks}} queue. We should stop these streamers and mark them 
{{failed}}. So other streamers don't have to wait them.

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285.005.patch, 
 HDFS-8220-HDFS-7285.006.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7601) Operations(e.g. balance) failed due to deficient configuration parsing

2015-08-03 Thread Doris Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doris Gu updated HDFS-7601:
---
Attachment: (was: 0001-for-hdfs-7601.patch)

 Operations(e.g. balance) failed due to deficient configuration parsing
 --

 Key: HDFS-7601
 URL: https://issues.apache.org/jira/browse/HDFS-7601
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.3.0, 2.6.0
Reporter: Doris Gu
Assignee: Doris Gu
Priority: Minor
  Labels: BB2015-05-TBR

 Some operations, for example,balance,parses configuration(from 
 core-site.xml,hdfs-site.xml) to get NameServiceUris to link to.
 Current method considers those end with or without /  as two different 
 uris, then following operation may meet errors.
 bq. [hdfs://haCluster, hdfs://haCluster/] are considered to be two different 
 uris   which actually the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8848) Support OAuth2 in libwebhdfs

2015-08-03 Thread Puneeth P (JIRA)
Puneeth P created HDFS-8848:
---

 Summary: Support OAuth2 in libwebhdfs
 Key: HDFS-8848
 URL: https://issues.apache.org/jira/browse/HDFS-8848
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Puneeth P
Assignee: Puneeth P


As per Jira [https://issues.apache.org/jira/browse/HDFS-8155] there is a patch 
for WebHDFS java client. It would be good to bring libwebhdfs on par as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks

2015-08-03 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8823:
-
Attachment: HDFS-8823.001.patch

 Move replication factor into individual blocks
 --

 Key: HDFS-8823
 URL: https://issues.apache.org/jira/browse/HDFS-8823
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8823.000.patch, HDFS-8823.001.patch


 This jira proposes to record the replication factor in the {{BlockInfo}} 
 class. The changes have two advantages:
 * Decoupling the namespace and the block management layer. It is a 
 prerequisite step to move block management off the heap or to a separate 
 process.
 * Increased flexibility on replicating blocks. Currently the replication 
 factors of all blocks have to be the same. The replication factors of these 
 blocks are equal to the highest replication factor across all snapshots. The 
 changes will allow blocks in a file to have different replication factor, 
 potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8499) Refactor BlockInfo class hierarchy with static helper class

2015-08-03 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652325#comment-14652325
 ] 

Zhe Zhang commented on HDFS-8499:
-

[~szetszwo] I wonder if you've had a chance to work on reverting or reworking 
this change? Thanks.

 Refactor BlockInfo class hierarchy with static helper class
 ---

 Key: HDFS-8499
 URL: https://issues.apache.org/jira/browse/HDFS-8499
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 2.7.0
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Fix For: 2.8.0

 Attachments: HDFS-8499.00.patch, HDFS-8499.01.patch, 
 HDFS-8499.02.patch, HDFS-8499.03.patch, HDFS-8499.04.patch, 
 HDFS-8499.05.patch, HDFS-8499.06.patch, HDFS-8499.07.patch, 
 HDFS-8499.UCFeature.patch, HDFS-bistriped.patch


 In HDFS-7285 branch, the {{BlockInfoUnderConstruction}} interface provides a 
 common abstraction for striped and contiguous UC blocks. This JIRA aims to 
 merge it to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8220:
---
Attachment: HDFS-8220-HDFS-7285-09.patch

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285-09.patch, 
 HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8220:
---
Attachment: (was: HDFS-8220-HDFS-7285-09.patch)

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285.005.patch, 
 HDFS-8220-HDFS-7285.006.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652228#comment-14652228
 ] 

Rakesh R commented on HDFS-8220:


bq. I saw if ( numOfDNs = NUM_DATA_BLOCKS  numOfDNs  GROUP_SIZE ), the 
OutputStream hangs and stop writing, even if the file is smaller than a 
cellSize. We should fix that.
Good catch!, I've added testcase to simulate the same. Attached patch where I'm 
closing the streamer which doesn't have blocklocations available. 

After the execution of {{StripedDataStreamer.super.locateFollowingBlock()}}, it 
will validate the data blocks length. Secondly, it does the check for {{(blocks 
== null)}}. I could see {{LocatedBlock}} will be null when there is no 
sufficient data node available for that index. Since we are checking for 
sufficient data blocks number of DNs, those {{LocatedBlock}} will never be 
empty. If there are no block locations available for parity blocks then those 
blocks will become null. I've tried an approach by closing the respective 
parity streamers, any thoughts?

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285-09.patch, 
 HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8848) Support OAuth2 in libwebhdfs

2015-08-03 Thread Puneeth P (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Puneeth P updated HDFS-8848:

Issue Type: New Feature  (was: Improvement)

 Support OAuth2 in libwebhdfs
 

 Key: HDFS-8848
 URL: https://issues.apache.org/jira/browse/HDFS-8848
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Reporter: Puneeth P
Assignee: Puneeth P

 As per Jira [https://issues.apache.org/jira/browse/HDFS-8155] there is a 
 patch for WebHDFS java client. It would be good to bring libwebhdfs on par as 
 well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8220:
---
Attachment: HDFS-8220-HDFS-7285-09.patch

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285-09.patch, 
 HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7929) inotify unable fetch pre-upgrade edit log segments once upgrade starts

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7929:
--
Labels: 2.6.1-candidate  (was: )

 inotify unable fetch pre-upgrade edit log segments once upgrade starts
 --

 Key: HDFS-7929
 URL: https://issues.apache.org/jira/browse/HDFS-7929
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Zhe Zhang
Assignee: Zhe Zhang
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: HDFS-7929-000.patch, HDFS-7929-001.patch, 
 HDFS-7929-002.patch, HDFS-7929-003.patch


 inotify is often used to periodically poll HDFS events. However, once an HDFS 
 upgrade has started, edit logs are moved to /previous on the NN, which is not 
 accessible. Moreover, once the upgrade is finalized /previous is currently 
 lost forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8480) Fix performance and timeout issues in HDFS-7929 by using hard-links to preserve old edit logs instead of copying them

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8480:
--
Labels: 2.6.1-candidate  (was: )

 Fix performance and timeout issues in HDFS-7929 by using hard-links to 
 preserve old edit logs instead of copying them
 -

 Key: HDFS-8480
 URL: https://issues.apache.org/jira/browse/HDFS-8480
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Critical
  Labels: 2.6.1-candidate
 Fix For: 2.7.1

 Attachments: HDFS-8480.00.patch, HDFS-8480.01.patch, 
 HDFS-8480.02.patch, HDFS-8480.03.patch


 HDFS-7929 copies existing edit logs to the storage directory of the upgraded 
 {{NameNode}}. This slows down the upgrade process. This JIRA aims to use 
 hard-linking instead of per-op copying to achieve the same goal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8804) Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer allocation

2015-08-03 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8804:

Attachment: HDFS-8804.001.patch

Thanks Nicholas for the review! Update the patch to address the comments. I did 
not add synchronized to {{getParityBuffer}} because it is only used in 
StatefulStripeReader which is already protected by the lock.

 Erasure Coding: use DirectBufferPool in DFSStripedInputStream for buffer 
 allocation
 ---

 Key: HDFS-8804
 URL: https://issues.apache.org/jira/browse/HDFS-8804
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8804.000.patch, HDFS-8804.001.patch


 Currently we directly allocate direct ByteBuffer in DFSStripedInputstream for 
 the stripe buffer and the buffers holding parity data. It's better to get 
 ByteBuffer from DirectBufferPool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8846) Create edit log files with old layout version for upgrade testing

2015-08-03 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652515#comment-14652515
 ] 

Zhe Zhang commented on HDFS-8846:
-

Thanks Ming for the feedback!

I was planning to only add edit log files. But I think creating an entire NN 
dir structure with old layout version is a good idea. It could support a 
broader range of upgrade tests.

 Create edit log files with old layout version for upgrade testing
 -

 Key: HDFS-8846
 URL: https://issues.apache.org/jira/browse/HDFS-8846
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang

 Per discussion under HDFS-8480, we should create some edit log files with old 
 layout version, to test whether they can be correctly handled in upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-03 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated HDFS-8828:
---
Attachment: HDFS-8828.003.patch

 Utilize Snapshot diff report to build copy list in distcp
 -

 Key: HDFS-8828
 URL: https://issues.apache.org/jira/browse/HDFS-8828
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Yufei Gu
Assignee: Yufei Gu
 Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
 HDFS-8828.003.patch


 Some users reported huge time cost to build file copy list in distcp. (30 
 hours for 1.6M files). We can leverage snapshot diff report to build file 
 copy list including files/dirs which are changes only between two snapshots 
 (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
 less copy list building time. 2. less file copy MR jobs.
 HDFS snapshot diff report provide information about file/directory creation, 
 deletion, rename and modification between two snapshots or a snapshot and a 
 normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
 the default distcp. So it still relies on default distcp to building complete 
 list of files under the source dir. This patch only puts creation and 
 modification files into the copy list based on snapshot diff report. We can 
 minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp

2015-08-03 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652525#comment-14652525
 ] 

Yufei Gu commented on HDFS-8828:


Hi Yongjun,

Thank you very much for detailed code review and all nice suggestion. I've 
upload a new patch(HDFS-8828.003.path) for above comments.

 Utilize Snapshot diff report to build copy list in distcp
 -

 Key: HDFS-8828
 URL: https://issues.apache.org/jira/browse/HDFS-8828
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Yufei Gu
Assignee: Yufei Gu
 Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, 
 HDFS-8828.003.patch


 Some users reported huge time cost to build file copy list in distcp. (30 
 hours for 1.6M files). We can leverage snapshot diff report to build file 
 copy list including files/dirs which are changes only between two snapshots 
 (or a snapshot and a normal dir). It speed up the process in two folds: 1. 
 less copy list building time. 2. less file copy MR jobs.
 HDFS snapshot diff report provide information about file/directory creation, 
 deletion, rename and modification between two snapshots or a snapshot and a 
 normal directory. HDFS-7535 synchronize deletion and rename, then fallback to 
 the default distcp. So it still relies on default distcp to building complete 
 list of files under the source dir. This patch only puts creation and 
 modification files into the copy list based on snapshot diff report. We can 
 minimize the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-8849:
---

 Summary: fsck should report number of missing blocks with 
replication factor 1
 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor


HDFS-7165 supports reporting number of blocks with replication factor 1 in 
{{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8046) Allow better control of getContentSummary

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8046:
--
Labels: 2.6.1-candidate 2.7.2-candidate  (was: 2.6.1-candidate)

 Allow better control of getContentSummary
 -

 Key: HDFS-8046
 URL: https://issues.apache.org/jira/browse/HDFS-8046
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
  Labels: 2.6.1-candidate, 2.7.2-candidate
 Fix For: 2.8.0

 Attachments: HDFS-8046.v1.patch


 On busy clusters, users performing quota checks against a big directory 
 structure can affect the namenode performance. It has become a lot better 
 after HDFS-4995, but as clusters get bigger and busier, it is apparent that 
 we need finer grain control to avoid long read lock causing throughput drop.
 Even with unfair namesystem lock setting, a long read lock (10s of 
 milliseconds) can starve many readers and especially writers. So the locking 
 duration should be reduced, which can be done by imposing a lower 
 count-per-iteration limit in the existing implementation.  But HDFS-4995 came 
 with a fixed amount of sleep between locks. This needs to be made 
 configurable, so that {{getContentSummary()}} doesn't get exceedingly slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7894) Rolling upgrade readiness is not updated in jmx until query command is issued.

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7894:
--
Labels: 2.6.1-candidate  (was: )

 Rolling upgrade readiness is not updated in jmx until query command is issued.
 --

 Key: HDFS-7894
 URL: https://issues.apache.org/jira/browse/HDFS-7894
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Brahma Reddy Battula
Priority: Critical
  Labels: 2.6.1-candidate
 Fix For: 2.7.1

 Attachments: HDFS-7894-002.patch, HDFS-7894-003.patch, HDFS-7894.patch


 When a hdfs rolling upgrade is started and a rollback image is 
 created/uploaded, the active NN does not update its {{rollingUpgradeInfo}} 
 until it receives a query command via RPC. This results in inconsistent info 
 being showing up in the web UI and its jmx page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7446) HDFS inotify should have the ability to determine what txid it has read up to

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7446:
--
Labels: 2.6.1-candidate  (was: )

 HDFS inotify should have the ability to determine what txid it has read up to
 -

 Key: HDFS-7446
 URL: https://issues.apache.org/jira/browse/HDFS-7446
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.6.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: HDFS-7446.001.patch, HDFS-7446.002.patch, 
 HDFS-7446.003.patch


 HDFS inotify should have the ability to determine what txid it has read up 
 to.  This will allow users who want to avoid missing any events to record 
 this txid and use it to resume reading events at the spot they left off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8838) Tolerate datanode failures in DFSStripedOutputStream when the data length is small

2015-08-03 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652403#comment-14652403
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8838:
---

[~walter.k.su], thanks for showing a detailed failure case.  It is a multiple 
failure case.  I need to think about how to handle it.  Will work on it in 
HDFS-8383.  Or are you interested in working on HDFS-8383?

[~libo-intel], thanks for the suggestion.  A datanode is started in each test.  
So we already has 10 datanodes.

 Tolerate datanode failures in DFSStripedOutputStream when the data length is 
 small
 --

 Key: HDFS-8838
 URL: https://issues.apache.org/jira/browse/HDFS-8838
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
 Attachments: h8838_20150729.patch, h8838_20150731.patch


 Currently, DFSStripedOutputStream cannot tolerate datanode failures when the 
 data length is small.  We fix the bugs here and add more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7182) JMX metrics aren't accessible when NN is busy

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7182:
--
Labels: 2.6.1-candidate  (was: )

 JMX metrics aren't accessible when NN is busy
 -

 Key: HDFS-7182
 URL: https://issues.apache.org/jira/browse/HDFS-7182
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: HDFS-7182-2.patch, HDFS-7182-3.patch, HDFS-7182.patch


 HDFS-5693 has addressed all NN JMX metrics in hadoop 2.0.5. Since then couple 
 new metrics have been added. It turns out RollingUpgradeStatus requires 
 FSNamesystem read lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7314) When the DFSClient lease cannot be renewed, abort open-for-write files rather than the entire DFSClient

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7314:
--
Labels: 2.6.1-candidate 2.7.2-candidate BB2015-05-TBR  (was: BB2015-05-TBR)

 When the DFSClient lease cannot be renewed, abort open-for-write files rather 
 than the entire DFSClient
 ---

 Key: HDFS-7314
 URL: https://issues.apache.org/jira/browse/HDFS-7314
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
  Labels: 2.6.1-candidate, 2.7.2-candidate, BB2015-05-TBR
 Fix For: 2.8.0

 Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, 
 HDFS-7314-5.patch, HDFS-7314-6.patch, HDFS-7314-7.patch, HDFS-7314-8.patch, 
 HDFS-7314-9.patch, HDFS-7314.patch


 It happened in YARN nodemanger scenario. But it could happen to any long 
 running service that use cached instance of DistrbutedFileSystem.
 1. Active NN is under heavy load. So it became unavailable for 10 minutes; 
 any DFSClient request will get ConnectTimeoutException.
 2. YARN nodemanager use DFSClient for certain write operation such as log 
 aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's 
 renewLease RPC got ConnectTimeoutException.
 {noformat}
 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
 renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds.  
 Aborting ...
 {noformat}
 3. After DFSClient is in Aborted state, YARN NM can't use that cached 
 instance of DistributedFileSystem.
 {noformat}
 2014-10-29 20:26:23,991 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Failed to download rsrc...
 java.io.IOException: Filesystem closed
 at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 We can make YARN or DFSClient more tolerant to temporary NN unavailability. 
 Given the callstack is YARN - DistributedFileSystem - DFSClient, this can 
 be addressed at different layers.
 * YARN closes the DistributedFileSystem object when it receives some well 
 defined exception. Then the next HDFS call will create a new instance of 
 DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS 
 applications need to address this as well.
 * DistributedFileSystem detects Aborted DFSClient and create a new instance 
 of DFSClient. We will need to fix all the places DistributedFileSystem calls 
 DFSClient.
 * After DFSClient gets into Aborted state, it doesn't have to reject all 
 requests , instead it can retry. If NN is available again it can transition 
 to healthy state.
 Comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652421#comment-14652421
 ] 

Allen Wittenauer commented on HDFS-8849:


That's pretty much covered already.  fsck will already report the number of 
blocks that don't have the minimum replication (whether that be 1 or some 
higher number).

 fsck should report number of missing blocks with replication factor 1
 -

 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor

 HDFS-7165 supports reporting number of blocks with replication factor 1 in 
 {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
 support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7916:
--
Labels:   (was: 2.6.1-candidate)

 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for 
 infinite loop
 --

 Key: HDFS-7916
 URL: https://issues.apache.org/jira/browse/HDFS-7916
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0
Reporter: Vinayakumar B
Assignee: Rushabh S Shah
Priority: Critical
 Fix For: 2.7.1

 Attachments: HDFS-7916-01.patch, HDFS-7916-1.patch


 if any badblock found, then BPSA for StandbyNode will go for infinite times 
 to report it.
 {noformat}2015-03-11 19:43:41,528 WARN 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block 
 BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: 
 stobdtserver3/10.224.54.70:18010
 org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed 
 to report bad block 
 BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode:
 at 
 org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652427#comment-14652427
 ] 

Zhe Zhang commented on HDFS-8849:
-

Thanks for the input Allen. I guess there's still a small gap. Even when we 
know 1) the number of missing blocks; 2) number of blocks below min 
replication, it's not always possible to calculate the number of blocks meeting 
both conditions. So agreed that it's partially covered. This JIRA will just 
fill in the small gap.

 fsck should report number of missing blocks with replication factor 1
 -

 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor

 HDFS-7165 supports reporting number of blocks with replication factor 1 in 
 {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
 support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652439#comment-14652439
 ] 

Allen Wittenauer commented on HDFS-8849:


I'm not sure what benefit that number provides.  If I'm missing a block below 
min rep, I'm still going through the full fsck output to try and find it.

 fsck should report number of missing blocks with replication factor 1
 -

 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor

 HDFS-7165 supports reporting number of blocks with replication factor 1 in 
 {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
 support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652471#comment-14652471
 ] 

Allen Wittenauer commented on HDFS-8849:


bq.A replication factor of 1 indicates the data is disposable. So when 
checking fsck on a directory the user might want to separately consider this 
metric (e.g., less alarmed about the number of disposable data that's missing).

Meanwhile, back in real life, users set a repl factor of 1 to avoid quotas 
problems. I've seen it over and over and over. It's why a lot of us are 
starting to use min repl of 2.  Special casing 1 is a dangerous capitulation to 
a bad practice that should be outlawed on production systems.

 fsck should report number of missing blocks with replication factor 1
 -

 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor

 HDFS-7165 supports reporting number of blocks with replication factor 1 in 
 {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
 support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8046) Allow better control of getContentSummary

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8046:
--
Labels: 2.6.1-candidate  (was: )

 Allow better control of getContentSummary
 -

 Key: HDFS-8046
 URL: https://issues.apache.org/jira/browse/HDFS-8046
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
  Labels: 2.6.1-candidate
 Fix For: 2.8.0

 Attachments: HDFS-8046.v1.patch


 On busy clusters, users performing quota checks against a big directory 
 structure can affect the namenode performance. It has become a lot better 
 after HDFS-4995, but as clusters get bigger and busier, it is apparent that 
 we need finer grain control to avoid long read lock causing throughput drop.
 Even with unfair namesystem lock setting, a long read lock (10s of 
 milliseconds) can starve many readers and especially writers. So the locking 
 duration should be reduced, which can be done by imposing a lower 
 count-per-iteration limit in the existing implementation.  But HDFS-4995 came 
 with a fixed amount of sleep between locks. This needs to be made 
 configurable, so that {{getContentSummary()}} doesn't get exceedingly slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop

2015-08-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7916:
--
Labels: 2.6.1-candidate  (was: )

 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for 
 infinite loop
 --

 Key: HDFS-7916
 URL: https://issues.apache.org/jira/browse/HDFS-7916
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0
Reporter: Vinayakumar B
Assignee: Rushabh S Shah
Priority: Critical
  Labels: 2.6.1-candidate
 Fix For: 2.7.1

 Attachments: HDFS-7916-01.patch, HDFS-7916-1.patch


 if any badblock found, then BPSA for StandbyNode will go for infinite times 
 to report it.
 {noformat}2015-03-11 19:43:41,528 WARN 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Failed to report bad block 
 BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode: 
 stobdtserver3/10.224.54.70:18010
 org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed 
 to report bad block 
 BP-1384821822-10.224.54.68-1422634566395:blk_1079544278_5812006 to namenode:
 at 
 org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8220) Erasure Coding: StripedDataStreamer fails to handle the blocklocations which doesn't satisfy BlockGroupSize

2015-08-03 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652430#comment-14652430
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8220:
---

Some minor comment:
{code}
   if (!coordinator.getStripedDataStreamer(i).isFailed()) {
+StripedDataStreamer curStreamer = coordinator
+.getStripedDataStreamer(i);
{code}
Let's call getStripedDataStreamer before the if.  How about renaming 
curStreamer to si?  CurrentStreamer has a different meaning in 
DFSStripedOutputStream.

 Erasure Coding: StripedDataStreamer fails to handle the blocklocations which 
 doesn't satisfy BlockGroupSize
 ---

 Key: HDFS-8220
 URL: https://issues.apache.org/jira/browse/HDFS-8220
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: HDFS-8220-001.patch, HDFS-8220-002.patch, 
 HDFS-8220-003.patch, HDFS-8220-004.patch, HDFS-8220-HDFS-7285-09.patch, 
 HDFS-8220-HDFS-7285.005.patch, HDFS-8220-HDFS-7285.006.patch, 
 HDFS-8220-HDFS-7285.007.patch, HDFS-8220-HDFS-7285.007.patch, 
 HDFS-8220-HDFS-7285.008.patch


 During write operations {{StripedDataStreamer#locateFollowingBlock}} fails to 
 validate the available datanodes against the {{BlockGroupSize}}. Please see 
 the exception to understand more:
 {code}
 2015-04-22 14:56:11,313 WARN  hdfs.DFSClient (DataStreamer.java:run(538)) - 
 DataStreamer Exception
 java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 2015-04-22 14:56:11,313 INFO  hdfs.MiniDFSCluster 
 (MiniDFSCluster.java:shutdown(1718)) - Shutting down the Mini HDFS Cluster
 2015-04-22 14:56:11,313 ERROR hdfs.DFSClient 
 (DFSClient.java:closeAllFilesBeingWritten(608)) - Failed to close inode 16387
 java.io.IOException: DataStreamer Exception: 
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:544)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:1)
 Caused by: java.lang.NullPointerException
   at 
 java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:374)
   at 
 org.apache.hadoop.hdfs.StripedDataStreamer.locateFollowingBlock(StripedDataStreamer.java:157)
   at 
 org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1332)
   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:424)
   ... 1 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8849) fsck should report number of missing blocks with replication factor 1

2015-08-03 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652451#comment-14652451
 ] 

Zhe Zhang commented on HDFS-8849:
-

I guess the motivation is the same as HDFS-7165. A replication factor of 1 
indicates the data is disposable. So when checking {{fsck}} on a directory 
the user might want to separately consider this metric (e.g., less alarmed 
about the number of disposable data that's missing).

 fsck should report number of missing blocks with replication factor 1
 -

 Key: HDFS-8849
 URL: https://issues.apache.org/jira/browse/HDFS-8849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 2.7.1
Reporter: Zhe Zhang
Assignee: Zhe Zhang
Priority: Minor

 HDFS-7165 supports reporting number of blocks with replication factor 1 in 
 {{dfsadmin}} and NN metrics. But it didn't extend {{fsck}} with the same 
 support, which is the aim of this JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)