[jira] [Assigned] (HDFS-7884) NullPointerException in BlockSender

2015-03-04 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned HDFS-7884:
--

Assignee: Brahma Reddy Battula

 NullPointerException in BlockSender
 ---

 Key: HDFS-7884
 URL: https://issues.apache.org/jira/browse/HDFS-7884
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo Nicholas Sze
Assignee: Brahma Reddy Battula
Priority: Blocker

 {noformat}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 BlockSender.java:264 is shown below
 {code}
   this.volumeRef = datanode.data.getVolume(block).obtainReference();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7884) NullPointerException in BlockSender

2015-03-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347080#comment-14347080
 ] 

Brahma Reddy Battula commented on HDFS-7884:


[~szetszwo] Please re-assign to yourself, if you started work on this 
jira...Thanks

 NullPointerException in BlockSender
 ---

 Key: HDFS-7884
 URL: https://issues.apache.org/jira/browse/HDFS-7884
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo Nicholas Sze
Assignee: Brahma Reddy Battula
Priority: Blocker

 {noformat}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 BlockSender.java:264 is shown below
 {code}
   this.volumeRef = datanode.data.getVolume(block).obtainReference();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7881) TestHftpFileSystem#testSeek fails in branch-2

2015-03-04 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned HDFS-7881:
--

Assignee: Brahma Reddy Battula

 TestHftpFileSystem#testSeek fails in branch-2
 -

 Key: HDFS-7881
 URL: https://issues.apache.org/jira/browse/HDFS-7881
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula
Priority: Blocker

 TestHftpFileSystem#testSeek fails in branch-2.
 {code}
 ---
  T E S T S
 ---
 Running org.apache.hadoop.hdfs.web.TestHftpFileSystem
 Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.201 sec 
  FAILURE! - in org.apache.hadoop.hdfs.web.TestHftpFileSystem
 testSeek(org.apache.hadoop.hdfs.web.TestHftpFileSystem)  Time elapsed: 0.054 
 sec   ERROR!
 java.io.IOException: Content-Length is missing: {null=[HTTP/1.1 206 Partial 
 Content], Date=[Wed, 04 Mar 2015 05:32:30 GMT, Wed, 04 Mar 2015 05:32:30 
 GMT], Expires=[Wed, 04 Mar 2015 05:32:30 GMT, Wed, 04 Mar 2015 05:32:30 GMT], 
 Connection=[close], Content-Type=[text/plain; charset=utf-8], 
 Server=[Jetty(6.1.26)], Content-Range=[bytes 7-9/10], Pragma=[no-cache, 
 no-cache], Cache-Control=[no-cache]}
   at 
 org.apache.hadoop.hdfs.web.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:132)
   at 
 org.apache.hadoop.hdfs.web.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:104)
   at 
 org.apache.hadoop.hdfs.web.ByteRangeInputStream.read(ByteRangeInputStream.java:181)
   at java.io.FilterInputStream.read(FilterInputStream.java:83)
   at 
 org.apache.hadoop.hdfs.web.TestHftpFileSystem.testSeek(TestHftpFileSystem.java:253)
 Results :
 Tests in error: 
   TestHftpFileSystem.testSeek:253 ยป IO Content-Length is missing: 
 {null=[HTTP/1
 Tests run: 14, Failures: 0, Errors: 1, Skipped: 0
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7757) Misleading error messages in FSImage.java

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347002#comment-14347002
 ] 

Hudson commented on HDFS-7757:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #122 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/122/])
HDFS-7757. Misleading error messages in FSImage.java. (Contributed by Brahma 
Reddy Battula) (arp: rev 1004473aa612ee3703394943f25687aa5bef47ea)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java


 Misleading error messages in FSImage.java
 -

 Key: HDFS-7757
 URL: https://issues.apache.org/jira/browse/HDFS-7757
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.6.0
Reporter: Arpit Agarwal
Assignee: Brahma Reddy Battula
 Fix For: 2.7.0

 Attachments: HDFS-7757-002.patch, HDFS-7757.patch


 If a quota violation is detected while loading an image, the NameNode logs 
 scary error messages indicating a bug. However the quota violation state is 
 very easy to get into e.g.
 # Copy a 2MB file to a directory.
 # Set a disk space quota of 1MB on the directory. We are in quota violation 
 state now.
 We should reword the error messages, ideally making them warnings and 
 suggesting the administrator needs to fix the quotas:
 Relevant code:
 {code}
 LOG.error(BUG: Diskspace quota violation in image for 
 + dir.getFullPathName()
 +  quota =  + dsQuota +   consumed =  + diskspace);
 ...
   LOG.error(BUG Disk quota by storage type violation in image for 
   + dir.getFullPathName()
   +  type =  + t.toString() +  quota = 
   + typeQuota +   consumed  + typeSpace);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347036#comment-14347036
 ] 

Hudson commented on HDFS-7682:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2072 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2072/])
HDFS-7682. {{DistributedFileSystem#getFileChecksum}} of a snapshotted file 
includes non-snapshotted content. Contributed by Charles Lamb. (atm: rev 
f2d7a67a2c1d9dde10ed3171fdec65dff885afcc)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotFileLength.java


 {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes 
 non-snapshotted content
 

 Key: HDFS-7682
 URL: https://issues.apache.org/jira/browse/HDFS-7682
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Fix For: 2.7.0

 Attachments: HDFS-7682.000.patch, HDFS-7682.001.patch, 
 HDFS-7682.002.patch, HDFS-7682.003.patch


 DistributedFileSystem#getFileChecksum of a snapshotted file includes 
 non-snapshotted content.
 The reason why this happens is because DistributedFileSystem#getFileChecksum 
 simply calculates the checksum of all of the CRCs from the blocks in the 
 file. But, in the case of a snapshotted file, we don't want to include data 
 in the checksum that was appended to the last block in the file after the 
 snapshot was taken.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7869) Inconsistency in the return information while performing rolling upgrade

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347031#comment-14347031
 ] 

Hudson commented on HDFS-7869:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2072 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2072/])
HDFS-7869. Inconsistency in the return information while performing rolling 
upgrade ( Contributed by J.Andreina ) (vinayakumarb: rev 
3560180b6e9926aa3ee1357da59b28a4b4689a0d)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 Inconsistency in the return information while performing rolling upgrade
 

 Key: HDFS-7869
 URL: https://issues.apache.org/jira/browse/HDFS-7869
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina
 Fix For: 2.7.0

 Attachments: HDFS-7869.1.patch, HDFS-7869.2.patch


 Return information , while performing finalize Rolling upgrade is improper 
 ( does not gives information whether the current action is successful / not)
 {noformat}
 Rex@XXX:~/Hadoop_27/hadoop-3.0.0-SNAPSHOT/bin ./hdfs dfsadmin 
 -rollingUpgrade finalize
 FINALIZE rolling upgrade ...
 There is no rolling upgrade in progress or rolling upgrade has already been 
 finalized.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7757) Misleading error messages in FSImage.java

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347039#comment-14347039
 ] 

Hudson commented on HDFS-7757:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2072 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2072/])
HDFS-7757. Misleading error messages in FSImage.java. (Contributed by Brahma 
Reddy Battula) (arp: rev 1004473aa612ee3703394943f25687aa5bef47ea)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Misleading error messages in FSImage.java
 -

 Key: HDFS-7757
 URL: https://issues.apache.org/jira/browse/HDFS-7757
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.6.0
Reporter: Arpit Agarwal
Assignee: Brahma Reddy Battula
 Fix For: 2.7.0

 Attachments: HDFS-7757-002.patch, HDFS-7757.patch


 If a quota violation is detected while loading an image, the NameNode logs 
 scary error messages indicating a bug. However the quota violation state is 
 very easy to get into e.g.
 # Copy a 2MB file to a directory.
 # Set a disk space quota of 1MB on the directory. We are in quota violation 
 state now.
 We should reword the error messages, ideally making them warnings and 
 suggesting the administrator needs to fix the quotas:
 Relevant code:
 {code}
 LOG.error(BUG: Diskspace quota violation in image for 
 + dir.getFullPathName()
 +  quota =  + dsQuota +   consumed =  + diskspace);
 ...
   LOG.error(BUG Disk quota by storage type violation in image for 
   + dir.getFullPathName()
   +  type =  + t.toString() +  quota = 
   + typeQuota +   consumed  + typeSpace);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6565) Use jackson instead jetty json in hdfs-client

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347038#comment-14347038
 ] 

Hudson commented on HDFS-6565:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2072 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2072/])
HDFS-6565. Use jackson instead jetty json in hdfs-client. Contributed by Akira 
AJISAKA. (wheat9: rev e2262d3d18c6d5c2aa20f96920104dc07271b869)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestJsonUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


 Use jackson instead jetty json in hdfs-client
 -

 Key: HDFS-6565
 URL: https://issues.apache.org/jira/browse/HDFS-6565
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Akira AJISAKA
 Fix For: 2.7.0

 Attachments: HDFS-6565-002.patch, HDFS-6565-003.patch, 
 HDFS-6565-004.patch, HDFS-6565-005.patch, HDFS-6565.patch


 hdfs-client should use Jackson instead of jetty to parse JSON.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-03-04 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7746:
--
Attachment: h7746_20150305.patch

h7746_20150305.patch: adds TestAppendSnapshotTruncate.

 Add a test randomly mixing append, truncate and snapshot
 

 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7746_20150305.patch


 TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
 covering many test cases.  Let's add a random test for mixing many append, 
 truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7879) hdfs.dll does not export functions of the public libhdfs API

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347196#comment-14347196
 ] 

Hudson commented on HDFS-7879:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7254 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7254/])
HDFS-7879. hdfs.dll does not export functions of the public libhdfs API. 
Contributed by Chris Nauroth. (wheat9: rev 
f717dc51b27d72ad02732a8da397e4a1cc270514)
* hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 hdfs.dll does not export functions of the public libhdfs API
 

 Key: HDFS-7879
 URL: https://issues.apache.org/jira/browse/HDFS-7879
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, libhdfs
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.7.0

 Attachments: HDFS-7879.001.patch, HDFS-7879.002.patch


 HDFS-573 enabled libhdfs to be built for Windows.  This did not include 
 marking the public API functions for export in hdfs.dll though, effectively 
 making dynamic linking scenarios impossible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient

2015-03-04 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347165#comment-14347165
 ] 

Daryn Sharp commented on HDFS-7435:
---

Thanks Jing  Charles.  Addressing comments, will post later today.

 PB encoding of block reports is very inefficient
 

 Key: HDFS-7435
 URL: https://issues.apache.org/jira/browse/HDFS-7435
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, 
 HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch, HDFS-7435.patch, 
 HDFS-7435.patch, HDFS-7435.patch


 Block reports are encoded as a PB repeating long.  Repeating fields use an 
 {{ArrayList}} with default capacity of 10.  A block report containing tens or 
 hundreds of thousand of longs (3 for each replica) is extremely expensive 
 since the {{ArrayList}} must realloc many times.  Also, decoding repeating 
 fields will box the primitive longs which must then be unboxed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7879) hdfs.dll does not export functions of the public libhdfs API

2015-03-04 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7879:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~cnauroth] for the 
contribution.

 hdfs.dll does not export functions of the public libhdfs API
 

 Key: HDFS-7879
 URL: https://issues.apache.org/jira/browse/HDFS-7879
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, libhdfs
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.7.0

 Attachments: HDFS-7879.001.patch, HDFS-7879.002.patch


 HDFS-573 enabled libhdfs to be built for Windows.  This did not include 
 marking the public API functions for export in hdfs.dll though, effectively 
 making dynamic linking scenarios impossible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7879) hdfs.dll does not export functions of the public libhdfs API

2015-03-04 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-7879:
-
Summary: hdfs.dll does not export functions of the public libhdfs API  
(was: hdfs.dll does not export functions of the public libhdfs API.)

 hdfs.dll does not export functions of the public libhdfs API
 

 Key: HDFS-7879
 URL: https://issues.apache.org/jira/browse/HDFS-7879
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, libhdfs
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.7.0

 Attachments: HDFS-7879.001.patch, HDFS-7879.002.patch


 HDFS-573 enabled libhdfs to be built for Windows.  This did not include 
 marking the public API functions for export in hdfs.dll though, effectively 
 making dynamic linking scenarios impossible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7879) hdfs.dll does not export functions of the public libhdfs API

2015-03-04 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347182#comment-14347182
 ] 

Haohui Mai commented on HDFS-7879:
--

+1

 hdfs.dll does not export functions of the public libhdfs API
 

 Key: HDFS-7879
 URL: https://issues.apache.org/jira/browse/HDFS-7879
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, libhdfs
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.7.0

 Attachments: HDFS-7879.001.patch, HDFS-7879.002.patch


 HDFS-573 enabled libhdfs to be built for Windows.  This did not include 
 marking the public API functions for export in hdfs.dll though, effectively 
 making dynamic linking scenarios impossible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7434) DatanodeID hashCode should not be mutable

2015-03-04 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7434:
--
Attachment: HDFS-7434.patch

UUID becomes immutable and the hashCode of DatanodeIds.  Updated the few tests 
that relied on mutating the uuid to create new instances instead.  This 
actually makes the tests more accurate because real registrations are not using 
the exact same object reference.  DatanodeIds are now safe for collections.

 DatanodeID hashCode should not be mutable
 -

 Key: HDFS-7434
 URL: https://issues.apache.org/jira/browse/HDFS-7434
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
 Attachments: HDFS-7434.patch


 Mutable hash codes may lead to orphaned instances in a collection.  Instances 
 must always be removed prior to modification of hash code values, and 
 re-inserted.  Although current code appears to do this, the mutable hash code 
 is a landmine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7434) DatanodeID hashCode should not be mutable

2015-03-04 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-7434:
--
Status: Patch Available  (was: Open)

 DatanodeID hashCode should not be mutable
 -

 Key: HDFS-7434
 URL: https://issues.apache.org/jira/browse/HDFS-7434
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-7434.patch


 Mutable hash codes may lead to orphaned instances in a collection.  Instances 
 must always be removed prior to modification of hash code values, and 
 re-inserted.  Although current code appears to do this, the mutable hash code 
 is a landmine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7434) DatanodeID hashCode should not be mutable

2015-03-04 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp reassigned HDFS-7434:
-

Assignee: Daryn Sharp

 DatanodeID hashCode should not be mutable
 -

 Key: HDFS-7434
 URL: https://issues.apache.org/jira/browse/HDFS-7434
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-7434.patch


 Mutable hash codes may lead to orphaned instances in a collection.  Instances 
 must always be removed prior to modification of hash code values, and 
 re-inserted.  Although current code appears to do this, the mutable hash code 
 is a landmine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7879) hdfs.dll does not export functions of the public libhdfs API

2015-03-04 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347230#comment-14347230
 ] 

Chris Nauroth commented on HDFS-7879:
-

Haohui, thank you for the review and commit.

 hdfs.dll does not export functions of the public libhdfs API
 

 Key: HDFS-7879
 URL: https://issues.apache.org/jira/browse/HDFS-7879
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, libhdfs
Affects Versions: 2.6.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 2.7.0

 Attachments: HDFS-7879.001.patch, HDFS-7879.002.patch


 HDFS-573 enabled libhdfs to be built for Windows.  This did not include 
 marking the public API functions for export in hdfs.dll though, effectively 
 making dynamic linking scenarios impossible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount

2015-03-04 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347414#comment-14347414
 ] 

Brandon Li commented on HDFS-6488:
--

The unit test failure is not introduced by this patch.

 HDFS superuser unable to access user's Trash files using NFSv3 mount
 

 Key: HDFS-6488
 URL: https://issues.apache.org/jira/browse/HDFS-6488
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Affects Versions: 2.3.0
Reporter: Stephen Chu
Assignee: Brandon Li
 Attachments: HDFS-6488.001.patch, HDFS-6488.002.patch, 
 HDFS-6488.003.patch


 As hdfs superuseruser on the NFS mount, I cannot cd or ls the 
 /user/schu/.Trash directory:
 {code}
 bash-4.1$ cd .Trash/
 bash: cd: .Trash/: Permission denied
 bash-4.1$ ls -la
 total 2
 drwxr-xr-x 4 schu 2584148964 128 Jan  7 10:42 .
 drwxr-xr-x 4 hdfs 2584148964 128 Jan  6 16:59 ..
 drwx-- 2 schu 2584148964  64 Jan  7 10:45 .Trash
 drwxr-xr-x 2 hdfs hdfs64 Jan  7 10:42 tt
 bash-4.1$ ls .Trash
 ls: cannot open directory .Trash: Permission denied
 bash-4.1$
 {code}
 When using FsShell as hdfs superuser, I have superuser permissions to schu's 
 .Trash contents:
 {code}
 bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user
 drwx--   - schu supergroup  0 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu
 -rw-r--r--   1 schu supergroup  4 2014-01-07 10:48 
 /user/schu/.Trash/Current/user/schu/tf1
 {code}
 The NFSv3 logs don't produce any error when superuser tries to access schu 
 Trash contents. However, for other permission errors (e.g. schu tries to 
 delete a directory owned by hdfs), there will be a permission error in the 
 logs.
 I think this is not specific to the .Trash directory perhaps.
 I created a /user/schu/dir1 which has the same permissions as .Trash (700). 
 When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, 
 I get the same permission denied.
 {code}
 [schu@hdfs-nfs ~]$ hdfs dfs -ls
 Found 4 items
 drwx--   - schu supergroup  0 2014-01-07 10:57 .Trash
 drwx--   - schu supergroup  0 2014-01-07 11:05 dir1
 -rw-r--r--   1 schu supergroup  4 2014-01-07 11:05 tf1
 drwxr-xr-x   - hdfs hdfs0 2014-01-07 10:42 tt
 bash-4.1$ whoami
 hdfs
 bash-4.1$ pwd
 /hdfs_nfs_mount/user/schu
 bash-4.1$ cd dir1
 bash: cd: dir1: Permission denied
 bash-4.1$
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7535) Utilize Snapshot diff report for distcp

2015-03-04 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7535:

   Resolution: Fixed
Fix Version/s: 2.7.0
   Status: Resolved  (was: Patch Available)

Thanks again for the review, Nicholas! I've committed this to trunk and 
branch-2.

 Utilize Snapshot diff report for distcp
 ---

 Key: HDFS-7535
 URL: https://issues.apache.org/jira/browse/HDFS-7535
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.7.0

 Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, 
 HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch


 Currently HDFS snapshot diff report can identify file/directory creation, 
 deletion, rename and modification under a snapshottable directory. We can use 
 the diff report for distcp between the primary cluster and a backup cluster 
 to avoid unnecessary data copy. This is especially useful when there is a big 
 directory rename happening in the primary cluster: the current distcp cannot 
 detect the rename op thus this rename usually leads to large amounts of real 
 data copy.
 More details of the approach will come in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-03-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347447#comment-14347447
 ] 

Hadoop QA commented on HDFS-7746:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12702490/h7746_20150305.patch
  against trunk revision 3560180.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9732//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9732//console

This message is automatically generated.

 Add a test randomly mixing append, truncate and snapshot
 

 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h7746_20150305.patch


 TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
 covering many test cases.  Let's add a random test for mixing many append, 
 truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files

2015-03-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347370#comment-14347370
 ] 

Zhe Zhang commented on HDFS-7853:
-

Thanks Jing for the patch and sorry for the delayed review.

The overall structure looks good to me. The following implementation details 
are worth more discussions:
# You might have already started optimizing {{BlockInfoStriped#indices}} based 
on TODO msg. Just a reminder that 
{{BlockInfoStripedUnderConstruction#blockIndices}} is only applicable in 
over-replication as well. But if we get rid of indices for non-over-replicated 
{{replicas}}, we need to update {{addReplicaIfNotPresent}} as well, to always 
insert in the right position
# How about using a Map for those over-replicated replica (in addition to 
triplets and the {{replicas}} array)? It's less efficient but should simplify 
the code. It's a rare condition anyway.
# If the non-over-replicated locations in {{LocatedBlockStriped}} are sorted, 
we can remove the required {{indices}} filed in PB and add an optional field 
for the excess replicas and their indices.

 Erasure coding: extend LocatedBlocks to support reading from striped files
 --

 Key: HDFS-7853
 URL: https://issues.apache.org/jira/browse/HDFS-7853
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-7853.000.patch


 We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work 
 with striping layout (possibly an extra list specifying the index of each 
 location in the group)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-03-04 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7746:

Hadoop Flags: Reviewed

 Add a test randomly mixing append, truncate and snapshot
 

 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h7746_20150305.patch


 TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
 covering many test cases.  Let's add a random test for mixing many append, 
 truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347595#comment-14347595
 ] 

Jing Zhao commented on HDFS-7746:
-

Thanks for working on this, Nicholas! The patch looks good to me and the failed 
test should be unrelated. +1

 Add a test randomly mixing append, truncate and snapshot
 

 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h7746_20150305.patch


 TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
 covering many test cases.  Let's add a random test for mixing many append, 
 truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7782) Read a striping layout file from client side

2015-03-04 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7782:

Attachment: HDFS-7782-000.patch

Initial patch to illustrate the idea. It uses hedged reading from the group of 
DNs in the stripe and only supports pread now.

 Read a striping layout file from client side
 

 Key: HDFS-7782
 URL: https://issues.apache.org/jira/browse/HDFS-7782
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: HDFS-7782-000.patch


 If client wants to read a file, he is not necessary to know and handle what 
 layout the file is. This sub task adds logic to DFSInputStream to support 
 reading striping layout files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-7885:
--
Assignee: (was: Suresh Srinivas)

 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1522) Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant

2015-03-04 Thread Dongming Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongming Liang updated HDFS-1522:
-
Fix Version/s: 3.0.0
   Labels: patch  (was: )
 Target Version/s: 3.0.0  (was: 2.7.0)
Affects Version/s: (was: 0.21.0)
   3.0.0
 Release Note: 
This merges Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one 
constant. Hard-coded
literals of blk_ in various files are also updated to use the same constant. 
 Hadoop Flags: Reviewed
   Status: Patch Available  (was: In Progress)

Existing test cases are used for testing. The code changes are reviewed by 
Konstantin Shvachko.

 Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one 
 constant
 -

 Key: HDFS-1522
 URL: https://issues.apache.org/jira/browse/HDFS-1522
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Dongming Liang
  Labels: patch
 Fix For: 3.0.0

 Attachments: HDFS-1522.002.patch, HDFS-1522.patch


 Two semantically identical constant {{Block.BLOCK_FILE_PREFIX}} and 
 {{DataStorage.BLOCK_FILE_PREFIX}} should merged into one. Should be defined 
 in {{Block}}, imo.
 Also use cases of blok_, like in {{DirectoryScanner}} should be replaced by 
 the this constant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7884) NullPointerException in BlockSender

2015-03-04 Thread Tsz Wo Nicholas Sze (JIRA)
Tsz Wo Nicholas Sze created HDFS-7884:
-

 Summary: NullPointerException in BlockSender
 Key: HDFS-7884
 URL: https://issues.apache.org/jira/browse/HDFS-7884
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo Nicholas Sze
Priority: Blocker


{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249)
at java.lang.Thread.run(Thread.java:745)
{noformat}
BlockSender.java:264 is shown below
{code}
  this.volumeRef = datanode.data.getVolume(block).obtainReference();
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-03-04 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7746:
--
Fix Version/s: (was: 2.7.0)
   Status: Patch Available  (was: Open)

 Add a test randomly mixing append, truncate and snapshot
 

 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h7746_20150305.patch


 TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
 covering many test cases.  Let's add a random test for mixing many append, 
 truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread vitthal (Suhas) Gogate (JIRA)
vitthal (Suhas) Gogate created HDFS-7885:


 Summary: Datanode should not trust the generation stamp provided 
by client
 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Priority: Critical


Datanode should not trust the generation stamp provided by client, since it is 
prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client

2015-03-04 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347533#comment-14347533
 ] 

Arun Suresh commented on HDFS-7858:
---

This testcase failure seems unrelated..

 Improve HA Namenode Failover detection on the client
 

 Key: HDFS-7858
 URL: https://issues.apache.org/jira/browse/HDFS-7858
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Arun Suresh
Assignee: Arun Suresh
 Attachments: HDFS-7858.1.patch, HDFS-7858.2.patch, HDFS-7858.2.patch


 In an HA deployment, Clients are configured with the hostnames of both the 
 Active and Standby Namenodes.Clients will first try one of the NNs 
 (non-deterministically) and if its a standby NN, then it will respond to the 
 client to retry the request on the other Namenode.
 If the client happens to talks to the Standby first, and the standby is 
 undergoing some GC / is busy, then those clients might not get a response 
 soon enough to try the other NN.
 Proposed Approach to solve this :
 1) Since Zookeeper is already used as the failover controller, the clients 
 could talk to ZK and find out which is the active namenode before contacting 
 it.
 2) Long-lived DFSClients would have a ZK watch configured which fires when 
 there is a failover so they do not have to query ZK everytime to find out the 
 active NN
 2) Clients can also cache the last active NN in the user's home directory 
 (~/.lastNN) so that short-lived clients can try that Namenode first before 
 querying ZK



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7884) NullPointerException in BlockSender

2015-03-04 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347545#comment-14347545
 ] 

Lei (Eddy) Xu commented on HDFS-7884:
-

[~szetszwo] would you mind me to take this JIRA if you have not started on this 
yet.

Moreover, could you post the context that generates this 
{{NullPointerException}}? Thanks!

 NullPointerException in BlockSender
 ---

 Key: HDFS-7884
 URL: https://issues.apache.org/jira/browse/HDFS-7884
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Tsz Wo Nicholas Sze
Assignee: Brahma Reddy Battula
Priority: Blocker

 {noformat}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249)
   at java.lang.Thread.run(Thread.java:745)
 {noformat}
 BlockSender.java:264 is shown below
 {code}
   this.volumeRef = datanode.data.getVolume(block).obtainReference();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread vitthal (Suhas) Gogate (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vitthal (Suhas) Gogate updated HDFS-7885:
-
Assignee: Suresh Srinivas

 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Assignee: Suresh Srinivas
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1522) Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant

2015-03-04 Thread Dongming Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347527#comment-14347527
 ] 

Dongming Liang commented on HDFS-1522:
--

Hi Konstantin, Thank you for reviewing this fix. I will update as suggested and 
then submit a new patch.

 Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one 
 constant
 -

 Key: HDFS-1522
 URL: https://issues.apache.org/jira/browse/HDFS-1522
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Dongming Liang
 Attachments: HDFS-1522.patch


 Two semantically identical constant {{Block.BLOCK_FILE_PREFIX}} and 
 {{DataStorage.BLOCK_FILE_PREFIX}} should merged into one. Should be defined 
 in {{Block}}, imo.
 Also use cases of blok_, like in {{DirectoryScanner}} should be replaced by 
 the this constant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7434) DatanodeID hashCode should not be mutable

2015-03-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347546#comment-14347546
 ] 

Hadoop QA commented on HDFS-7434:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12702505/HDFS-7434.patch
  against trunk revision 3560180.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9733//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9733//console

This message is automatically generated.

 DatanodeID hashCode should not be mutable
 -

 Key: HDFS-7434
 URL: https://issues.apache.org/jira/browse/HDFS-7434
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-7434.patch


 Mutable hash codes may lead to orphaned instances in a collection.  Instances 
 must always be removed prior to modification of hash code values, and 
 re-inserted.  Although current code appears to do this, the mutable hash code 
 is a landmine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1522) Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant

2015-03-04 Thread Dongming Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongming Liang updated HDFS-1522:
-
Attachment: HDFS-1522.002.patch

Updated to adjust column size and fix import with review feedback. 

 Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one 
 constant
 -

 Key: HDFS-1522
 URL: https://issues.apache.org/jira/browse/HDFS-1522
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Dongming Liang
 Attachments: HDFS-1522.002.patch, HDFS-1522.patch


 Two semantically identical constant {{Block.BLOCK_FILE_PREFIX}} and 
 {{DataStorage.BLOCK_FILE_PREFIX}} should merged into one. Should be defined 
 in {{Block}}, imo.
 Also use cases of blok_, like in {{DirectoryScanner}} should be replaced by 
 the this constant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7535) Utilize Snapshot diff report for distcp

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347349#comment-14347349
 ] 

Hudson commented on HDFS-7535:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7256 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7256/])
HDFS-7535. Utilize Snapshot diff report for distcp. Contributed by Jing Zhao. 
(jing9: rev ed70fa142cabdbc1065e4dbbc95e99c8850c4751)
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpSync.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptionSwitch.java
* hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DiffInfo.java
* hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestOptionsParser.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/CopyListing.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpConstants.java
* 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java


 Utilize Snapshot diff report for distcp
 ---

 Key: HDFS-7535
 URL: https://issues.apache.org/jira/browse/HDFS-7535
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: distcp, snapshots
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: 2.7.0

 Attachments: HDFS-7535.000.patch, HDFS-7535.001.patch, 
 HDFS-7535.002.patch, HDFS-7535.003.patch, HDFS-7535.004.patch


 Currently HDFS snapshot diff report can identify file/directory creation, 
 deletion, rename and modification under a snapshottable directory. We can use 
 the diff report for distcp between the primary cluster and a backup cluster 
 to avoid unnecessary data copy. This is especially useful when there is a big 
 directory rename happening in the primary cluster: the current distcp cannot 
 detect the rename op thus this rename usually leads to large amounts of real 
 data copy.
 More details of the approach will come in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4929) [NNBench mark] Lease mismatch error when running with multiple mappers

2015-03-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347635#comment-14347635
 ] 

Brahma Reddy Battula commented on HDFS-4929:


Kindly review the patch !!!

 [NNBench mark] Lease mismatch error when running with multiple mappers
 --

 Key: HDFS-4929
 URL: https://issues.apache.org/jira/browse/HDFS-4929
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: benchmarks
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical
 Attachments: HDFS4929.patch


 Command :
 ./yarn jar 
 ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.1-tests.jar 
 nnbench -operation create_write -numberOfFiles 1000 -blockSize 268435456 
 -bytesToWrite 102400 -baseDir /benchmarks/NNBench`hostname -s` 
 -replicationFactorPerFile 3 -maps 100 -reduces 10
 Trace :
 013-06-21 10:44:53,763 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 7 on 9005, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
 192.168.105.214:36320: error: 
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on /benchmarks/NNBenchlinux-185/data/file_linux-214__0 owned by 
 DFSClient_attempt_1371782327901_0001_m_48_0_1383437860_1 but is accessed 
 by DFSClient_attempt_1371782327901_0001_m_84_0_1880545303_1
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on /benchmarks/NNBenchlinux-185/data/file_linux-214__0 owned by 
 DFSClient_attempt_1371782327901_0001_m_48_0_1383437860_1 but is accessed 
 by DFSClient_attempt_1371782327901_0001_m_84_0_1880545303_1
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2351)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2098)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2019)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:213)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:52012)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:435)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:925)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1710)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1706)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4929) [NNBench mark] Lease mismatch error when running with multiple mappers

2015-03-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347632#comment-14347632
 ] 

Brahma Reddy Battula commented on HDFS-4929:


Any thoughts on this issue..? Thanks..

 [NNBench mark] Lease mismatch error when running with multiple mappers
 --

 Key: HDFS-4929
 URL: https://issues.apache.org/jira/browse/HDFS-4929
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: benchmarks
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical
 Attachments: HDFS4929.patch


 Command :
 ./yarn jar 
 ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.1-tests.jar 
 nnbench -operation create_write -numberOfFiles 1000 -blockSize 268435456 
 -bytesToWrite 102400 -baseDir /benchmarks/NNBench`hostname -s` 
 -replicationFactorPerFile 3 -maps 100 -reduces 10
 Trace :
 013-06-21 10:44:53,763 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
 7 on 9005, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
 192.168.105.214:36320: error: 
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on /benchmarks/NNBenchlinux-185/data/file_linux-214__0 owned by 
 DFSClient_attempt_1371782327901_0001_m_48_0_1383437860_1 but is accessed 
 by DFSClient_attempt_1371782327901_0001_m_84_0_1880545303_1
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: Lease mismatch 
 on /benchmarks/NNBenchlinux-185/data/file_linux-214__0 owned by 
 DFSClient_attempt_1371782327901_0001_m_48_0_1383437860_1 but is accessed 
 by DFSClient_attempt_1371782327901_0001_m_84_0_1880545303_1
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2351)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2098)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2019)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:213)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:52012)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:435)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:925)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1710)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1706)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347646#comment-14347646
 ] 

Zhe Zhang commented on HDFS-7855:
-

Thanks Bo for the patch and Jing for the review. From Jenkins console it seems 
{{TestFileLengthOnClusterRestart}} was killed. Bo maybe you want to test it 
locally?

The Javadoc warning seems to be a blank {{@return}} statement

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HDFS-7878:
--

Assignee: Sergey Shelukhin

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347637#comment-14347637
 ] 

Jing Zhao commented on HDFS-7855:
-

Thanks for working on this, Bo! The patch looks good to me in general. Some 
minors:
# The new DFSPacket class does not need to be public
# In {{DFSPacket#writeTo}}, {{assert checksumPos == dataStart;}} should be
  unnecessary since it's always true. We can use this chance to delete it.
# The getter methods for final fields (e.g., {{isHeartbeatPacket}} and 
{{getSeqno}}) does not need to acquire object's monitor.
# Looks like we can also convert {{lastPacketInBlock}} to be final since its
  modification pattern is always like:
{code}
currentPacket = createPacket(0, 0, bytesCurBlock, currentSeqno++);
currentPacket.lastPacketInBlock = true;
{code}
# We have a javadoc warning reported by Jenkins.

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HDFS-7878:
---
Attachment: HDFS-7878.patch

this patch exposes fileId via normal FileStatus.
I can add separate API instead, which will be a smaller change, but it will add 
a separate API... please advise

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
 Attachments: HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347653#comment-14347653
 ] 

Sergey Shelukhin commented on HDFS-7878:


[~jingzhao] can you please review?

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
 Attachments: HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HDFS-7878:
---
Status: Patch Available  (was: Open)

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
 Attachments: HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347654#comment-14347654
 ] 

Jing Zhao commented on HDFS-7878:
-

Instead of defining a new DFSFileStatus class, can we define a new 
{{getFileId}} API inside of DistributedFileSystem and use HdfsFileStatus there?

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7872) Erasure Coding: INodeFile.dumpTreeRecursively() supports to print striped blocks

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347662#comment-14347662
 ] 

Jing Zhao commented on HDFS-7872:
-

Thanks for working on this, [~tfukudom]!

Instead of checking and retrieving FileWithStripedBlocksFeature, how about 
directly calling {{getBlocks}} (which can handle both contiguous and striped 
blocks) and printing out the first element of the result?

 Erasure Coding: INodeFile.dumpTreeRecursively() supports to print striped 
 blocks
 

 Key: HDFS-7872
 URL: https://issues.apache.org/jira/browse/HDFS-7872
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takuya Fukudome
Assignee: Takuya Fukudome
 Attachments: HDFS-7872.1.patch


 We need to let dumpTreeRecursively be able to print striped blocks (or maybe 
 just the first striped block).
 {code}
   @Override
   public void dumpTreeRecursively(PrintWriter out, StringBuilder prefix,
   final int snapshotId) {
 super.dumpTreeRecursively(out, prefix, snapshotId);
 out.print(, fileSize= + computeFileSize(snapshotId));
 // only compare the first block
 out.print(, blocks=);
 out.print(blocks == null || blocks.length == 0? null: blocks[0]);
 // TODO print striped blocks
 out.println();
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5215) dfs.datanode.du.reserved is not taking effect as it's not considered while getting the available space

2015-03-04 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347690#comment-14347690
 ] 

Brahma Reddy Battula commented on HDFS-5215:


Any thoughts on this jira..?

 dfs.datanode.du.reserved is not taking effect as it's not considered while 
 getting the available space
 --

 Key: HDFS-5215
 URL: https://issues.apache.org/jira/browse/HDFS-5215
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: HDFS-5215.patch


 {code}public long getAvailable() throws IOException {
 long remaining = getCapacity()-getDfsUsed();
 long available = usage.getAvailable();
 if (remaining  available) {
   remaining = available;
 }
 return (remaining  0) ? remaining : 0;
   } 
 {code}
 Here we are not considering the reserved space while getting the Available 
 Space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5215) dfs.datanode.du.reserved is not taking effect as it's not considered while getting the available space

2015-03-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347693#comment-14347693
 ] 

Hadoop QA commented on HDFS-5215:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12610538/HDFS-5215.patch
  against trunk revision ed70fa1.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9736//console

This message is automatically generated.

 dfs.datanode.du.reserved is not taking effect as it's not considered while 
 getting the available space
 --

 Key: HDFS-5215
 URL: https://issues.apache.org/jira/browse/HDFS-5215
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: HDFS-5215.patch


 {code}public long getAvailable() throws IOException {
 long remaining = getCapacity()-getDfsUsed();
 long available = usage.getAvailable();
 if (remaining  available) {
   remaining = available;
 }
 return (remaining  0) ? remaining : 0;
   } 
 {code}
 Here we are not considering the reserved space while getting the Available 
 Space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7864) Erasure Coding: Update safemode calculation for striped blocks

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347700#comment-14347700
 ] 

Jing Zhao commented on HDFS-7864:
-

bq. I think striped blocks in this Jira's description actually denotes to 
BlockGroup in HDFSErasureCodingDesign-20150206.pdf , right?

Yes, you're right. We should update the doc later for this.

bq. so a BlockGroup should be counted as 9 received blocks, right?

A block group (or a striped block) consists of 9 blocks in this case. But for 
safemode calculation, I guess we can still count it as 1 block. This is because 
for a single striped block (suppose it's m+k), we need a different logic to 
declare it's safe (as increase the safe amount by 1): the NN has received m 
blocks belonging to it from block reports (since we need m blocks to recover 
all the data/parity blocks).

 Erasure Coding: Update safemode calculation for striped blocks
 --

 Key: HDFS-7864
 URL: https://issues.apache.org/jira/browse/HDFS-7864
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: GAO Rui

 We need to update the safemode calculation for striped blocks. Specifically, 
 each striped block now consists of multiple data/parity blocks stored in 
 corresponding DataNodes. The current code's calculation is thus inconsistent: 
 each striped block is only counted as 1 expected block, while each of its 
 member block may increase the number of received blocks by 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7826) Erasure Coding: Update INodeFile quota computation for striped blocks

2015-03-04 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7826:

Status: Open  (was: Patch Available)

Thanks for the work, Kai! Currently the Jenkins can only run a patch against 
trunk thus we do not need to submit patch for a feature branch.

I will review the patch later.

 Erasure Coding: Update INodeFile quota computation for striped blocks
 -

 Key: HDFS-7826
 URL: https://issues.apache.org/jira/browse/HDFS-7826
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Kai Sasaki
 Attachments: HDFS-7826.1.patch


 Currently INodeFile's quota computation only considers contiguous blocks 
 (i.e., {{INodeFile#blocks}}). We need to update it to support striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HDFS-7878:
---
Attachment: HDFS-7878.01.patch

Updated to just have the API...

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HDFS-7878.01.patch, HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7434) DatanodeID hashCode should not be mutable

2015-03-04 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347762#comment-14347762
 ] 

Kihwal Lee commented on HDFS-7434:
--

+1 looks good to me.

 DatanodeID hashCode should not be mutable
 -

 Key: HDFS-7434
 URL: https://issues.apache.org/jira/browse/HDFS-7434
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-7434.patch


 Mutable hash codes may lead to orphaned instances in a collection.  Instances 
 must always be removed prior to modification of hash code values, and 
 re-inserted.  Although current code appears to do this, the mutable hash code 
 is a landmine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7671) hdfs user guide should point to the common rack awareness doc

2015-03-04 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347780#comment-14347780
 ] 

Kai Sasaki commented on HDFS-7671:
--

[~aw] Thank you for comments, but I have a question about first one and I 
cannot figure out second and third points. 

 Links should be to relative paths when referring to other documentation in 
 the tree.
The link where I want to attach belongs to common docs. So the target is not 
included under document root of HDFS docs. Is there any good way to make 
relative path to other hadoop project docs?

 The actual documentation you want to link to is the html file generated by 
 the RackAwareness.md file.
Do you mean that actual link should be to the .html file or the other? 

 The rewrite shouldn't include any technical details since the rack awareness 
 doc covers all of that.
Should all current other contents except for the direct link to common doc be 
removed? Or should I move current contents to common docs?

 hdfs user guide should point to the common rack awareness doc
 -

 Key: HDFS-7671
 URL: https://issues.apache.org/jira/browse/HDFS-7671
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Allen Wittenauer
Assignee: Kai Sasaki
 Attachments: HDFS-7671.1.patch


 HDFS user guide has a section on rack awareness that should really just be a 
 pointer to the common doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7740) Test truncate with DataNodes restarting

2015-03-04 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347921#comment-14347921
 ] 

Konstantin Shvachko commented on HDFS-7740:
---

Yi, do you have any more context on this. Should we create a jira?

 Test truncate with DataNodes restarting
 ---

 Key: HDFS-7740
 URL: https://issues.apache.org/jira/browse/HDFS-7740
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Affects Versions: 2.7.0
Reporter: Konstantin Shvachko
Assignee: Yi Liu
 Fix For: 2.7.0

 Attachments: HDFS-7740.001.patch, HDFS-7740.002.patch, 
 HDFS-7740.003.patch


 Add a test case, which ensures replica consistency when DNs are failing and 
 restarting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347938#comment-14347938
 ] 

Hadoop QA commented on HDFS-7878:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12702631/HDFS-7878.01.patch
  against trunk revision c66c3ac.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestFileTruncate

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestDecommission
org.apache.hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9737//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9737//console

This message is automatically generated.

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HDFS-7878.01.patch, HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-1522) Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant

2015-03-04 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-1522:
--
  Resolution: Fixed
   Fix Version/s: (was: 3.0.0)
  2.7.0
Target Version/s: 2.7.0  (was: 3.0.0)
  Status: Resolved  (was: Patch Available)

I just committed this. Congratulations Dongming.

 Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one 
 constant
 -

 Key: HDFS-1522
 URL: https://issues.apache.org/jira/browse/HDFS-1522
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Dongming Liang
  Labels: patch
 Fix For: 2.7.0

 Attachments: HDFS-1522.002.patch, HDFS-1522.patch


 Two semantically identical constant {{Block.BLOCK_FILE_PREFIX}} and 
 {{DataStorage.BLOCK_FILE_PREFIX}} should merged into one. Should be defined 
 in {{Block}}, imo.
 Also use cases of blok_, like in {{DirectoryScanner}} should be replaced by 
 the this constant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348017#comment-14348017
 ] 

Hudson commented on HDFS-7746:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7262 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7262/])
HDFS-7746. Add a test randomly mixing append, truncate and snapshot operations. 
(szetszwo: rev ded0200e9c98dea960db756bb208ff475d710e28)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestAppendSnapshotTruncate.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Add a test randomly mixing append, truncate and snapshot
 

 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7746_20150305.patch


 TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
 covering many test cases.  Let's add a random test for mixing many append, 
 truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348016#comment-14348016
 ] 

Jing Zhao commented on HDFS-7855:
-

bq. One question is that lastPacketInBlock is set to false when creating a 
DFSPacket object. And it is modified to true when DFSOutputStream wants it to 
be the last packet

For the last packet, looks like we always create a new packet and immediately 
set lastPacketInBlock to true? If this is the case, we only need to modify 
DFSPacket's constructor method and set lastPacketInBlock's value in the 
constructor. In this way we can convert it to final.

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7434) DatanodeID hashCode should not be mutable

2015-03-04 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7434:
-
   Resolution: Fixed
Fix Version/s: 2.7.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed this to branch-2 and trunk. Thanks for fixing this, Daryn.

 DatanodeID hashCode should not be mutable
 -

 Key: HDFS-7434
 URL: https://issues.apache.org/jira/browse/HDFS-7434
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 2.7.0

 Attachments: HDFS-7434.patch


 Mutable hash codes may lead to orphaned instances in a collection.  Instances 
 must always be removed prior to modification of hash code values, and 
 re-inserted.  Although current code appears to do this, the mutable hash code 
 is a landmine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7826) Erasure Coding: Update INodeFile quota computation for striped blocks

2015-03-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347812#comment-14347812
 ] 

Zhe Zhang commented on HDFS-7826:
-

Oops I didn't realize this JIRA handles {{computeFileSize}}. I was about to 
suggest the following code under HDFS-7749. 
{code}
  public final long computeFileSize(boolean includesLastUcBlock,
  boolean usePreferredBlockSize4LastUcBlock) {
BlockInfo[] blocksInfile = isStriped() ?
getStripedBlocksFeature().getBlocks() : blocks;
if (blocksInfile == null || blocksInfile.length == 0) {
  return 0;
}
final int last = blocksInfile.length - 1;
//check if the last block is BlockInfoUnderConstruction
long size = blocksInfile[last].getNumBytes();
if (blocksInfile[last] instanceof BlockInfoContiguousUnderConstruction ||
blocksInfile[last] instanceof BlockInfoStripedUnderConstruction) {
   if (!includesLastUcBlock) {
 size = 0;
   } else if (usePreferredBlockSize4LastUcBlock) {
 size = getPreferredBlockSize();
   }
}
//sum other blocks
for(int i = 0; i  last; i++) {
  size += blocksInfile[i].getNumBytes();
}
return size;
  }
{code}

 Erasure Coding: Update INodeFile quota computation for striped blocks
 -

 Key: HDFS-7826
 URL: https://issues.apache.org/jira/browse/HDFS-7826
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Kai Sasaki
 Attachments: HDFS-7826.1.patch


 Currently INodeFile's quota computation only considers contiguous blocks 
 (i.e., {{INodeFile#blocks}}). We need to update it to support striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files

2015-03-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347836#comment-14347836
 ] 

Zhe Zhang commented on HDFS-7853:
-

The extended {{DFSStripedInputStream}} uses {{LocatedStripedBlock}} like the 
following:
{code}
 private LocatedBlock[] parseStripedBlockGroup(LocatedBlock bg) {
LocatedBlock[] lbs = new LocatedBlock[HdfsConstants.NUM_DATA_BLOCKS];
for (short i = 0; i  HdfsConstants.NUM_DATA_BLOCKS; i++) {
  ExtendedBlock blk = new ExtendedBlock(bg.getBlock());
  short j = bg instanceof LocatedStripedBlock ?
  ((LocatedStripedBlock) bg).getBlockIndicies()[i] : i;
  blk.setBlockId(bg.getBlock().getBlockId() + i);

  lbs[j] = new LocatedBlock(blk, new DatanodeInfo[]{bg.getLocations()[i]},
  new String[]{bg.getStorageIDs()[i]},
  new StorageType[]{bg.getStorageTypes()[i]},
  bg.getStartOffset() + i * cellSize, bg.isCorrupt(), null);
}
return lbs;
  }
{code}

My 
[test|https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14347808page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14347808]
 writes a file with 2 blocks and reads it back. I found {{blockIndices}} is 
always sequential ({{getBlockIndicies()\[i\]}} always equal to i ). But the 
stored locations are not necessarily sorted based on indices in the group. So 
sometimes the test passes and sometimes it gets ReplicaNotFound error.

My test manually applied HDFS-7729 though. Without it, I guess we should add 
synthetic block reports to unit-test the above point?

 Erasure coding: extend LocatedBlocks to support reading from striped files
 --

 Key: HDFS-7853
 URL: https://issues.apache.org/jira/browse/HDFS-7853
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-7853.000.patch


 We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work 
 with striping layout (possibly an extra list specifying the index of each 
 location in the group)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7844) Create an off-heap hash table implementation

2015-03-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347903#comment-14347903
 ] 

Colin Patrick McCabe commented on HDFS-7844:


{{CommonConfigurationKeys.java}}: add {{hadoop.memory.manager}} configuration 
key, which controls whether to use on-heap or off-heap (basically)

{{MemoryManager.java}}: inferface which all memory managers implement.  Factory 
methods for creating one from a Hadoop {{Configuration}}.

{{ByteArrayMemoryManager.java}}: a safe on-heap memory manager.  Does 
extensive verification of memory accesses to ensure that they're valid.

{{NativeMemoryManager.java}}: off-heap memory manager which delegates to 
{{Unsafe}}, which basically calls {{malloc}}.

{{ProbingHashSet.java}}: an off-heap hash table implementation.  It uses 
probing rather than separate chaining (as suggested by the name), and doubles 
in size when it becomes more than half full, to maintain O(1) access.

{{ProbingHashSet#Adaptor}}: a class which deals with storing and loading 
entries from the hash table.  In the case of the block map, this just means 
putting a single 8-byte long (the address of the off-heap BlocksInfo data) into 
the correct place in the hash table.

{{ProbingHashSet#Key}}: these are keys that can be compared and so forth.  Used 
to search for an element, and when we have an element, these are used to 
determine what it's hash code is and what else it is identical to.

There is also an iterator provided which can iterate over the whole hash 
table... very similar to HashTable#Iterator.  One twist is that it can still be 
used if the table is modified after the iterator is created.  It will return 
reduced consistency results in that case, but still be useful for many cases.

 Create an off-heap hash table implementation
 

 Key: HDFS-7844
 URL: https://issues.apache.org/jira/browse/HDFS-7844
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7836
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7844-scl.001.patch


 Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7746) Add a test randomly mixing append, truncate and snapshot

2015-03-04 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-7746:
--
   Resolution: Fixed
Fix Version/s: 2.7.0
   Status: Resolved  (was: Patch Available)

Thanks Jing for reviewing the patch.

I have committed this.

 Add a test randomly mixing append, truncate and snapshot
 

 Key: HDFS-7746
 URL: https://issues.apache.org/jira/browse/HDFS-7746
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Fix For: 2.7.0

 Attachments: h7746_20150305.patch


 TestFileTruncate.testSnapshotWithAppendTruncate already does a good job for 
 covering many test cases.  Let's add a random test for mixing many append, 
 truncate and snapshot operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements

2015-03-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347999#comment-14347999
 ] 

Colin Patrick McCabe commented on HDFS-7836:


I think it makes sense to have total heapsize in a JMX counter, if it's not 
already there somewhere.  It's a pretty easy number to get from the OS, 
although there are confounding factors like shared libraries and shared memory 
segments.  But in general, those should be minor contributors.

 BlockManager Scalability Improvements
 -

 Key: HDFS-7836
 URL: https://issues.apache.org/jira/browse/HDFS-7836
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: BlockManagerScalabilityImprovementsDesign.pdf


 Improvements to BlockManager scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348012#comment-14348012
 ] 

Li Bo commented on HDFS-7855:
-

hi, Zhe. I run {{TestFileLengthOnClusterRestart}} locally and it worked well 
and took 17s. Maybe the server is busy when running the unit tests. I will 
upload a new patch and try again.

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread Vinod Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347787#comment-14347787
 ] 

Vinod Nair commented on HDFS-7885:
--

Suresh, 

Wanted to give you some background and context here

As you may know we are working with Pivotal and they are going to white
label HDP as PHD starting with PHD 3.0.

They have raised the issue of 3 patches that were in their Hadoop distro
that are critical for HAWQ to work.  We have asked them to create Apache
JIRAs so our experts can evaluate them and consider for inclusion in HDP.

Hopefully they will add some more detail soon

-- 
Regards,

Vinod K. Nair 


Partner Product Management | (650) 224-9741 | vn...@hortonworks.com



5470 Great America Parkway, Santa Clara, CA








 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7729) Add logic to DFSOutputStream to support writing a file in striping layout

2015-03-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347819#comment-14347819
 ] 

Zhe Zhang commented on HDFS-7729:
-

In my tests I set {{cellSize}} to a smaller value (1024) and got 
IndexOutOfBound error in {{encode}}. [~libo-intel] could you take a look when 
you resume work on this JIRA (after HDFS-7793)?

 Add logic to DFSOutputStream to support writing a file in striping layout 
 --

 Key: HDFS-7729
 URL: https://issues.apache.org/jira/browse/HDFS-7729
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: Codec-tmp.patch, HDFS-7729-001.patch, 
 HDFS-7729-002.patch, HDFS-7729-003.patch, HDFS-7729-004.patch, 
 HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, 
 HDFS-7729-008.patch, HDFS-7729-009.patch


 If client wants to directly write a file striping layout, we need to add some 
 logic to DFSOutputStream.  DFSOutputStream needs multiple DataStreamers to 
 write each cell of a stripe to a remote datanode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7782) Read a striping layout file from client side

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347831#comment-14347831
 ] 

Jing Zhao commented on HDFS-7782:
-

Thanks for the work, Zhe! I will review the patch soon.

 Read a striping layout file from client side
 

 Key: HDFS-7782
 URL: https://issues.apache.org/jira/browse/HDFS-7782
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: HDFS-7782-000.patch


 If client wants to read a file, he is not necessary to know and handle what 
 layout the file is. This sub task adds logic to DFSInputStream to support 
 reading striping layout files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347850#comment-14347850
 ] 

Hadoop QA commented on HDFS-7878:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12702620/HDFS-7878.patch
  against trunk revision ed70fa1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.web.TestWebHDFS

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9735//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9735//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9735//console

This message is automatically generated.

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HDFS-7878.01.patch, HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files

2015-03-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347876#comment-14347876
 ] 

Zhe Zhang commented on HDFS-7853:
-

bq. I guess here the code should be ... + j); ?
Good catch! It doesn't fix the test error though, since j is always equal to i 
right now.

 Erasure coding: extend LocatedBlocks to support reading from striped files
 --

 Key: HDFS-7853
 URL: https://issues.apache.org/jira/browse/HDFS-7853
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-7853.000.patch


 We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work 
 with striping layout (possibly an extra list specifying the index of each 
 location in the group)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1522) Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant

2015-03-04 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347922#comment-14347922
 ] 

Konstantin Shvachko commented on HDFS-1522:
---

+1 on the patch.
Test failure is unrelated. Will commit.

 Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one 
 constant
 -

 Key: HDFS-1522
 URL: https://issues.apache.org/jira/browse/HDFS-1522
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Dongming Liang
  Labels: patch
 Fix For: 3.0.0

 Attachments: HDFS-1522.002.patch, HDFS-1522.patch


 Two semantically identical constant {{Block.BLOCK_FILE_PREFIX}} and 
 {{DataStorage.BLOCK_FILE_PREFIX}} should merged into one. Should be defined 
 in {{Block}}, imo.
 Also use cases of blok_, like in {{DirectoryScanner}} should be replaced by 
 the this constant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7826) Erasure Coding: Update INodeFile quota computation for striped blocks

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347985#comment-14347985
 ] 

Jing Zhao commented on HDFS-7826:
-

Thanks Kai! 

bq. Current `getBlocks` can handle striped blocks. So its length is `m` + `k` 
(m data blocks and k parity blocks)

{{getBlocks}} returns a striped block array. Each element in that array is a 
striped block consisting of m data blocks and k parity blocks. The length of 
{{getBlocks}}'s result is not related to m and k.

 Erasure Coding: Update INodeFile quota computation for striped blocks
 -

 Key: HDFS-7826
 URL: https://issues.apache.org/jira/browse/HDFS-7826
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Kai Sasaki
 Attachments: HDFS-7826.1.patch


 Currently INodeFile's quota computation only considers contiguous blocks 
 (i.e., {{INodeFile#blocks}}). We need to update it to support striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7887) Asynchronous native RPC v9 client

2015-03-04 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348008#comment-14348008
 ] 

Haohui Mai commented on HDFS-7887:
--

Given the popularity of integrating Hadoop with other native applications, 
there are several attempts to implement a native RPC library. The most recent 
attempts are HADOOP-10389 and HDFS-7013. The jira proposes to combine the 
previous efforts and creates a unified libraries. The existing implementations 
in HADOOP-10389 and HDFS-7013 have several drawbacks:

* The implementation is tightly coupled with the native HDFS client, making it 
unavailable for YARN.
* Both HADOOP-10389 and HDFS-7013 only provide synchronous APIs. It fails to be 
a building block for an asynchronous higher-level client.
* HDFS-7013 uses C++ exceptions extensively throughout the code. The community 
has expressed the concerns on C++ exceptions and would like to get them removed 
before the code is merged.

This jira proposes to implement the following components:

* A compiler that generates stubs from protobuf definitions, which can be taken 
from HADOOP-10389.
* An asynchronous runtime for the Hadoop RPC
* Supports for SASL, wire encryption, and Kerberos authentication, which can be 
taken from HDFS-7013.
* Unit tests, many of which can be taken from HADOOP-10389 and HDFS-7013.



 Asynchronous native RPC v9 client
 -

 Key: HDFS-7887
 URL: https://issues.apache.org/jira/browse/HDFS-7887
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Haohui Mai

 There are more and more integration happening between Hadoop and applications 
 that are implemented using languages other than Java.
 To access Hadoop, applications either have to go through JNI (e.g. libhdfs), 
 or to reverse engineer the Hadoop RPC protocol. (e.g. snakebite). 
 Unfortunately, neither of them are satisfactory:
 * Integrating with JNI requires running a JVM inside the application. Some 
 applications (e.g., real-time processing, MPP database) does not want the 
 footprints and GC behavior of the JVM.
 * The Hadoop RPC protocol has a rich feature set including wire encryption, 
 SASL, Kerberos authentication. Many 3rd-party implementations can fully cover 
 the feature sets thus they might work in limited environment.
 This jira is to propose implementing an Hadoop RPC library in C++ that 
 provides a common ground to implement higher-level native client for HDFS, 
 YARN, and MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348024#comment-14348024
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7885:
---

I see.  The problem is only in legacy.blockreader.local, which uses the 
generation stamp passed by client.  Just have checked the remote block reader.  
It does not have such bug.

 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Assignee: Tsz Wo Nicholas Sze
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348026#comment-14348026
 ] 

Li Bo commented on HDFS-7855:
-

Right, if we modify DFSPacket's constructor we can make {{lastPacketInBlock}} 
as final. I will change the constructor in new patch.

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7433) DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize lookup performance

2015-03-04 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347784#comment-14347784
 ] 

Kihwal Lee commented on HDFS-7433:
--

[~daryn] Do you think HDFS-7434 was the cause of the test failure?

 DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize 
 lookup performance
 --

 Key: HDFS-7433
 URL: https://issues.apache.org/jira/browse/HDFS-7433
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch


 The datanode map is currently a {{TreeMap}}.  For many thousands of 
 datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
 Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7671) hdfs user guide should point to the common rack awareness doc

2015-03-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347789#comment-14347789
 ] 

Allen Wittenauer commented on HDFS-7671:


bq. The link where I want to attach belongs to common docs. So the target is 
not included under document root of HDFS docs. Is there any good way to make 
relative path to other hadoop project docs?  

bq.  Do you mean that actual link should be to the .html file or the other? 

You want something like:

{code}
 [Rack Awareness](../../hadoop-project-dist/hadoop-common/RackAwareness.html)
{code}

This will point to the rack awareness' html file after the mvn build.

bq. Should all current other contents except for the direct link to common doc 
be removed? Or should I move current contents to common docs?

I think the current content (well, the theory and practice, not word for word, 
obviously) is already covered in that doc.  So this section just needs some 
wordsmithing on why someone should go follow that link to that important topic. 
:)

 hdfs user guide should point to the common rack awareness doc
 -

 Key: HDFS-7671
 URL: https://issues.apache.org/jira/browse/HDFS-7671
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Allen Wittenauer
Assignee: Kai Sasaki
 Attachments: HDFS-7671.1.patch


 HDFS user guide has a section on rack awareness that should really just be a 
 pointer to the common doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7826) Erasure Coding: Update INodeFile quota computation for striped blocks

2015-03-04 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347970#comment-14347970
 ] 

Kai Sasaki commented on HDFS-7826:
--

Thank you for reviewing.
I understand about 1. I'll update.

I have a question about 2. Current `getBlocks` can handle striped blocks. So 
its length is `m` + `k` (m data blocks and k parity blocks), right?
I think I can simply calculate storage space usage by adding all these block 
size in block list returned from `getBlocks` because it includes data blocks 
and parity blocks. Is there any oversight or misunderstanding?

 Erasure Coding: Update INodeFile quota computation for striped blocks
 -

 Key: HDFS-7826
 URL: https://issues.apache.org/jira/browse/HDFS-7826
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Kai Sasaki
 Attachments: HDFS-7826.1.patch


 Currently INodeFile's quota computation only considers contiguous blocks 
 (i.e., {{INodeFile#blocks}}). We need to update it to support striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7853) Erasure coding: extend LocatedBlocks to support reading from striped files

2015-03-04 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347868#comment-14347868
 ] 

Jing Zhao commented on HDFS-7853:
-

Thanks for the review, Zhe!

bq. Just a reminder that BlockInfoStripedUnderConstruction#blockIndices is only 
applicable in over-replication as well.

This may not be true. For a newly created BlockInfoStripedUC, {{blockIndices}} 
is not useful since we simply assign block (in the group) to storage according 
to the target storage sequence. However, if the namenode restarts before the 
block is completed, the replicas array is later rebuilt based on the incoming 
block reports, and we need to record the corresponding block index somewhere 
(or keep some null elements in the ReplicaUC array).

bq. If the non-over-replicated locations in LocatedBlockStriped are sorted

Here a sorted array may not be enough, e.g., if we miss several blocks in the 
middle. Then we may still need an extra index array in the msg.

bq. blk.setBlockId(bg.getBlock().getBlockId() + i);

I guess here the code should be ... + j); ?

 Erasure coding: extend LocatedBlocks to support reading from striped files
 --

 Key: HDFS-7853
 URL: https://issues.apache.org/jira/browse/HDFS-7853
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Jing Zhao
 Attachments: HDFS-7853.000.patch


 We should extend {{LocatedBlocks}} class so {{getBlockLocations}} can work 
 with striping layout (possibly an extra list specifying the index of each 
 location in the group)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-7844) Create an off-heap hash table implementation

2015-03-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-7844 started by Colin Patrick McCabe.
--
 Create an off-heap hash table implementation
 

 Key: HDFS-7844
 URL: https://issues.apache.org/jira/browse/HDFS-7844
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7836
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7844-scl.002.patch


 Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7878) API - expose an unique file identifier

2015-03-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347889#comment-14347889
 ] 

Sergey Shelukhin commented on HDFS-7878:


This QA was for old patch...

 API - expose an unique file identifier
 --

 Key: HDFS-7878
 URL: https://issues.apache.org/jira/browse/HDFS-7878
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HDFS-7878.01.patch, HDFS-7878.patch


 See HDFS-487.
 Even though that is resolved as duplicate, the ID is actually not exposed by 
 the JIRA it supposedly duplicates.
 INode ID for the file should be easy to expose; alternatively ID could be 
 derived from block IDs, to account for appends...
 This is useful e.g. for cache key by file, to make sure cache stays correct 
 when file is overwritten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7844) Create an off-heap hash table implementation

2015-03-04 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7844:
---
Attachment: HDFS-7844-scl.001.patch

 Create an off-heap hash table implementation
 

 Key: HDFS-7844
 URL: https://issues.apache.org/jira/browse/HDFS-7844
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7836
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-7844-scl.001.patch


 Create an off-heap hash table implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7887) Asynchronous native RPC v9 client

2015-03-04 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347992#comment-14347992
 ] 

Allen Wittenauer commented on HDFS-7887:


Is this really a good, long term strategy given our use of protobuf now that 
gRPC exists?

 Asynchronous native RPC v9 client
 -

 Key: HDFS-7887
 URL: https://issues.apache.org/jira/browse/HDFS-7887
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Haohui Mai

 There are more and more integration happening between Hadoop and applications 
 that are implemented using languages other than Java.
 To access Hadoop, applications either have to go through JNI (e.g. libhdfs), 
 or to reverse engineer the Hadoop RPC protocol. (e.g. snakebite). 
 Unfortunately, neither of them are satisfactory:
 * Integrating with JNI requires running a JVM inside the application. Some 
 applications (e.g., real-time processing, MPP database) does not want the 
 footprints and GC behavior of the JVM.
 * The Hadoop RPC protocol has a rich feature set including wire encryption, 
 SASL, Kerberos authentication. Many 3rd-party implementations can fully cover 
 the feature sets thus they might work in limited environment.
 This jira is to propose implementing an Hadoop RPC library in C++ that 
 provides a common ground to implement higher-level native client for HDFS, 
 YARN, and MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7729) Add logic to DFSOutputStream to support writing a file in striping layout

2015-03-04 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348066#comment-14348066
 ] 

Li Bo commented on HDFS-7729:
-

hi, Zhe, I will check this problem soon. This JIRA has been discussing a lot on 
modifying {{DFSOutputStream}}, how about creating a new JIRA for subclassing 
{{DFSOutputStream}} based on HDFS-7793? 

 Add logic to DFSOutputStream to support writing a file in striping layout 
 --

 Key: HDFS-7729
 URL: https://issues.apache.org/jira/browse/HDFS-7729
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: Codec-tmp.patch, HDFS-7729-001.patch, 
 HDFS-7729-002.patch, HDFS-7729-003.patch, HDFS-7729-004.patch, 
 HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, 
 HDFS-7729-008.patch, HDFS-7729-009.patch


 If client wants to directly write a file striping layout, we need to add some 
 logic to DFSOutputStream.  DFSOutputStream needs multiple DataStreamers to 
 write each cell of a stripe to a remote datanode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-03-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347808#comment-14347808
 ] 

Zhe Zhang commented on HDFS-7285:
-

To follow up on the PoC prototype plan, I created a very rough test by manually 
applying the following patches, and it seems to work -- based on the 
[description | 
https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14339006page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14339006]
 above :)
# HDFS-7729 (this one needs major refactor after HDFS-7793)
# HDFS-7853
# HDFS-7782

A few bugs have been found and I'll post them under individual JIRAs.

 Erasure Coding Support inside HDFS
 --

 Key: HDFS-7285
 URL: https://issues.apache.org/jira/browse/HDFS-7285
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
 Attachments: ECAnalyzer.py, ECParser.py, 
 HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, 
 HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, 
 fsimage-analysis-20150105.pdf


 Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
 of data reliability, comparing to the existing HDFS 3-replica approach. For 
 example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
 with storage overhead only being 40%. This makes EC a quite attractive 
 alternative for big data storage, particularly for cold data. 
 Facebook had a related open source project called HDFS-RAID. It used to be 
 one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
 on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
 cold files that are intended not to be appended anymore; 3) the pure Java EC 
 coding implementation is extremely slow in practical use. Due to these, it 
 might not be a good idea to just bring HDFS-RAID back.
 We (Intel and Cloudera) are working on a design to build EC into HDFS that 
 gets rid of any external dependencies, makes it self-contained and 
 independently maintained. This design lays the EC feature on the storage type 
 support and considers compatible with existing HDFS features like caching, 
 snapshot, encryption, high availability and etc. This design will also 
 support different EC coding schemes, implementations and policies for 
 different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
 ISA-L library), an implementation can greatly improve the performance of EC 
 encoding/decoding and makes the EC solution even more attractive. We will 
 post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1522) Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one constant

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347962#comment-14347962
 ] 

Hudson commented on HDFS-1522:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7260 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7260/])
HDFS-1522. Combine two BLOCK_FILE_PREFIX constants into one. Contributed by 
Dongming Liang. (shv: rev 430b5371883e22abb65f37c3e3d4afc3f421fc89)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplication.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestCrcCorruption.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCorruption.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java


 Merge Block.BLOCK_FILE_PREFIX and DataStorage.BLOCK_FILE_PREFIX into one 
 constant
 -

 Key: HDFS-1522
 URL: https://issues.apache.org/jira/browse/HDFS-1522
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Konstantin Shvachko
Assignee: Dongming Liang
  Labels: patch
 Fix For: 2.7.0

 Attachments: HDFS-1522.002.patch, HDFS-1522.patch


 Two semantically identical constant {{Block.BLOCK_FILE_PREFIX}} and 
 {{DataStorage.BLOCK_FILE_PREFIX}} should merged into one. Should be defined 
 in {{Block}}, imo.
 Also use cases of blok_, like in {{DirectoryScanner}} should be replaced by 
 the this constant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7887) Asynchronous native RPC v9 client

2015-03-04 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-7887:


 Summary: Asynchronous native RPC v9 client
 Key: HDFS-7887
 URL: https://issues.apache.org/jira/browse/HDFS-7887
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Haohui Mai


There are more and more integration happening between Hadoop and applications 
that are implemented using languages other than Java.

To access Hadoop, applications either have to go through JNI (e.g. libhdfs), or 
to reverse engineer the Hadoop RPC protocol. (e.g. snakebite). Unfortunately, 
neither of them are satisfactory:

* Integrating with JNI requires running a JVM inside the application. Some 
applications (e.g., real-time processing, MPP database) does not want the 
footprints and GC behavior of the JVM.
* The Hadoop RPC protocol has a rich feature set including wire encryption, 
SASL, Kerberos authentication. Many 3rd-party implementations can fully cover 
the feature sets thus they might work in limited environment.

This jira is to propose implementing an Hadoop RPC library in C++ that provides 
a common ground to implement higher-level native client for HDFS, YARN, and 
MapReduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7872) Erasure Coding: INodeFile.dumpTreeRecursively() supports to print striped blocks

2015-03-04 Thread Takuya Fukudome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Fukudome updated HDFS-7872:
--
Attachment: HDFS-7872.2.patch

Thank you for reviewing and your advices, [~jingzhao]! I have attached a new 
patch which directly call {{getBlocks}}. Can you review it for me? Thank you.

 Erasure Coding: INodeFile.dumpTreeRecursively() supports to print striped 
 blocks
 

 Key: HDFS-7872
 URL: https://issues.apache.org/jira/browse/HDFS-7872
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Takuya Fukudome
Assignee: Takuya Fukudome
 Attachments: HDFS-7872.1.patch, HDFS-7872.2.patch


 We need to let dumpTreeRecursively be able to print striped blocks (or maybe 
 just the first striped block).
 {code}
   @Override
   public void dumpTreeRecursively(PrintWriter out, StringBuilder prefix,
   final int snapshotId) {
 super.dumpTreeRecursively(out, prefix, snapshotId);
 out.print(, fileSize= + computeFileSize(snapshotId));
 // only compare the first block
 out.print(, blocks=);
 out.print(blocks == null || blocks.length == 0? null: blocks[0]);
 // TODO print striped blocks
 out.println();
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7855:

Status: Patch Available  (was: In Progress)

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, 
 HDFS-7855-006.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7855:

Attachment: HDFS-7855-006.patch

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch, 
 HDFS-7855-006.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7855) Separate class Packet from DFSOutputStream

2015-03-04 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7855:

Status: In Progress  (was: Patch Available)

 Separate class Packet from DFSOutputStream
 --

 Key: HDFS-7855
 URL: https://issues.apache.org/jira/browse/HDFS-7855
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7855-001.patch, HDFS-7855-002.patch, 
 HDFS-7855-003.patch, HDFS-7855-004.patch, HDFS-7855-005.patch


 Class Packet is an inner class in DFSOutputStream and also used by 
 DataStreamer. This sub task separates Packet out of DFSOutputStream to aid 
 the separation in HDFS-7854.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347779#comment-14347779
 ] 

Suresh Srinivas commented on HDFS-7885:
---

[~vgogate], can you please add details about what test you are doing and what 
issues you are seeing?


 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-7885:
--
Comment: was deleted

(was: Suresh, 

Wanted to give you some background and context here

As you may know we are working with Pivotal and they are going to white
label HDP as PHD starting with PHD 3.0.

They have raised the issue of 3 patches that were in their Hadoop distro
that are critical for HAWQ to work.  We have asked them to create Apache
JIRAs so our experts can evaluate them and consider for inclusion in HDP.

Hopefully they will add some more detail soon

-- 
Regards,

Vinod K. Nair 


Partner Product Management | (650) 224-9741 | vn...@hortonworks.com



5470 Great America Parkway, Santa Clara, CA






)

 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes

2015-03-04 Thread Yi Liu (JIRA)
Yi Liu created HDFS-7886:


 Summary: TestFileTruncate#testTruncateWithDataNodesRestart runs 
timeout sometimes
 Key: HDFS-7886
 URL: https://issues.apache.org/jira/browse/HDFS-7886
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: Yi Liu
Priority: Minor


https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes

2015-03-04 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu reassigned HDFS-7886:


Assignee: Yi Liu

 TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
 

 Key: HDFS-7886
 URL: https://issues.apache.org/jira/browse/HDFS-7886
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor

 https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7740) Test truncate with DataNodes restarting

2015-03-04 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347939#comment-14347939
 ] 

Yi Liu commented on HDFS-7740:
--

I create HDFS-7886, thanks.

 Test truncate with DataNodes restarting
 ---

 Key: HDFS-7740
 URL: https://issues.apache.org/jira/browse/HDFS-7740
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Affects Versions: 2.7.0
Reporter: Konstantin Shvachko
Assignee: Yi Liu
 Fix For: 2.7.0

 Attachments: HDFS-7740.001.patch, HDFS-7740.002.patch, 
 HDFS-7740.003.patch


 Add a test case, which ensures replica consistency when DNs are failing and 
 restarting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread Zhanwei Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347963#comment-14347963
 ] 

Zhanwei Wang commented on HDFS-7885:


In function {{getBlockLocalPathInfo}} the input parameter {{block}} is passed 
by the client. Since client will buffer file's metadata, 
block.getGenerationStamp() may be older then the real generationStamp on 
Datanode. Datanode will report that it cannot find metadata file and then 
client fail to read.

{code}
@Override // FsDatasetSpi
  public BlockLocalPathInfo getBlockLocalPathInfo(ExtendedBlock block)
  throws IOException {
File datafile = getBlockFile(block);
File metafile = FsDatasetUtil.getMetaFile(datafile, 
block.getGenerationStamp());
BlockLocalPathInfo info = new BlockLocalPathInfo(block,
datafile.getAbsolutePath(), metafile.getAbsolutePath());
return info;
  }
{code}

Test case

enable read-circuit and set {{dfs.client.use.legacy.blockreader.local}} to true
1) crete a file with two blocks.
2) open it for read, but not read. (client fetch block metadata)
3) append to it. (increase generation stamp of last block)
4) continue to read. (will fail)

 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7885) Datanode should not trust the generation stamp provided by client

2015-03-04 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze reassigned HDFS-7885:
-

Assignee: Tsz Wo Nicholas Sze

 Datanode should not trust the generation stamp provided by client
 -

 Key: HDFS-7885
 URL: https://issues.apache.org/jira/browse/HDFS-7885
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.2.0
Reporter: vitthal (Suhas) Gogate
Assignee: Tsz Wo Nicholas Sze
Priority: Critical

 Datanode should not trust the generation stamp provided by client, since it 
 is prefetched and buffered in client, and concurrent append may increase it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7434) DatanodeID hashCode should not be mutable

2015-03-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347805#comment-14347805
 ] 

Hudson commented on HDFS-7434:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7258 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7258/])
HDFS-7434. DatanodeID hashCode should not be mutable. Contributed by Daryn 
Sharp. (kihwal: rev 722b4794693d8bad1dee0ca5c2f99030a08402f9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/DatanodeRegistration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestComputeInvalidateWork.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDatanodeProtocolRetryPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 DatanodeID hashCode should not be mutable
 -

 Key: HDFS-7434
 URL: https://issues.apache.org/jira/browse/HDFS-7434
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Fix For: 2.7.0

 Attachments: HDFS-7434.patch


 Mutable hash codes may lead to orphaned instances in a collection.  Instances 
 must always be removed prior to modification of hash code values, and 
 re-inserted.  Although current code appears to do this, the mutable hash code 
 is a landmine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >