[jira] [Commented] (HDFS-4340) Update addBlock() to inculde inode id as additional argument

2013-01-21 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13558885#comment-13558885
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4340:
--

Some more comments below:

- Since startFileInternal(..) is not changed, appendFileInt(..) do not need to 
return file status.

- the old ClientProtocol.addBlock(..) should be removed.

- checkLease(String src, String holder, INode file) is not needed.  Only 
getAdditionalBlock(..) calls it and fileId is in the parameter list.




 Update addBlock() to inculde inode id as additional argument
 

 Key: HDFS-4340
 URL: https://issues.apache.org/jira/browse/HDFS-4340
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client, namenode
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch, 
 HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4350) Make enabling of stale marking on read and write paths independent

2013-01-21 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559010#comment-13559010
 ] 

Andrew Wang commented on HDFS-4350:
---

Todd's patch looks good to me. I ran the failed tests a couple times locally 
and they passed, and earlier run on this jira were fine.

 Make enabling of stale marking on read and write paths independent
 --

 Key: HDFS-4350
 URL: https://issues.apache.org/jira/browse/HDFS-4350
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-4350-1.patch, hdfs-4350-2.patch, hdfs-4350-3.patch, 
 hdfs-4350-4.patch, hdfs-4350.txt


 Marking of datanodes as stale for the read and write path was introduced in 
 HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, 
 {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently 
 exists a dependency, since you cannot enable write marking without also 
 enabling read marking, since the first key enables both checking of staleness 
 and read marking.
 I propose renaming the first key to 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled 
 if either of the keys are set. This will allow read and write marking to be 
 enabled independently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4131) Add a tool to print the diff between two snapshots and diff of a snapshot from the current tree

2013-01-21 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4131:


Attachment: HDFS-4131.002.patch

Update the patch based on HDFS-4414+4131.002.patch in HDFS-4414: fix the code 
for checking if the metadata of a directory has been changed between snapshots.

 Add a tool to print the diff between two snapshots and diff of a snapshot 
 from the current tree
 ---

 Key: HDFS-4131
 URL: https://issues.apache.org/jira/browse/HDFS-4131
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: Snapshot (HDFS-2802)
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: HDFS-4131.001.patch, HDFS-4131.002.patch


 This jira tracks tool to print diff between an two snapshots at a given path. 
 The tool will also print the difference between the current directory and the 
 given snapshot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-4131) Add a tool to print the diff between two snapshots and diff of a snapshot from the current tree

2013-01-21 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao reassigned HDFS-4131:
---

Assignee: Jing Zhao  (was: Suresh Srinivas)

 Add a tool to print the diff between two snapshots and diff of a snapshot 
 from the current tree
 ---

 Key: HDFS-4131
 URL: https://issues.apache.org/jira/browse/HDFS-4131
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: Snapshot (HDFS-2802)
Reporter: Suresh Srinivas
Assignee: Jing Zhao
 Attachments: HDFS-4131.001.patch, HDFS-4131.002.patch


 This jira tracks tool to print diff between an two snapshots at a given path. 
 The tool will also print the difference between the current directory and the 
 given snapshot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4416) change dfs.datanode.domain.socket.path to dfs.domain.socket.path

2013-01-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559013#comment-13559013
 ] 

Todd Lipcon commented on HDFS-4416:
---

+1, committing momentarily

 change dfs.datanode.domain.socket.path to dfs.domain.socket.path
 

 Key: HDFS-4416
 URL: https://issues.apache.org/jira/browse/HDFS-4416
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client, performance
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-4416.001.patch, HDFS-4416.002.patch, 
 HDFS-4416.003.patch, HDFS-4416.004.patch


 {{dfs.datanode.domain.socket.path}} is used by both clients and the DataNode, 
 so it might be best to avoid putting 'datanode' in the name.  Most of the 
 configuration keys that have 'datanode' in the name apply only to the DN.
 Also, should change __PORT__ to _PORT to be consistent with _HOST, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4416) change dfs.datanode.domain.socket.path to dfs.domain.socket.path

2013-01-21 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-4416.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to branch. Thanks, Colin.

 change dfs.datanode.domain.socket.path to dfs.domain.socket.path
 

 Key: HDFS-4416
 URL: https://issues.apache.org/jira/browse/HDFS-4416
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client, performance
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-4416.001.patch, HDFS-4416.002.patch, 
 HDFS-4416.003.patch, HDFS-4416.004.patch


 {{dfs.datanode.domain.socket.path}} is used by both clients and the DataNode, 
 so it might be best to avoid putting 'datanode' in the name.  Most of the 
 configuration keys that have 'datanode' in the name apply only to the DN.
 Also, should change __PORT__ to _PORT to be consistent with _HOST, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2554) Add separate metrics for missing blocks with desired replication level 1

2013-01-21 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-2554:


Target Version/s:   (was: )
  Status: Open  (was: Patch Available)

 Add separate metrics for missing blocks with desired replication level 1
 

 Key: HDFS-2554
 URL: https://issues.apache.org/jira/browse/HDFS-2554
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Andy Isaacson
Priority: Minor
 Attachments: hdfs-2554-1.txt, hdfs-2554.txt


 Some users use replication level set to 1 for datasets which are unimportant 
 and can be lost with no worry (eg the output of terasort tests). But other 
 data on the cluster is important and should not be lost. It would be useful 
 to separate the metric for missing blocks by the desired replication level of 
 those blocks, so that one could ignore missing blocks at repl 1 while still 
 alerting on missing blocks with higher desired replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4414) Create a DiffReport class to represent the diff between snapshots to end users

2013-01-21 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559027#comment-13559027
 ] 

Aaron T. Myers commented on HDFS-4414:
--

This seems like a great feature to add a public-facing (unstable or evolving) 
programmatic API for. Given that, consider moving this API to the HdfsAdmin 
class instead of DistributedFileSystem, which is marked only LimitedPrivate to 
MR and HBase?

 Create a DiffReport class to represent the diff between snapshots to end users
 --

 Key: HDFS-4414
 URL: https://issues.apache.org/jira/browse/HDFS-4414
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-4414.001.patch, HDFS-4414+4131.002.patch


 HDFS-4131 computes the difference between two snapshots (or between a 
 snapshot and the current tree). In this jira we create a DiffReport class to 
 represent the diff to end users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4417) HDFS-347: fix case where local reads get disabled incorrectly

2013-01-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559116#comment-13559116
 ] 

Todd Lipcon commented on HDFS-4417:
---

{code}
-  private Peer newPeer(InetSocketAddress addr) throws IOException {
+  private Peer newRemotePeer(InetSocketAddress addr) throws IOException {
{code}

How about {{newTcpPeer}}? Remote is kind of vague.



{code}
+  public static DomainSocket getClosedSocket() {
+return new DomainSocket(, -1);
+  }
{code}

This doesn't seem like a reasonable thing to expose. Instead, since it's just 
used from tests, could you just create a mock DomainSocket object which throws 
ClosedChannelException on write?



I think the changes to PeerCache are a little over-complicated... why not just 
have two separate PeerCaches, one for each type of peer?


 HDFS-347: fix case where local reads get disabled incorrectly
 -

 Key: HDFS-4417
 URL: https://issues.apache.org/jira/browse/HDFS-4417
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client, performance
Reporter: Todd Lipcon
Assignee: Colin Patrick McCabe
 Attachments: HDFS-4417.002.patch, hdfs-4417.txt


 In testing HDFS-347 against HBase (thanks [~jdcryans]) we ran into the 
 following case:
 - a workload is running which puts a bunch of local sockets in the PeerCache
 - the workload abates for a while, causing the sockets to go stale (ie the 
 DN side disconnects after the keepalive timeout)
 - the workload starts again
 In this case, the local socket retrieved from the cache failed the 
 newBlockReader call, and it incorrectly disabled local sockets on that host. 
 This is similar to an earlier bug HDFS-3376, but not quite the same.
 The next issue we ran into is that, once this happened, it never tried local 
 sockets again, because the cache held lots of TCP sockets. Since we always 
 managed to get a cached socket to the local node, it didn't bother trying 
 local read again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4237) Add unit tests for HTTP-based filesystems against secure MiniDFSCluster

2013-01-21 Thread Stephen Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-4237:
--

Attachment: HDFS-4237.patch.007

Thank you for the review, Andy. I've uploaded a new patch. In it...

I've removed the tab characters.

I used 200 * 1024 * 1024 instead of the bitshift.

I converted FileSystemContractBaseTest (and the classes that extend it) to 
JUnit4. Previously, it was written in Junit3 style (extends TestCase), but 
Junit3 TestCase and Junit4 Assume are incompatible, e.g. HDFS-3966.

How does me adding a section on running/developing secure unit tests in the 
Developer Documentation in http://wiki.apache.org/hadoop/ sound? Is there a 
better place for documentation?

 Add unit tests for HTTP-based filesystems against secure MiniDFSCluster
 ---

 Key: HDFS-4237
 URL: https://issues.apache.org/jira/browse/HDFS-4237
 Project: Hadoop HDFS
  Issue Type: Test
  Components: security, test, webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-4237.patch.001, HDFS-4237.patch.007


 Now that we can start a secure MiniDFSCluster (HADOOP-9004), we need more 
 security unit tests.
 A good area to add secure tests is the HTTP-based filesystems (WebHDFS, 
 HttpFs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4237) Add unit tests for HTTP-based filesystems against secure MiniDFSCluster

2013-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559219#comment-13559219
 ] 

Hadoop QA commented on HDFS-4237:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12565859/HDFS-4237.patch.007
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 11 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestSecureWebHdfsFileSystemContract

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3861//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3861//console

This message is automatically generated.

 Add unit tests for HTTP-based filesystems against secure MiniDFSCluster
 ---

 Key: HDFS-4237
 URL: https://issues.apache.org/jira/browse/HDFS-4237
 Project: Hadoop HDFS
  Issue Type: Test
  Components: security, test, webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-4237.patch.001, HDFS-4237.patch.007


 Now that we can start a secure MiniDFSCluster (HADOOP-9004), we need more 
 security unit tests.
 A good area to add secure tests is the HTTP-based filesystems (WebHDFS, 
 HttpFs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4237) Add unit tests for HTTP-based filesystems against secure MiniDFSCluster

2013-01-21 Thread Stephen Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559226#comment-13559226
 ] 

Stephen Chu commented on HDFS-4237:
---

Woops, I forgot the Assume check in TestSecureWebHdfsFileSystemContract.

 Add unit tests for HTTP-based filesystems against secure MiniDFSCluster
 ---

 Key: HDFS-4237
 URL: https://issues.apache.org/jira/browse/HDFS-4237
 Project: Hadoop HDFS
  Issue Type: Test
  Components: security, test, webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-4237.patch.001, HDFS-4237.patch.007


 Now that we can start a secure MiniDFSCluster (HADOOP-9004), we need more 
 security unit tests.
 A good area to add secure tests is the HTTP-based filesystems (WebHDFS, 
 HttpFs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4237) Add unit tests for HTTP-based filesystems against secure MiniDFSCluster

2013-01-21 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559231#comment-13559231
 ] 

Andy Isaacson commented on HDFS-4237:
-

{noformat}
+  String address = 127.0.0.1: + port;
{noformat}
this line grew some trailing whitespace.

{{SecureHdfsTestUtil.java}} license comment has trailing whitespace.

{noformat}
+ * Our unit tests use 127.0.0.1/localhost to address the host running
+ * the tests. However, WebHDFS secure authentication using localhost is
+ * not allowed (kerberos authentication will complain it can't find
+ * the server). The actual hostname must be used. Therefore, to run
+ * the secure WebHDFS tests in your test environment, make 127.0.0.1
+ * resolve to the actual hostname.
{noformat}

I'm not sure this is an acceptable requirement, but let's go ahead and get it 
checked in as is.  Worst case we just back out this code.
(It would be better to teach the tests how to run in a reasonable environment 
where the hostname resolves to the actual eth0 address or similar.  This may 
mean that it's impossible to do jUnit style tests of Kerberized security.)

bq. How does me adding a section on running/developing secure unit tests in the 
Developer Documentation in http://wiki.apache.org/hadoop/ sound? Is there a 
better place for documentation?

A wiki page sounds like an excellent start.  I think it belongs on a new page 
but you can use your judgment if you find a page where it fits in.


 Add unit tests for HTTP-based filesystems against secure MiniDFSCluster
 ---

 Key: HDFS-4237
 URL: https://issues.apache.org/jira/browse/HDFS-4237
 Project: Hadoop HDFS
  Issue Type: Test
  Components: security, test, webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Stephen Chu
Assignee: Stephen Chu
 Attachments: HDFS-4237.patch.001, HDFS-4237.patch.007


 Now that we can start a secure MiniDFSCluster (HADOOP-9004), we need more 
 security unit tests.
 A good area to add secure tests is the HTTP-based filesystems (WebHDFS, 
 HttpFs).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4340) Update addBlock() to inculde inode id as additional argument

2013-01-21 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-4340:
-

Attachment: HDFS-4340.patch

 Update addBlock() to inculde inode id as additional argument
 

 Key: HDFS-4340
 URL: https://issues.apache.org/jira/browse/HDFS-4340
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client, namenode
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch, 
 HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4417) HDFS-347: fix case where local reads get disabled incorrectly

2013-01-21 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4417:
---

Attachment: HDFS-4417.003.patch

 HDFS-347: fix case where local reads get disabled incorrectly
 -

 Key: HDFS-4417
 URL: https://issues.apache.org/jira/browse/HDFS-4417
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client, performance
Reporter: Todd Lipcon
Assignee: Colin Patrick McCabe
 Attachments: HDFS-4417.002.patch, HDFS-4417.003.patch, hdfs-4417.txt


 In testing HDFS-347 against HBase (thanks [~jdcryans]) we ran into the 
 following case:
 - a workload is running which puts a bunch of local sockets in the PeerCache
 - the workload abates for a while, causing the sockets to go stale (ie the 
 DN side disconnects after the keepalive timeout)
 - the workload starts again
 In this case, the local socket retrieved from the cache failed the 
 newBlockReader call, and it incorrectly disabled local sockets on that host. 
 This is similar to an earlier bug HDFS-3376, but not quite the same.
 The next issue we ran into is that, once this happened, it never tried local 
 sockets again, because the cache held lots of TCP sockets. Since we always 
 managed to get a cached socket to the local node, it didn't bother trying 
 local read again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4417) HDFS-347: fix case where local reads get disabled incorrectly

2013-01-21 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559277#comment-13559277
 ] 

Colin Patrick McCabe commented on HDFS-4417:


bq. How about newTcpPeer? Remote is kind of vague.

Agree.

Using a mock for DomainSocket also worked out well.

For PeerCache, I tried out the two-cache solution, but it started getting 
pretty complicated, since we refer to the cache in many places.  Instead, I 
just added a boolean to the cache key.

In {{TestParallelShortCircuitReadUnCached}}, since this *is* a regression test 
for HDFS-4417, I figured I needed some way to make sure that we were not 
falling back on TCP sockets to read.  So I added 
{{DFSInputStream#tcpReadsDisabledForTesting}}.

I considered several other solutions.  Any solution that makes TCP sockets 
unusable, like setting a bad {{SocketFactory}}, runs into trouble because the 
first part of the test needs to create the files that we're reading.  Killing 
the {{DataNode#dataXceiverServer}} thread after doing the writes seemed like a 
promising approach, but it caused exceptions in the {{DFSOutputStream}} worker 
threads, which led to the (only) {{DataNode}} getting kicked out of the 
cluster.  Another approach is to create a subclass for {{DFSInputStream}} that 
overrides {{DFSInputStream#newTcpPeer}} to throw an exception.  However, 
getting a {{DFSClient}} to return this subclass is difficult.  Possibly 
Mockito's partial mocks could help here.

 HDFS-347: fix case where local reads get disabled incorrectly
 -

 Key: HDFS-4417
 URL: https://issues.apache.org/jira/browse/HDFS-4417
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client, performance
Reporter: Todd Lipcon
Assignee: Colin Patrick McCabe
 Attachments: HDFS-4417.002.patch, HDFS-4417.003.patch, hdfs-4417.txt


 In testing HDFS-347 against HBase (thanks [~jdcryans]) we ran into the 
 following case:
 - a workload is running which puts a bunch of local sockets in the PeerCache
 - the workload abates for a while, causing the sockets to go stale (ie the 
 DN side disconnects after the keepalive timeout)
 - the workload starts again
 In this case, the local socket retrieved from the cache failed the 
 newBlockReader call, and it incorrectly disabled local sockets on that host. 
 This is similar to an earlier bug HDFS-3376, but not quite the same.
 The next issue we ran into is that, once this happened, it never tried local 
 sockets again, because the cache held lots of TCP sockets. Since we always 
 managed to get a cached socket to the local node, it didn't bother trying 
 local read again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4340) Update addBlock() to inculde inode id as additional argument

2013-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559300#comment-13559300
 ] 

Hadoop QA commented on HDFS-4340:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12565880/HDFS-4340.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3862//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3862//console

This message is automatically generated.

 Update addBlock() to inculde inode id as additional argument
 

 Key: HDFS-4340
 URL: https://issues.apache.org/jira/browse/HDFS-4340
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client, namenode
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch, 
 HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4403) DFSClient can infer checksum type when not provided by reading first byte

2013-01-21 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-4403:
--

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks for reviewing, Aaron.

 DFSClient can infer checksum type when not provided by reading first byte
 -

 Key: HDFS-4403
 URL: https://issues.apache.org/jira/browse/HDFS-4403
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: hdfs-4403.txt, hdfs-4403.txt


 HDFS-3177 added the checksum type to OpBlockChecksumResponseProto, but the 
 new protobuf field is optional, with a default of CRC32. This means that this 
 API, when used against an older cluster (like earlier 0.23 releases) will 
 falsely return CRC32 even if that cluster has written files with CRC32C. This 
 can cause issues for distcp, for example.
 Instead of defaulting the protobuf field to CRC32, we can leave it with no 
 default, and if the OpBlockChecksumResponseProto has no checksum type set, 
 the client can send OP_READ_BLOCK to read the first byte of the block, then 
 grab the checksum type out of that response (which has always been present)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4403) DFSClient can infer checksum type when not provided by reading first byte

2013-01-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559329#comment-13559329
 ] 

Hudson commented on HDFS-4403:
--

Integrated in Hadoop-trunk-Commit #3265 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3265/])
HDFS-4403. DFSClient can infer checksum type when not provided by reading 
first byte. Contributed by Todd Lipcon. (Revision 1436730)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1436730
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto


 DFSClient can infer checksum type when not provided by reading first byte
 -

 Key: HDFS-4403
 URL: https://issues.apache.org/jira/browse/HDFS-4403
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: hdfs-4403.txt, hdfs-4403.txt


 HDFS-3177 added the checksum type to OpBlockChecksumResponseProto, but the 
 new protobuf field is optional, with a default of CRC32. This means that this 
 API, when used against an older cluster (like earlier 0.23 releases) will 
 falsely return CRC32 even if that cluster has written files with CRC32C. This 
 can cause issues for distcp, for example.
 Instead of defaulting the protobuf field to CRC32, we can leave it with no 
 default, and if the OpBlockChecksumResponseProto has no checksum type set, 
 the client can send OP_READ_BLOCK to read the first byte of the block, then 
 grab the checksum type out of that response (which has always been present)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4417) HDFS-347: fix case where local reads get disabled incorrectly

2013-01-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559333#comment-13559333
 ] 

Todd Lipcon commented on HDFS-4417:
---

{code}
+  @VisibleForTesting
+  public void killDataXceiverServer() {
+if (dataXceiverServer != null) {
+  ((DataXceiverServer) this.dataXceiverServer.getRunnable()).kill();
+  this.dataXceiverServer.interrupt();
+  dataXceiverServer = null;
+}
+  }
{code}

Think you forgot to delete this attempt that you didn't end up using. Also the 
removal of the assert in {{kill}} shouldn't be in the patch anymore.



{code}
+  return Mockito.mock(DomainSocket.class, 
+  new AnswerObject() {
+@Override
+public Object answer(InvocationOnMock invocation) throws Throwable 
{
+  throw new RuntimeException(...);
+  } });
{code}

Can you add a one-line comment explaining this, like 'Return a mock which 
always throws exceptions on any of its function calls'? Also, fill in the 
exception text with something like Injected fault instead of ...



Looks like your patch might be missing the new test case? I don't see anyone 
set the {{tcpReadsDisabledForTesting}} flag, nor the 
{{TestParallelShortCircuitReadUnCached}} class you mentioned.

 HDFS-347: fix case where local reads get disabled incorrectly
 -

 Key: HDFS-4417
 URL: https://issues.apache.org/jira/browse/HDFS-4417
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, hdfs-client, performance
Reporter: Todd Lipcon
Assignee: Colin Patrick McCabe
 Attachments: HDFS-4417.002.patch, HDFS-4417.003.patch, hdfs-4417.txt


 In testing HDFS-347 against HBase (thanks [~jdcryans]) we ran into the 
 following case:
 - a workload is running which puts a bunch of local sockets in the PeerCache
 - the workload abates for a while, causing the sockets to go stale (ie the 
 DN side disconnects after the keepalive timeout)
 - the workload starts again
 In this case, the local socket retrieved from the cache failed the 
 newBlockReader call, and it incorrectly disabled local sockets on that host. 
 This is similar to an earlier bug HDFS-3376, but not quite the same.
 The next issue we ran into is that, once this happened, it never tried local 
 sockets again, because the cache held lots of TCP sockets. Since we always 
 managed to get a cached socket to the local node, it didn't bother trying 
 local read again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HDFS-4126) Add reading/writing snapshot information to FSImage

2013-01-21 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559393#comment-13559393
 ] 

Suresh Srinivas edited comment on HDFS-4126 at 1/22/13 4:42 AM:


# DFSUtil#byte2String - add javadoc
# FSImageFormat.java
#* In the javadoc, SnapshotID under FSImage should be snapshotCounter or 
nextSnapshotID. Should we change the SnapshotManager#snapshotID to 
SnapshotManager#snapshotCounter?
#* As per our conversation, the INodeFile FSImage ContainsBlock will change 
when we do the file level diff and simplify the FSImage. Hence I am okay with 
the current code.
#* ComputedFileSize in javadoc could be called snapshotFileSize. The 
corresponding variable name could also be updated accordingly.
#* Snapshot in javadoc is missing snapshot name?
#* javadoc could consoldiate snapshot supported fields together
#* loadRoot should return void and numFiles-- should be used. Returning 1 
always just for decrement purpose does not seem intutive.
#* Snapshot related methods should be moved to an inner class or separate 
class. This can be done in a separate jira.
# FileWithSnapshot impelementation #insertBefore and #removeSelf code seems to 
be repeated in implementation?
# Add a summary of test information to the javadoc of test methods
# For commented tests can please add TODO and a brief description


  was (Author: sureshms):
# DFSUtil#byte2String - add javadoc
# FSImageFormat.java
#* In the javadoc, SnapshotID under FSImage should be snapshotCounter or 
nextSnapshotID. Should we change the SnapshotManager#snapshotID to 
SnapshotManager#snapshotCounter?
#* As per our conversation, the INodeFile FSImage ContainsBlock will change 
when we do the file level diff and simplify the FSImage. Hence I am okay with 
the current code.
#* ComputedFileSize in javadoc could be called snapshotFileSize. The 
corresponding variable name could also be updated accordingly.
#* Snapshot in javadoc is missing snapshot name?
#* javadoc could consoldiate snapshot supported fields together
n
#* loadRoot should return void and numFiles-- should be used. Returning 1 
always just for decrement purpose does not seem intutive.
#* Snapshot related methods should be moved to an inner class or separate 
class. This can be done in a separate jira.
# FileWithSnapshot impelementation #insertBefore and #removeSelf code seems to 
be repeated in implementation?
# Add a summary of test information to the javadoc of test methods
# For commented tests can please add TODO and a brief description

  
 Add reading/writing snapshot information to FSImage
 ---

 Key: HDFS-4126
 URL: https://issues.apache.org/jira/browse/HDFS-4126
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: Snapshot (HDFS-2802)
Reporter: Suresh Srinivas
Assignee: Jing Zhao
 Attachments: HDFS-4126.001.patch, HDFS-4126.002.patch, 
 HDFS-4126.002.patch


 After the changes proposed in HDFS-4125 is completed, reading and writing 
 snapshot related information from FSImage can be implemented. This jira 
 tracks changes required for:
 # Loading snapshot information from FSImage
 # Loading snapshot related operations from editlog
 # Writing snapshot information in FSImage
 # Unit tests related to this functionality

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4126) Add reading/writing snapshot information to FSImage

2013-01-21 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559393#comment-13559393
 ] 

Suresh Srinivas commented on HDFS-4126:
---

# DFSUtil#byte2String - add javadoc
# FSImageFormat.java
#* In the javadoc, SnapshotID under FSImage should be snapshotCounter or 
nextSnapshotID. Should we change the SnapshotManager#snapshotID to 
SnapshotManager#snapshotCounter?
#* As per our conversation, the INodeFile FSImage ContainsBlock will change 
when we do the file level diff and simplify the FSImage. Hence I am okay with 
the current code.
#* ComputedFileSize in javadoc could be called snapshotFileSize. The 
corresponding variable name could also be updated accordingly.
#* Snapshot in javadoc is missing snapshot name?
#* javadoc could consoldiate snapshot supported fields together
n
#* loadRoot should return void and numFiles-- should be used. Returning 1 
always just for decrement purpose does not seem intutive.
#* Snapshot related methods should be moved to an inner class or separate 
class. This can be done in a separate jira.
# FileWithSnapshot impelementation #insertBefore and #removeSelf code seems to 
be repeated in implementation?
# Add a summary of test information to the javadoc of test methods
# For commented tests can please add TODO and a brief description


 Add reading/writing snapshot information to FSImage
 ---

 Key: HDFS-4126
 URL: https://issues.apache.org/jira/browse/HDFS-4126
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: Snapshot (HDFS-2802)
Reporter: Suresh Srinivas
Assignee: Jing Zhao
 Attachments: HDFS-4126.001.patch, HDFS-4126.002.patch, 
 HDFS-4126.002.patch


 After the changes proposed in HDFS-4125 is completed, reading and writing 
 snapshot related information from FSImage can be implemented. This jira 
 tracks changes required for:
 # Loading snapshot information from FSImage
 # Loading snapshot related operations from editlog
 # Writing snapshot information in FSImage
 # Unit tests related to this functionality

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4403) DFSClient can infer checksum type when not provided by reading first byte

2013-01-21 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559397#comment-13559397
 ] 

Suresh Srinivas commented on HDFS-4403:
---

Todd, sorry got busy with other things. +1 for the change as well.

Consider adding a brief release note on the issue with prior branch in the 
release notes to help users understand the issue.

 DFSClient can infer checksum type when not provided by reading first byte
 -

 Key: HDFS-4403
 URL: https://issues.apache.org/jira/browse/HDFS-4403
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: hdfs-4403.txt, hdfs-4403.txt


 HDFS-3177 added the checksum type to OpBlockChecksumResponseProto, but the 
 new protobuf field is optional, with a default of CRC32. This means that this 
 API, when used against an older cluster (like earlier 0.23 releases) will 
 falsely return CRC32 even if that cluster has written files with CRC32C. This 
 can cause issues for distcp, for example.
 Instead of defaulting the protobuf field to CRC32, we can leave it with no 
 default, and if the OpBlockChecksumResponseProto has no checksum type set, 
 the client can send OP_READ_BLOCK to read the first byte of the block, then 
 grab the checksum type out of that response (which has always been present)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4403) DFSClient can infer checksum type when not provided by reading first byte

2013-01-21 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-4403:
--

Release Note: The HDFS implementation of getFileChecksum() can now operate 
correctly against earlier-version datanodes which do not include the checksum 
type information in their checksum response. The checksum type is automatically 
inferred by issuing a read of the first byte of each block.

 DFSClient can infer checksum type when not provided by reading first byte
 -

 Key: HDFS-4403
 URL: https://issues.apache.org/jira/browse/HDFS-4403
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.0.2-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.0.3-alpha

 Attachments: hdfs-4403.txt, hdfs-4403.txt


 HDFS-3177 added the checksum type to OpBlockChecksumResponseProto, but the 
 new protobuf field is optional, with a default of CRC32. This means that this 
 API, when used against an older cluster (like earlier 0.23 releases) will 
 falsely return CRC32 even if that cluster has written files with CRC32C. This 
 can cause issues for distcp, for example.
 Instead of defaulting the protobuf field to CRC32, we can leave it with no 
 default, and if the OpBlockChecksumResponseProto has no checksum type set, 
 the client can send OP_READ_BLOCK to read the first byte of the block, then 
 grab the checksum type out of that response (which has always been present)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4340) Update addBlock() to inculde inode id as additional argument

2013-01-21 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559413#comment-13559413
 ] 

Brandon Li commented on HDFS-4340:
--

@Nicholas, the new patch addresses your comments. I synchronized 
streamer.start() to avoid the findbugs warnings. Please let me know if you 
think it's sort of overkill to do so. Thanks!

 Update addBlock() to inculde inode id as additional argument
 

 Key: HDFS-4340
 URL: https://issues.apache.org/jira/browse/HDFS-4340
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client, namenode
Affects Versions: 3.0.0
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch, 
 HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch, HDFS-4340.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4366) Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks

2013-01-21 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559443#comment-13559443
 ] 

Todd Lipcon commented on HDFS-4366:
---

This looks good to me. +1. Nice patch, Derek.

I'll wait til tomorrow to commit in case anyone else wants to take a look - 
this is pretty important code so having a few eyes on it would be nice.

 Block Replication Policy Implementation May Skip Higher-Priority Blocks for 
 Lower-Priority Blocks
 -

 Key: HDFS-4366
 URL: https://issues.apache.org/jira/browse/HDFS-4366
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.5
Reporter: Derek Dagit
Assignee: Derek Dagit
 Attachments: HDFS-4366.patch, HDFS-4366.patch, HDFS-4366.patch, 
 hdfs-4366-unittest.patch


 In certain cases, higher-priority under-replicated blocks can be skipped by 
 the replication policy implementation.  The current implementation maintains, 
 for each priority level, an index into a list of blocks that are 
 under-replicated.  Together, the lists compose a priority queue (see note 
 later about branch-0.23).  In some cases when blocks are removed from a list, 
 the caller (BlockManager) properly handles the index into the list from which 
 it removed a block.  In some other cases, the index remains stationary while 
 the list changes.  Whenever this happens, and the removed block happened to 
 be at or before the index, the implementation will skip over a block when 
 selecting blocks for replication work.
 In situations when entire racks are decommissioned, leading to many 
 under-replicated blocks, loss of blocks can occur.
 Background: HDFS-1765
 This patch to trunk greatly improved the state of the replication policy 
 implementation.  Prior to the patch, the following details were true:
   * The block priority queue was no such thing: It was really set of 
 trees that held blocks in natural ordering, that being by the blocks ID, 
 which resulted in iterator walks over the blocks in pseudo-random order.
   * There was only a single index into an iteration over all of the 
 blocks...
   * ... meaning the implementation was only successful in respecting 
 priority levels on the first pass.  Overall, the behavior was a 
 round-robin-type scheduling of blocks.
 After the patch
   * A proper priority queue is implemented, preserving log n operations 
 while iterating over blocks in the order added.
   * A separate index for each priority is key is kept...
   * ... allowing for processing of the highest priority blocks first 
 regardless of which priority had last been processed.
 The change was suggested for branch-0.23 as well as trunk, but it does not 
 appear to have been pulled in.
 The problem:
 Although the indices are now tracked in a better way, there is a 
 synchronization issue since the indices are managed outside of methods to 
 modify the contents of the queue.
 Removal of a block from a priority level without adjusting the index can mean 
 that the index then points to the block after the block it originally pointed 
 to.  In the next round of scheduling for that priority level, the block 
 originally pointed to by the index is skipped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira