[jira] Updated: (HDFS-1153) The navigation to /dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error

2010-08-18 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1153:
--

Status: Open  (was: Patch Available)

Canceling patch.  Ravi, can you please provide a trunk version?

> The navigation to /dfsnodelist.jsp  with invalid input parameters produces 
> NPE and HTTP 500 error
> -
>
> Key: HDFS-1153
> URL: https://issues.apache.org/jira/browse/HDFS-1153
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.2, 0.20.1
>Reporter: Ravi Phulari
>Assignee: Ravi Phulari
> Fix For: 0.20.3
>
> Attachments: HDFS-1153.patch
>
>
> Navigation to dfsnodelist.jsp  with invalid input parameters produces NPE and 
> HTTP 500 error. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1347) TestDelegationToken uses mortbay.log for logging

2010-08-18 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1347:
--

 Hadoop Flags: [Reviewed]
 Assignee: Boris Shkolnik
Fix Version/s: 0.22.0
Affects Version/s: 0.22.0
  Component/s: test

+1.

> TestDelegationToken uses mortbay.log for logging
> 
>
> Key: HDFS-1347
> URL: https://issues.apache.org/jira/browse/HDFS-1347
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Boris Shkolnik
>Assignee: Boris Shkolnik
> Fix For: 0.22.0
>
> Attachments: HDFS-1347.patch
>
>
> needs to be changed to commons.log

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1203) DataNode should sleep before reentering service loop after an exception

2010-08-18 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1203:
--

 Hadoop Flags: [Reviewed]
   Issue Type: Improvement  (was: Bug)
Fix Version/s: 0.22.0

+1. This sounds reasonable.

> DataNode should sleep before reentering service loop after an exception
> ---
>
> Key: HDFS-1203
> URL: https://issues.apache.org/jira/browse/HDFS-1203
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.0
>
> Attachments: hdfs-1203.txt
>
>
> When the DN gets an exception in response to a heartbeat, it logs it and 
> continues, but there is no sleep. I've occasionally seen bugs produce a case 
> where heartbeats continuously produce exceptions, and thus the DN floods the 
> NN with bad heartbeats. Adding a 1 second sleep at least throttles the error 
> messages for easier debugging and error isolation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1203) DataNode should sleep before reentering service loop after an exception

2010-08-18 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1203:
--

Status: Open  (was: Patch Available)

> DataNode should sleep before reentering service loop after an exception
> ---
>
> Key: HDFS-1203
> URL: https://issues.apache.org/jira/browse/HDFS-1203
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.0
>
> Attachments: hdfs-1203.txt
>
>
> When the DN gets an exception in response to a heartbeat, it logs it and 
> continues, but there is no sleep. I've occasionally seen bugs produce a case 
> where heartbeats continuously produce exceptions, and thus the DN floods the 
> NN with bad heartbeats. Adding a 1 second sleep at least throttles the error 
> messages for easier debugging and error isolation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1203) DataNode should sleep before reentering service loop after an exception

2010-08-18 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1203:
--

Status: Patch Available  (was: Open)

Re-submitting to Hudson to get another run of the tests, just for completeness, 
since the original run has expired.  However, Hudon's not been around a lot 
lately, and so it may be more expedient for Todd to run the tests locally and 
report the results here, if he wishes.

> DataNode should sleep before reentering service loop after an exception
> ---
>
> Key: HDFS-1203
> URL: https://issues.apache.org/jira/browse/HDFS-1203
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.0
>
> Attachments: hdfs-1203.txt
>
>
> When the DN gets an exception in response to a heartbeat, it logs it and 
> continues, but there is no sleep. I've occasionally seen bugs produce a case 
> where heartbeats continuously produce exceptions, and thus the DN floods the 
> NN with bad heartbeats. Adding a 1 second sleep at least throttles the error 
> messages for easier debugging and error isolation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-535) TestFileCreation occasionally fails because of an exception in DataStreamer.

2010-08-19 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-535:
-

Status: Open  (was: Patch Available)

This patch does indeed cause testFsCloseAfterClusterShutdown to fail for me:
{noformat}4893 Testcase: testFsClose took 3.137 sec
4894 Testcase: testFsCloseAfterClusterShutdown took 2.751 sec
4895   FAILED
4896 Failed to close file after cluster shutdown
4897 junit.framework.AssertionFailedError: Failed to close file after cluster 
shutdown
4898   at 
org.apache.hadoop.hdfs.TestFileCreation.testFsCloseAfterClusterShutdown(TestFileCreation.java:851){noformat}
Canceling patch for Konstanin to update, although I don't believe we've seen 
this problem for a while, so may we can just close this issue?

> TestFileCreation occasionally fails because of an exception in DataStreamer.
> 
>
> Key: HDFS-535
> URL: https://issues.apache.org/jira/browse/HDFS-535
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, test
>Affects Versions: 0.20.1
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Attachments: TestFileCreate.patch
>
>
> One of test cases, namely {{testFsCloseAfterClusterShutdown()}}, of 
> {{TestFileCreation}} fails occasionally.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-08-19 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900580#action_12900580
 ] 

Jakob Homan commented on HDFS-718:
--

I'm +1.  Once an Hadoop cluster is up and running in production it can 
potentially hold very critical and valuable information.  An extra, optional 
safeguard that saves one such cluster and doesn't add any serious complexity to 
the code is worth it.  A steadystate cluster is a very valuable thing...

> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718.patch-2.txt, HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-08-23 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901498#action_12901498
 ] 

Jakob Homan commented on HDFS-718:
--

One nit on patch: Since the patch was changed correctly to use the DFS 
constants -  {{DFS_NAMENODE_SUPPORT_ALLOWFORMAT_KEY = 
"dfs.namenode.support.allowformat";}} - we should really reference this key 
only through the constant and not directly.  Otherwise, +1 as it is.


> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718.patch-2.txt, HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1352) Fix jsvc.location

2010-08-24 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1352:
--

Hadoop Flags: [Reviewed]

+1

> Fix jsvc.location
> -
>
> Key: HDFS-1352
> URL: https://issues.apache.org/jira/browse/HDFS-1352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.22.0
>
> Attachments: hdfs-1352-1.patch
>
>
> The jsvc specified in build.xml 404s, causing the build to fail, because 
> version 1.0.2 has been archived. Let's update the url, not sure we want to 
> move to 1.0.3 or play the game where the build breaks with every jsvc dot 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1353) Optimize number of block access tokens returned by getBlockLocations

2010-08-24 Thread Jakob Homan (JIRA)
Optimize number of block access tokens returned by getBlockLocations


 Key: HDFS-1353
 URL: https://issues.apache.org/jira/browse/HDFS-1353
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.22.0


HDFS-1081 optimized the number of block access tokens (BATs) created in a 
single call to getBlockLocations, as this is an expensive operation.  However, 
that JIRA put off another optimization which was then made possible, which is 
to just send a single block access token across the wire (and maintain a single 
BAT on the client side).  This JIRA is for implementing that optimization.  
Since a single BAT is generated for all the blocks, we just write that single 
BAT to the wire, rather than writing n BATs for n blocks, as is currently done. 
 This turns out to be a useful optimization for files with very large numbers 
of blocks, as the new lone BAT is much larger than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1352) Fix jsvc.location

2010-08-25 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1352:
--

Attachment: HDFS-1352-y20.patch

Patch for y20.  Not for commit.  I'd like to stick with 1.0.2, since that's the 
version we're using in production and have verified it works.

> Fix jsvc.location
> -
>
> Key: HDFS-1352
> URL: https://issues.apache.org/jira/browse/HDFS-1352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.22.0
>
> Attachments: hdfs-1352-1.patch, HDFS-1352-y20.patch
>
>
> The jsvc specified in build.xml 404s, causing the build to fail, because 
> version 1.0.2 has been archived. Let's update the url, not sure we want to 
> move to 1.0.3 or play the game where the build breaks with every jsvc dot 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1352) Fix jsvc.location

2010-08-25 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902591#action_12902591
 ] 

Jakob Homan commented on HDFS-1352:
---

Eli, with Hudson missing at sea, can you run test-patch?  It should be fine, 
but we should double check.  Tests don't need to be run...  After that I'll 
commit it.

> Fix jsvc.location
> -
>
> Key: HDFS-1352
> URL: https://issues.apache.org/jira/browse/HDFS-1352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.22.0
>
> Attachments: hdfs-1352-1.patch, HDFS-1352-y20.patch
>
>
> The jsvc specified in build.xml 404s, causing the build to fail, because 
> version 1.0.2 has been archived. Let's update the url, not sure we want to 
> move to 1.0.3 or play the game where the build breaks with every jsvc dot 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1352) Fix jsvc.location

2010-08-25 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902626#action_12902626
 ] 

Jakob Homan commented on HDFS-1352:
---

HDFS uses svn:externals to fetch the test-patch script from Core, but doesn't 
support svn:externals.  So you either need to run tp from an svn checkout or 
(what I do) just manually copy the script over to ./src/test/bin/test-patch.sh


> Fix jsvc.location
> -
>
> Key: HDFS-1352
> URL: https://issues.apache.org/jira/browse/HDFS-1352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.22.0
>
> Attachments: hdfs-1352-1.patch, HDFS-1352-y20.patch
>
>
> The jsvc specified in build.xml 404s, causing the build to fail, because 
> version 1.0.2 has been archived. Let's update the url, not sure we want to 
> move to 1.0.3 or play the game where the build breaks with every jsvc dot 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1352) Fix jsvc.location

2010-08-25 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902631#action_12902631
 ] 

Jakob Homan commented on HDFS-1352:
---

^but doesn't support^but git doesn't support

... sigh.

> Fix jsvc.location
> -
>
> Key: HDFS-1352
> URL: https://issues.apache.org/jira/browse/HDFS-1352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.22.0
>
> Attachments: hdfs-1352-1.patch, HDFS-1352-y20.patch
>
>
> The jsvc specified in build.xml 404s, causing the build to fail, because 
> version 1.0.2 has been archived. Let's update the url, not sure we want to 
> move to 1.0.3 or play the game where the build breaks with every jsvc dot 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1352) Fix jsvc.location

2010-08-26 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902961#action_12902961
 ] 

Jakob Homan commented on HDFS-1352:
---

Verified manually.  Going to commit.

> Fix jsvc.location
> -
>
> Key: HDFS-1352
> URL: https://issues.apache.org/jira/browse/HDFS-1352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.22.0
>
> Attachments: hdfs-1352-1.patch, HDFS-1352-y20.patch
>
>
> The jsvc specified in build.xml 404s, causing the build to fail, because 
> version 1.0.2 has been archived. Let's update the url, not sure we want to 
> move to 1.0.3 or play the game where the build breaks with every jsvc dot 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1352) Fix jsvc.location

2010-08-26 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1352:
--

Status: Resolved  (was: Patch Available)
Resolution: Fixed

I've committed this.  Resolving as fixed.  Thanks, Eli.  Now that 1.0.2 has 
been archived, its location won't change again and this shouldn't re-occur.

> Fix jsvc.location
> -
>
> Key: HDFS-1352
> URL: https://issues.apache.org/jira/browse/HDFS-1352
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 0.22.0
>
> Attachments: hdfs-1352-1.patch, HDFS-1352-y20.patch
>
>
> The jsvc specified in build.xml 404s, causing the build to fail, because 
> version 1.0.2 has been archived. Let's update the url, not sure we want to 
> move to 1.0.3 or play the game where the build breaks with every jsvc dot 
> release.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1356) Provide information as to whether or not security is enabled on web interface for NameNode (part of HADOOP-6822)

2010-08-26 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1356:
--

Hadoop Flags: [Reviewed]

+1

> Provide information as to whether or not security is enabled on web interface 
> for NameNode (part of HADOOP-6822)
> 
>
> Key: HDFS-1356
> URL: https://issues.apache.org/jira/browse/HDFS-1356
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Boris Shkolnik
>Assignee: Boris Shkolnik
> Fix For: 0.22.0
>
> Attachments: HDFS-1356.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Optimize number of block access tokens returned by getBlockLocations

2010-08-30 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

Fix Version/s: 0.21.1
   (was: 0.22.0)
Affects Version/s: 0.21.0
   (was: 0.22.0)

> Optimize number of block access tokens returned by getBlockLocations
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1
>
>
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

Summary: Remove most of getBlockLocation optimization  (was: Optimize 
number of block access tokens returned by getBlockLocations)
Description: 

HDFS-1081 optimized the number of block access tokens (BATs) created in a 
single call to getBlockLocations, as this is an expensive operation.  However, 
that JIRA put off another optimization which was then made possible, which is 
to just send a single block access token across the wire (and maintain a single 
BAT on the client side).  This JIRA is for implementing that optimization.  
Since a single BAT is generated for all the blocks, we just write that single 
BAT to the wire, rather than writing n BATs for n blocks, as is currently done. 
 This turns out to be a useful optimization for files with very large numbers 
of blocks, as the new lone BAT is much larger than was a BAT previously.

  was:HDFS-1081 optimized the number of block access tokens (BATs) created in a 
single call to getBlockLocations, as this is an expensive operation.  However, 
that JIRA put off another optimization which was then made possible, which is 
to just send a single block access token across the wire (and maintain a single 
BAT on the client side).  This JIRA is for implementing that optimization.  
Since a single BAT is generated for all the blocks, we just write that single 
BAT to the wire, rather than writing n BATs for n blocks, as is currently done. 
 This turns out to be a useful optimization for files with very large numbers 
of blocks, as the new lone BAT is much larger than was a BAT previously.


While benchmarking this new patch, originally an addendum to HDFS-1081, we 
determined that 1081's original benchmarks were in error.  getBlockLocations 
was not the culprit in the performance degradation.  1081 didn't do any damage 
to speed, and with this addendum, actually does give some benefit for files 
with moderate numbers of blocks (see to-be-attached benchmarks).  However, 
since getBL isn't really a slow method, these gains aren't worth the extra 
complexity they introduce.  I'll upload the on-the-wire optimization patch, in 
case it becomes useful at some point, but I'm going to use this JIRA to roll 
back most of 1081, excluding some byte-array allocating that we can easily 
cache.  ...sigh.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1
>
> Attachments: Benchmarking results.xlsx, HDFS-1353-y20.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

Attachment: Benchmarking results.xlsx

Benchmarks of original patch, which optimized the on-the-wire combined block 
tokens.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1
>
> Attachments: Benchmarking results.xlsx, HDFS-1353-y20.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

Attachment: HDFS-1353-y20.patch

Patch for y20.  Not for commit here.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1
>
> Attachments: Benchmarking results.xlsx, HDFS-1353-y20.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

Attachment: HDFS-1353.patch

Patch for trunk and 21.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1, 0.22.0
>
> Attachments: Benchmarking results.xlsx, HDFS-1353-y20.patch, 
> HDFS-1353.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

   Status: Patch Available  (was: Open)
Fix Version/s: 0.22.0

Submitting patch.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1, 0.22.0
>
> Attachments: Benchmarking results.xlsx, HDFS-1353-y20.patch, 
> HDFS-1353.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1284) TestBlockToken fails

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1284:
--

Hadoop Flags: [Reviewed]

+1

> TestBlockToken fails
> 
>
> Key: HDFS-1284
> URL: https://issues.apache.org/jira/browse/HDFS-1284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Kan Zhang
> Attachments: h1284-01.patch, h1284-02.patch, h1284-03.patch
>
>
> Hudson runs fail several tests. {{TestBlockToken.testBlockTokenRpc}} is one 
> of them.
> [See 
> here|http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/423/testReport/junit/org.apache.hadoop.hdfs.security.token.block/TestBlockToken/testBlockTokenRpc/]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1284) TestBlockToken fails

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1284:
--

   Status: Resolved  (was: Patch Available)
Fix Version/s: 0.22.0
   Resolution: Fixed

I've committed this.  Resolving as fixed.  Thanks, Kan!

> TestBlockToken fails
> 
>
> Key: HDFS-1284
> URL: https://issues.apache.org/jira/browse/HDFS-1284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Kan Zhang
> Fix For: 0.22.0
>
> Attachments: h1284-01.patch, h1284-02.patch, h1284-03.patch
>
>
> Hudson runs fail several tests. {{TestBlockToken.testBlockTokenRpc}} is one 
> of them.
> [See 
> here|http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/423/testReport/junit/org.apache.hadoop.hdfs.security.token.block/TestBlockToken/testBlockTokenRpc/]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906101#action_12906101
 ] 

Jakob Homan commented on HDFS-1353:
---

bq. In BlockTokenIdentifier constructor, should we check for non-zero blockId? 
Earlier we checked for non-null blockIds array. 
The blockID is a random number, so it might be 0, so we don't need to check 
that.  Thanks.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1, 0.22.0
>
> Attachments: Benchmarking results.xlsx, HDFS-1353-y20.patch, 
> HDFS-1353.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-1375) TestRefreshUserMappings fails w/o security enabled

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HDFS-1375.
---

Resolution: Not A Problem

As Kan reported, this has been fixed by another patch.  Closing.

> TestRefreshUserMappings fails w/o security enabled
> --
>
> Key: HDFS-1375
> URL: https://issues.apache.org/jira/browse/HDFS-1375
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 0.22.0
>Reporter: Eli Collins
> Fix For: 0.22.0
>
>
> {noformat}
> - Standard Error -
> refreshUserToGroupsMappings: Kerberos service principal name isn't configured 
> properly (should have 3 parts): 
> auth for userL1 failed
> auth for userL2 succeeded
> refreshSuperUserGroupsConfiguration: Kerberos service principal name isn't 
> configured properly (should have 3 parts): 
> -  ---
> Testcase: testGroupMappingRefresh took 9.352 sec
> FAILED
> Should be different group: eli1 and eli1
> junit.framework.AssertionFailedError: Should be different group: eli1 and eli1
> at 
> org.apache.hadoop.security.TestRefreshUserMappings.testGroupMappingRefresh(TestRefreshUserMappings.java:122)
> Testcase: testRefreshSuperUserGroupsConfiguration took 2.334 sec
> FAILED
> second auth for user2 should've failed 
> junit.framework.AssertionFailedError: second auth for user2 should've failed 
> at 
> org.apache.hadoop.security.TestRefreshUserMappings.testRefreshSuperUserGroupsConfiguration(TestRefreshUserMappings.java:199)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1355) ant veryclean (clean-cache) doesn't clean enough

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1355:
--

Hadoop Flags: [Reviewed]

+1

> ant veryclean (clean-cache) doesn't clean enough
> 
>
> Key: HDFS-1355
> URL: https://issues.apache.org/jira/browse/HDFS-1355
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Luke Lu
>Assignee: Luke Lu
> Fix For: 0.22.0
>
> Attachments: hdfs-1355-trunk-v1.patch
>
>
> Looks like since HDFS-1159, ant veryclean no longer work as expected for the 
> case when hadoop common jars are changed. The proposed patch does a more 
> through cleaning on hadoop jars.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1355) ant veryclean (clean-cache) doesn't clean enough

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1355:
--

Status: Resolved  (was: Patch Available)
Resolution: Fixed

I've committed this.  Resolving as fixed.  Thanks, Luke!

> ant veryclean (clean-cache) doesn't clean enough
> 
>
> Key: HDFS-1355
> URL: https://issues.apache.org/jira/browse/HDFS-1355
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Luke Lu
>Assignee: Luke Lu
> Fix For: 0.22.0
>
> Attachments: hdfs-1355-trunk-v1.patch
>
>
> Looks like since HDFS-1159, ant veryclean no longer work as expected for the 
> case when hadoop common jars are changed. The proposed patch does a more 
> through cleaning on hadoop jars.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-881) Refactor DataNode Packet header into DataTransferProtocol

2010-09-03 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906144#action_12906144
 ] 

Jakob Homan commented on HDFS-881:
--

> I ran commit-tests locally and they passed. 
Does this mean just the commit test target or all of the tests?  Since we're 
relying on existing tests to verify this refactor, it'd be good to include the 
test-patch and results for the whole test suite.  Hudson is still quite flaky, 
so it's probably best not to rely on it...

> Refactor DataNode Packet header into DataTransferProtocol
> -
>
> Key: HDFS-881
> URL: https://issues.apache.org/jira/browse/HDFS-881
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-881.txt, hdfs-881.txt
>
>
> The Packet Header format is used ad-hoc in various places. This JIRA is to 
> refactor it into a class inside DataTransferProtocol (like was done with 
> PipelineAck)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-881) Refactor DataNode Packet header into DataTransferProtocol

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-881:
-

Hadoop Flags: [Reviewed]

+1 otherwise.  Thanks for updating the patch.

> Refactor DataNode Packet header into DataTransferProtocol
> -
>
> Key: HDFS-881
> URL: https://issues.apache.org/jira/browse/HDFS-881
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-881.txt, hdfs-881.txt
>
>
> The Packet Header format is used ad-hoc in various places. This JIRA is to 
> refactor it into a class inside DataTransferProtocol (like was done with 
> PipelineAck)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1361) Add -fileStatus operation to NNThroughputBenchmark

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1361:
--

Hadoop Flags: [Reviewed]

+1, although this class is due for a refactor. 

> Add -fileStatus operation to NNThroughputBenchmark
> --
>
> Key: HDFS-1361
> URL: https://issues.apache.org/jira/browse/HDFS-1361
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.22.0
>
> Attachments: NNThroughput-fileStatus.patch
>
>
> getFileStatus() is a frequently used operation in HDFS. It important to 
> benchmark the name-node throughput on it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906188#action_12906188
 ] 

Jakob Homan commented on HDFS-1353:
---

Ran tests manually.  All past except known bad TestHDFSTrash and 
TestFileConcurrentReader.  Test-patch: 
{noformat}
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] -1 javadoc.  The javadoc tool appears to have generated 1 
warning messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.
{noformat}
Javadoc warning has been around and is bogus.  I plan to commit this.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1, 0.22.0
>
> Attachments: Benchmarking results.xlsx, HDFS-1353-y20.patch, 
> HDFS-1353.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

Attachment: HDFS-1353-optmized-wire-not-to-be-committed.patch

For completeness' sake, here's the planned optimizations referenced in the 
spreadsheet... Not to be committed.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.1, 0.22.0
>
> Attachments: Benchmarking results.xlsx, 
> HDFS-1353-optmized-wire-not-to-be-committed.patch, HDFS-1353-y20.patch, 
> HDFS-1353.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-09-03 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

   Status: Resolved  (was: Patch Available)
 Hadoop Flags: [Reviewed]
Fix Version/s: (was: 0.21.1)
   Resolution: Fixed

I've committed this to trunk.  Resolving as fixed.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: Benchmarking results.xlsx, 
> HDFS-1353-optmized-wire-not-to-be-committed.patch, HDFS-1353-y20.patch, 
> HDFS-1353.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-881) Refactor DataNode Packet header into DataTransferProtocol

2010-09-07 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-881:
-

   Status: Resolved  (was: Patch Available)
Fix Version/s: 0.22.0
   Resolution: Fixed

I've committed this.  Resolving as fixed.  Thanks, Todd.

> Refactor DataNode Packet header into DataTransferProtocol
> -
>
> Key: HDFS-881
> URL: https://issues.apache.org/jira/browse/HDFS-881
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.0
>
> Attachments: hdfs-881.txt, hdfs-881.txt
>
>
> The Packet Header format is used ad-hoc in various places. This JIRA is to 
> refactor it into a class inside DataTransferProtocol (like was done with 
> PipelineAck)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-829) hdfsJniHelper.c: #include is not portable

2010-09-07 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-829:
-

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
Assignee: Allen Wittenauer
  Resolution: Fixed

+1.  I verified that this patch allows libhdfs to compile on OSX (although 
build was still not successful, I had to manually chmod +x on 
./src/c++/libhdfs/install-sh to get a full build.  We should open a JIRA for 
this).  I've committed this.  Resolving as fixed.  Thanks, Allen.

> hdfsJniHelper.c: #include  is not portable
> ---
>
> Key: HDFS-829
> URL: https://issues.apache.org/jira/browse/HDFS-829
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
> Fix For: 0.22.0
>
> Attachments: HDFS-632.patch, hdfs-829.patch
>
>
> hdfsJniHelper.c includes  but this appears to be unnecessary, since 
> even under Linux none of the routines that are prototyped are used.  Worse 
> yet, error.h doesn't appear to be a standard header file so this breaks on 
> Mac OS X and Solaris and prevents libhdfs from being built.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1381) MiniDFSCluster documentation refers to out-of-date configuration parameters

2010-09-07 Thread Jakob Homan (JIRA)
MiniDFSCluster documentation refers to out-of-date configuration parameters
---

 Key: HDFS-1381
 URL: https://issues.apache.org/jira/browse/HDFS-1381
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.1
Reporter: Jakob Homan
 Fix For: 0.22.0


The javadoc for MiniDFSCluster makes repeated references to setting 
dfs.name.dir and dfs.data.dir.  These should be replaced with references to 
DFSConfigKeys' DFS_NAMENODE_NAME_DIR_KEY and DFS_DATANODE_DATA_DIR_KEY, 
respectively.  The old values are deprecated in DFSConfigKeys, but we should 
switch to the new values where ever we can.

Also, a quick search the code shows that TestDFSStorageStateRecovery.java and 
UpgradeUtilities.java should be updated as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-09-07 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-718:
-

Status: Open  (was: Patch Available)

> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718.patch-2.txt, HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-09-07 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-718:
-

Attachment: HDFS-718-3.patch

Went to review the patch and found it no longer applied.  Sync'ed with trunk 
and did some clean up.  Mainly, switched to using DFSConfigKeys, fixed some 
values in the MiniDFSCluster setup, cleaned up logging, changed config value to 
not run together allowformat.  If a committer wants to +1 these changes, I feel 
this patch is ready to go, barring further objections.

> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718-3.patch, HDFS-718.patch-2.txt, 
> HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-09-07 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-718:
-

Status: Patch Available  (was: Open)

submitting v3 of patch to Hudson.

> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718-3.patch, HDFS-718.patch-2.txt, 
> HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1363) startFileInternal should return the last block of the file opened for append as an under-construction block

2010-09-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908173#action_12908173
 ] 

Jakob Homan commented on HDFS-1363:
---

Review:
 * BlockManager::createLocatedBlock().  Originally this method did more than 
just call the LocatedBlock constructor (back when it was part of FSNameSystem). 
 Now, since this is all it's doing and is called just two places, maybe we can 
just remove it?
 * FSNamesystem.java:822.  Javadoc for setBlockTokens refers to old, combined 
BATs, which we don't use any more.
 * Since the contract of startFileInternal has been changed (now may null or a 
LocatedBlock) and the method itself is quite convoluted, it would be good to 
spell this out in the method's javadoc.

Otherwise, looks good as a refactor.

> startFileInternal should return the last block of the file opened for append 
> as an under-construction block
> ---
>
> Key: HDFS-1363
> URL: https://issues.apache.org/jira/browse/HDFS-1363
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.21.1
>
> Attachments: appendFileSync.patch
>
>
> {{FSNamesystem.startFileInternal}} should convert the last block of the file 
> opened for append to an under-construction block and return it. This will let 
> remove the second synchronized section in {{FSNamesystem.appendFile()}} and 
> avoid redundant computations and potential inconsistencies as stated in 
> HDFS-1152.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1363) startFileInternal should return the last block of the file opened for append as an under-construction block

2010-09-10 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1363:
--

Status: Open  (was: Patch Available)

Canceling patch post review.  Also, looks like Hudson is AWOL again, so you may 
wish to run tests manually.

> startFileInternal should return the last block of the file opened for append 
> as an under-construction block
> ---
>
> Key: HDFS-1363
> URL: https://issues.apache.org/jira/browse/HDFS-1363
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.21.1
>
> Attachments: appendFileSync.patch
>
>
> {{FSNamesystem.startFileInternal}} should convert the last block of the file 
> opened for append to an under-construction block and return it. This will let 
> remove the second synchronized section in {{FSNamesystem.appendFile()}} and 
> avoid redundant computations and potential inconsistencies as stated in 
> HDFS-1152.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-09-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908197#action_12908197
 ] 

Jakob Homan commented on HDFS-718:
--

Hudson's AWOL: test-commit passes fine, so does test-patch: 
{noformat}[exec] +1 overall.  
[exec] 
[exec] +1 @author.  The patch does not contain any @author tags.
[exec] 
[exec] +1 tests included.  The patch appears to include 3 new or modified 
tests.
[exec] 
[exec] +1 javadoc.  The javadoc tool did not generate any warning messages.
[exec] 
[exec] +1 javac.  The applied patch does not increase the total number of 
javac compiler warnings.
[exec] 
[exec] +1 findbugs.  The patch does not introduce any new Findbugs warnings.
[exec] 
[exec] +1 release audit.  The applied patch does not increase the total 
number of release audit warnings.
[exec] 
[exec] +1 system tests framework.  The patch passed system tests framework 
compile.{noformat}


> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718-3.patch, HDFS-718.patch-2.txt, 
> HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1357) HFTP traffic served by DataNode shouldn't use service port on NameNode

2010-09-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908217#action_12908217
 ] 

Jakob Homan commented on HDFS-1357:
---

Hudson's AWOL and the previous run is no longer available.  Ran commit-tests 
and they passed.  test-patch: 
{noformat} [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.
{noformat}
Kan manually tested this.  +1; I'm going to commit.  

> HFTP traffic served by DataNode shouldn't use service port on NameNode 
> ---
>
> Key: HDFS-1357
> URL: https://issues.apache.org/jira/browse/HDFS-1357
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, security
>Reporter: Kan Zhang
>Assignee: Kan Zhang
> Attachments: h1357-01.patch, h1357-02.patch
>
>
> HDFS-599 introduced a new service port on NameNode to separate system traffic 
> (e.g., heartbeats/blockreports) from client file access requests so that they 
> can be prioritized.  All Datanode traffic now goes to the service port. 
> However, datanode also serves as a proxy for HFTP requests from client 
> (served by StreamFile servlet). These HFTP traffic should continue to use the 
> client port on NameNode. Moreover, using the service port for HFTP is 
> incompatible with the existing way of selecting delegation tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1357) HFTP traffic served by DataNode shouldn't use service port on NameNode

2010-09-10 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1357:
--

   Status: Resolved  (was: Patch Available)
 Hadoop Flags: [Reviewed]
Fix Version/s: 0.22.0
   Resolution: Fixed

I've committed this.  Resolving as fixed.  Thanks, Kan!

> HFTP traffic served by DataNode shouldn't use service port on NameNode 
> ---
>
> Key: HDFS-1357
> URL: https://issues.apache.org/jira/browse/HDFS-1357
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, security
>Reporter: Kan Zhang
>Assignee: Kan Zhang
> Fix For: 0.22.0
>
> Attachments: h1357-01.patch, h1357-02.patch
>
>
> HDFS-599 introduced a new service port on NameNode to separate system traffic 
> (e.g., heartbeats/blockreports) from client file access requests so that they 
> can be prioritized.  All Datanode traffic now goes to the service port. 
> However, datanode also serves as a proxy for HFTP requests from client 
> (served by StreamFile servlet). These HFTP traffic should continue to use the 
> client port on NameNode. Moreover, using the service port for HFTP is 
> incompatible with the existing way of selecting delegation tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1363) startFileInternal should return the last block of the file opened for append as an under-construction block

2010-09-14 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1363:
--

Hadoop Flags: [Reviewed]

+1.  

> startFileInternal should return the last block of the file opened for append 
> as an under-construction block
> ---
>
> Key: HDFS-1363
> URL: https://issues.apache.org/jira/browse/HDFS-1363
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.21.1
>
> Attachments: appendFileSync.patch, appendFileSync.patch
>
>
> {{FSNamesystem.startFileInternal}} should convert the last block of the file 
> opened for append to an under-construction block and return it. This will let 
> remove the second synchronized section in {{FSNamesystem.appendFile()}} and 
> avoid redundant computations and potential inconsistencies as stated in 
> HDFS-1152.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1401) TestFileConcurrentReader test case is still timing out / failing

2010-09-14 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1401:
--

Priority: Critical  (was: Minor)

> TestFileConcurrentReader test case is still timing out / failing
> 
>
> Key: HDFS-1401
> URL: https://issues.apache.org/jira/browse/HDFS-1401
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.22.0
>Reporter: Tanping Wang
>Priority: Critical
>
> The unit test case, TestFileConcurrentReader after its most recent fix in 
> HDFS-1310 still times out when using java 1.6.0_07.  When using java 
> 1.6.0_07, the test case simply hangs.  On apache Hudson build ( which 
> possibly is using a higher sub-version of java) this test case has presented 
> an inconsistent test result that it sometimes passes, some times fails. For 
> example, between the most recent build 423, 424 and build 425, there is no 
> effective change, however, the test case failed on build 424 and passed on 
> build 425
> build 424 test failed
> https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk/424/testReport/org.apache.hadoop.hdfs/TestFileConcurrentReader/
> build 425 test passed
> https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk/425/testReport/org.apache.hadoop.hdfs/TestFileConcurrentReader/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1394) modify -format option for namenode to generated new blockpool id and accept newcluster

2010-09-21 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913253#action_12913253
 ] 

Jakob Homan commented on HDFS-1394:
---

Review (line numbers from patch):
* Before Line 138: Is there anything that needs to be done for simulated 
storage?
* DataStorage.java's only change is whitespace. This can be removed.
* Nit: FsImage:guessClusterId seems like an odd name.  determineClusterId?
* Reading through guessClusterId I was initially confused as to if it were ok 
to return a null to indicate failure.  Javadoc on the method would help.
* Line 279: {{if(!(System.in.read() == 'Y'))}} could be simplified to 
{{if(System.in.read != 'Y')}}
* In the tests there are quite a few calls to {{NameNode.clusterIdStr = 
"TestClusterId";}} these should be refactored into a static method on 
{{GenericTestUtils.java}}.
* There are currently no tests that go over the {{-genclusterid}} paths.  This 
could be tested in {{TestHDFSCLI}}
* This changes user-interaction with the Namenode, so the forrest docs need to 
be updated as well.  

> modify -format option for namenode to generated new blockpool id and accept 
> newcluster
> --
>
> Key: HDFS-1394
> URL: https://issues.apache.org/jira/browse/HDFS-1394
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: Federation Branch
>Reporter: Boris Shkolnik
>Assignee: Boris Shkolnik
> Attachments: HDFS-1394-5.patch, HDFS-1394-6.patch, HDFS-1394-7.patch, 
> HDFS-1394-7.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1413) Broken links to HDFS Wiki in hdfs site and documentation.

2010-09-21 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913305#action_12913305
 ] 

Jakob Homan commented on HDFS-1413:
---

There a couple of odd white-space changes in in css files in 
siteWikiLink.patch.  Otherwise, +1.

> Broken links to HDFS Wiki in hdfs site and documentation.
> -
>
> Key: HDFS-1413
> URL: https://issues.apache.org/jira/browse/HDFS-1413
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.21.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.21.1
>
> Attachments: siteWikiLink.patch, WikiLink.patch
>
>
> # hdfs/site wiki tab points to "http://wiki.apache.org/hadoop/DFS";, should be 
> HDFS.
> # hdfs documentation wiki tab points to "http://wiki.apache.org/hadoop/hdfs";, 
> should be HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1394) modify -format option for namenode to generated new blockpool id and accept newcluster

2010-09-21 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913319#action_12913319
 ] 

Jakob Homan commented on HDFS-1394:
---

+1

> modify -format option for namenode to generated new blockpool id and accept 
> newcluster
> --
>
> Key: HDFS-1394
> URL: https://issues.apache.org/jira/browse/HDFS-1394
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: Federation Branch
>Reporter: Boris Shkolnik
>Assignee: Boris Shkolnik
> Attachments: HDFS-1394-5.patch, HDFS-1394-6.patch, HDFS-1394-7.patch, 
> HDFS-1394-7.patch, HDFS-1394-8.patch, HDFS-1394-9.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1430) Don't print entire stack trace to NN log when denying file access

2010-09-29 Thread Jakob Homan (JIRA)
Don't print entire stack trace to NN log when denying file access
-

 Key: HDFS-1430
 URL: https://issues.apache.org/jira/browse/HDFS-1430
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.21.0
Reporter: Jakob Homan
 Fix For: 0.22.0


Currently when a user attempts to access a file/directory he/she doesn't have 
access to, we include the entire stack trace from the exception that is 
generated.  Denying access is a routine event and the stack trace is just noise.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1430) Don't print entire stack trace to NN log when denying file access

2010-09-29 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916294#action_12916294
 ] 

Jakob Homan commented on HDFS-1430:
---

An example:
{noformat}2010-09-01 19:30:52,760 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 20 on 8020, call
getListing(/the/swimming/pool/in/the/library, [...@94bf925) from 
123.456.789.10:60262: error:
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=themaster, access=READ_EXECUTE,
inode="20100115":thedoctor:thetardis:rwxr-x---
org.apache.hadoop.security.AccessControlException: Permission denied: 
user=themaster, access=READ_EXECUTE,
inode="20100115":thedoctor:thetardis:rwxr-x---
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:199)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:134)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4672)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:4636)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:2081)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.getListing(NameNode.java:651)
at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:519)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1285)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1281)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:978)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1279){noformat}
22 lines is a bit excessive.  A single line will suffice.

> Don't print entire stack trace to NN log when denying file access
> -
>
> Key: HDFS-1430
> URL: https://issues.apache.org/jira/browse/HDFS-1430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
> Fix For: 0.22.0
>
>
> Currently when a user attempts to access a file/directory he/she doesn't have 
> access to, we include the entire stack trace from the exception that is 
> generated.  Denying access is a routine event and the stack trace is just 
> noise.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-855) namenode can save images in parallel to all directories in fs.name.dir

2010-09-30 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HDFS-855.
--

Resolution: Duplicate

> namenode can save images in parallel to all directories in fs.name.dir
> --
>
> Key: HDFS-855
> URL: https://issues.apache.org/jira/browse/HDFS-855
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>
> The namenode restart times can be reduced if the namenode can save its image 
> to multiple directories (specified in fs.name.dir) in parallel.
> The NN has a 6 GB fsimage and 1 MB edits file. The NN needed 10 minutes to 
> load fsimage/edits into memory. It needs 7 minutes to read the 6 GB image. 
> There are two directories in fs.name.dir.  it takes about 2 minute to save 
> the image into first directory and another 2 minutes to save the image to the 
> second directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1367) Add alternative search-provider to HDFS site

2010-10-04 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1367:
--

  Resolution: Fixed
Hadoop Flags:   (was: [Reviewed])
  Status: Resolved  (was: Patch Available)

Alex, if there are follow-up changes that need to be done, separate JIRAs will 
need to be filed; we don't normally accept patches on resolved issues.  Also, 
post-project split, there do need to be separate JIRAs for each of the 
projects.  Re-resolving as fixed.

> Add alternative search-provider to HDFS site
> 
>
> Key: HDFS-1367
> URL: https://issues.apache.org/jira/browse/HDFS-1367
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Alex Baranau
>Assignee: Alex Baranau
>Priority: Minor
> Attachments: HDFS-1367-common.patch, HDFS-1367-main.patch, 
> HDFS-1367-mapreduce.patch, HDFS-1367.patch
>
>
> Use search-hadoop.com service to make available search in HDFS sources, MLs, 
> wiki, etc.
> This was initially proposed on user mailing list. The search service was 
> already added in site's skin (common for all Hadoop related projects) before 
> so this issue is about enabling it for HDFS. The ultimate goal is to use it 
> at all Hadoop's sub-projects' sites.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1300) Decommissioning nodes does not increase replication priority

2010-10-04 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1300:
--

Status: Open  (was: Patch Available)

Dmytro- Looks like the updated patch dropped the unit test. Can you upload a 
new one with it re-attached?  Thanks.

> Decommissioning nodes does not increase replication priority
> 
>
> Key: HDFS-1300
> URL: https://issues.apache.org/jira/browse/HDFS-1300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0, 0.20.2, 0.20.1, 0.20-append, 0.20.3, 0.22.0
>Reporter: Dmytro Molkov
>Assignee: Dmytro Molkov
> Fix For: 0.22.0
>
> Attachments: HDFS-1300.2.patch, HDFS-1300.patch
>
>
> Currently when you decommission a node each block is only inserted into 
> neededReplications if it is not there yet. This causes a problem of a block 
> sitting in a low priority queue when all replicas sit on the nodes being 
> decommissioned.
> The common usecase for decommissioning nodes for us is proactively exclude 
> them before they went bad, so it would be great to get the blocks at risk 
> onto the live datanodes as quickly as possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1367) Add alternative search-provider to HDFS site

2010-10-04 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1367:
--

Hadoop Flags: [Reviewed]

> Add alternative search-provider to HDFS site
> 
>
> Key: HDFS-1367
> URL: https://issues.apache.org/jira/browse/HDFS-1367
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Alex Baranau
>Assignee: Alex Baranau
>Priority: Minor
> Attachments: HDFS-1367-common.patch, HDFS-1367-main.patch, 
> HDFS-1367-mapreduce.patch, HDFS-1367.patch
>
>
> Use search-hadoop.com service to make available search in HDFS sources, MLs, 
> wiki, etc.
> This was initially proposed on user mailing list. The search service was 
> already added in site's skin (common for all Hadoop related projects) before 
> so this issue is about enabling it for HDFS. The ultimate goal is to use it 
> at all Hadoop's sub-projects' sites.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1440) TestComputeInvalidateWork fails intermittently

2010-10-04 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1440:
--

Hadoop Flags: [Reviewed]

+1

> TestComputeInvalidateWork fails intermittently
> --
>
> Key: HDFS-1440
> URL: https://issues.apache.org/jira/browse/HDFS-1440
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.22.0
>
> Attachments: HDFS-1440.patch
>
>
> TestComputeInvalidateWork fails intermittently. This is due to incorrect 
> synchronization introduced due to HDFS-1093. The test uses blocks 
> synchronized on FSNamesystem monitor, however with HDFS-1093, the 
> FSNamesystem uses explicit read/write lock for synchronization.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-908) TestDistributedFileSystem fails with Wrong FS on weird hosts

2010-10-04 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917855#action_12917855
 ] 

Jakob Homan commented on HDFS-908:
--

Hudson's AWOL.  Todd, please run tests and test-patch manually so this can be 
committed.

> TestDistributedFileSystem fails with Wrong FS on weird hosts
> 
>
> Key: HDFS-908
> URL: https://issues.apache.org/jira/browse/HDFS-908
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.20.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-908.txt, hdfs-908.txt
>
>
> On the same host where I experienced HDFS-874, I also experience this failure 
> for TestDistributedFileSystem:
> Testcase: testFileChecksum took 0.492 sec
>   Caused an ERROR
> Wrong FS: hftp://localhost.localdomain:59782/filechecksum/foo0, expected: 
> hftp://127.0.0.1:59782
> java.lang.IllegalArgumentException: Wrong FS: 
> hftp://localhost.localdomain:59782/filechecksum/foo0, expected: 
> hftp://127.0.0.1:59782
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)
>   at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:222)
>   at 
> org.apache.hadoop.hdfs.HftpFileSystem.getFileChecksum(HftpFileSystem.java:318)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testFileChecksum(TestDistributedFileSystem.java:166)
> Doesn't appear to occur on trunk or branch-0.21.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-908) TestDistributedFileSystem fails with Wrong FS on weird hosts

2010-10-04 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-908:
-

Priority: Minor  (was: Major)

> TestDistributedFileSystem fails with Wrong FS on weird hosts
> 
>
> Key: HDFS-908
> URL: https://issues.apache.org/jira/browse/HDFS-908
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.20.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-908.txt, hdfs-908.txt
>
>
> On the same host where I experienced HDFS-874, I also experience this failure 
> for TestDistributedFileSystem:
> Testcase: testFileChecksum took 0.492 sec
>   Caused an ERROR
> Wrong FS: hftp://localhost.localdomain:59782/filechecksum/foo0, expected: 
> hftp://127.0.0.1:59782
> java.lang.IllegalArgumentException: Wrong FS: 
> hftp://localhost.localdomain:59782/filechecksum/foo0, expected: 
> hftp://127.0.0.1:59782
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)
>   at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:222)
>   at 
> org.apache.hadoop.hdfs.HftpFileSystem.getFileChecksum(HftpFileSystem.java:318)
>   at 
> org.apache.hadoop.hdfs.TestDistributedFileSystem.testFileChecksum(TestDistributedFileSystem.java:166)
> Doesn't appear to occur on trunk or branch-0.21.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1339) NameNodeMetrics should use MetricsTimeVaryingLong

2010-10-05 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918361#action_12918361
 ] 

Jakob Homan commented on HDFS-1339:
---

Scott: I'm curious, have you guys run into a situation where an int was not 
sufficient?

> NameNodeMetrics should use MetricsTimeVaryingLong 
> --
>
> Key: HDFS-1339
> URL: https://issues.apache.org/jira/browse/HDFS-1339
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: Scott Chen
>Assignee: Scott Chen
>Priority: Minor
> Attachments: HDFS-1339.txt
>
>
> NameNodeMetrics uses MetricsTimeVaryingInt. We see that FileInfoOps and 
> GetBlockLocations overflow in our cluster.
> Using MetricsTimeVaryingLong will easily solve this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1353) Remove most of getBlockLocation optimization

2010-10-06 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1353:
--

Attachment: HDFS-1353-y20-2.patch

Uploading updated patch.  We changed the fix a bit to not bump the RPC protocol 
version since it's a minor fix.  Not for commit to Apache.

> Remove most of getBlockLocation optimization
> 
>
> Key: HDFS-1353
> URL: https://issues.apache.org/jira/browse/HDFS-1353
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: Benchmarking results.xlsx, 
> HDFS-1353-optmized-wire-not-to-be-committed.patch, HDFS-1353-y20-2.patch, 
> HDFS-1353-y20.patch, HDFS-1353.patch
>
>
> 
> HDFS-1081 optimized the number of block access tokens (BATs) created in a 
> single call to getBlockLocations, as this is an expensive operation.  
> However, that JIRA put off another optimization which was then made possible, 
> which is to just send a single block access token across the wire (and 
> maintain a single BAT on the client side).  This JIRA is for implementing 
> that optimization.  Since a single BAT is generated for all the blocks, we 
> just write that single BAT to the wire, rather than writing n BATs for n 
> blocks, as is currently done.  This turns out to be a useful optimization for 
> files with very large numbers of blocks, as the new lone BAT is much larger 
> than was a BAT previously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-270) DFS Upgrade should process dfs.data.dirs in parallel

2010-10-06 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-270:
-

  Component/s: data-node
 Priority: Major  (was: Minor)
Affects Version/s: 0.20.2
Fix Version/s: 0.22.0

> DFS Upgrade should process dfs.data.dirs in parallel
> 
>
> Key: HDFS-270
> URL: https://issues.apache.org/jira/browse/HDFS-270
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Stu Hood
>Assignee: Matt Foley
> Fix For: 0.22.0
>
>
> I just upgraded from 0.14.2 to 0.15.0, and things went very smoothly, if a 
> little slowly.
> The main reason the upgrade took so long was the block upgrades on the 
> datanodes. Each of our datanodes has 3 drives listed for the dfs.data.dir 
> parameter. From looking at the logs, it is fairly clear that the upgrade 
> procedure does not attempt to upgrade all listed dfs.data.dir's in parallel.
> I think even if all of your dfs.data.dir's are on the same physical device, 
> there would still be an advantage to performing the upgrade process in 
> parallel. The less downtime, the better: especially if it is potentially 20 
> minutes versus 60 minutes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1150) Verify datanodes' identities to clients in secure clusters

2010-10-07 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1150:
--

Attachment: RequireSecurePorts.patch

Small follow-up patch.  Our Ops team had requested to not have the secure 
datanode bail out, but rather give a warning, if non-privileged ports were 
specified, during the transition to secure ports.  Now that this has been done, 
this patch changes the secure datanode to throw a RTE if provided with 
non-privileged ports.  This is for Y!20 only; trunk already has this behavior.

> Verify datanodes' identities to clients in secure clusters
> --
>
> Key: HDFS-1150
> URL: https://issues.apache.org/jira/browse/HDFS-1150
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: commons-daemon-1.0.2-src.tar.gz, 
> HDFS-1150-BF-Y20-LOG-DIRS-2.patch, HDFS-1150-BF-Y20-LOG-DIRS.patch, 
> HDFS-1150-BF1-Y20.patch, hdfs-1150-bugfix-1.1.patch, 
> hdfs-1150-bugfix-1.2.patch, hdfs-1150-bugfix-1.patch, 
> HDFS-1150-trunk-2.patch, HDFS-1150-trunk-3.patch, HDFS-1150-trunk.patch, 
> HDFS-1150-Y20-BetterJsvcHandling.patch, HDFS-1150-y20.build-script.patch, 
> HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, 
> HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, 
> HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, 
> HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt, 
> RequireSecurePorts.patch
>
>
> Currently we use block access tokens to allow datanodes to verify clients' 
> identities, however we don't have a way for clients to verify the 
> authenticity of the datanodes themselves.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1444) Test related code of build.xml is error-prone and needs to be re-aligned.

2010-10-08 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1444:
--

Hadoop Flags: [Reviewed]

+1

> Test related code of build.xml is error-prone and needs to be re-aligned.
> -
>
> Key: HDFS-1444
> URL: https://issues.apache.org/jira/browse/HDFS-1444
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.21.1
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
>Priority: Minor
> Attachments: HDFS-1444.patch
>
>
> Test related parts of build.xml introduce at least two places (effectively 
> different) for test classes destination compilation.
> Then some extra logic is applied at say test-jar creation step where the 
> content of one is copied over to another. Etc.
> This seems to be overcomplicated and is better be fixed to prevent possible 
> issues with future build modificaitons.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1452) ant compile-contrib is broken

2010-10-11 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920012#action_12920012
 ] 

Jakob Homan commented on HDFS-1452:
---

Looks like hdfsproxy can't find MiniDFSCluster...
{noformat}compile-test:
 [echo] contrib: hdfsproxy
[javac] Compiling 9 source files to 
/Users/jhoman/work/git/hadoop-hdfs/build/contrib/hdfsproxy/test
[javac] 
/Users/jhoman/work/git/hadoop-hdfs/src/contrib/hdfsproxy/src/test/org/apache/hadoop/hdfsproxy/TestHdfsProxy.java:39:
 cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: package org.apache.hadoop.hdfs
[javac] import org.apache.hadoop.hdfs.MiniDFSCluster;
[javac]  ^
[javac] 
/Users/jhoman/work/git/hadoop-hdfs/src/contrib/hdfsproxy/src/test/org/apache/hadoop/hdfsproxy/TestHdfsProxy.java:203:
 cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: class org.apache.hadoop.hdfsproxy.TestHdfsProxy
[javac] MiniDFSCluster cluster = null;
[javac] ^
[javac] 
/Users/jhoman/work/git/hadoop-hdfs/src/contrib/hdfsproxy/src/test/org/apache/hadoop/hdfsproxy/TestHdfsProxy.java:213:
 cannot find symbol
[javac] symbol  : class MiniDFSCluster
[javac] location: class org.apache.hadoop.hdfsproxy.TestHdfsProxy
[javac]   cluster = new MiniDFSCluster(dfsConf, 2, true, null);
[javac] ^
[javac] 3 errors{noformat}


> ant compile-contrib is broken
> -
>
> Key: HDFS-1452
> URL: https://issues.apache.org/jira/browse/HDFS-1452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/hdfsproxy
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
> Fix For: 0.22.0
>
>
> ant compile-contrib is broken, looks like commit 
> a0a62d971fb35de7f021ecbd6ceb8d08ef923ed5 HDFS-1444

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1452) ant compile-contrib is broken

2010-10-11 Thread Jakob Homan (JIRA)
ant compile-contrib is broken
-

 Key: HDFS-1452
 URL: https://issues.apache.org/jira/browse/HDFS-1452
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/hdfsproxy
Affects Versions: 0.22.0
Reporter: Jakob Homan
 Fix For: 0.22.0


ant compile-contrib is broken, looks like commit 
a0a62d971fb35de7f021ecbd6ceb8d08ef923ed5 HDFS-1444


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1452) ant compile-contrib is broken

2010-10-11 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920020#action_12920020
 ] 

Jakob Homan commented on HDFS-1452:
---

With Cos' patch, I can compile test-contrib.  However, TestHdfsProxy fails for 
me, as it did before this patch.  Even omitting that test, I can't get cleanly 
through the test; instead I get an exception with test-cactus.  This corner of 
the code is a mess, but it's not HDFS-1444's fault.  +1 on the patch.

> ant compile-contrib is broken
> -
>
> Key: HDFS-1452
> URL: https://issues.apache.org/jira/browse/HDFS-1452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/hdfsproxy
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Konstantin Boudnik
> Fix For: 0.22.0
>
> Attachments: hdfs-1452.patch, hdfs-1452.patch
>
>
> ant compile-contrib is broken, looks like commit 
> a0a62d971fb35de7f021ecbd6ceb8d08ef923ed5 HDFS-1444

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1448) Create multi-format parser for edits logs file, support binary and XML formats initially

2010-10-13 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1448:
--

Attachment: Viewer hierarchy.pdf


Code review:
* General: 
** All classes should be categorized with audience and stability
** No need for all the brackets in messages.  Breaks with what passes for our 
style.
** Do we need to write to disk for tests? Just write to output stream
* editsStored.xml
** Indent/format to make more human readable
* TestOfflineImageViewer.java
** Convert getBuildDir and getCacheDir to fields, rather than re-evaluating the 
method each call
** Would it be better to split the single, large test into four smaller tests, 
with more descriptive names?
** The commented-out code should be removed.  If it's useful for manual 
testing, it can be included in a static main in the test
** Style: runOev method calls don't follow code convention, they can be all on 
one line
** The methods runOevXmlToBinary/runOevBinaryToXml can be refactored to remove 
common code, which is most of it.
** There is no need for a separate printToScreen variable in those methods
** fileEqualIgnoreTrailingZeroes: Since largeFilename is just aliased to 
filename1, there is no need for filename1. Just use that name as the method 
parameter.
** loadFile(): I'm surprised we don't have a utility method in the test package 
to do this. It's a general operation and this method may be better located 
there.
** A larger problem is that this test doesn't use asserts to verify 
correctness, which will make working with it difficult.  The exceptions should 
be converted to fully described JUnit asserts.
* OfflineEditsViewerHelper.java
** Class needs Javadoc
** Is it necessary to copy the edits file? Instead, can we just leave it in 
place and test it there? A better option, though I don't believe supported by 
MiniDFSCluster, would be if we could just write the edits to a memory buffer 
and avoid the disk altogether.
** Commented out code: fc = FileContext.getFileContext(cluster.getURI(), 
config);
* Tokenizer.java
** Tokenizer works specifically with EditsElements, may be good to give it more 
specific name. Same comment for Token.
** I'm torn on the individual Token* classes.  I'd rather if there were a way 
of directly integrating them into edits, but that's a bridge too far for this 
patch.  Scala case classes would be quite helpful here...
** Several referances to static method encode/decodeBase64 via instance variable
*EditsLoaderCurrent.java
** The duplicated edits enums should be re-factored into shared class rather 
than duplicated. 
** Style: case OP_CLOSE doesn't need to be surrounded by braces, as do several 
other cases.
** The more involved cases should be refactored into separate classes to aid 
readability. This may be reasonable for all the cases to be consistent.
** OP_UPDATE_MASTER_KEY this seems to be the only place we check for the 
possibility of working with an unsupported version. Is there a reason for this?
** The pattern: {noformat}v.visit(t.read(new 
Tokenizer.Token{Whatever}(EditsElement.LENGTH)));{noformat} is repeated quite a 
lot.  Can this be refactored into a helper method to aid in readability?
** By doing a static import of the various Tokenizer classes (which can be made 
static) such as: {noformat}import static 
org.apache.hadoop.hdfs.tools.offlineEditsViewer.Tokenizer.TokenInt;{noformat} 
you can avoid the extra reference to Tokenizer in the visit calls.
** I'm not sure that the statistics functionality adds any value to this class. 
 It may be better to create a separate statistics viewer that provides this 
information. 
** Several unnecessary imports
* EditsVisitor.java
** The DepthCounter duplicates the same class in the oiv.  May as well create a 
common utility class and share it.
** Commented out code: {noformat} // abstract void visit(EditsElement element, 
String value) throws IOException;  {noformat}
** Unnecessary import of DeprecatedUTF8
* EditsVisitorXml.java
** Consistent naming with oiv would be XmlEditsVisitor
** I believe this class is quite ripe for a shared generic implementation with 
oiv's Xml viewer. This is discussed more below.
** unnecessary import of Base64 class
* OfflineEditsViewer.java
** Typo:  This class implements and offline edits viewer, tool that  (and -> an)
** No need to mention note about OfflineImageViewer.
** The command line parsing and options shares quite a bit of code with the oiv 
and may be easy to merge.
* EditsVisitorBinary.java
** The printToScreen option is ignored and doesn't make sense for this viewer.  
It may be fine to keep the option, but we should probably add documentation 
about it being ignored by some visitors
** No need for commented-out debugging code
* Tokenizers.java
** Since the class is a factory perhaps TokenizerFactory is a better name?
** The file type determination can be simplified by checking for 
.endsWith(".xml") 
** Typo:

[jira] Commented: (HDFS-1455) Record DFS client/cli id with username/kerbros session token in audit log or hdfs client trace log

2010-10-13 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920759#action_12920759
 ] 

Jakob Homan commented on HDFS-1455:
---

So the point of this would be to build an offline MR tool to get a better 
picture of individuals' hdfs usage? Including the info in the log would not 
make it available for real-time analysis.

> Record DFS client/cli id with username/kerbros session token in audit log or 
> hdfs client trace log
> --
>
> Key: HDFS-1455
> URL: https://issues.apache.org/jira/browse/HDFS-1455
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Eric Yang
>
> HDFS usage calculation is commonly calculated by running dfs -dus and group 
> directory usage by user at fix interval.  This approach does not show 
> accurate HDFS usage if a lot of read/write activity of equivalent amount of 
> data happen at fix interval.  In order to identify usage of such pattern, the 
> usage calculation could be measured by the bytes read and bytes written in 
> the hdfs client trace log.  There is currently no association of DFSClient ID 
> or CLI ID to the user or session token emitted by Hadoop hdfs client trace 
> log files.  This JIRA is to record DFS Client ID/CLI ID with user 
> name/session token in appropriate place for more precious measuring of HDFS 
> usage.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1448) Create multi-format parser for edits logs file, support binary and XML formats initially

2010-10-13 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1448:
--

Priority: Major  (was: Minor)

> Create multi-format parser for edits logs file, support binary and XML 
> formats initially
> 
>
> Key: HDFS-1448
> URL: https://issues.apache.org/jira/browse/HDFS-1448
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: tools
>Affects Versions: 0.22.0
>Reporter: Erik Steffl
> Fix For: 0.22.0
>
> Attachments: editsStored, HDFS-1448-0.22.patch, Viewer hierarchy.pdf
>
>
> Create multi-format parser for edits logs file, support binary and XML 
> formats initially.
> Parsing should work from any supported format to any other supported format 
> (e.g. from binary to XML and from XML to binary).
> The binary format is the format used by FSEditLog class to read/write edits 
> file.
> Primary reason to develop this tool is to help with troubleshooting, the 
> binary format is hard to read and edit (for human troubleshooters).
> Longer term it could be used to clean up and minimize parsers for fsimage and 
> edits files. Edits parser OfflineEditsViewer is written in a very similar 
> fashion to OfflineImageViewer. Next step would be to merge OfflineImageViewer 
> and OfflineEditsViewer and use the result in both FSImage and FSEditLog. This 
> is subject to change, specifically depending on adoption of avro (which would 
> completely change how objects are serialized as well as provide ways to 
> convert files to different formats).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1456) Provide builder for constructing instances of MiniDFSCluster

2010-10-13 Thread Jakob Homan (JIRA)
Provide builder for constructing instances of MiniDFSCluster


 Key: HDFS-1456
 URL: https://issues.apache.org/jira/browse/HDFS-1456
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 0.22.0
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.22.0


Time to fix a broken window. Of the 293 occurences of "new MiniDFSCluster("... 
most look something like:
{noformat}cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, 
true,  null, null, null, null);{noformat}
The largest constructor takes 10 parameters, and even the overloaded 
constructors can be difficult to read as their mainaly nulls or booleans.

We should provide a Builder for constructing MiniDFSClusters to improve 
readability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1456) Provide builder for constructing instances of MiniDFSCluster

2010-10-13 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1456:
--

Attachment: HDFS-1456.patch

Patch creates a new Builder class for construction MiniDFSClusters.  What 
before was:
{noformat}cluster = new MiniDFSCluster(0, conf, NUM_DATA_NODES, true, 
false, true, null,
  null, null, null);{noformat}
can now be expressed as
{noformat}cluster = new MiniDFSCluster.Builder(conf)
.numDataNodes(NUM_DATA_NODES)
.manageNameDfsDirs(false).build();{noformat}

I've converted a few instances to the new Builder.  If people like this idea, 
I'll convert the rest, mainly through automation to avoid human error, but I 
wanted an easy-to-read patch before one filled with auto-refactoring.  We can 
deprecate the MiniDFSConstructors as well.

> Provide builder for constructing instances of MiniDFSCluster
> 
>
> Key: HDFS-1456
> URL: https://issues.apache.org/jira/browse/HDFS-1456
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: HDFS-1456.patch
>
>
> Time to fix a broken window. Of the 293 occurences of "new 
> MiniDFSCluster("... most look something like:
> {noformat}cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, 
> true,  null, null, null, null);{noformat}
> The largest constructor takes 10 parameters, and even the overloaded 
> constructors can be difficult to read as their mainaly nulls or booleans.
> We should provide a Builder for constructing MiniDFSClusters to improve 
> readability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1456) Provide builder for constructing instances of MiniDFSCluster

2010-10-17 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1456:
--

Attachment: HDFS-1456-2.patch

Here's a patch with a slightly refined builder and all the calls to the 
MiniDFSCluster constructors replaced with calls to the builder.  Most of the 
refactoring was automatic since most of the calls to the MiniDFS follow a 
standard pattern.  I noted that quite a large percentage of the calls weren't 
using the most efficient constructor available.  This is a lot of code churn, 
but as they stands, the MiniDFSCluster constructors are so hideous, I think 
it's worth the change to get rid of them.
All the tests pass, although TestBlockRecovery, TestBlockTokenWithDFS and 
TestPipelines are all pretty flaky both on OSX and Ubuntu.
Test-patch:
{noformat} [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 424 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system tests framework.  The patch passed system tests 
framework compile.
{noformat}

> Provide builder for constructing instances of MiniDFSCluster
> 
>
> Key: HDFS-1456
> URL: https://issues.apache.org/jira/browse/HDFS-1456
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: HDFS-1456-2.patch, HDFS-1456.patch
>
>
> Time to fix a broken window. Of the 293 occurences of "new 
> MiniDFSCluster("... most look something like:
> {noformat}cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, 
> true,  null, null, null, null);{noformat}
> The largest constructor takes 10 parameters, and even the overloaded 
> constructors can be difficult to read as their mainaly nulls or booleans.
> We should provide a Builder for constructing MiniDFSClusters to improve 
> readability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-1456) Provide builder for constructing instances of MiniDFSCluster

2010-10-18 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HDFS-1456.
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]

The commented-out calls were in the original code and got converted during the 
autorefactor.  To minimize change - heh - I left them in.  Thanks for the 
review.  I've committed this.  Resolving as fixed.

> Provide builder for constructing instances of MiniDFSCluster
> 
>
> Key: HDFS-1456
> URL: https://issues.apache.org/jira/browse/HDFS-1456
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: HDFS-1456-2.patch, HDFS-1456.patch
>
>
> Time to fix a broken window. Of the 293 occurences of "new 
> MiniDFSCluster("... most look something like:
> {noformat}cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, 
> true,  null, null, null, null);{noformat}
> The largest constructor takes 10 parameters, and even the overloaded 
> constructors can be difficult to read as their mainaly nulls or booleans.
> We should provide a Builder for constructing MiniDFSClusters to improve 
> readability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1464) Fix reporting of 2NN address when dfs.secondary.http.address is default (wildcard)

2010-10-19 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922772#action_12922772
 ] 

Jakob Homan commented on HDFS-1464:
---

HDFS-1080 discussed that the change was backwards compatible and aimed at 
making more of the configs explicit rather than implicit, which can lead to 
confusion (which lead to the bug).  Specifically, in the situation that caused 
HDFS-1080, where the SNN was ip aliased and the config value not specified, 
this patch would leave us in the same state as before 1080 - using the default 
value, which is incorrect under security, and not able to transfer the merged 
image.  For this reason, I am reluctant to let this go in its current form.  
This distinction only makes a difference under security; maybe that should be 
considered.

> Fix reporting of 2NN address when dfs.secondary.http.address is default 
> (wildcard)
> --
>
> Key: HDFS-1464
> URL: https://issues.apache.org/jira/browse/HDFS-1464
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-1464.txt
>
>
> HDFS-1080 broke the way that the 2NN identifies its own hostname to the NN 
> during checkpoint upload. It used to use the local hostname, which as 
> HDFS-1080 pointed out was error prone if it had multiple interfaces, etc. But 
> now, with the default setting of dfs.secondary.http.address, the 2NN reports 
> "0.0.0.0", which won't work either.
> We should look for the wildcard bind address and use the local hostname in 
> that case, like we used to.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HDFS-1465) Eliminate FS image loading code duplication between OIV and FSImage

2010-10-20 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HDFS-1465:
-

Assignee: Jakob Homan

> Eliminate FS image loading code duplication between OIV and FSImage
> ---
>
> Key: HDFS-1465
> URL: https://issues.apache.org/jira/browse/HDFS-1465
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
>Reporter: Aaron T. Myers
>Assignee: Jakob Homan
>
> Konstantin wrote in HADOOP-5467:
> {quote}
> Ideally we should have the same source code reading the fsimage file and then 
> using different visitors to process deserialized data. I think we can achieve 
> that goal by implementing a LoadFSImageVisitor, which will call FSNamesystem 
> methods to add inodes to the directory tree and so on, making it a 
> replacement to FSImage.loadFSImage().
> The LoadFSImageVisitor can be passed to FSImageProcessor same as other 
> visitors Jakob implemented.
> We can do it in a separate Jira, but it should be done before the next 
> release so that we had uniform deserialization in the release.
> This approach will probably also require to move FSImageProcessor code inside 
> server.namenode package. The OfflineImageViewer itself should remain in tools.
> {quote}
> This work never got done, and it caused a problem in HDFS-1435. This JIRA is 
> to track that work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1468) TestBlockReport fails on trunk

2010-10-20 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923140#action_12923140
 ] 

Jakob Homan commented on HDFS-1468:
---

This failure is not consistent.  From just now:
{noformat}[junit] Running 
org.apache.hadoop.hdfs.server.datanode.TestBlockReport
[junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 67.186 
sec{noformat}

> TestBlockReport fails on trunk
> --
>
> Key: HDFS-1468
> URL: https://issues.apache.org/jira/browse/HDFS-1468
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: failed-TestBlockReport.txt.gz
>
>
> TestBlockReport appears to be failing on trunk:
> Testcase: blockReport_08 took 4.68 sec
>   FAILED
> Wrong number of PendingReplication blocks expected:<2> but was:<1>
> junit.framework.AssertionFailedError: Wrong number of PendingReplication 
> blocks expected:<2> but was:<1>
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockReport.blockReport_08(TestBlockReport.java:414)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1456) Provide builder for constructing instances of MiniDFSCluster

2010-10-20 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1456:
--

Attachment: HDFS-1456-for-1052-branch.patch

Patch merged with the 1052 branch to be committed there.  Same as trunk, just 
some lines didn't merge cleanly.

> Provide builder for constructing instances of MiniDFSCluster
> 
>
> Key: HDFS-1456
> URL: https://issues.apache.org/jira/browse/HDFS-1456
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: HDFS-1456-2.patch, HDFS-1456-for-1052-branch.patch, 
> HDFS-1456.patch
>
>
> Time to fix a broken window. Of the 293 occurences of "new 
> MiniDFSCluster("... most look something like:
> {noformat}cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, 
> true,  null, null, null, null);{noformat}
> The largest constructor takes 10 parameters, and even the overloaded 
> constructors can be difficult to read as their mainaly nulls or booleans.
> We should provide a Builder for constructing MiniDFSClusters to improve 
> readability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-10-20 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923261#action_12923261
 ] 

Jakob Homan commented on HDFS-1071:
---

bq. implementing this with one thread traversing the namespace tree and other 
threads writing to the disk is more relevant now.
This seems like a good way to go forward.  It would be good to get a patch that 
implements this approach..

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-209) Provide tool to view/change edits file

2010-10-20 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HDFS-209.
--

Resolution: Duplicate

Progress is being made on this feature in HDFS-1448.  Closing this one.

> Provide tool to view/change edits file
> --
>
> Key: HDFS-209
> URL: https://issues.apache.org/jira/browse/HDFS-209
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Lohit Vijayarenu
>
> At present if edits file is corrupt namenode fails to start throwing 
> EOFException as seen this this jira HADOOP-820. One way out in such cases was 
> to remove offending entry from edits, but that is not straight forward as 
> edits is a binary file. So, workaround was to ignore EOFException and 
> continue with namenode startup. It would be good to have a tool to view edits 
> entries or even modify it to a recoverable state. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-312) FSEditLog dump/restore tool

2010-10-20 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan resolved HDFS-312.
--

Resolution: Duplicate

Progress is being made on this feature in HDFS-1448.  Closing this one.

> FSEditLog dump/restore tool
> ---
>
> Key: HDFS-312
> URL: https://issues.apache.org/jira/browse/HDFS-312
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Andrzej Bialecki 
>Priority: Minor
> Attachments: FSEditLogTool.java
>
>
> This tool is useful if your DFS has problems due to a corrupted edit log. You 
> can convert the binary "edits" file into a well-formatted text file, make any 
> corrections to edit commands that you think are likely to fix the problem, 
> and then convert the text file back to a binary "edits" file.
> NOTE: obviously if you are not careful you can damage your DFS beyond any 
> hope of repair.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-10-25 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924775#action_12924775
 ] 

Jakob Homan commented on HDFS-1071:
---

bq. Could you please verify this. If the images are the same I'm fine with the 
implementation.
In the patch, the {{FSNameSystem::saveNamespace()}} acquires the writelock 
before calling {{FSImage::saveNamespace(renewCheckpointTime)}}.  The writing is 
done in parallel and each of the writer threads is joined (in 
{{waitForThreads}}) before returning from the method, where the writeLock is 
surrendered.  So this should be safe

There are other calls to {{saveNamespace}} that should be considered, though.   
{{FSImage::saveNamespace(renewCheckpointTime)}} is called from several other 
locations: In {{FSDirectory::loadFSImage}}, which is called by FSNameSystem's 
constructors, by {{BackupStorage::saveCheckpoint()}}, by 
{{CheckpointStorage::doMerge()}}, and by {{FSImage::doImportCheckpoint}}.  
Assuming no new operations are coming in, which they shouldn't be, the 
checkpoint and backupnode calls are safe.  The others are as well, assuming 
we're in safemode.  Does this sound reasonable?

I believe this addresses Konstantin's concerns.

A couple nits with the current patch (6):
* Java's Collections documentation is pretty adamant about traversing 
synchronized collections with a lock on the collection 
(http://download.oracle.com/javase/6/docs/api/java/util/Collections.html#synchronizedList(java.util.List)),
 which isn't done currently in the patch in {{processIOErrors}} for the {{sds}} 
parameter.  This isn't necessary at the moment, as only one thread is 
guaranteed to be iterating, but it may be better to synchronize now to avoid 
problems in the future.
* The MiniDFSCluster constructors have been deprecated since this patch was 
generated.  It should be updated to use the new Builder.
 

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-10-25 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924800#action_12924800
 ] 

Jakob Homan commented on HDFS-1071:
---

Also, in TestParallelImageWrite.java, there is an unused import (URI) and an 
unused local variable (imageIndex) that should be cleaned up.

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1448) Create multi-format parser for edits logs file, support binary and XML formats initially

2010-11-11 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931182#action_12931182
 ] 

Jakob Homan commented on HDFS-1448:
---

Patch review for HDFS-1448-0.22-2.patch.

* BinaryTokenizer.java
** Providing a constructor that takes a stream rather than a String could aid 
in testing (my goal is for all testing to be able to be done via streams 
without going to the file system).
* EditsLoader.java
** printStatistics - does this method add value that couldn't otherwise be 
achieved as a separate viewer? I'm still not sold on providing this information 
for every run.  The vast majority of oev instances won't be interested in it, 
but will still have to pay the penalty of compiling the statistics. Conversely, 
the information could be of interest specifically (ie, tell me about this edits 
log), and then the user will have to run some other viewer just get it.  This 
same information can be gathered as a separate visitor, as mentioned in the 
first review.
* EditsLoaderCurrent.java
** Comments on 157-160 should be moved one line down so they apply to the check 
they're describing.
** The prior review had asked for the various switch statements to be moved 
into separate functions to aid in readability, testing and code maintability.  
Does the new code, with its individual functors and the extra code necessary to 
implement this scheme provide any functionality not given by the original 
suggestion?  If not, it seems to be a large amount of extra code and wiring 
without any benefit.  The goal of the comment in the original review was to 
reduce complexity and improve readability, which I'm not sure this new approach 
accomplishes.
* EditsOpCode.java
** The content of this file duplicates the constants created in FSEditLog.java. 
 While it would be best to avoid all duplication, that may not be possible in 
this patch, as discussed above.  However, to minimize duplication, I suggest 
refactoring out the constants from FSEditsLog into a separate class and 
referring to those constants in the enum definition.  Bonus points would be to 
have the Enum in the same file as the constants.
EditsVisitor.java
** What's the use case for the getTokenizer() method? It doesn't seem to be 
called anywhere.  Unused methods should be removed.
* EditsVisitorFactory.java
** Does the three lines of regular expression parsing necessary to determine if 
a file ends in .xml provide any extra benefit than simply using 
filename.endWith(".xml"), as was proposed in the original review?  If not, we 
should prefer the shorter, simpler code.
** It may be good to support .XML, .Xml, etc., and therefore call toLower on 
the string before checking for the file extension.
** Typo: "different implementatios"
* OfflineEditsViewer.java
** The only method calling public setEditsLoader() is OfflineEditsViewer is 
go().  As such, it should be made private.
* Tokenizer.java
** Spacing between fields and methods does not follow our coding standard.
** Consensus on naming convention for tokens of Foo appears to be FooToken, 
rather than TokenFoo (see: 
[http://www.google.com/codesearch?hl=en&sa=N&q=file:^.*Token*.java]), as well 
as our own 'Token'y classes.  We should follow that here. 
* TokenizerFactory.java
** Same comments for as for EditsFactory.java
* XmlTokenizer.java
** A quick survey of our exception handling shows that it is preferable to nest 
exceptions rather than taking the message from one, swallowing it and throwing 
a new exception: http://dl.dropbox.com/u/565949/exceptions.txt  We should do 
the same here.
** Do we need to handle all the empty cases in the switch? At the very least, 
there are a lot of empty comments that should be returned.
** {{public Token read(Token t) throws IOException {}} this method returns the 
same Token that it accepts, which has a bit of code smell.  I wonder if there's 
a way to avoid the confusion of mutating the parameter and then returning it, 
or, if not, explicitly documenting this behavior.
OfflineEditsViewerHelper.java
** As an aside: Whatever form elements of the edits file eventually take, it 
would be nice if they were self-testing and could provide this information 
automatically, rather than needing to call each one here, decoupled from the 
implementation.
** As noted in the first review, it is unnecessary and inefficient to shell out 
to copy the edits file.  This file is used a source for the test; you can find 
where it is (as is done in order to accomplish the copying) and explicitly 
clean it up after the test has completed.
I suggest refactoring
  {noformat}public void generateEdits(String dfsDir, String 
editsFilename){noformat}
to 
  {noformat}public String generateEdits(String dfsDir) // return path to 
edits{noformat}
and providing a shutdown method on OfflineEditsViewer that cleans up the 
cluster when the unit test has completed.
** 137: No need for IOException 

[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-11-11 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1071:
--

Hadoop Flags: [Reviewed]

Thanks for the updates.  +1 on latest patch.  Barring any more objections and 
pending Hudson, I'll commit this tomorrow.

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.7.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-11-11 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1071:
--

Status: Open  (was: Patch Available)

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.7.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-11-11 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1071:
--

Status: Patch Available  (was: Open)

Re-triggering the mythical Hudson... 

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.7.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-11-11 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931298#action_12931298
 ] 

Jakob Homan commented on HDFS-1071:
---

Looks like Hudson didn't make it all the way through due to an unrelated error: 
https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/3/console  Dmytro, 
can you post test and test-patch results and then I'll commit.  Thanks.

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.7.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-811) Add metrics, failure reporting and additional tests for HDFS-457

2010-11-13 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931742#action_12931742
 ] 

Jakob Homan commented on HDFS-811:
--

bq. Forgot to mention I don't see any javac warnings with the patch either. 
The testpatch javac warning flag is probably from the call to the deprecated 
MiniDFSCluster constructor on line 411 of the patch...

> Add metrics, failure reporting and additional tests for HDFS-457
> 
>
> Key: HDFS-811
> URL: https://issues.apache.org/jira/browse/HDFS-811
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Ravi Phulari
>Assignee: Eli Collins
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: hdfs-811-1.patch, hdfs-811-2.patch, hdfs-811-3.patch, 
> hdfs-811-4.patch, hdfs-811-5.patch
>
>
>  HDFS-457 introduced a improvement which allows  datanode to continue if a 
> volume for replica storage fails. Previously a datanode resigned if any 
> volume failed. 
> Description of HDFS-457
> {quote}
> Current implementation shuts DataNode down completely when one of the 
> configured volumes of the storage fails.
> This is rather wasteful behavior because it decreases utilization (good 
> storage becomes unavailable) and imposes extra load on the system 
> (replication of the blocks from the good volumes). These problems will become 
> even more prominent when we move to mixed (heterogeneous) clusters with many 
> more volumes per Data Node.
> {quote}
> I suggest following additional tests for this improvement. 
> #1 Test successive  volume failures ( Minimum 4 volumes )
> #2 Test if each volume failure reports reduction in available DFS space and 
> remaining space.
> #3 Test if failure of all volumes on a data nodes leads to the data node 
> failure.
> #4 Test if correcting failed storage disk brings updates and increments 
> available DFS space. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-696) Java assertion failures triggered by tests

2010-11-13 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931767#action_12931767
 ] 

Jakob Homan commented on HDFS-696:
--

It's after the fact, but +1.  Eli, please update the components, versions, 
assignee, etc.

> Java assertion failures triggered by tests
> --
>
> Key: HDFS-696
> URL: https://issues.apache.org/jira/browse/HDFS-696
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Eli Collins
> Attachments: hadoop6309-hdfs.patch
>
>
> Re-purposing as catch-all ticket for assertion failures when running tests 
> with java asserts enabled. Running with the attached patch on tr...@823732 
> the following tests all trigger assertion failures:
>  
> TestAccessTokenWithDFS
> TestInterDatanodeProtocol
> TestBackupNode 
> TestBlockUnderConstruction
> TestCheckpoint  
> TestNameEditsConfigs
> TestStartup
> TestStorageRestore

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1187) Modify fetchdt to allow renewing and canceling token

2010-11-15 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1187:
--

Hadoop Flags: [Reviewed]

+1

> Modify fetchdt to allow renewing and canceling token
> 
>
> Key: HDFS-1187
> URL: https://issues.apache.org/jira/browse/HDFS-1187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: fetchdt.patch, h1187-14.patch, h1187-15.patch
>
>
> I would like to extend fetchdt to allow renewing and canceling tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1187) Modify fetchdt to allow renewing and canceling token

2010-11-15 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1187:
--

Component/s: (was: tools)
 security

> Modify fetchdt to allow renewing and canceling token
> 
>
> Key: HDFS-1187
> URL: https://issues.apache.org/jira/browse/HDFS-1187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.22.0
>
> Attachments: fetchdt.patch, h1187-14.patch, h1187-15.patch
>
>
> I would like to extend fetchdt to allow renewing and canceling tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1187) Modify fetchdt to allow renewing and canceling token

2010-11-15 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1187:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
   Status: Resolved  (was: Patch Available)

I've committed this.  Resolving as fixed.

> Modify fetchdt to allow renewing and canceling token
> 
>
> Key: HDFS-1187
> URL: https://issues.apache.org/jira/browse/HDFS-1187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.22.0
>
> Attachments: fetchdt.patch, h1187-14.patch, h1187-15.patch
>
>
> I would like to extend fetchdt to allow renewing and canceling tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1500) TestOfflineImageViewer failing on trunk

2010-11-15 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932307#action_12932307
 ] 

Jakob Homan commented on HDFS-1500:
---

Why was the one-line needed to correct the test failure not the entirety of 
this patch?   The extra changes made to the tests were out of the scope of the 
JIRA, which got opened and resolved in an hour.  By moving the fail statements 
out of the individual sub-tests, it is now more difficult to see which sub-test 
failed.  Had this been called out as being part of this patch, I would have 
raised concerns about this change.

> TestOfflineImageViewer failing on trunk
> ---
>
> Key: HDFS-1500
> URL: https://issues.apache.org/jira/browse/HDFS-1500
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-1500.txt
>
>
> Testcase: testOIV took 22.679 sec
>   FAILED
> Failed reading valid file: No image processor to read version -26 is 
> available.
> junit.framework.AssertionFailedError: Failed reading valid file: No image 
> processor to read version -26 is available.
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.outputOfLSVisitor(TestOfflineImageViewer.java:171)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testOIV(TestOfflineImageViewer.java:86)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1500) TestOfflineImageViewer failing on trunk

2010-11-15 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932310#action_12932310
 ] 

Jakob Homan commented on HDFS-1500:
---

That's reasonable.  Since patches to fix failing tests are generally committed 
quickly, without much time for community input, in the future it will be best 
to save improvements to those tests for subsequent JIRAs, where the community 
can get a chance to see the changes.  This applies for even relatively small 
changes.

> TestOfflineImageViewer failing on trunk
> ---
>
> Key: HDFS-1500
> URL: https://issues.apache.org/jira/browse/HDFS-1500
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-1500.txt
>
>
> Testcase: testOIV took 22.679 sec
>   FAILED
> Failed reading valid file: No image processor to read version -26 is 
> available.
> junit.framework.AssertionFailedError: Failed reading valid file: No image 
> processor to read version -26 is available.
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.outputOfLSVisitor(TestOfflineImageViewer.java:171)
>   at 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer.testOIV(TestOfflineImageViewer.java:86)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-11-16 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932685#action_12932685
 ] 

Jakob Homan commented on HDFS-1071:
---

I'd like to get this one into 22, so I ran the tests myself.  All tests passed 
except also-failing-on-trunk TestDatanodeBlockScanner, TestBlockRecovery, 
TestStorageRestore, TestDatanodeDeath.
{noformat}[exec] -1 overall.  
[exec] 
[exec] +1 @author.  The patch does not contain any @author tags.
[exec] 
[exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
[exec] 
[exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
[exec] 
[exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
[exec] 
[exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
[exec] 
[exec] -1 release audit.  The applied patch generated 98 release audit 
warnings (more than the trunk's
current 1 warnings).
[exec] 
[exec] +1 system test framework.  The patch passed system test 
framework compile.{noformat}

I'm going to commit this.

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.7.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-11-16 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-1071:
--

   Resolution: Fixed
Fix Version/s: 0.22.0
   Status: Resolved  (was: Patch Available)

Forgot to mention, the audit issue from test-patch above is a known bug.  I've 
committed this.  Resolving as fixed.  Thanks, Dmytro!

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> 
>
> Key: HDFS-1071
> URL: https://issues.apache.org/jira/browse/HDFS-1071
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Dmytro Molkov
> Fix For: 0.22.0
>
> Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.7.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-11-16 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-718:
-

Attachment: HDFS-718-4.patch

Here's an updated patch addressing Boris' comment and using the new 
MiniDFSCluster builder.

Test-patch is fine (with known-bad audit warning)
{noformat} [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] -1 release audit.  The applied patch generated 98 release audit 
warnings (more than the trunk's current 1 warnings).
 [exec] 
 [exec] +1 system test framework.  The patch passed system test 
framework compile.{noformat}
I'd like to get this into 22.  Since the change in this code only affected the 
unit test, which has been verified, I'm ready to go ahead and commit it.  How's 
that sound?

> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718-3.patch, HDFS-718-4.patch, 
> HDFS-718.patch-2.txt, HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2010-11-16 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-718:
-

Attachment: HDFS-718-5.patch

Great catch Eli.  I've removed the try-catches around those statements.  Now 
exceptions will go straight up.  I think this is good to go.

> configuration parameter to prevent accidental formatting of HDFS filesystem
> ---
>
> Key: HDFS-718
> URL: https://issues.apache.org/jira/browse/HDFS-718
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.22.0
> Environment: Any
>Reporter: Andrew Ryan
>Assignee: Andrew Ryan
>Priority: Minor
> Attachments: HDFS-718-3.patch, HDFS-718-4.patch, HDFS-718-5.patch, 
> HDFS-718.patch-2.txt, HDFS-718.patch.txt
>
>
> Currently, any time the NameNode is not running, an HDFS filesystem will 
> accept the 'format' command, and will duly format itself. There are those of 
> us who have multi-PB HDFS filesystems who are really quite uncomfortable with 
> this behavior. There is "Y/N" confirmation in the format command, but if the 
> formatter genuinely believes themselves to be doing the right thing, the 
> filesystem will be formatted.
> This patch adds a configuration parameter to the namenode, 
> dfs.namenode.support.allowformat, which defaults to "true," the current 
> behavior: always allow formatting if the NameNode is down or some other 
> process is not holding the namenode lock. But if 
> dfs.namenode.support.allowformat is set to "false," the NameNode will not 
> allow itself to be formatted until this config parameter is changed to "true".
> The general idea is that for production HDFS filesystems, the user would 
> format the HDFS once, then set dfs.namenode.support.allowformat to "false" 
> for all time.
> The attached patch was generated against trunk and +1's on my test machine. 
> We have a 0.20 version that we are using in our cluster as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



<    1   2   3   4   5   6   7   8   >