[jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place

2011-08-30 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094299#comment-13094299
 ] 

Jitendra Nath Pandey commented on HDFS-2018:


The latest patch looks good to me.
Minor comment:
 countTransactionsInInprogress in FileJournalManager is only doing validation, 
can we rename it to something like validateEditLogFile?
 
+1. 

> 1073: Move all journal stream management code into one place
> 
>
> Key: HDFS-2018
> URL: https://issues.apache.org/jira/browse/HDFS-2018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, hdfs-2018-otherapi.txt, hdfs-2018.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in 
> FileJournalManager and the code for input streams is in the inspectors. This 
> change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of 
> transactions instead of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1330) Make RPCs to DataNodes timeout

2011-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094254#comment-13094254
 ] 

Hudson commented on HDFS-1330:
--

Integrated in Hadoop-Hdfs-trunk #777 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/777/])
HDFS-1330 and HADOOP-6889. Added additional unit tests. Contributed by John 
George.

mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1163463
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java


> Make RPCs to DataNodes timeout
> --
>
> Key: HDFS-1330
> URL: https://issues.apache.org/jira/browse/HDFS-1330
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Hairong Kuang
>Assignee: John George
> Fix For: 0.22.0, 0.23.0, 0.24.0
>
> Attachments: HADOOP-6889-fortrunk-2.patch, hdfsRpcTimeout.patch
>
>
> This jira aims to make client/datanode or datanode/datanode RPC to have a 
> timeout of DataNode#socketTimeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2303) jsvc needs to be recompilable

2011-08-30 Thread Roman Shaposhnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094250#comment-13094250
 ] 

Roman Shaposhnik commented on HDFS-2303:


I have a patch to recompile jsvc by fetching commons-daemon-1.0.3-native-src 
tarball based on ant. Now that HDFS has transitioned to Maven, what would the 
preference be? We can go maven ant plugin way, or perhaps we can create a Maven 
subproject hadoop-jsvc (under hadoop-hdfs-project) that would use  c-builds 
Maven plugin to fetch and recompile jsvc.

Please let me know.

> jsvc needs to be recompilable
> -
>
> Key: HDFS-2303
> URL: https://issues.apache.org/jira/browse/HDFS-2303
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Roman Shaposhnik
>Assignee: Roman Shaposhnik
>
> It would be nice to recompile jsvc as part of the native profile. This has a 
> number of benefits including an ability to re-generate all binary artifacts, 
> etc. Most of all, however, it will provide a way to generate jsvc on Linux 
> distributions that don't have matching libc

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2289) jsvc isn't part of the artifact

2011-08-30 Thread Roman Shaposhnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094249#comment-13094249
 ] 

Roman Shaposhnik commented on HDFS-2289:


@Arun -- HDFS-2303 opened. Please comment over there.

> jsvc isn't part of the artifact
> ---
>
> Key: HDFS-2289
> URL: https://issues.apache.org/jira/browse/HDFS-2289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Arun C Murthy
>Assignee: Alejandro Abdelnur
>Priority: Blocker
> Fix For: 0.23.0
>
> Attachments: HDFS-2289v1.patch, HDFS-2289v2.patch, HDFS-2289v3.patch, 
> HDFS-2289v4.patch, HDFS-2289v5.patch
>
>
> Apparently we had something like this in build.xml:
>  value="http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-i386.tar.gz";
>  />
> Also, when I manually add in jsvc binary I get this error:
> {noformat}
> 25/08/2011 23:47:18 29805 jsvc.exec error: Cannot find daemon loader 
> org/apache/commons/daemon/support/DaemonLoader
> 25/08/2011 23:47:18 29778 jsvc.exec error: Service exit with a return value 
> of 1
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2303) jsvc needs to be recompilable

2011-08-30 Thread Roman Shaposhnik (JIRA)
jsvc needs to be recompilable
-

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik


It would be nice to recompile jsvc as part of the native profile. This has a 
number of benefits including an ability to re-generate all binary artifacts, 
etc. Most of all, however, it will provide a way to generate jsvc on Linux 
distributions that don't have matching libc

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1330) Make RPCs to DataNodes timeout

2011-08-30 Thread Matt Foley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated HDFS-1330:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 for code review. Thanks, John!
Committed to trunk.

Also asked Arun if he wanted this in branch-0.23, he said yes.
Committed to v23.

> Make RPCs to DataNodes timeout
> --
>
> Key: HDFS-1330
> URL: https://issues.apache.org/jira/browse/HDFS-1330
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Hairong Kuang
>Assignee: John George
> Fix For: 0.22.0, 0.23.0, 0.24.0
>
> Attachments: HADOOP-6889-fortrunk-2.patch, hdfsRpcTimeout.patch
>
>
> This jira aims to make client/datanode or datanode/datanode RPC to have a 
> timeout of DataNode#socketTimeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1330) Make RPCs to DataNodes timeout

2011-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094227#comment-13094227
 ] 

Hudson commented on HDFS-1330:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #821 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/821/])
HDFS-1330 and HADOOP-6889. Added additional unit tests. Contributed by John 
George.

mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1163463
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java


> Make RPCs to DataNodes timeout
> --
>
> Key: HDFS-1330
> URL: https://issues.apache.org/jira/browse/HDFS-1330
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Hairong Kuang
>Assignee: John George
> Fix For: 0.22.0, 0.23.0, 0.24.0
>
> Attachments: HADOOP-6889-fortrunk-2.patch, hdfsRpcTimeout.patch
>
>
> This jira aims to make client/datanode or datanode/datanode RPC to have a 
> timeout of DataNode#socketTimeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1330) Make RPCs to DataNodes timeout

2011-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094225#comment-13094225
 ] 

Hudson commented on HDFS-1330:
--

Integrated in Hadoop-Hdfs-trunk-Commit #888 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/888/])
HDFS-1330 and HADOOP-6889. Added additional unit tests. Contributed by John 
George.

mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1163463
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java


> Make RPCs to DataNodes timeout
> --
>
> Key: HDFS-1330
> URL: https://issues.apache.org/jira/browse/HDFS-1330
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Hairong Kuang
>Assignee: John George
> Fix For: 0.22.0, 0.23.0, 0.24.0
>
> Attachments: HADOOP-6889-fortrunk-2.patch, hdfsRpcTimeout.patch
>
>
> This jira aims to make client/datanode or datanode/datanode RPC to have a 
> timeout of DataNode#socketTimeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1330) Make RPCs to DataNodes timeout

2011-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094222#comment-13094222
 ] 

Hudson commented on HDFS-1330:
--

Integrated in Hadoop-Common-trunk-Commit #811 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/811/])
HDFS-1330 and HADOOP-6889. Added additional unit tests. Contributed by John 
George.

mattf : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1163463
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestInterDatanodeProtocol.java


> Make RPCs to DataNodes timeout
> --
>
> Key: HDFS-1330
> URL: https://issues.apache.org/jira/browse/HDFS-1330
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Hairong Kuang
>Assignee: John George
> Fix For: 0.22.0, 0.23.0, 0.24.0
>
> Attachments: HADOOP-6889-fortrunk-2.patch, hdfsRpcTimeout.patch
>
>
> This jira aims to make client/datanode or datanode/datanode RPC to have a 
> timeout of DataNode#socketTimeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-08-30 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094138#comment-13094138
 ] 

Doug Cutting commented on HDFS-2298:


If we hope to eventually support wire-format interoperability (HADOOP-7347) 
then we ought to avoid overloading methods in our RPC protocols.  If we agree 
with that, then the fix is to rename one of the overloaded methods, and let 
Avro serve as a test for these sorts of problems.  Another alternative might be 
to try to fix Avro Reflect to handle this, but I think that could be either 
tricky, ulgy or both.  Or we might insert a non-reflected, manually maintained 
wrapper layer for each RPC protocol, as has been done with MR2.  This is a good 
practice, but also a lot of work.   

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094028#comment-13094028
 ] 

Uma Maheswara Rao G commented on HDFS-2298:
---

Just now i checked with 22branch with avro 1.3.2 jar. It is working fine.

But in trunk, Avro jars are upgraded to 1.5.2 version. I think because of this 
up-gradation, this might have happened with mavenization changes.

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-08-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094019#comment-13094019
 ] 

Aaron T. Myers commented on HDFS-2298:
--

So then I'll go back to my original question - what caused this test to start 
failing? Perhaps it was previously excluded from even being run, but that was 
undone by the recent mavenization changes?

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094012#comment-13094012
 ] 

Uma Maheswara Rao G commented on HDFS-2298:
---

Hi Aaron,

I think Cutting already filed one issue in HDFS to handle this.

https://issues.apache.org/jira/browse/HDFS-1077

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094003#comment-13094003
 ] 

Uma Maheswara Rao G commented on HDFS-2298:
---

I think mostly reason could be because of Avro changes. Related change is in 
AVRO-499.
I am not exactly sure from when this test started filing.  

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2302) HDFS logs not being rotated

2011-08-30 Thread Ravi Prakash (JIRA)
HDFS logs not being rotated
---

 Key: HDFS-2302
 URL: https://issues.apache.org/jira/browse/HDFS-2302
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravi Prakash


In commit c5edca2b15eca7c0bd568a0017f699ac91b8aebf, the logs for the namenode, 
datanode and secondarynamenode are being written to .out files and are not 
being rotated after one day. IMHO rotation of logs is important

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-362) FSEditLog should not writes long and short as UTF8 and should not use ArrayWritable for writing non-array items

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-362:
-

Status: Open  (was: Patch Available)

> FSEditLog should not writes long and short as UTF8 and should not use 
> ArrayWritable for writing non-array items
> ---
>
> Key: HDFS-362
> URL: https://issues.apache.org/jira/browse/HDFS-362
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-362.1.patch, HDFS-362.2.patch, HDFS-362.2b.patch, 
> HDFS-362.2c.patch, HDFS-362.2d.patch, HDFS-362.2d.patch, HDFS-362.patch
>
>
> In FSEditLog, 
> - long and short are first converted to String and are further converted to 
> UTF8
> - For some non-array items, it first create an ArrayWritable object to hold 
> all the items and then writes the ArrayWritable object.
> These result creating many intermediate objects which affects Namenode CPU 
> performance and Namenode restart.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-362) FSEditLog should not writes long and short as UTF8 and should not use ArrayWritable for writing non-array items

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-362:
-

Status: Patch Available  (was: Open)

> FSEditLog should not writes long and short as UTF8 and should not use 
> ArrayWritable for writing non-array items
> ---
>
> Key: HDFS-362
> URL: https://issues.apache.org/jira/browse/HDFS-362
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-362.1.patch, HDFS-362.2.patch, HDFS-362.2b.patch, 
> HDFS-362.2c.patch, HDFS-362.2d.patch, HDFS-362.2d.patch, HDFS-362.patch
>
>
> In FSEditLog, 
> - long and short are first converted to String and are further converted to 
> UTF8
> - For some non-array items, it first create an ArrayWritable object to hold 
> all the items and then writes the ArrayWritable object.
> These result creating many intermediate objects which affects Namenode CPU 
> performance and Namenode restart.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1779) After NameNode restart , Clients can not read partial files even after client invokes Sync.

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093994#comment-13093994
 ] 

Uma Maheswara Rao G commented on HDFS-1779:
---

Hi Hairong,
Actually i was also started looking into this. Since you have patch ready, I 
will test your patch.
Thanks a lot Hairong for the help :-)


Thanks
Uma

> After NameNode restart , Clients can not read partial files even after client 
> invokes Sync.
> ---
>
> Key: HDFS-1779
> URL: https://issues.apache.org/jira/browse/HDFS-1779
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.20-append
> Environment: Linux
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 0.20-append
>
> Attachments: HDFS-1779.1.patch, HDFS-1779.patch
>
>
> In Append HDFS-200 issue,
> If file has 10 blocks and after writing 5 blocks if client invokes sync 
> method then NN will persist the blocks information in edits. 
> After this if we restart the NN, All the DataNodes will reregister with NN. 
> But DataNodes are not sending the blocks being written information to NN. DNs 
> are sending the blocksBeingWritten information in DN startup. So, here 
> NameNode can not find that the 5 persisted blocks belongs to which datanodes. 
> This information can build based on block reports from DN. Otherwise we will 
> loose this 5 blocks information even NN persisted that block information in 
> edits. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093989#comment-13093989
 ] 

Hadoop QA commented on HDFS-2299:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12492281/HDFS-2299.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1180//console

This message is automatically generated.

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch, HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2299:
--

Attachment: HDFS-2299.patch

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch, HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093976#comment-13093976
 ] 

Aaron T. Myers commented on HDFS-2299:
--

Yes, that's fine. Just make sure that test-patch doesn't complain. I believe 
you do this by upping the number of OK_RELEASEAUDIT_WARNINGS in the relevant 
test-patch.properties file.

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093975#comment-13093975
 ] 

Hadoop QA commented on HDFS-2299:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12492275/HDFS-2299.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:



+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1179//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1179//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1179//console

This message is automatically generated.

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place

2011-08-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093974#comment-13093974
 ] 

Hadoop QA commented on HDFS-2018:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12492273/HDFS-2018.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 28 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:

  org.apache.hadoop.hdfs.TestDfsOverAvroRpc
  
org.apache.hadoop.hdfs.server.blockmanagement.TestHost2NodesMap

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1178//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1178//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1178//console

This message is automatically generated.

> 1073: Move all journal stream management code into one place
> 
>
> Key: HDFS-2018
> URL: https://issues.apache.org/jira/browse/HDFS-2018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, hdfs-2018-otherapi.txt, hdfs-2018.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in 
> FileJournalManager and the code for input streams is in the inspectors. This 
> change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of 
> transactions instead of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093970#comment-13093970
 ] 

Uma Maheswara Rao G commented on HDFS-2299:
---

But that will introduce the release audit warning.but no harm.
Is that fine?

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2299:
--

Status: Patch Available  (was: Open)

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093961#comment-13093961
 ] 

Aaron T. Myers commented on HDFS-2299:
--

Thanks for looking into this Uma. I think a better solution would be to just 
remove the offending header from editsStored.xml. There's no reason it needs to 
be there, and I don't think we should rework the test to account for it.

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093954#comment-13093954
 ] 

Uma Maheswara Rao G commented on HDFS-2299:
---

Hi Aaron,

I analyzed the root cause for this failure.
I think as part of some release audit comment fixes, Apache header has been 
added in editStored.xml file.

But this testcase will generate one editsStoredParsedXml file. This file will 
not contain that apache header. 
Actually we need not compare that content. So, excluded them from comparison.


Thanks
Uma

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2299:
--

Attachment: HDFS-2299.patch

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2299.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1779) After NameNode restart , Clients can not read partial files even after client invokes Sync.

2011-08-30 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093943#comment-13093943
 ] 

Hairong Kuang commented on HDFS-1779:
-

Sure, I can upload a patch for supporting bbw block report.

> After NameNode restart , Clients can not read partial files even after client 
> invokes Sync.
> ---
>
> Key: HDFS-1779
> URL: https://issues.apache.org/jira/browse/HDFS-1779
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.20-append
> Environment: Linux
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 0.20-append
>
> Attachments: HDFS-1779.1.patch, HDFS-1779.patch
>
>
> In Append HDFS-200 issue,
> If file has 10 blocks and after writing 5 blocks if client invokes sync 
> method then NN will persist the blocks information in edits. 
> After this if we restart the NN, All the DataNodes will reregister with NN. 
> But DataNodes are not sending the blocks being written information to NN. DNs 
> are sending the blocksBeingWritten information in DN startup. So, here 
> NameNode can not find that the 5 persisted blocks belongs to which datanodes. 
> This information can build based on block reports from DN. Otherwise we will 
> loose this 5 blocks information even NN persisted that block information in 
> edits. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-08-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093938#comment-13093938
 ] 

Aaron T. Myers commented on HDFS-2298:
--

Hey Uma, can you tell what commit introduced this issue? I don't have any 
work-around in mind for this problem.

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2299) TestOfflineEditsViewer is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-2299:
-

Assignee: Uma Maheswara Rao G

> TestOfflineEditsViewer is failing on trunk
> --
>
> Key: HDFS-2299
> URL: https://issues.apache.org/jira/browse/HDFS-2299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: 
> org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
> ---
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.652 sec <<< 
> FAILURE!
> testStored(org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer)
>   Time elapsed: 0.038 sec  <<< FAILURE!
> java.lang.AssertionError: Reference XML edits and parsed to XML should be same
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2018) 1073: Move all journal stream management code into one place

2011-08-30 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-2018:
-

Attachment: HDFS-2018.diff

> 1073: Move all journal stream management code into one place
> 
>
> Key: HDFS-2018
> URL: https://issues.apache.org/jira/browse/HDFS-2018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, hdfs-2018-otherapi.txt, hdfs-2018.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in 
> FileJournalManager and the code for input streams is in the inspectors. This 
> change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of 
> transactions instead of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-08-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093887#comment-13093887
 ] 

Andrew Purtell commented on HDFS-2246:
--

A revised commit without accidentally included .orig files from patch: 
https://github.com/trendmicro/hadoop-common/commit/c19169f9c93f89ddafd6b6293a3e327a9741dfb0

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-08-30 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HDFS-2246:
-

Attachment: 0001-HDFS-347.-Local-reads.patch

Attached is an 0.20-ish patch ported from our production tree to an ASF 
derivative: 
https://github.com/trendmicro/hadoop-common/tree/0.20-security-append-HDFS-347

(The 0.20-security-append tree, based on security-204, is here: 
https://github.com/trendmicro/hadoop-common/tree/0.20-security-append)

The attached patch is this change: 
https://github.com/trendmicro/hadoop-common/commit/18971523a9260acbe920c3da0c5a2623eacec1d2

This was initially Ryan Rawson's patch for 0.20-append on HDFS-347, later 
merged with the checksumming improvements from Facebook, and merged with our 
enhancements and fixups for security and metrics v2. It includes a new metric 
for the number of times clients successful get a local path to block data.

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1217) Some methods in the NameNdoe should not be public

2011-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093797#comment-13093797
 ] 

Hudson commented on HDFS-1217:
--

Integrated in Hadoop-Mapreduce-trunk #801 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/801/])
HDFS-1217.  Change some NameNode methods from public to package private.  
Constributed by Laxman

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1163081
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMultipleRegistrations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java


> Some methods in the NameNdoe should not be public
> -
>
> Key: HDFS-1217
> URL: https://issues.apache.org/jira/browse/HDFS-1217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Laxman
> Fix For: 0.23.0, 0.24.0
>
> Attachments: HDFS-1217.patch
>
>
> There are quite a few NameNode methods which are not required to be public.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2095) org.apache.hadoop.hdfs.server.datanode.DataNode#checkDiskError produces check storm making data node unavailable

2011-08-30 Thread Vitalii Tymchyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093793#comment-13093793
 ] 

Vitalii Tymchyshyn commented on HDFS-2095:
--

I've got today one more stack trace that forced "disk check": 

2011-08-30 14:48:39,010 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
checkDiskError: exception: 
java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
at sun.nio.ch.IOUtil.write(IOUtil.java:75)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:60)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:151)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:112)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:105)
at java.io.DataOutputStream.writeShort(DataOutputStream.java:150)
at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.write(DataTransferProtocol.java:543)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:916)
at java.lang.Thread.run(Thread.java:619)

It seems that BlockReceiver$PacketResponder.run do not distinguish between disk 
errors and network errors and runs checkDiskError in any case. And 
checkDiskError is very time consuming if you have a lot of directories. So, 
actually there are next problems to be fixed:
 1) checkDiskError should not be called on network errors
 2) checkDiskError should either not lock the whole data dir or be much more 
lighter.
In any case, why the whole tree (dirs only!) is analyzed? Is it because of 
possible multiple mount  points possible in there? In any case, checking every 
single directory looks like overkill for me. In my case, it simply makes node 
"dead".

> org.apache.hadoop.hdfs.server.datanode.DataNode#checkDiskError produces check 
> storm making data node unavailable
> 
>
> Key: HDFS-2095
> URL: https://issues.apache.org/jira/browse/HDFS-2095
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.21.0
>Reporter: Vitalii Tymchyshyn
> Attachments: patch.diff, patch2.diff
>
>
> I can see that if data node receives some IO error, this can cause checkDir 
> storm.
> What I mean:
> 1) any error produces DataNode.checkDiskError call
> 2) this call locks volume:
>  java.lang.Thread.State: RUNNABLE
>at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
>at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
>at java.io.File.exists(File.java:733)
>at 
> org.apache.hadoop.util.DiskChecker.mkdirsWithExistsCheck(DiskChecker.java:65)
>at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:86)
>at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.checkDirTree(FSDataset.java:228)
>at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.checkDirTree(FSDataset.java:232)
>at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.checkDirTree(FSDataset.java:232)
>at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSDir.checkDirTree(FSDataset.java:232)
>at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.checkDirs(FSDataset.java:414)
>at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.checkDirs(FSDataset.java:617)
>- locked <0x00080a8faec0> (a 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet)
>at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.checkDataDir(FSDataset.java:1681)
>at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:745)
>at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:735)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.close(BlockReceiver.java:202)
>at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:151)
>at org.apache.hadoop.io.IOUtils.closeStream(IOUtils.java:167)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:646)
>at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:352)
>at 
> org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:390)
>at 
> org.ap

[jira] [Commented] (HDFS-1108) Log newly allocated blocks

2011-08-30 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093792#comment-13093792
 ] 

Eli Collins commented on HDFS-1108:
---

The design doc allows for and discusses multiple approaches, specific 
implementations are happening in the jiras, eg see HDFS-1975 for sharing NN 
state.

> Log newly allocated blocks
> --
>
> Key: HDFS-1108
> URL: https://issues.apache.org/jira/browse/HDFS-1108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
> Fix For: HA branch (HDFS-1623)
>
> Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, hdfs-1108.txt
>
>
> The current HDFS design says that newly allocated blocks for a file are not 
> persisted in the NN transaction log when the block is allocated. Instead, a 
> hflush() or a close() on the file persists the blocks into the transaction 
> log. It would be nice if we can immediately persist newly allocated blocks 
> (as soon as they are allocated) for specific files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN

2011-08-30 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093787#comment-13093787
 ] 

Eli Collins commented on HDFS-1623:
---

Hi Konstantin,

HDFS-1975 (sharing the namenode state from active to standby) was created back 
in May and the description clearly states that the "proposed solution in this 
jira is to use shared storage".  It's probably the most appropriate place for 
this discussion.

Thanks,
Eli 

> High Availability Framework for HDFS NN
> ---
>
> Key: HDFS-1623
> URL: https://issues.apache.org/jira/browse/HDFS-1623
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode 
> HA_v2_1.pdf, Namenode HA Framework.pdf
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1217) Some methods in the NameNdoe should not be public

2011-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093691#comment-13093691
 ] 

Hudson commented on HDFS-1217:
--

Integrated in Hadoop-Hdfs-trunk #776 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/776/])
HDFS-1217.  Change some NameNode methods from public to package private.  
Constributed by Laxman

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1163081
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMultipleRegistrations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java


> Some methods in the NameNdoe should not be public
> -
>
> Key: HDFS-1217
> URL: https://issues.apache.org/jira/browse/HDFS-1217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Laxman
> Fix For: 0.23.0, 0.24.0
>
> Attachments: HDFS-1217.patch
>
>
> There are quite a few NameNode methods which are not required to be public.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093664#comment-13093664
 ] 

Uma Maheswara Rao G commented on HDFS-2298:
---

Hi Aaron,
 
 Looks Avro does not support multiple APIs with same name
http://web.archiveorange.com/archive/v/du1upKgQKN4KYfuW7tng

 But we have the cases in NameNodeProtocols.

{code}
public interface NamenodeProtocols
  extends ClientProtocol,
  DatanodeProtocol,
  NamenodeProtocol,
  RefreshAuthorizationPolicyProtocol,
  RefreshUserMappingsProtocol,
  GetUserMappingsProtocol {
{code}

One more observation is that reportedBadBlocks, versionRequest and etc.. 
present in ClientProtocol, DatanodeProtocol. I think this case also will not be 
allowed.


Since we have such scenarios to support, we may need to check with Avro.
we can file one JIRA in AVRO or you have some workaround for this?


Thanks
Uma



> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2281) NPE in checkpoint during processIOError()

2011-08-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093622#comment-13093622
 ] 

Uma Maheswara Rao G commented on HDFS-2281:
---

 Hi Konstantin,

   Thanks a lot for taking a look on this patch.
   Below is the test-patch results

 [exec] +1 @author. The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include  new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system test framework.  The patch passed system test 
framework compile.

When i ran the tests no failures observed because of this changes.

Thanks
Uma

> NPE in checkpoint during processIOError()
> -
>
> Key: HDFS-2281
> URL: https://issues.apache.org/jira/browse/HDFS-2281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: BN-bug-NPE.txt, HDFS-2281.patch
>
>
> At the end of checkpoint BackupNode tries to convergeJournalSpool() and calls 
> revertFileStreams(). The latter closes each file stream, and tries to rename 
> the corresponding file to its permanent location current/edits. If for any 
> reason the rename fails processIOError() is called for failed streams. 
> processIOError() will try to close the stream again and will get NPE in 
> EditLogFileOutputStream.close() because bufCurrent was set to null by the 
> previous close.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1108) Log newly allocated blocks

2011-08-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093619#comment-13093619
 ] 

Konstantin Shvachko commented on HDFS-1108:
---

Eli, the latest design doc does not make any decisions on the matter, your 
excerpt is from the oldest doc.
Since the discussion turns to more general topic I [posted my 
concerns|https://issues.apache.org/jira/browse/HDFS-1623?focusedCommentId=13093571&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13093571]
 in the main jira.

> Log newly allocated blocks
> --
>
> Key: HDFS-1108
> URL: https://issues.apache.org/jira/browse/HDFS-1108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
> Fix For: HA branch (HDFS-1623)
>
> Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, hdfs-1108.txt
>
>
> The current HDFS design says that newly allocated blocks for a file are not 
> persisted in the NN transaction log when the block is allocated. Instead, a 
> hflush() or a close() on the file persists the blocks into the transaction 
> log. It would be nice if we can immediately persist newly allocated blocks 
> (as soon as they are allocated) for specific files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN

2011-08-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093571#comment-13093571
 ] 

Konstantin Shvachko commented on HDFS-1623:
---

The discussion in HDFS-1108 revealed that Todd, Suresh and Eli (and probably 
others) are building HA approach based on shared storage (NFS filers) journal 
synchronization. The motivation for this is claimed to be the simplicity of the 
approach, compared to the direct streaming of edits to the StandbyNode. I think 
there are 2 main questions that need to be addressed with respect to this:
# _Why do you introduce a dependency on enterprise hardware when you run a 
commodity hardware cluster?_ 
*People running a 20-node Hadoop cluster will have to spend probably the same 
amount extra on a filer.*
# _How do you address the race condition between NN addBlock and DN 
blockReceived?_
Explanation: When HDFS client needs to creates a new block it sends addBlock() 
command to the NameNode. NN (assuming HDFS-1108 is fixed) writes addBlock 
transaction to the shared storage. The client writes data to the allocated 
DataNodes. Each DataNode confirms that it got the replica by sending 
blockReceived() message to NN and SBN. If blockReceived() is sent to 
StandbyNode before it consumed addBlock() transaction for this block from 
shared storage, blockReceived() will be rejected since SBN still does not know 
the block exists. SBN will eventually learn about that same replica from the 
next block report, but this can be one hour later. 
*SBN will be one hour behind the active NN, which is not hot.*

> High Availability Framework for HDFS NN
> ---
>
> Key: HDFS-1623
> URL: https://issues.apache.org/jira/browse/HDFS-1623
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Attachments: HDFS-High-Availability.pdf, NameNode HA_v2.pdf, NameNode 
> HA_v2_1.pdf, Namenode HA Framework.pdf
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place

2011-08-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093551#comment-13093551
 ] 

Hadoop QA commented on HDFS-2018:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12492209/HDFS-2018.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 28 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:

  org.apache.hadoop.hdfs.TestDfsOverAvroRpc
  
org.apache.hadoop.hdfs.server.blockmanagement.TestHost2NodesMap

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1177//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1177//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1177//console

This message is automatically generated.

> 1073: Move all journal stream management code into one place
> 
>
> Key: HDFS-2018
> URL: https://issues.apache.org/jira/browse/HDFS-2018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> hdfs-2018-otherapi.txt, hdfs-2018.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in 
> FileJournalManager and the code for input streams is in the inspectors. This 
> change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of 
> transactions instead of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2281) NPE in checkpoint during processIOError()

2011-08-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093533#comment-13093533
 ] 

Konstantin Shvachko commented on HDFS-2281:
---

Hey Uma. It looks good to me. Could you pls run and post the results of test 
and test-patch here.

> NPE in checkpoint during processIOError()
> -
>
> Key: HDFS-2281
> URL: https://issues.apache.org/jira/browse/HDFS-2281
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: BN-bug-NPE.txt, HDFS-2281.patch
>
>
> At the end of checkpoint BackupNode tries to convergeJournalSpool() and calls 
> revertFileStreams(). The latter closes each file stream, and tries to rename 
> the corresponding file to its permanent location current/edits. If for any 
> reason the rename fails processIOError() is called for failed streams. 
> processIOError() will try to close the stream again and will get NPE in 
> EditLogFileOutputStream.close() because bufCurrent was set to null by the 
> previous close.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2018) 1073: Move all journal stream management code into one place

2011-08-30 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-2018:
-

Attachment: HDFS-2018.diff

Had unloaded wrong patch last time with 1 fix missing. This should pass as well 
as trunk. 

On trunk the following tests have failures;
TestDfsOverAvroRpc
TestHost2NodesMap
TestReplicasMap
TestOfflineEditsViewer

> 1073: Move all journal stream management code into one place
> 
>
> Key: HDFS-2018
> URL: https://issues.apache.org/jira/browse/HDFS-2018
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
> hdfs-2018-otherapi.txt, hdfs-2018.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in 
> FileJournalManager and the code for input streams is in the inspectors. This 
> change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of 
> transactions instead of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-962) Make DFSOutputStream MAX_PACKETS configurable

2011-08-30 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093504#comment-13093504
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-962:
-

Hi Justin,

Thanks for working on this.  Some comments on the patch:

- change dfsMaxPackets to private and non-static.

- define constants for dfs.max.packets in DFSConfigKeys

> Make DFSOutputStream MAX_PACKETS configurable
> -
>
> Key: HDFS-962
> URL: https://issues.apache.org/jira/browse/HDFS-962
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Justin Joseph
>Priority: Minor
> Attachments: HDFS-962.patch
>
>
> HDFS-959 suggests that the MAX_PACKETS variable (which determines how many 
> outstanding data packets the DFSOutputStream will permit) may have an impact 
> on performance. If so, we should make it configurable to trade off between 
> memory and performance. I think it ought to be a secret/undocumented config 
> for now - this will make it easier to benchmark without confusing users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-962) Make DFSOutputStream MAX_PACKETS configurable

2011-08-30 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-962:


Assignee: Justin Joseph

> Make DFSOutputStream MAX_PACKETS configurable
> -
>
> Key: HDFS-962
> URL: https://issues.apache.org/jira/browse/HDFS-962
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Justin Joseph
>Priority: Minor
> Attachments: HDFS-962.patch
>
>
> HDFS-959 suggests that the MAX_PACKETS variable (which determines how many 
> outstanding data packets the DFSOutputStream will permit) may have an impact 
> on performance. If so, we should make it configurable to trade off between 
> memory and performance. I think it ought to be a secret/undocumented config 
> for now - this will make it easier to benchmark without confusing users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira