[jira] [Commented] (HDFS-3916) libwebhdfs (C client) code cleanups

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487591#comment-13487591
 ] 

Hudson commented on HDFS-3916:
--

Integrated in Hadoop-trunk-Commit #2945 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2945/])
HDFS-3916. libwebhdfs testing code cleanup. Contributed by Jing Zhao. 
(Revision 1403922)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403922
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_multi_write.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_read.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_threaded.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_write.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_read_bm.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/native_mini_dfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/native_mini_dfs.h


> libwebhdfs (C client) code cleanups
> ---
>
> Key: HDFS-3916
> URL: https://issues.apache.org/jira/browse/HDFS-3916
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 2.0.2-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.0.3-alpha
>
> Attachments: 0002-fix.patch, HDFS-3916.003.patch, 
> HDFS-3916.004.patch, HDFS-3916.005.patch, HDFS-3916.006.patch
>
>
> Code cleanups in libwebhdfs.
> * don't duplicate exception.c, exception.h, expect.h, jni_helper.c.  We have 
> one copy of these files; we don't need 2.
> * remember to set errno in all public library functions (this is part of the 
> API)
> * fix undefined symbols (if a function is not implemented, it should return 
> ENOTSUP, but still exist)
> * don't expose private data structures in the (end-user visible) public 
> headers
> * can't re-use hdfsBuilder as hdfsFS, because the strings in hdfsBuilder are 
> not dynamically allocated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4129) Add utility methods to dump NameNode in memory tree for testing

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487590#comment-13487590
 ] 

Hudson commented on HDFS-4129:
--

Integrated in Hadoop-trunk-Commit #2945 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2945/])
HDFS-4129. Add utility methods to dump NameNode in memory tree for testing. 
Contributed by Tsz Wo (Nicholas), SZE. (Revision 1403956)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403956
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsLimits.java


> Add utility methods to dump NameNode in memory tree for testing
> ---
>
> Key: HDFS-4129
> URL: https://issues.apache.org/jira/browse/HDFS-4129
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h4129_20121029b.patch, h4129_20121029.patch, 
> h4129_20121030.patch
>
>
> The output of the utility methods looks like below.
> {noformat}
> \- foo   (INodeDirectory)
>   \- sub1   (INodeDirectory)
> +- file1   (INodeFile)
> +- file2   (INodeFile)
> +- sub11   (INodeDirectory)
>   \- file3   (INodeFile)
> \- z_file4   (INodeFile)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4047) BPServiceActor has nested shouldRun loops

2012-10-30 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated HDFS-4047:
--

Status: Patch Available  (was: Open)

> BPServiceActor has nested shouldRun loops
> -
>
> Key: HDFS-4047
> URL: https://issues.apache.org/jira/browse/HDFS-4047
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Minor
> Attachments: HADOOP-4047.patch
>
>
> BPServiceActor#run and offerService booth have while shouldRun loops. We only 
> need the outer one, ie we can hoist the info log from offerService out to run 
> and remove the while loop.
> {code}
> BPServiceActor#run:
> while (shouldRun()) {
>   try {
> offerService();
>   } catch (Exception ex) {
> ...
> offerService:
> while (shouldRun()) {
>   try {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4047) BPServiceActor has nested shouldRun loops

2012-10-30 Thread Yanbo Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanbo Liang updated HDFS-4047:
--

Attachment: HADOOP-4047.patch

Just like the description, we hoist the info log from offerService() out to 
run() and remove the while loop in offerservice(). The patch looks like huge 
change due to the format of code.

> BPServiceActor has nested shouldRun loops
> -
>
> Key: HDFS-4047
> URL: https://issues.apache.org/jira/browse/HDFS-4047
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Minor
> Attachments: HADOOP-4047.patch
>
>
> BPServiceActor#run and offerService booth have while shouldRun loops. We only 
> need the outer one, ie we can hoist the info log from offerService out to run 
> and remove the while loop.
> {code}
> BPServiceActor#run:
> while (shouldRun()) {
>   try {
> offerService();
>   } catch (Exception ex) {
> ...
> offerService:
> while (shouldRun()) {
>   try {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode

2012-10-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487583#comment-13487583
 ] 

Aaron T. Myers commented on HDFS-3507:
--

Hi Vinay, the config "dfs.ha.allow.stale.reads" is only used for tests. As 
such, I think it's OK to label these operations as I previously suggested.

> DFS#isInSafeMode needs to execute only on Active NameNode
> -
>
> Key: HDFS-3507
> URL: https://issues.apache.org/jira/browse/HDFS-3507
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
>Priority: Critical
> Attachments: HDFS-3507.patch, HDFS-3507.patch
>
>
> Currently DFS#isInSafeMode is not Checking for the NN state. It can be 
> executed on any of the NNs.
> But HBase will use this API to check for the NN safemode before starting up 
> its service.
> If first NN configured is in standby then DFS#isInSafeMode will check standby 
> NNs safemode but hbase want state of Active NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2882) DN continues to start up, even if block pool fails to initialize

2012-10-30 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-2882:
--

Affects Version/s: (was: 0.24.0)
   2.0.2-alpha

> DN continues to start up, even if block pool fails to initialize
> 
>
> Key: HDFS-2882
> URL: https://issues.apache.org/jira/browse/HDFS-2882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 2.0.2-alpha
>Reporter: Todd Lipcon
> Attachments: hdfs-2882.txt
>
>
> I started a DN on a machine that was completely out of space on one of its 
> drives. I saw the following:
> 2012-02-02 09:56:50,499 FATAL 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
> block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id 
> DS-507718931-172.29.5.194-11072-12978
> 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021
> java.io.IOException: Mkdirs failed to create 
> /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.(FSDataset.java:335)
> but the DN continued to run, spewing NPEs when it tried to do block reports, 
> etc. This was on the HDFS-1623 branch but may affect trunk as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487547#comment-13487547
 ] 

Konstantin Shvachko commented on HDFS-2802:
---

I propose to divide this discussion into three categories: design goals, API 
and semantics, algorithms and implementation. If people can agree on one we can 
move to the next.
# I see three main *design goals* proposed: snapshots should be (a) read-only, 
(b) directory-level, (c) multiple.
This should hopefully work for everybody.
# *API*. Seems to me the most important point now.
HDFSSnapshotsDesign.pdf doesn't talk much about APIs except a reference to WAFL.
Snapshots20121030.pdf has examples of shell commands, which look a bit 
convoluted. I mean using delimiter ".snapshot" to specify a snapshot means I 
cannot have entries with that name.
Wouldn't it be better to control access to snapshots via -version option:
{{rm -r -version 3 /user/shv/hbase/}}  remove snapshot with id 3.
{{ls -version 2 /user/shv/hbase/}}  listing of the snapshot #2.
{{ls -versions /user/shv/hbase/}}  listing of snapshot ids of the directory.
Where non -versioned commands deal with "current" state of the file system as 
today.
I like the idea of generating globally unique version ids, and assigning them 
to snapshots internally rather than letting people invent their own. One can 
always list available versions and read the desired one. So the -createSnapshot 
command does not need to pass , but will instead get it in return.
# *Algorithms*. I agree the length of an under-construction file in the 
snapshot should come directly from the namespace. And we provide means to 
update it with hflush before the snapshot is taken.
Creating duplicate INodes with a diff, this is sort of COW technique, right? 
Sounds hard.
It is simpler for me to think of versioned files and directories in this case. 
Creating a snapshot assigns a new version to objects.
Deleting a file should remove current version, but leave other versions 
unchanged. Can be implemented by marking the file "deleted" until all versions 
disappear, when it can be physically removed.

My dumb question: can I create a snapshot of a subdirectory that is a part of a 
snapshot above it?

> Support for RW/RO snapshots in HDFS
> ---
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node, name-node
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFSSnapshotsDesign.pdf, snap.patch, 
> snapshot-one-pager.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-30 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487496#comment-13487496
 ] 

Suresh Srinivas commented on HDFS-2802:
---

bq. From my quick look, this seems to clarify much of the proposed design
I am happy to hear this.

bq. It seems that since every current file which is snapshotted is represented 
by an INodeFileWithLink, at some point all files which are to be snapshotted 
under a sub-directory would need to be converted from an INodeFile to an 
INodeFileWithLink.
When a snapshotted file is modified, the conversion to INodeFileWithLink takes 
place. So if all the snapshotted files under a directory are modified post 
snapshot, they will need to be converted into INodeFileWithLink. BTW We have 
flexibility here and we can eliminate the need for this completely, if we 
decide snapshot of file does not preserve the replication factor at the time of 
snapshot - see replication factor related section for details.

> Support for RW/RO snapshots in HDFS
> ---
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node, name-node
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFSSnapshotsDesign.pdf, snap.patch, 
> snapshot-one-pager.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4128) 2NN gets stuck in inconsistent state if edit log replay fails in the middle

2012-10-30 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487495#comment-13487495
 ] 

Todd Lipcon commented on HDFS-4128:
---

Good point, Colin. I definitely don't think that's a safe assumption to make, 
so I agree we should abort if there is an exception on the application of an 
edit

> 2NN gets stuck in inconsistent state if edit log replay fails in the middle
> ---
>
> Key: HDFS-4128
> URL: https://issues.apache.org/jira/browse/HDFS-4128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.2-alpha
>Reporter: Todd Lipcon
>
> We saw the following issue in a cluster:
> - The 2NN downloads an edit log segment:
> {code}
> 2012-10-29 12:30:57,433 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading /xxx/current/edits_00049136809-00049176162 
> expecting start txid #49136809
> {code}
> - It fails in the middle of replay due to an OOME:
> {code}
> 2012-10-29 12:31:21,021 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation AddOp [length=0, path=/
> java.lang.OutOfMemoryError: Java heap space
> {code}
> - Future checkpoints then fail because the prior edit log replay only got 
> halfway through the stream:
> {code}
> 2012-10-29 12:32:21,214 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading /x/current/edits_00049176163-00049177224 
> expecting start txid #49144432
> 2012-10-29 12:32:21,216 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 49144432, but got txid 49176163.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487489#comment-13487489
 ] 

Aaron T. Myers commented on HDFS-2802:
--

Hi Suresh, thanks a lot for posting an updated design doc. From my quick look, 
this seems to clarify much of the proposed design.

I do have one question for you from my quick read of this updated doc: It seems 
that since every current file which is snapshotted is represented by an 
INodeFileWithLink, at some point all files which are to be snapshotted under a 
sub-directory would need to be converted from an INodeFile to an 
INodeFileWithLink. When would this conversion take place? Perhaps when the 
directory is marked as being snapshottable using the `dfsadmin -allowSnapshot' 
command? Or perhaps when a snapshot is first created of a snapshottable 
directory? (Not suggesting this is unacceptable - just trying to understand the 
design.)

> Support for RW/RO snapshots in HDFS
> ---
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node, name-node
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFSSnapshotsDesign.pdf, snap.patch, 
> snapshot-one-pager.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4118) Change INodeDirectory.getExistingPathINodes(..) to work with snapshots

2012-10-30 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-4118.
---

   Resolution: Fixed
Fix Version/s: Snapshot (HDFS-2802)
 Hadoop Flags: Reviewed

I committed the patch to HDFS-2802 branch. Thank you Jing.

> Change INodeDirectory.getExistingPathINodes(..) to work with snapshots
> --
>
> Key: HDFS-4118
> URL: https://issues.apache.org/jira/browse/HDFS-4118
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: Snapshot (HDFS-2802)
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Jing Zhao
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: HDFS-4118.001.patch
>
>
> {code}
> int getExistingPathINodes(byte[][] components, INode[] existing, boolean 
> resolveLink)
> {code}
> The INodeDirectory above retrieves existing INodes from the given path 
> components.  It needs to be updated in order to understand snapshot paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4118) Change INodeDirectory.getExistingPathINodes(..) to work with snapshots

2012-10-30 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4118:
--

Affects Version/s: Snapshot (HDFS-2802)

> Change INodeDirectory.getExistingPathINodes(..) to work with snapshots
> --
>
> Key: HDFS-4118
> URL: https://issues.apache.org/jira/browse/HDFS-4118
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: Snapshot (HDFS-2802)
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Jing Zhao
> Attachments: HDFS-4118.001.patch
>
>
> {code}
> int getExistingPathINodes(byte[][] components, INode[] existing, boolean 
> resolveLink)
> {code}
> The INodeDirectory above retrieves existing INodes from the given path 
> components.  It needs to be updated in order to understand snapshot paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4118) Change INodeDirectory.getExistingPathINodes(..) to work with snapshots

2012-10-30 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487477#comment-13487477
 ] 

Suresh Srinivas commented on HDFS-4118:
---

+1 for the patch. Nice set of tests.

> Change INodeDirectory.getExistingPathINodes(..) to work with snapshots
> --
>
> Key: HDFS-4118
> URL: https://issues.apache.org/jira/browse/HDFS-4118
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: Snapshot (HDFS-2802)
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Jing Zhao
> Attachments: HDFS-4118.001.patch
>
>
> {code}
> int getExistingPathINodes(byte[][] components, INode[] existing, boolean 
> resolveLink)
> {code}
> The INodeDirectory above retrieves existing INodes from the given path 
> components.  It needs to be updated in order to understand snapshot paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4056) Always start the NN's SecretManager

2012-10-30 Thread Kan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487475#comment-13487475
 ] 

Kan Zhang commented on HDFS-4056:
-

bq. "Newer" clients can however request and use tokens, while "older" clients 
work the same as before.

I'm not sure why "newer" or "older" clients matter. To me, a cluster is 
configured to run in either token testing mode or production mode. There needs 
to be a conf flag to tell the cluster which mode it is in. That flag tells a 
job whether it needs to use tokens or not. The same flag can tell NN whether it 
needs to start its SecretManager (regardless the client is newer or older).

bq. Even with or w/o SASL PLAIN auth, HADOOP-8733 and HADOOP-8784 should not be 
reverted. They both implement correct behavior in a more general fashion.

IMO, they make the Client and Server less intelligent in the sense that they 
don't recognize situations they used to recognize. I'm not sure their new 
behavior is desirable. For example, Client will always look for token and try 
to use it if found, even if configuration says otherwise. And the NN's RPC 
Server will always initialize SaslRpcServer even if SASL is not configured 
(currently on NN, SecretManager object is always instantiated).

> Always start the NN's SecretManager
> ---
>
> Key: HDFS-4056
> URL: https://issues.apache.org/jira/browse/HDFS-4056
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-4056.patch
>
>
> To support the ability to use tokens regardless of whether kerberos is 
> enabled, the NN's secret manager should always be started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4129) Add utility methods to dump NameNode in memory tree for testing

2012-10-30 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4129:
--

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed this patch to trunk. Thank you Nicholas.

> Add utility methods to dump NameNode in memory tree for testing
> ---
>
> Key: HDFS-4129
> URL: https://issues.apache.org/jira/browse/HDFS-4129
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: h4129_20121029b.patch, h4129_20121029.patch, 
> h4129_20121030.patch
>
>
> The output of the utility methods looks like below.
> {noformat}
> \- foo   (INodeDirectory)
>   \- sub1   (INodeDirectory)
> +- file1   (INodeFile)
> +- file2   (INodeFile)
> +- sub11   (INodeDirectory)
>   \- file3   (INodeFile)
> \- z_file4   (INodeFile)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4129) Add utility methods to dump NameNode in memory tree for testing

2012-10-30 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487461#comment-13487461
 ] 

Suresh Srinivas commented on HDFS-4129:
---

+1 for the patch.

> Add utility methods to dump NameNode in memory tree for testing
> ---
>
> Key: HDFS-4129
> URL: https://issues.apache.org/jira/browse/HDFS-4129
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Attachments: h4129_20121029b.patch, h4129_20121029.patch, 
> h4129_20121030.patch
>
>
> The output of the utility methods looks like below.
> {noformat}
> \- foo   (INodeDirectory)
>   \- sub1   (INodeDirectory)
> +- file1   (INodeFile)
> +- file2   (INodeFile)
> +- sub11   (INodeDirectory)
>   \- file3   (INodeFile)
> \- z_file4   (INodeFile)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4125) Use a persistent data structure for snapshots

2012-10-30 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487450#comment-13487450
 ] 

Suresh Srinivas commented on HDFS-4125:
---

Todd, please look at the latest design document in HDFS-2802 to understand the 
cost of modification.

> Use a persistent data structure for snapshots
> -
>
> Key: HDFS-4125
> URL: https://issues.apache.org/jira/browse/HDFS-4125
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Suresh Srinivas
>
> There is a well-known data structure supporting
> - O(1) snapshot creation,
> - O(1) access slowdown, and
> - O(1) modification space and time.
> See http://www.cs.cmu.edu/~sleator/papers/Persistence.htm

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-30 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-2802:
--

Attachment: Snapshots20121030.pdf

Attaching the updated design document. Hopefully it addresses the issues that 
have been rasied. This should give sufficient details about the implementation 
we are currently working on. To summarize:

# *Snapshot allowed only at the root vs snapshot at the subdirectories* - 
Ability to snapshot a sub-directory is very important requirement for many 
Hadoop users. Please see the requirements in the document posted for more 
details. The alternate proposal to allow snapshots only at the root is a 
non-starter in this regard.
# *Efficiency of snapshot creation and management* - The current design 
addresses the concerns raised. To summarize the creation of snapshot is O(1). 
The design uses copy-on-modify approach so that the cost of snapshot is zero 
when there is no modification and is proportional the modifications when they 
are made. Please provide feedback.
# *Snapshot of being written files and consistency*  - Our document describes 
several design choices - some easy, some complicated. Please see the proposed 
choice in the document. We could continue this discussion in HDFS-3960.

I took a look at the alternate proposal. It is too high level without 
sufficient details to evaluate. From my limited understanding of the alternate 
proposal, the design document we have posted here has several significant 
advantages over it:
# It supports sub-directory snapshots, an important use case for many Hadoop 
users.
# It supports on-demand and user managed snapshots.
# When snapshots are not created, there is no cost incurred in terms of memory. 
The alternate proposal has O(N) memory cost for storing tags.
# Our design can also be extended to do RW snapshots, if we feel a need for it.

Hopefully this clarifies the design better. We would like continue focus on 
implementing it. Any feedback provided will be incorporated into the design and 
the implementation.


> Support for RW/RO snapshots in HDFS
> ---
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node, name-node
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFSSnapshotsDesign.pdf, snap.patch, 
> snapshot-one-pager.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4129) Add utility methods to dump NameNode in memory tree for testing

2012-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487429#comment-13487429
 ] 

Hadoop QA commented on HDFS-4129:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551455/h4129_20121030.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3431//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3431//console

This message is automatically generated.

> Add utility methods to dump NameNode in memory tree for testing
> ---
>
> Key: HDFS-4129
> URL: https://issues.apache.org/jira/browse/HDFS-4129
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Attachments: h4129_20121029b.patch, h4129_20121029.patch, 
> h4129_20121030.patch
>
>
> The output of the utility methods looks like below.
> {noformat}
> \- foo   (INodeDirectory)
>   \- sub1   (INodeDirectory)
> +- file1   (INodeFile)
> +- file2   (INodeFile)
> +- sub11   (INodeDirectory)
>   \- file3   (INodeFile)
> \- z_file4   (INodeFile)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4128) 2NN gets stuck in inconsistent state if edit log replay fails in the middle

2012-10-30 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487392#comment-13487392
 ] 

Colin Patrick McCabe commented on HDFS-4128:


Aborting definitely seems like the safest thing to do-- do we know that all 
transactions are applied atomically (i.e. if they fail and throw an exception 
in the middle, is there rollback of whatever they did to the FSImage?)  I'm not 
clear on that point.

> 2NN gets stuck in inconsistent state if edit log replay fails in the middle
> ---
>
> Key: HDFS-4128
> URL: https://issues.apache.org/jira/browse/HDFS-4128
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.2-alpha
>Reporter: Todd Lipcon
>
> We saw the following issue in a cluster:
> - The 2NN downloads an edit log segment:
> {code}
> 2012-10-29 12:30:57,433 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading /xxx/current/edits_00049136809-00049176162 
> expecting start txid #49136809
> {code}
> - It fails in the middle of replay due to an OOME:
> {code}
> 2012-10-29 12:31:21,021 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation AddOp [length=0, path=/
> java.lang.OutOfMemoryError: Java heap space
> {code}
> - Future checkpoints then fail because the prior edit log replay only got 
> halfway through the stream:
> {code}
> 2012-10-29 12:32:21,214 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading /x/current/edits_00049176163-00049177224 
> expecting start txid #49144432
> 2012-10-29 12:32:21,216 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 49144432, but got txid 49176163.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4129) Add utility methods to dump NameNode in memory tree for testing

2012-10-30 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-4129:
-

Attachment: h4129_20121030.patch

h4129_20121030.patch: add some format checks.

> Add utility methods to dump NameNode in memory tree for testing
> ---
>
> Key: HDFS-4129
> URL: https://issues.apache.org/jira/browse/HDFS-4129
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Attachments: h4129_20121029b.patch, h4129_20121029.patch, 
> h4129_20121030.patch
>
>
> The output of the utility methods looks like below.
> {noformat}
> \- foo   (INodeDirectory)
>   \- sub1   (INodeDirectory)
> +- file1   (INodeFile)
> +- file2   (INodeFile)
> +- sub11   (INodeDirectory)
>   \- file3   (INodeFile)
> \- z_file4   (INodeFile)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-3923) libwebhdfs testing code cleanup

2012-10-30 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-3923.
---

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
   3.0.0
 Hadoop Flags: Reviewed

I committed the patch to branch-2 as well.

Thank you Jing for the patch. Thank you Colin and Andy for the review.

> libwebhdfs testing code cleanup
> ---
>
> Key: HDFS-3923
> URL: https://issues.apache.org/jira/browse/HDFS-3923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-3923.001.patch, HDFS-3923.002.patch
>
>
> 1. Testing code cleanup for libwebhdfs
> 1.1 Tests should generate a test-specific filename and should use TMPDIR 
> appropriately.
> 2. Enabling automate testing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3923) libwebhdfs testing code cleanup

2012-10-30 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487358#comment-13487358
 ] 

Suresh Srinivas commented on HDFS-3923:
---

+1 for the patch. I committed it to trunk.

> libwebhdfs testing code cleanup
> ---
>
> Key: HDFS-3923
> URL: https://issues.apache.org/jira/browse/HDFS-3923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3923.001.patch, HDFS-3923.002.patch
>
>
> 1. Testing code cleanup for libwebhdfs
> 1.1 Tests should generate a test-specific filename and should use TMPDIR 
> appropriately.
> 2. Enabling automate testing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7

2012-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487304#comment-13487304
 ] 

Hadoop QA commented on HDFS-3804:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551412/HDFS-3804-3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3430//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3430//console

This message is automatically generated.

> TestHftpFileSystem fails intermittently with JDK7
> -
>
> Key: HDFS-3804
> URL: https://issues.apache.org/jira/browse/HDFS-3804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
> Environment: Apache Maven 3.0.4
> Maven home: /usr/share/maven
> Java version: 1.7.0_04, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/jdk1.7.0_04/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.2.0-25-generic", arch: "amd64", family: "unix"
>Reporter: Trevor Robinson
>Assignee: Trevor Robinson
>  Labels: java7
> Attachments: HDFS-3804-2.patch, HDFS-3804-3.patch, HDFS-3804.patch, 
> HDFS-3804.patch
>
>
> For example:
>   testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
>   testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
> This test case sets up a filesystem that is used by the first half of the 
> test methods (in declaration order), but the second half of the tests start 
> by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
> arbitrary order, so if any first half methods run after any second half 
> methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1331) dfs -test should work like /bin/test

2012-10-30 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-1331:


Attachment: hdfs1331-2.txt

New patch:

Update usage message, remove debug println.

> dfs -test should work like /bin/test
> 
>
> Key: HDFS-1331
> URL: https://issues.apache.org/jira/browse/HDFS-1331
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.20.2, 3.0.0, 2.0.2-alpha
>Reporter: Allen Wittenauer
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs1331-2.txt, hdfs1331.txt, 
> hdfs1331-with-hadoop8994.txt
>
>
> hadoop dfs -test doesn't act like its shell equivalent, making it difficult 
> to actually use if you are used to the real test command:
> hadoop:
> $hadoop dfs -test -d /nonexist; echo $?
> test: File does not exist: /nonexist
> 255
> shell:
> $ test -d /nonexist; echo $?
> 1
> a) Why is it spitting out a message? Even so, why is it saying file instead 
> of directory when I used -d?
> b) Why is the return code 255? I realize this is documented as '0' if true.  
> But docs basically say the value is undefined if it isn't.
> c) where is -f?
> d) Why is empty -z instead of -s ?  Was it a misunderstanding of the man page?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3990) NN's health report has severe performance problems

2012-10-30 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487247#comment-13487247
 ] 

Eli Collins commented on HDFS-3990:
---

I missed that you switched to a List because we're conditionally adding items 
so hard to use an ImmutableList, I think using a List is better than the latest 
patch where you convert the List to an array, so +1 to the Oct 22nd patch

> NN's health report has severe performance problems
> --
>
> Key: HDFS-3990
> URL: https://issues.apache.org/jira/browse/HDFS-3990
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-3990.branch-0.23.patch, 
> HDFS-3990.branch-0.23.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, hdfs-3990.txt, hdfs-3990.txt
>
>
> The dfshealth page will place a read lock on the namespace while it does a 
> dns lookup for every DN.  On a multi-thousand node cluster, this often 
> results in 10s+ load time for the health page.  10 concurrent requests were 
> found to cause 7m+ load times during which time write operations blocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4131) Add a tool to print the diff between two snapshots and diff of a snapshot from the current tree

2012-10-30 Thread Suresh Srinivas (JIRA)
Suresh Srinivas created HDFS-4131:
-

 Summary: Add a tool to print the diff between two snapshots and 
diff of a snapshot from the current tree
 Key: HDFS-4131
 URL: https://issues.apache.org/jira/browse/HDFS-4131
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: Snapshot (HDFS-2802)
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas


This jira tracks tool to print diff between an two snapshots at a given path. 
The tool will also print the difference between the current directory and the 
given snapshot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4105) the SPNEGO user for secondary namenode should use the web keytab

2012-10-30 Thread Arpit Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487237#comment-13487237
 ] 

Arpit Gupta commented on HDFS-4105:
---

patched a secure hadoop 1.1.0 deploy with the patch and now the secondary 
namenode is able to log in.

Question if the HTTP principal fails to login should we not stop the secondary 
namenode server? I think we should do that as the image calls would fail 
without the if the HTTP principal was not available. Let me know and i can log 
a different jira for it.

> the SPNEGO user for secondary namenode should use the web keytab
> 
>
> Key: HDFS-4105
> URL: https://issues.apache.org/jira/browse/HDFS-4105
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.1.0, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HDFS-4105.branch-1.patch, HDFS-4105.patch
>
>
> This is similar to HDFS-3466 where we made sure the namenode checks for the 
> web keytab before it uses the namenode keytab.
> The same needs to be done for secondary namenode as well.
> {code}
> String httpKeytab = 
>   conf.get(DFSConfigKeys.DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY);
> if (httpKeytab != null && !httpKeytab.isEmpty()) {
>   params.put("kerberos.keytab", httpKeytab);
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487233#comment-13487233
 ] 

Hudson commented on HDFS-3809:
--

Integrated in Hadoop-trunk-Commit #2944 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2944/])
Moved HDFS-3809 entry in CHANGES.txt from trunk to 2.0.3-alpha section 
(Revision 1403769)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403769
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, HDFS-3809.diff, 
> HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487232#comment-13487232
 ] 

Hudson commented on HDFS-3789:
--

Integrated in Hadoop-trunk-Commit #2944 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2944/])
Moved HDFS-3789 entry in CHANGES.txt from trunk to 2.0.3-alpha section 
(Revision 1403765)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403765
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> JournalManager#format() should be able to throw IOException
> ---
>
> Key: HDFS-3789
> URL: https://issues.apache.org/jira/browse/HDFS-3789
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0003-HDFS-3789-for-branch-2.patch, HDFS-3789.diff
>
>
> Currently JournalManager#format cannot throw any exception. As format can 
> fail, we should be able to propogate this failure upwards. Otherwise, format 
> will fail silently, and the admin will start using the cluster with a 
> failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3923) libwebhdfs testing code cleanup

2012-10-30 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487231#comment-13487231
 ] 

Andy Isaacson commented on HDFS-3923:
-

The .002.patch looks good to me.  In the spirit of continuous improvement let's 
get this checked in ASAP so we can continue to make progress on getting 
libwebhdfs to a reliable state.

> libwebhdfs testing code cleanup
> ---
>
> Key: HDFS-3923
> URL: https://issues.apache.org/jira/browse/HDFS-3923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3923.001.patch, HDFS-3923.002.patch
>
>
> 1. Testing code cleanup for libwebhdfs
> 1.1 Tests should generate a test-specific filename and should use TMPDIR 
> appropriately.
> 2. Enabling automate testing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3923) libwebhdfs testing code cleanup

2012-10-30 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487227#comment-13487227
 ] 

Andy Isaacson commented on HDFS-3923:
-

Sorry, user error on my part -- the files are different, I managed to 
mistakenly wget the same file twice.

> libwebhdfs testing code cleanup
> ---
>
> Key: HDFS-3923
> URL: https://issues.apache.org/jira/browse/HDFS-3923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3923.001.patch, HDFS-3923.002.patch
>
>
> 1. Testing code cleanup for libwebhdfs
> 1.1 Tests should generate a test-specific filename and should use TMPDIR 
> appropriately.
> 2. Enabling automate testing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3923) libwebhdfs testing code cleanup

2012-10-30 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487223#comment-13487223
 ] 

Andy Isaacson commented on HDFS-3923:
-

The new .002.patch seems to be the same file as the old .001.patch.

> libwebhdfs testing code cleanup
> ---
>
> Key: HDFS-3923
> URL: https://issues.apache.org/jira/browse/HDFS-3923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3923.001.patch, HDFS-3923.002.patch
>
>
> 1. Testing code cleanup for libwebhdfs
> 1.1 Tests should generate a test-specific filename and should use TMPDIR 
> appropriately.
> 2. Enabling automate testing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4104) dfs -test -d prints inappropriate error on nonexistent directory

2012-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487218#comment-13487218
 ] 

Hadoop QA commented on HDFS-4104:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551406/hdfs4104-3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3429//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3429//console

This message is automatically generated.

> dfs -test -d prints inappropriate error on nonexistent directory
> 
>
> Key: HDFS-4104
> URL: https://issues.apache.org/jira/browse/HDFS-4104
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs4104-2.txt, hdfs4104-3.txt, hdfs-4104.txt
>
>
> Running {{hdfs dfs -test -d foo}} should return 0 or 1 as appropriate. It 
> should not generate any output due to missing files.  Alas, it prints an 
> error message when {{foo}} does not exist.
> {code}
> $ hdfs dfs -test -d foo; echo $?
> test: `foo': No such file or directory
> 1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3809:
-

Attachment: 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch

ah, I had generated from what had been committed, so thats why it was there. 
removed now.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, HDFS-3809.diff, 
> HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-3822) TestWebHDFS fails intermittently with NullPointerException

2012-10-30 Thread Trevor Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson resolved HDFS-3822.
---

  Resolution: Duplicate
Release Note: Yes, this probably is a duplicate of HDFS-3664. The stack 
traces look the same, and I haven't seen it occur since that issue was fixed.

> TestWebHDFS fails intermittently with NullPointerException
> --
>
> Key: HDFS-3822
> URL: https://issues.apache.org/jira/browse/HDFS-3822
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.2-alpha
> Environment: Apache Maven 3.0.4
> Maven home: /usr/share/maven
> Java version: 1.6.0_24, vendor: Sun Microsystems Inc.
> Java home: /usr/lib/jvm/java-6-openjdk-amd64/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.2.0-25-generic", arch: "amd64", family: "unix"
>Reporter: Trevor Robinson
>  Labels: test-fail
> Attachments: org.apache.hadoop.hdfs.web.TestWebHDFS-output.txt, 
> org.apache.hadoop.hdfs.web.TestWebHDFS.txt
>
>
> I've hit this test failure a few times in trunk:
> {noformat}
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 58.835 sec 
> <<< FAILURE!
> testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS)  Time elapsed: 
> 52.105 sec  <<< FAILURE!
> java.lang.AssertionError: There are 1 exception(s):
>   Exception 0: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> java.lang.NullPointerExcept
> ion
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:101)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getBlockCollection(BlockManager.java:2926)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isValidBlock(FSNamesystem.java:4474)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:2439)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2200)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtoco
> lServerSideTranslatorPB.java:295)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMetho
> d(ClientNamenodeProtocolProtos.java:43388)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:473)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
> {noformat}
> It appears that {{close}} has been called on the {{BlocksMap}} before 
> {{getBlockCollection}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3809:
--

  Resolution: Fixed
Target Version/s: 2.0.1-alpha, 3.0.0  (was: 3.0.0, 2.0.1-alpha)
  Status: Resolved  (was: Patch Available)

[INFO]
[INFO] Apache Hadoop Main  SUCCESS [1.784s]
[INFO] Apache Hadoop Project POM . SUCCESS [1.483s]
[INFO] Apache Hadoop Annotations . SUCCESS [1.712s]
[INFO] Apache Hadoop Project Dist POM  SUCCESS [0.520s]
[INFO] Apache Hadoop Assemblies .. SUCCESS [0.204s]
[INFO] Apache Hadoop Auth  SUCCESS [2.090s]
[INFO] Apache Hadoop Auth Examples ... SUCCESS [0.957s]
[INFO] Apache Hadoop Common .. SUCCESS [47.699s]
[INFO] Apache Hadoop Common Project .. SUCCESS [0.083s]
[INFO] Apache Hadoop HDFS  SUCCESS [1:03.694
[INFO] Apache Hadoop HttpFS .. SUCCESS [8.764s]
[INFO] Apache Hadoop HDFS BookKeeper Journal . SUCCESS [5.288s]
[INFO] Apache Hadoop HDFS Project  SUCCESS [0.137s]

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, HDFS-3809.diff, 
> HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3960) Snapshot of Being Written Files

2012-10-30 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487171#comment-13487171
 ] 

Sanjay Radia commented on HDFS-3960:


I assumed that it was an optional parameter. 
In append-2 we avoided the extra call to the NN to reduce NN load and reduce 
the latency (see the append-2 design doc). BTW in append-2 the new readers 
always see the hflushed data correctly - the new reader determine the length of 
a file being written by contacting the DN.

> Snapshot of Being Written Files
> ---
>
> Key: HDFS-3960
> URL: https://issues.apache.org/jira/browse/HDFS-3960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> Here is a design question: Suppose there is a being written file when a 
> snapshot is being taken.  What should the length of the file be shown in the 
> snapshot?  In other words, how to determine the length of being written file 
> when a snapshot is being taken?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487160#comment-13487160
 ] 

Uma Maheswara Rao G commented on HDFS-3809:
---

ok, Adding '+package hadoop.hdfs;' fixed the problem. I will commit new patch.
Generally we will not keep changes.txt in patches. do you mind removing it. For 
me it is fine to commit above patch. Asking just to keep patches in JIRA 
consistently.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, HDFS-3809.diff, 
> HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7

2012-10-30 Thread Trevor Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated HDFS-3804:
--

Attachment: HDFS-3804-3.patch

Works for me, though I'd also make {{hdfs}} non-static.

> TestHftpFileSystem fails intermittently with JDK7
> -
>
> Key: HDFS-3804
> URL: https://issues.apache.org/jira/browse/HDFS-3804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
> Environment: Apache Maven 3.0.4
> Maven home: /usr/share/maven
> Java version: 1.7.0_04, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/jdk1.7.0_04/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.2.0-25-generic", arch: "amd64", family: "unix"
>Reporter: Trevor Robinson
>Assignee: Trevor Robinson
>  Labels: java7
> Attachments: HDFS-3804-2.patch, HDFS-3804-3.patch, HDFS-3804.patch, 
> HDFS-3804.patch
>
>
> For example:
>   testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
>   testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
> This test case sets up a filesystem that is used by the first half of the 
> test methods (in declaration order), but the second half of the tests start 
> by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
> arbitrary order, so if any first half methods run after any second half 
> methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7

2012-10-30 Thread Trevor Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated HDFS-3804:
--

Status: Patch Available  (was: Open)

> TestHftpFileSystem fails intermittently with JDK7
> -
>
> Key: HDFS-3804
> URL: https://issues.apache.org/jira/browse/HDFS-3804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
> Environment: Apache Maven 3.0.4
> Maven home: /usr/share/maven
> Java version: 1.7.0_04, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/jdk1.7.0_04/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.2.0-25-generic", arch: "amd64", family: "unix"
>Reporter: Trevor Robinson
>Assignee: Trevor Robinson
>  Labels: java7
> Attachments: HDFS-3804-2.patch, HDFS-3804-3.patch, HDFS-3804.patch, 
> HDFS-3804.patch
>
>
> For example:
>   testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
>   testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
> This test case sets up a filesystem that is used by the first half of the 
> test methods (in declaration order), but the second half of the tests start 
> by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
> arbitrary order, so if any first half methods run after any second half 
> methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3960) Snapshot of Being Written Files

2012-10-30 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487155#comment-13487155
 ] 

Todd Lipcon commented on HDFS-3960:
---

We can't pass the length (ie call fsync()) on every hflush. That would be too 
expensive -- eg a typical region server under load calls HFlush ~300 times per 
second. On a cluster of even 100 HBase nodes, that would result in 30k RPC/sec 
to the NN which could easily overwhelm it. Hence the proposal to add a flag so 
that this is only done in specific places where the client needs extra 
durability.

> Snapshot of Being Written Files
> ---
>
> Key: HDFS-3960
> URL: https://issues.apache.org/jira/browse/HDFS-3960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> Here is a design question: Suppose there is a being written file when a 
> snapshot is being taken.  What should the length of the file be shown in the 
> snapshot?  In other words, how to determine the length of being written file 
> when a snapshot is being taken?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3960) Snapshot of Being Written Files

2012-10-30 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487141#comment-13487141
 ] 

Sanjay Radia commented on HDFS-3960:


bq.  Would it be useful to pass file length from the client, when it calls 
namenode.fsync()?

Good idea konstantine. 
 


> Snapshot of Being Written Files
> ---
>
> Key: HDFS-3960
> URL: https://issues.apache.org/jira/browse/HDFS-3960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> Here is a design question: Suppose there is a being written file when a 
> snapshot is being taken.  What should the length of the file be shown in the 
> snapshot?  In other words, how to determine the length of being written file 
> when a snapshot is being taken?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487137#comment-13487137
 ] 

Ivan Kelly commented on HDFS-3809:
--

HDFS-4121 caused this. New patch attached.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, HDFS-3809.diff, 
> HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3809:
-

Target Version/s: 2.0.1-alpha, 3.0.0  (was: 3.0.0, 2.0.1-alpha)
  Status: Patch Available  (was: Reopened)

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, HDFS-3809.diff, 
> HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3809:
-

Attachment: 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 
> 0001-HDFS-3809.-Make-BKJM-use-protobufs-for-all-serializa.patch, 
> 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, HDFS-3809.diff, 
> HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3990) NN's health report has severe performance problems

2012-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487135#comment-13487135
 ] 

Hadoop QA commented on HDFS-3990:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551382/HDFS-3990.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3428//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3428//console

This message is automatically generated.

> NN's health report has severe performance problems
> --
>
> Key: HDFS-3990
> URL: https://issues.apache.org/jira/browse/HDFS-3990
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-3990.branch-0.23.patch, 
> HDFS-3990.branch-0.23.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, hdfs-3990.txt, hdfs-3990.txt
>
>
> The dfshealth page will place a read lock on the namespace while it does a 
> dns lookup for every DN.  On a multi-thousand node cluster, this often 
> results in 10s+ load time for the health page.  10 concurrent requests were 
> found to cause 7m+ load times during which time write operations blocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4104) dfs -test -d prints inappropriate error on nonexistent directory

2012-10-30 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-4104:


Attachment: hdfs4104-3.txt

Update TestHDFSCLI for new -test output semantics.

> dfs -test -d prints inappropriate error on nonexistent directory
> 
>
> Key: HDFS-4104
> URL: https://issues.apache.org/jira/browse/HDFS-4104
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs4104-2.txt, hdfs4104-3.txt, hdfs-4104.txt
>
>
> Running {{hdfs dfs -test -d foo}} should return 0 or 1 as appropriate. It 
> should not generate any output due to missing files.  Alas, it prints an 
> error message when {{foo}} does not exist.
> {code}
> $ hdfs dfs -test -d foo; echo $?
> test: `foo': No such file or directory
> 1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487100#comment-13487100
 ] 

Uma Maheswara Rao G commented on HDFS-3809:
---

Yes, Robert I just compared, they both looks almost identical. As you said 
there must be some issue in build scripts. let me/Ivan dig into it.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
> HDFS-3809.diff, HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487094#comment-13487094
 ] 

Robert Joseph Evans commented on HDFS-3809:
---

Thanks for doing that Uma. It looks like there is something about the build 
scripts that is causing it, because hdfs.proto, where NameSpaceInfoProto is 
defined, is more or less identical between trunk and branch-2.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
> HDFS-3809.diff, HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3960) Snapshot of Being Written Files

2012-10-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487085#comment-13487085
 ] 

Konstantin Shvachko commented on HDFS-3960:
---

More strongly I consider it a bug that we do not update metadata in NN when we 
hflush, regardless of snapshots. This probably addresses prior discussions 
about visible length in other jiras.

> Snapshot of Being Written Files
> ---
>
> Key: HDFS-3960
> URL: https://issues.apache.org/jira/browse/HDFS-3960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> Here is a design question: Suppose there is a being written file when a 
> snapshot is being taken.  What should the length of the file be shown in the 
> snapshot?  In other words, how to determine the length of being written file 
> when a snapshot is being taken?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487075#comment-13487075
 ] 

Uma Maheswara Rao G commented on HDFS-3809:
---

Thanks Robert. I will check this for branch-2, as there were only offset 
changes from trunk patch I applied it directly. To unblock others I have just 
reverted it for now.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
> HDFS-3809.diff, HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3923) libwebhdfs testing code cleanup

2012-10-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3923:


Attachment: HDFS-3923.002.patch

Updated the patch.

> libwebhdfs testing code cleanup
> ---
>
> Key: HDFS-3923
> URL: https://issues.apache.org/jira/browse/HDFS-3923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3923.001.patch, HDFS-3923.002.patch
>
>
> 1. Testing code cleanup for libwebhdfs
> 1.1 Tests should generate a test-specific filename and should use TMPDIR 
> appropriately.
> 2. Enabling automate testing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3960) Snapshot of Being Written Files

2012-10-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487060#comment-13487060
 ] 

Aaron T. Myers commented on HDFS-3960:
--

bq. The document proposes to add new functionality as far as I understand. I 
just propose to add a parameter to fsync(). Because sync can and should adjust 
file length, just like close. The client is the only authority that knows the 
exact file length.

I think we're agreeing here, just using different terms. The document is 
presently discussing how the DFSOutputStream API would be changed to add an 
optional parameter to hflush() and hsync() which would cause the client to 
update the length on the NN. What you're describing is how that would be 
implemented: by changing the ClientProtocol#fsync method to also take a length 
parameter. I'll update the document to make this clear.

bq. I don't think files that were not hflush-ed should be excluded from 
snapshots as the document proposes. Everything should be versioned.

That would be fine too. Excluding them seemed simpler, but I don't feel 
strongly about it. The length of that file in the snapshot would then be at the 
last block boundary before the snapshot was taken. I'll update the design 
document to reflect that.

Thanks a lot for the comments, Konstantin, and for taking a look at the doc. 
Your feedback is very valuable.

> Snapshot of Being Written Files
> ---
>
> Key: HDFS-3960
> URL: https://issues.apache.org/jira/browse/HDFS-3960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> Here is a design question: Suppose there is a being written file when a 
> snapshot is being taken.  What should the length of the file be shown in the 
> snapshot?  In other words, how to determine the length of being written file 
> when a snapshot is being taken?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans reopened HDFS-3809:
---


Branch-2 is failing with 
{noformat}
main:
 [exec] bkjournal.proto:30:12: "NamespaceInfoProto" is not defined.
{noformat}

after this was merged in.  Please either fix it or revert the change.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
> HDFS-3809.diff, HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3990) NN's health report has severe performance problems

2012-10-30 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3990:
--

Attachment: HDFS-3990.patch

> NN's health report has severe performance problems
> --
>
> Key: HDFS-3990
> URL: https://issues.apache.org/jira/browse/HDFS-3990
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-3990.branch-0.23.patch, 
> HDFS-3990.branch-0.23.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, hdfs-3990.txt, hdfs-3990.txt
>
>
> The dfshealth page will place a read lock on the namespace while it does a 
> dns lookup for every DN.  On a multi-thousand node cluster, this often 
> results in 10s+ load time for the health page.  10 concurrent requests were 
> found to cause 7m+ load times during which time write operations blocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3990) NN's health report has severe performance problems

2012-10-30 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3990:
--

Status: Patch Available  (was: Open)

I reverted method signature to returning an array instead of a list.

> NN's health report has severe performance problems
> --
>
> Key: HDFS-3990
> URL: https://issues.apache.org/jira/browse/HDFS-3990
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-3990.branch-0.23.patch, 
> HDFS-3990.branch-0.23.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, hdfs-3990.txt, hdfs-3990.txt
>
>
> The dfshealth page will place a read lock on the namespace while it does a 
> dns lookup for every DN.  On a multi-thousand node cluster, this often 
> results in 10s+ load time for the health page.  10 concurrent requests were 
> found to cause 7m+ load times during which time write operations blocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3990) NN's health report has severe performance problems

2012-10-30 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3990:
--

Attachment: HDFS-3990.branch-0.23.patch

> NN's health report has severe performance problems
> --
>
> Key: HDFS-3990
> URL: https://issues.apache.org/jira/browse/HDFS-3990
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-3990.branch-0.23.patch, 
> HDFS-3990.branch-0.23.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, hdfs-3990.txt, hdfs-3990.txt
>
>
> The dfshealth page will place a read lock on the namespace while it does a 
> dns lookup for every DN.  On a multi-thousand node cluster, this often 
> results in 10s+ load time for the health page.  10 concurrent requests were 
> found to cause 7m+ load times during which time write operations blocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3960) Snapshot of Being Written Files

2012-10-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487033#comment-13487033
 ] 

Konstantin Shvachko commented on HDFS-3960:
---

The document proposes to add new functionality as far as I understand. I just 
propose to add a parameter to fsync(). Because sync can and should adjust file 
length, just like close. The client is the only authority that knows the exact 
file length.
That way snapshots should rely on the length of a file held by the NameNode 
instead of obtaining it from DataNodes.
I don't think files that were not hflush-ed should be excluded from snapshots 
as the document proposes. Everything should be versioned.

> Snapshot of Being Written Files
> ---
>
> Key: HDFS-3960
> URL: https://issues.apache.org/jira/browse/HDFS-3960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> Here is a design question: Suppose there is a being written file when a 
> snapshot is being taken.  What should the length of the file be shown in the 
> snapshot?  In other words, how to determine the length of being written file 
> when a snapshot is being taken?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3990) NN's health report has severe performance problems

2012-10-30 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3990:
--

Status: Open  (was: Patch Available)

> NN's health report has severe performance problems
> --
>
> Key: HDFS-3990
> URL: https://issues.apache.org/jira/browse/HDFS-3990
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
> Attachments: HDFS-3990.branch-0.23.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, HDFS-3990.patch, 
> HDFS-3990.patch, HDFS-3990.patch, hdfs-3990.txt, hdfs-3990.txt
>
>
> The dfshealth page will place a read lock on the namespace while it does a 
> dns lookup for every DN.  On a multi-thousand node cluster, this often 
> results in 10s+ load time for the health page.  10 concurrent requests were 
> found to cause 7m+ load times during which time write operations blocked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3810) Implement format() for BKJM

2012-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487027#comment-13487027
 ] 

Hadoop QA commented on HDFS-3810:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551379/HDFS-3810.diff
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3427//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3427//console

This message is automatically generated.

> Implement format() for BKJM
> ---
>
> Key: HDFS-3810
> URL: https://issues.apache.org/jira/browse/HDFS-3810
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-3810.diff, HDFS-3810.diff, HDFS-3810.diff
>
>
> At the moment, formatting for BKJM is done on initialization. Reinitializing 
> is a manual process. This JIRA is to implement the JournalManager#format API, 
> so that BKJM can be formatting along with all other storage methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487023#comment-13487023
 ] 

Konstantin Shvachko commented on HDFS-2802:
---

By rolling version on demand I just mean there is an API to create a new 
version of an object and keep the old one.
My feeling is that thinking explicitly about versions rather than snapshots in 
the context of this design will simplify things and make APIs more 
understandable for users.
Especially since the design is already based on versioning. And the document 
has an implementation proposal for that.

> Support for RW/RO snapshots in HDFS
> ---
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node, name-node
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFSSnapshotsDesign.pdf, snap.patch, 
> snapshot-one-pager.pdf, Snapshots20121018.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3810) Implement format() for BKJM

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3810:
--

Attachment: HDFS-3810.diff

+1 for the patch. re-attaching it for clean jenkins report.

> Implement format() for BKJM
> ---
>
> Key: HDFS-3810
> URL: https://issues.apache.org/jira/browse/HDFS-3810
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-3810.diff, HDFS-3810.diff, HDFS-3810.diff
>
>
> At the moment, formatting for BKJM is done on initialization. Reinitializing 
> is a manual process. This JIRA is to implement the JournalManager#format API, 
> so that BKJM can be formatting along with all other storage methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3809:
--

  Resolution: Fixed
Target Version/s: 2.0.1-alpha, 3.0.0  (was: 3.0.0, 2.0.1-alpha)
  Status: Resolved  (was: Patch Available)

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
> HDFS-3809.diff, HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486998#comment-13486998
 ] 

Uma Maheswara Rao G commented on HDFS-3809:
---

I have just updated CHANGES.txt for all. marking it as closed.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
> HDFS-3809.diff, HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3695) Genericize format() to non-file JournalManagers

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486981#comment-13486981
 ] 

Hudson commented on HDFS-3695:
--

Integrated in Hadoop-trunk-Commit #2943 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2943/])
Moved HDFS-3695 entry in CHANGES.txt from trunk to 2.0.3-alpha section 
(Revision 1403748)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403748
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Genericize format() to non-file JournalManagers
> ---
>
> Key: HDFS-3695
> URL: https://issues.apache.org/jira/browse/HDFS-3695
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0002-HDFS-3695-for-branch-2.patch, hdfs-3695.txt, 
> hdfs-3695.txt, hdfs-3695.txt
>
>
> Currently, the "namenode -format" and "namenode -initializeSharedEdits" 
> commands do not understand how to do anything with non-file-based shared 
> storage. This affects both BookKeeperJournalManager and QuorumJournalManager.
> This JIRA is to plumb through the formatting of edits directories using 
> pluggable journal manager implementations so that no separate step needs to 
> be taken to format them -- the same commands will work for NFS-based storage 
> or one of the alternate implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3573) Supply NamespaceInfo when instantiating JournalManagers

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486979#comment-13486979
 ] 

Hudson commented on HDFS-3573:
--

Integrated in Hadoop-trunk-Commit #2943 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/2943/])
Moved HDFS-3573 entry in CHANGES.txt from trunk to 2.0.3-alpha section 
(Revision 1403740)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403740
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Supply NamespaceInfo when instantiating JournalManagers
> ---
>
> Key: HDFS-3573
> URL: https://issues.apache.org/jira/browse/HDFS-3573
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0001-HDFS-3573-for-branch-2.patch, hdfs-3573.txt, 
> hdfs-3573.txt, hdfs-3573.txt, hdfs-3573.txt
>
>
> Currently, the JournalManagers are instantiated before the NamespaceInfo is 
> loaded from local storage directories. This is problematic since the JM may 
> want to verify that the storage info associated with the journal matches the 
> NN which is starting up (eg to prevent an operator accidentally configuring 
> two clusters against the same remote journal storage). This JIRA rejiggers 
> the initialization sequence so that the JMs receive NamespaceInfo as a 
> constructor argument.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4013) TestHftpURLTimeouts throws NPE

2012-10-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486941#comment-13486941
 ] 

Hadoop QA commented on HDFS-4013:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551345/hdfs-4013.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3426//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3426//console

This message is automatically generated.

> TestHftpURLTimeouts throws NPE
> --
>
> Key: HDFS-4013
> URL: https://issues.apache.org/jira/browse/HDFS-4013
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
> Environment: java version "1.7.0_06-icedtea"
> OpenJDK Runtime Environment (fedora-2.3.1.fc17.2-i386)
> OpenJDK Client VM (build 23.2-b09, mixed mode)
>Reporter: Chao Shi
>Assignee: Chao Shi
>Priority: Trivial
> Attachments: hdfs-4013.patch, hdfs-4013.patch, hdfs-4013.patch
>
>
> The case fails at line 116, where message is null. I guess this may be an 
> openjdk-specific behavior, but it would be nice to have it fixed although 
> openjdk is not officially supported.
> FYI: The exception is thrown with null message at java.net.SocksSocketImpl.
> {code}
> private static int remainingMillis(long deadlineMillis) throws 
> IOException {
> if (deadlineMillis == 0L)
> return 0;
> final long remaining = deadlineMillis - System.currentTimeMillis();
> if (remaining > 0)
> return (int) remaining;
> throw new SocketTimeoutException();
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7

2012-10-30 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3804:
--

Attachment: HDFS-3804.patch

Could you please test this minimal patch?

> TestHftpFileSystem fails intermittently with JDK7
> -
>
> Key: HDFS-3804
> URL: https://issues.apache.org/jira/browse/HDFS-3804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
> Environment: Apache Maven 3.0.4
> Maven home: /usr/share/maven
> Java version: 1.7.0_04, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/jdk1.7.0_04/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.2.0-25-generic", arch: "amd64", family: "unix"
>Reporter: Trevor Robinson
>Assignee: Trevor Robinson
>  Labels: java7
> Attachments: HDFS-3804-2.patch, HDFS-3804.patch, HDFS-3804.patch
>
>
> For example:
>   testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
>   testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
> This test case sets up a filesystem that is used by the first half of the 
> test methods (in declaration order), but the second half of the tests start 
> by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
> arbitrary order, so if any first half methods run after any second half 
> methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7

2012-10-30 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3804:
--

Status: Open  (was: Patch Available)

> TestHftpFileSystem fails intermittently with JDK7
> -
>
> Key: HDFS-3804
> URL: https://issues.apache.org/jira/browse/HDFS-3804
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
> Environment: Apache Maven 3.0.4
> Maven home: /usr/share/maven
> Java version: 1.7.0_04, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/jdk1.7.0_04/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.2.0-25-generic", arch: "amd64", family: "unix"
>Reporter: Trevor Robinson
>Assignee: Trevor Robinson
>  Labels: java7
> Attachments: HDFS-3804-2.patch, HDFS-3804.patch
>
>
> For example:
>   testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
>   testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
> closed
> This test case sets up a filesystem that is used by the first half of the 
> test methods (in declaration order), but the second half of the tests start 
> by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
> arbitrary order, so if any first half methods run after any second half 
> methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1331) dfs -test should work like /bin/test

2012-10-30 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486907#comment-13486907
 ] 

Daryn Sharp commented on HDFS-1331:
---

Looks good, but please update the usage and remove the spurious 
{{System.out.println("isFile = " ...}}

> dfs -test should work like /bin/test
> 
>
> Key: HDFS-1331
> URL: https://issues.apache.org/jira/browse/HDFS-1331
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.20.2, 3.0.0, 2.0.2-alpha
>Reporter: Allen Wittenauer
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs1331.txt, hdfs1331-with-hadoop8994.txt
>
>
> hadoop dfs -test doesn't act like its shell equivalent, making it difficult 
> to actually use if you are used to the real test command:
> hadoop:
> $hadoop dfs -test -d /nonexist; echo $?
> test: File does not exist: /nonexist
> 255
> shell:
> $ test -d /nonexist; echo $?
> 1
> a) Why is it spitting out a message? Even so, why is it saying file instead 
> of directory when I used -d?
> b) Why is the return code 255? I realize this is documented as '0' if true.  
> But docs basically say the value is undefined if it isn't.
> c) where is -f?
> d) Why is empty -z instead of -s ?  Was it a misunderstanding of the man page?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4056) Always start the NN's SecretManager

2012-10-30 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486894#comment-13486894
 ] 

Daryn Sharp commented on HDFS-4056:
---

A few clarifications: Starting the daemon thread does not automatically alter 
the behavior of clients or tasks.  They will not request or use tokens so the 
secret manager will sit dormant.  "Newer" clients can however request and use 
tokens, while "older" clients work the same as before.

Even with or w/o SASL PLAIN auth, HADOOP-8733 and HADOOP-8784 should not be 
reverted.  They both implement correct behavior in a more general fashion.

> Always start the NN's SecretManager
> ---
>
> Key: HDFS-4056
> URL: https://issues.apache.org/jira/browse/HDFS-4056
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-4056.patch
>
>
> To support the ability to use tokens regardless of whether kerberos is 
> enabled, the NN's secret manager should always be started.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4127) Log message is not correct in case of short of replica

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486885#comment-13486885
 ] 

Hudson commented on HDFS-4127:
--

Integrated in Hadoop-Mapreduce-trunk #1241 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1241/])
HDFS-4127. Log message is not correct in case of short of replica. 
Contributed by Junping Du. (Revision 1403616)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403616
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Log message is not correct in case of short of replica
> --
>
> Key: HDFS-4127
> URL: https://issues.apache.org/jira/browse/HDFS-4127
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.4, 2.0.2-alpha
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4127.patch, HDFS-4127.patch
>
>
> For some reason that block cannot be placed with enough replica (like no 
> enough available data nodes), it will throw a warning with wrong number of 
> replica in short.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4122) Cleanup HDFS logs and reduce the size of logged messages

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486883#comment-13486883
 ] 

Hudson commented on HDFS-4122:
--

Integrated in Hadoop-Mapreduce-trunk #1241 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1241/])
Moved HDFS-4122 to Release 2.0.3 section. It is also moved to Incompatible 
Changes section. (Revision 1403570)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403570
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Cleanup HDFS logs and reduce the size of logged messages
> 
>
> Key: HDFS-4122
> URL: https://issues.apache.org/jira/browse/HDFS-4122
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client, name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 1.2.0, 3.0.0, 2.0.3-alpha
>
> Attachments: log.chatty.cleanup.branch1.patch, 
> log.chatty.cleanup.branch1.patch, log.chatty.cleanup.branch1.patch, 
> log.chatty.cleanup.trunk.patch, log.chatty.cleanup.trunk.patch, 
> log.chatty.cleanup.trunk.patch
>
>
> I have attached the patch that removes unnecessary information from the log 
> such as "Namesystem." and prefixing file information with "file" and block 
> information with "block" etc.
> On a branch-1 log I saw it reduce the log size by ~10%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4124) Refactor INodeDirectory#getExistingPathINodes() to enable returning more than INode array

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486884#comment-13486884
 ] 

Hudson commented on HDFS-4124:
--

Integrated in Hadoop-Mapreduce-trunk #1241 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1241/])
HDFS-4124. Refactor INodeDirectory#getExistingPathINodes() to enable 
returningmore than INode array. Contributed by Jing Zhao. (Revision 1403304)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403304
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java


> Refactor INodeDirectory#getExistingPathINodes() to enable returning more than 
> INode array
> -
>
> Key: HDFS-4124
> URL: https://issues.apache.org/jira/browse/HDFS-4124
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-INodeDirecotry.trunk.001.patch, 
> HDFS-INodeDirecotry.trunk.002.patch, HDFS-INodeDirecotry.trunk.003.patch
>
>
> Currently INodeDirectory#getExistingPathINodes() uses an INode array to 
> return the INodes resolved from the given path. For snapshot we need the 
> function to be able to return more information when resolving a path for a 
> snapshot file/dir. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4013) TestHftpURLTimeouts throws NPE

2012-10-30 Thread Chao Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Shi updated HDFS-4013:
---

Assignee: Chao Shi
  Status: Patch Available  (was: Open)

> TestHftpURLTimeouts throws NPE
> --
>
> Key: HDFS-4013
> URL: https://issues.apache.org/jira/browse/HDFS-4013
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
> Environment: java version "1.7.0_06-icedtea"
> OpenJDK Runtime Environment (fedora-2.3.1.fc17.2-i386)
> OpenJDK Client VM (build 23.2-b09, mixed mode)
>Reporter: Chao Shi
>Assignee: Chao Shi
>Priority: Trivial
> Attachments: hdfs-4013.patch, hdfs-4013.patch, hdfs-4013.patch
>
>
> The case fails at line 116, where message is null. I guess this may be an 
> openjdk-specific behavior, but it would be nice to have it fixed although 
> openjdk is not officially supported.
> FYI: The exception is thrown with null message at java.net.SocksSocketImpl.
> {code}
> private static int remainingMillis(long deadlineMillis) throws 
> IOException {
> if (deadlineMillis == 0L)
> return 0;
> final long remaining = deadlineMillis - System.currentTimeMillis();
> if (remaining > 0)
> return (int) remaining;
> throw new SocketTimeoutException();
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4013) TestHftpURLTimeouts throws NPE

2012-10-30 Thread Chao Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Shi updated HDFS-4013:
---

Attachment: hdfs-4013.patch

add assertNotNull to prevent throwing NPE on failure

> TestHftpURLTimeouts throws NPE
> --
>
> Key: HDFS-4013
> URL: https://issues.apache.org/jira/browse/HDFS-4013
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
> Environment: java version "1.7.0_06-icedtea"
> OpenJDK Runtime Environment (fedora-2.3.1.fc17.2-i386)
> OpenJDK Client VM (build 23.2-b09, mixed mode)
>Reporter: Chao Shi
>Priority: Trivial
> Attachments: hdfs-4013.patch, hdfs-4013.patch, hdfs-4013.patch
>
>
> The case fails at line 116, where message is null. I guess this may be an 
> openjdk-specific behavior, but it would be nice to have it fixed although 
> openjdk is not officially supported.
> FYI: The exception is thrown with null message at java.net.SocksSocketImpl.
> {code}
> private static int remainingMillis(long deadlineMillis) throws 
> IOException {
> if (deadlineMillis == 0L)
> return 0;
> final long remaining = deadlineMillis - System.currentTimeMillis();
> if (remaining > 0)
> return (int) remaining;
> throw new SocketTimeoutException();
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4127) Log message is not correct in case of short of replica

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486847#comment-13486847
 ] 

Hudson commented on HDFS-4127:
--

Integrated in Hadoop-Hdfs-trunk #1211 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1211/])
HDFS-4127. Log message is not correct in case of short of replica. 
Contributed by Junping Du. (Revision 1403616)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403616
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Log message is not correct in case of short of replica
> --
>
> Key: HDFS-4127
> URL: https://issues.apache.org/jira/browse/HDFS-4127
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.4, 2.0.2-alpha
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4127.patch, HDFS-4127.patch
>
>
> For some reason that block cannot be placed with enough replica (like no 
> enough available data nodes), it will throw a warning with wrong number of 
> replica in short.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4122) Cleanup HDFS logs and reduce the size of logged messages

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486845#comment-13486845
 ] 

Hudson commented on HDFS-4122:
--

Integrated in Hadoop-Hdfs-trunk #1211 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1211/])
Moved HDFS-4122 to Release 2.0.3 section. It is also moved to Incompatible 
Changes section. (Revision 1403570)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403570
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Cleanup HDFS logs and reduce the size of logged messages
> 
>
> Key: HDFS-4122
> URL: https://issues.apache.org/jira/browse/HDFS-4122
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client, name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 1.2.0, 3.0.0, 2.0.3-alpha
>
> Attachments: log.chatty.cleanup.branch1.patch, 
> log.chatty.cleanup.branch1.patch, log.chatty.cleanup.branch1.patch, 
> log.chatty.cleanup.trunk.patch, log.chatty.cleanup.trunk.patch, 
> log.chatty.cleanup.trunk.patch
>
>
> I have attached the patch that removes unnecessary information from the log 
> such as "Namesystem." and prefixing file information with "file" and block 
> information with "block" etc.
> On a branch-1 log I saw it reduce the log size by ~10%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4124) Refactor INodeDirectory#getExistingPathINodes() to enable returning more than INode array

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486846#comment-13486846
 ] 

Hudson commented on HDFS-4124:
--

Integrated in Hadoop-Hdfs-trunk #1211 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1211/])
HDFS-4124. Refactor INodeDirectory#getExistingPathINodes() to enable 
returningmore than INode array. Contributed by Jing Zhao. (Revision 1403304)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403304
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java


> Refactor INodeDirectory#getExistingPathINodes() to enable returning more than 
> INode array
> -
>
> Key: HDFS-4124
> URL: https://issues.apache.org/jira/browse/HDFS-4124
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-INodeDirecotry.trunk.001.patch, 
> HDFS-INodeDirecotry.trunk.002.patch, HDFS-INodeDirecotry.trunk.003.patch
>
>
> Currently INodeDirectory#getExistingPathINodes() uses an INode array to 
> return the INodes resolved from the given path. For snapshot we need the 
> function to be able to return more information when resolving a path for a 
> snapshot file/dir. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4127) Log message is not correct in case of short of replica

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486794#comment-13486794
 ] 

Hudson commented on HDFS-4127:
--

Integrated in Hadoop-Yarn-trunk #21 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/21/])
HDFS-4127. Log message is not correct in case of short of replica. 
Contributed by Junping Du. (Revision 1403616)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403616
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Log message is not correct in case of short of replica
> --
>
> Key: HDFS-4127
> URL: https://issues.apache.org/jira/browse/HDFS-4127
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.4, 2.0.2-alpha
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-4127.patch, HDFS-4127.patch
>
>
> For some reason that block cannot be placed with enough replica (like no 
> enough available data nodes), it will throw a warning with wrong number of 
> replica in short.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4124) Refactor INodeDirectory#getExistingPathINodes() to enable returning more than INode array

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486793#comment-13486793
 ] 

Hudson commented on HDFS-4124:
--

Integrated in Hadoop-Yarn-trunk #21 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/21/])
HDFS-4124. Refactor INodeDirectory#getExistingPathINodes() to enable 
returningmore than INode array. Contributed by Jing Zhao. (Revision 1403304)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403304
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java


> Refactor INodeDirectory#getExistingPathINodes() to enable returning more than 
> INode array
> -
>
> Key: HDFS-4124
> URL: https://issues.apache.org/jira/browse/HDFS-4124
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-INodeDirecotry.trunk.001.patch, 
> HDFS-INodeDirecotry.trunk.002.patch, HDFS-INodeDirecotry.trunk.003.patch
>
>
> Currently INodeDirectory#getExistingPathINodes() uses an INode array to 
> return the INodes resolved from the given path. For snapshot we need the 
> function to be able to return more information when resolving a path for a 
> snapshot file/dir. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4122) Cleanup HDFS logs and reduce the size of logged messages

2012-10-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486792#comment-13486792
 ] 

Hudson commented on HDFS-4122:
--

Integrated in Hadoop-Yarn-trunk #21 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/21/])
Moved HDFS-4122 to Release 2.0.3 section. It is also moved to Incompatible 
Changes section. (Revision 1403570)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403570
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Cleanup HDFS logs and reduce the size of logged messages
> 
>
> Key: HDFS-4122
> URL: https://issues.apache.org/jira/browse/HDFS-4122
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client, name-node
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 1.2.0, 3.0.0, 2.0.3-alpha
>
> Attachments: log.chatty.cleanup.branch1.patch, 
> log.chatty.cleanup.branch1.patch, log.chatty.cleanup.branch1.patch, 
> log.chatty.cleanup.trunk.patch, log.chatty.cleanup.trunk.patch, 
> log.chatty.cleanup.trunk.patch
>
>
> I have attached the patch that removes unnecessary information from the log 
> such as "Namesystem." and prefixing file information with "file" and block 
> information with "block" etc.
> On a branch-1 log I saw it reduce the log size by ~10%.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3908) In HA mode, when there is a ledger in BK missing, which is generated after the last checkpoint, NN can not restore it.

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486781#comment-13486781
 ] 

Uma Maheswara Rao G commented on HDFS-3908:
---

I think this is general problem in HA than BKJM issue.

The problem what I am thinking is:
 
Active will continue even if there are some gaps in shared storage as that was 
general editlog loading behaviour. 

Also standby can not read the complete edits and may end up reading only till 
the gapped edits as max and will not do any tailing further.
If this continues for long then standby will have big gap of edits and on 
switch standby will fail to become active as it will not have continues edits.

@Xiao, is this the case you are seeing?

I think in HA case, we may have to expect shared storage edits mandatorily 
instead of falling back on to local edits in case of edits missing in shared 
storage while loading on startup. Because standby will have the input from only 
shared storage to load edits.

In Non HA case, considering local edits from different namedirs should be fine. 
But in HA, that should not be the case as explained above.

What about other's opinion on this proposal?

> In HA mode, when there is a ledger in BK missing, which is generated after 
> the last checkpoint, NN can not restore it.
> --
>
> Key: HDFS-3908
> URL: https://issues.apache.org/jira/browse/HDFS-3908
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.1-alpha
>Reporter: Han Xiao
>
> If not HA, when the num of edits.dir is larger than 1. Missing of one editlog 
> file in a dir will not relust problem cause of the replica in the other dir. 
> However, when in HA mode(using BK as ShareStorage), if an ledger missing, the 
> missing ledger will not restored at the phase of NN starting even if the 
> related editlog file existing in local dir.
> The missing maintains when NN is still in standby state. However, when the NN 
> enters active state, it will read the editlog file(related to the missing 
> ledger) in local. But, unfortunately, the ledger after the missing one in BK 
> can't be readed at such a phase(cause of gap).
> Therefore in the following situation, editlogs will not be restored even 
> there is an editlog file either in BK or in local dir: 
> In such a stituation, editlog can't be restored:
> 1、fsiamge file: fsimage_0005946.md5
> 2、legder in zk:
>   \[zk: localhost:2181(CONNECTED) 0\] ls 
> /hdfsEdit/ledgers/edits_00594
>   edits_005941_005942
>   edits_005943_005944
>   edits_005945_005946
>   edits_005949_005949   
> (missing edits_005947_005948)
> 3、editlog in local editlog dir:
>   \-rw-r--r-- 1 root root  30 Sep  8 03:24 
> edits_0005947-0005948
>   \-rw-r--r-- 1 root root 1048576 Sep  8 03:35 
> edits_0005950-0005950
>   \-rw-r--r-- 1 root root 1048576 Sep  8 04:42 
> edits_0005951-0005951
>   (miss edits_0005949-0005919)
> 4、and the seen_txid
>   vm2:/tmp/hadoop-root/dfs/name/current # cat seen_txid
>   5949
> Here, we want to restored editlog from txid 5946(image) to txid 
> 5949(seen_txid). The 5947-5948 is missing in BK, 5949-5949 is missing in 
> local dir.
> When start the NN, the following exception is thrown:
> 2012-09-08 06:26:10,031 FATAL 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 5949, but got txid 5950.
> at 
> org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:163)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:692)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:223)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.catchupDuringFailover(EditLogTailer.java:182)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:599)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1325)
> at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveSt

[jira] [Commented] (HDFS-4101) ZKFC should implement zookeeper.recovery.retry like HBase to connect to ZooKeeper

2012-10-30 Thread Damien Hardy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486767#comment-13486767
 ] 

Damien Hardy commented on HDFS-4101:


I tried something like that but as an non java developer (usually sysadmin) 
wasn't able to do it confronting the 'Call to super must be first statement in 
constructor' ERROR
Because I want to test the constructor method ...
I can't find any workaround instead of adding a new constructor with a 
parameter for number of ZK Fails which is a non-sens :) 
So I tried using the available parameters  ...
Maybe you can help ?

> ZKFC should implement zookeeper.recovery.retry like HBase to connect to 
> ZooKeeper
> -
>
> Key: HDFS-4101
> URL: https://issues.apache.org/jira/browse/HDFS-4101
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover, ha
>Affects Versions: 2.0.0-alpha, 3.0.0
> Environment: running CDH4.1.1
>Reporter: Damien Hardy
>Assignee: Damien Hardy
>Priority: Minor
>  Labels: newbie
> Attachments: HDFS-4101-2.patch
>
>
> When zkfc start and zookeeper is not yet started ZKFC fails and stop directly.
> Maybe ZKFC should allow some retries on Zookeeper services like does HBase 
> with zookeeper.recovery.retry
> This particularly appends when I start my whole cluster on VirtualBox for 
> example (every components nearly at the same time) ZKFC is the only that fail 
> and stop ... 
> Every others can wait each-others some time independently of the start order 
> like NameNode/DataNode/JournalNode/Zookeeper/HBaseMaster/HBaseRS so that the 
> system can be set and stable in few seconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-4130:
--

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-3399

> The reading for editlog at NN starting using bkjm  is not efficient
> ---
>
> Key: HDFS-4130
> URL: https://issues.apache.org/jira/browse/HDFS-4130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, performance
>Affects Versions: 2.0.2-alpha
>Reporter: Han Xiao
> Attachments: HDFS-4130.patch
>
>
> Now, the method of BookKeeperJournalManager.selectInputStreams is written 
> like:
> while (true) {
>   EditLogInputStream elis;
>   try {
> elis = getInputStream(fromTxId, inProgressOk);
>   } catch (IOException e) {
> LOG.error(e);
> return;
>   }
>   if (elis == null) {
> return;
>   }
>   streams.add(elis);
>   if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
> return;
>   }
>   fromTxId = elis.getLastTxId() + 1;
> }
>  
> EditLogInputstream is got from getInputStream(), which will read the ledgers 
> from zookeeper in each calling.
> This will be a larger cost of times when the the number ledgers becomes large.
> The reading of ledgers from zk is not necessary for every calling of 
> getInputStream().
> The log of time wasting here is as follows:
> 2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Caching file names occuring more than 10 times
> 2012-10-30 16:49:24,643 INFO 
> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
> Successfully connected to bookie: /167.52.1.121:318
> The stack of the process when blocking between the two lines of log is like:
> "main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
> \[0x7fca020fe000\]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
> \- locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
> at 
> org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> at 
> org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)
> 
> betweent different time, the diff of stack is:
> diff stack stack2
> 1c1
> < 2012-10-30 16:44:53
> ---
> > 2012-10-30 16:46:17
> 106c106
> <   - locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> ---
> >   - locked <0x0006fae58468> (a 
> > hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> In our environment, the waiting time could even reach to tens of minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Han Xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Xiao updated HDFS-4130:
---

Description: 
Now, the method of BookKeeperJournalManager.selectInputStreams is written like:

while (true) {
  EditLogInputStream elis;
  try {
elis = getInputStream(fromTxId, inProgressOk);
  } catch (IOException e) {
LOG.error(e);
return;
  }
  if (elis == null) {
return;
  }
  streams.add(elis);
  if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
return;
  }
  fromTxId = elis.getLastTxId() + 1;
}
 
EditLogInputstream is got from getInputStream(), which will read the ledgers 
from zookeeper in each calling.
This will be a larger cost of times when the the number ledgers becomes large.
The reading of ledgers from zk is not necessary for every calling of 
getInputStream().

The log of time wasting here is as follows:
2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Caching file names occuring more than 10 times
2012-10-30 16:49:24,643 INFO 
hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
Successfully connected to bookie: /167.52.1.121:318

The stack of the process when blocking between the two lines of log is like:
"main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
\[0x7fca020fe000\]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
\- locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
at 
org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
at 
org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)

betweent different time, the diff of stack is:
diff stack stack2
1c1
< 2012-10-30 16:44:53
---
> 2012-10-30 16:46:17
106c106
<   - locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
---
>   - locked <0x0006fae58468> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)

In our environment, the waiting time could even reach to tens of minutes.

  was:
Now, the method of BookKeeperJournalManager.selectInputStreams is written like:

while (true) {
  EditLogInputStream elis;
  try {
elis = getInputStream(fromTxId, inProgressOk);
  } catch (IOException e) {
LOG.error(e);
return;
  }
  if (elis == null) {
return;
  }
  streams.add(elis);
  if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
return;
  }
  fromTxId = elis.getLastTxId() + 1;
}
 
EditLogInputstream is got from getInputStream(), which will read the ledgers 
from zookeeper in each calling.
This will be a larger cost of times when the the number ledgers becomes large.
The reading of ledgers from zk is not necessary for every calling of 
getInputStream().

The log of time wasting here is as follows:
2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Caching file names occuring more than 10 times
2012-10-30 16:49:24,643 INFO 
hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
Successfully connected to bookie: /167.52.1.121:318

The stack of the process when blocking between the two lines of log is like:
"main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
\[0x7fca020fe000\]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
- locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
at 
org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
at 
org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
   

[jira] [Updated] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Han Xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Xiao updated HDFS-4130:
---

Description: 
Now, the method of BookKeeperJournalManager.selectInputStreams is written like:

while (true) {
  EditLogInputStream elis;
  try {
elis = getInputStream(fromTxId, inProgressOk);
  } catch (IOException e) {
LOG.error(e);
return;
  }
  if (elis == null) {
return;
  }
  streams.add(elis);
  if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
return;
  }
  fromTxId = elis.getLastTxId() + 1;
}
 
EditLogInputstream is got from getInputStream(), which will read the ledgers 
from zookeeper in each calling.
This will be a larger cost of times when the the number ledgers becomes large.
The reading of ledgers from zk is not necessary for every calling of 
getInputStream().

The log of time wasting here is as follows:
2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Caching file names occuring more than 10 times
2012-10-30 16:49:24,643 INFO 
hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
Successfully connected to bookie: /167.52.1.121:318

The stack of the process when blocking between the two lines of log is like:
"main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
\[0x7fca020fe000\]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
- locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
at 
org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
at 
org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)

betweent different time, the diff of stack is:
diff stack stack2
1c1
< 2012-10-30 16:44:53
---
> 2012-10-30 16:46:17
106c106
<   - locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
---
>   - locked <0x0006fae58468> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)

In our environment, the waiting time could even reach to tens of minutes.

  was:
Now, the method of BookKeeperJournalManager.selectInputStreams is written like:

while (true) {
  EditLogInputStream elis;
  try {
elis = getInputStream(fromTxId, inProgressOk);
  } catch (IOException e) {
LOG.error(e);
return;
  }
  if (elis == null) {
return;
  }
  streams.add(elis);
  if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
return;
  }
  fromTxId = elis.getLastTxId() + 1;
}
 
EditLogInputstream is got from getInputStream(), which will read the ledgers 
from zookeeper in each calling.
This will be a larger cost of times when the the number ledgers becomes large.
The reading of ledgers from zk is not necessary for every calling of 
getInputStream().

The log of time wasting here is as follows:
2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Caching file names occuring more than 10 times
2012-10-30 16:49:24,643 INFO 
hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
Successfully connected to bookie: /167.52.1.121:318

The stack of the process when blocking between the two lines of log is like:
"main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
[0x7fca020fe000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
- locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
at 
org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
at 
org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
at

[jira] [Updated] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Han Xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Xiao updated HDFS-4130:
---

Attachment: HDFS-4130.patch

patch available

> The reading for editlog at NN starting using bkjm  is not efficient
> ---
>
> Key: HDFS-4130
> URL: https://issues.apache.org/jira/browse/HDFS-4130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, performance
>Affects Versions: 2.0.2-alpha
>Reporter: Han Xiao
> Attachments: HDFS-4130.patch
>
>
> Now, the method of BookKeeperJournalManager.selectInputStreams is written 
> like:
> while (true) {
>   EditLogInputStream elis;
>   try {
> elis = getInputStream(fromTxId, inProgressOk);
>   } catch (IOException e) {
> LOG.error(e);
> return;
>   }
>   if (elis == null) {
> return;
>   }
>   streams.add(elis);
>   if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
> return;
>   }
>   fromTxId = elis.getLastTxId() + 1;
> }
>  
> EditLogInputstream is got from getInputStream(), which will read the ledgers 
> from zookeeper in each calling.
> This will be a larger cost of times when the the number ledgers becomes large.
> The reading of ledgers from zk is not necessary for every calling of 
> getInputStream().
> The log of time wasting here is as follows:
> 2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Caching file names occuring more than 10 times
> 2012-10-30 16:49:24,643 INFO 
> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
> Successfully connected to bookie: /167.52.1.121:318
> The stack of the process when blocking between the two lines of log is like:
> "main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
> [0x7fca020fe000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
> - locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
> at 
> org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> at 
> org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)
> 
> betweent different time, the diff of stack is:
> diff stack stack2
> 1c1
> < 2012-10-30 16:44:53
> ---
> > 2012-10-30 16:46:17
> 106c106
> <   - locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> ---
> >   - locked <0x0006fae58468> (a 
> > hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> In our environment, the waiting time could even reach to tens of minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Han Xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Xiao updated HDFS-4130:
---

Status: Open  (was: Patch Available)

Not find the option to submit the patch?

> The reading for editlog at NN starting using bkjm  is not efficient
> ---
>
> Key: HDFS-4130
> URL: https://issues.apache.org/jira/browse/HDFS-4130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, performance
>Affects Versions: 2.0.2-alpha
>Reporter: Han Xiao
> Attachments: HDFS-4130.patch
>
>
> Now, the method of BookKeeperJournalManager.selectInputStreams is written 
> like:
> while (true) {
>   EditLogInputStream elis;
>   try {
> elis = getInputStream(fromTxId, inProgressOk);
>   } catch (IOException e) {
> LOG.error(e);
> return;
>   }
>   if (elis == null) {
> return;
>   }
>   streams.add(elis);
>   if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
> return;
>   }
>   fromTxId = elis.getLastTxId() + 1;
> }
>  
> EditLogInputstream is got from getInputStream(), which will read the ledgers 
> from zookeeper in each calling.
> This will be a larger cost of times when the the number ledgers becomes large.
> The reading of ledgers from zk is not necessary for every calling of 
> getInputStream().
> The log of time wasting here is as follows:
> 2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Caching file names occuring more than 10 times
> 2012-10-30 16:49:24,643 INFO 
> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
> Successfully connected to bookie: /167.52.1.121:318
> The stack of the process when blocking between the two lines of log is like:
> "main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
> [0x7fca020fe000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
> - locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
> at 
> org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> at 
> org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)
> 
> betweent different time, the diff of stack is:
> diff stack stack2
> 1c1
> < 2012-10-30 16:44:53
> ---
> > 2012-10-30 16:46:17
> 106c106
> <   - locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> ---
> >   - locked <0x0006fae58468> (a 
> > hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> In our environment, the waiting time could even reach to tens of minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Han Xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Xiao updated HDFS-4130:
---

Target Version/s: 2.0.2-alpha
  Status: Patch Available  (was: Open)

> The reading for editlog at NN starting using bkjm  is not efficient
> ---
>
> Key: HDFS-4130
> URL: https://issues.apache.org/jira/browse/HDFS-4130
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, performance
>Affects Versions: 2.0.2-alpha
>Reporter: Han Xiao
>
> Now, the method of BookKeeperJournalManager.selectInputStreams is written 
> like:
> while (true) {
>   EditLogInputStream elis;
>   try {
> elis = getInputStream(fromTxId, inProgressOk);
>   } catch (IOException e) {
> LOG.error(e);
> return;
>   }
>   if (elis == null) {
> return;
>   }
>   streams.add(elis);
>   if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
> return;
>   }
>   fromTxId = elis.getLastTxId() + 1;
> }
>  
> EditLogInputstream is got from getInputStream(), which will read the ledgers 
> from zookeeper in each calling.
> This will be a larger cost of times when the the number ledgers becomes large.
> The reading of ledgers from zk is not necessary for every calling of 
> getInputStream().
> The log of time wasting here is as follows:
> 2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> Caching file names occuring more than 10 times
> 2012-10-30 16:49:24,643 INFO 
> hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
> Successfully connected to bookie: /167.52.1.121:318
> The stack of the process when blocking between the two lines of log is like:
> "main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
> [0x7fca020fe000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:485)
> at 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
> - locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
> at 
> org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
> at 
> hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
> at 
> org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
> at 
> org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)
> 
> betweent different time, the diff of stack is:
> diff stack stack2
> 1c1
> < 2012-10-30 16:44:53
> ---
> > 2012-10-30 16:46:17
> 106c106
> <   - locked <0x0006fb8495a8> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> ---
> >   - locked <0x0006fae58468> (a 
> > hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
> In our environment, the waiting time could even reach to tens of minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Han Xiao (JIRA)
Han Xiao created HDFS-4130:
--

 Summary: The reading for editlog at NN starting using bkjm  is not 
efficient
 Key: HDFS-4130
 URL: https://issues.apache.org/jira/browse/HDFS-4130
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, performance
Affects Versions: 2.0.2-alpha
Reporter: Han Xiao


Now, the method of BookKeeperJournalManager.selectInputStreams is written like:

while (true) {
  EditLogInputStream elis;
  try {
elis = getInputStream(fromTxId, inProgressOk);
  } catch (IOException e) {
LOG.error(e);
return;
  }
  if (elis == null) {
return;
  }
  streams.add(elis);
  if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
return;
  }
  fromTxId = elis.getLastTxId() + 1;
}
 
EditLogInputstream is got from getInputStream(), which will read the ledgers 
from zookeeper in each calling.
This will be a larger cost of times when the the number ledgers becomes large.
The reading of ledgers from zk is not necessary for every calling of 
getInputStream().

The log of time wasting here is as follows:
2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Caching file names occuring more than 10 times
2012-10-30 16:49:24,643 INFO 
hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
Successfully connected to bookie: /167.52.1.121:318

The stack of the process when blocking between the two lines of log is like:
"main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
[0x7fca020fe000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
- locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
at 
org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
at 
org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)

betweent different time, the diff of stack is:
diff stack stack2
1c1
< 2012-10-30 16:44:53
---
> 2012-10-30 16:46:17
106c106
<   - locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
---
>   - locked <0x0006fae58468> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)

In our environment, the waiting time could even reach to tens of minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode

2012-10-30 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486751#comment-13486751
 ] 

Vinay commented on HDFS-3507:
-

Ah! I just realize that, intention of this Jira is to get safemode status( i.e 
SafeModeAction.GET ) from only Active NN. 
I agree that SAFEMODE_GET is not a write operation, but If we consider 
SAFEMODE_GET as READ, and "dfs.ha.allow.stale.reads" is set, then SAFEMODE_GET 
will be again executed on the STANBY NN.

 {code}+if (isChecked) {
+  if (action == SafeModeAction.SAFEMODE_GET) {
+opCategory = OperationCategory.READ;
+  } else {
+opCategory = OperationCategory.WRITE
+  }{code}

So I feel, SAFEMODE_GET also should be categorized as WRITE ( only to avoid 
execution on STANBY NN), else need to check state explicitly.

> DFS#isInSafeMode needs to execute only on Active NameNode
> -
>
> Key: HDFS-3507
> URL: https://issues.apache.org/jira/browse/HDFS-3507
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
>Priority: Critical
> Attachments: HDFS-3507.patch, HDFS-3507.patch
>
>
> Currently DFS#isInSafeMode is not Checking for the NN state. It can be 
> executed on any of the NNs.
> But HBase will use this API to check for the NN safemode before starting up 
> its service.
> If first NN configured is in standby then DFS#isInSafeMode will check standby 
> NNs safemode but hbase want state of Active NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4114) Remove the BackupNode and CheckpointNode

2012-10-30 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486723#comment-13486723
 ] 

Konstantin Shvachko commented on HDFS-4114:
---

Yes I do use BackupNode.
-1 on this idea.

> Remove the BackupNode and CheckpointNode
> 
>
> Key: HDFS-4114
> URL: https://issues.apache.org/jira/browse/HDFS-4114
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Eli Collins
>Assignee: Eli Collins
>
> Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the 
> BackupNode and CheckpointNode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3960) Snapshot of Being Written Files

2012-10-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486705#comment-13486705
 ] 

Aaron T. Myers commented on HDFS-3960:
--

Hi Konstantin, I agree that a solution like you describe should be sufficient. 
It sounds very similar to what is described in the "Super Sync" option of the 
design document I posted on HDFS-2802. Would you mind taking a look at that 
section of the document to see if it is inline with what you're describing here?

> Snapshot of Being Written Files
> ---
>
> Key: HDFS-3960
> URL: https://issues.apache.org/jira/browse/HDFS-3960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>
> Here is a design question: Suppose there is a being written file when a 
> snapshot is being taken.  What should the length of the file be shown in the 
> snapshot?  In other words, how to determine the length of being written file 
> when a snapshot is being taken?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-10-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486701#comment-13486701
 ] 

Aaron T. Myers commented on HDFS-2802:
--

Hi Konstantin, if I correctly understand what you're suggesting, it sounds very 
similar to what is described in the "Representing Snapshots at the NameNode" 
section of the design proposal I posted. Under this scheme, each file and 
directory is tagged with its start and end version numbers. The start version 
number represents the point in the file system history when this file/directory 
came into existence, and the end version number represents the point in the 
file system history where the file/directory either was modified or was 
deleted. Taking a snapshot is, as you described, as simple as incrementing the 
current version number of the file system (or a subtree) as you described. 
Would you mind taking a look at that portion of the design document to see if 
it is inline with what you're thinking about?

> Support for RW/RO snapshots in HDFS
> ---
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node, name-node
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFSSnapshotsDesign.pdf, snap.patch, 
> snapshot-one-pager.pdf, Snapshots20121018.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3980) NPE in HttpURLConnection.java while starting SecondaryNameNode.

2012-10-30 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486702#comment-13486702
 ] 

Aaron T. Myers commented on HDFS-3980:
--

bq. i) Even If we configure IP, It should be resolved rite.(as I mentioned in 
defect, host-name resolution should be done rite.).

I suppose we could perform reverse DNS on the configured fs.defaultFS, but I 
must admit that I don't understand the use case for configuring an explicit IP 
address when the node in question does indeed have an externally resolvable 
hostname that could be used.

bq. NPE,can we address NPE ?

I'm pretty sure that the NPE itself is actually a bug in the JDK. We might be 
able to check for a specific Hadoop misconfiguration at a higher level so that 
we never reach the code that will cause the NPE, but doing so in such a way 
that would cover all possible cases of this NPE might prove difficult.

> NPE in HttpURLConnection.java while starting SecondaryNameNode.
> ---
>
> Key: HDFS-3980
> URL: https://issues.apache.org/jira/browse/HDFS-3980
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0, 2.0.1-alpha
>Reporter: Brahma Reddy Battula
>Priority: Critical
> Attachments: core-site.xml, hdfs-site.xml
>
>
> Scenario:
> 
> I started secure cluster by going thru following..
> https://ccp.cloudera.com/display/CDHDOC/CDH3+Security+Guide..
> Here SecondaryNamenode is getting shutdown by throwing NPE..
> Please correct me If I am wrong...
> Will attach conf and logs..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4118) Change INodeDirectory.getExistingPathINodes(..) to work with snapshots

2012-10-30 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4118:


Attachment: HDFS-4118.001.patch

Patch uploaded. The current patch also contains several testcases to test 
getExistingPathINodes under different scenarios.

> Change INodeDirectory.getExistingPathINodes(..) to work with snapshots
> --
>
> Key: HDFS-4118
> URL: https://issues.apache.org/jira/browse/HDFS-4118
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Jing Zhao
> Attachments: HDFS-4118.001.patch
>
>
> {code}
> int getExistingPathINodes(byte[][] components, INode[] existing, boolean 
> resolveLink)
> {code}
> The INodeDirectory above retrieves existing INodes from the given path 
> components.  It needs to be updated in order to understand snapshot paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode

2012-10-30 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486697#comment-13486697
 ] 

Vinay commented on HDFS-3507:
-

Thanks aaron, Your suggestion looks good. I will post a patch soon about that.

> DFS#isInSafeMode needs to execute only on Active NameNode
> -
>
> Key: HDFS-3507
> URL: https://issues.apache.org/jira/browse/HDFS-3507
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
>Priority: Critical
> Attachments: HDFS-3507.patch, HDFS-3507.patch
>
>
> Currently DFS#isInSafeMode is not Checking for the NN state. It can be 
> executed on any of the NNs.
> But HBase will use this API to check for the NN safemode before starting up 
> its service.
> If first NN configured is in standby then DFS#isInSafeMode will check standby 
> NNs safemode but hbase want state of Active NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira