[jira] [Commented] (HDFS-4305) Add a configurable limit on number of blocks per file, and min block size

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627545#comment-13627545
 ] 

Suresh Srinivas commented on HDFS-4305:
---

The change of adding max number of blocks per file, even though incompatible, 
is more likely not going to be an issue.

bq. What about for the case that just one or a few files have the small block 
size? You wouldn't want to put this on the NN web UI.
Setting default block size smaller and having all the files affected by is a 
big issue. Is setting small block files for a few files such a big issue. One 
could just copy the files right?


> Add a configurable limit on number of blocks per file, and min block size
> -
>
> Key: HDFS-4305
> URL: https://issues.apache.org/jira/browse/HDFS-4305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and 
> managed to create a single file with hundreds of thousands of blocks. This 
> caused problems with the edit log since the OP_ADD op was so large 
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To 
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are 
> rejected
> - introduce a configurable maximum number of blocks per file, above which 
> requests to add another block are rejected (with a suitably high default as 
> to not prevent legitimate large files)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4660) Duplicated checksum on DN in a recovered pipeline

2013-04-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627524#comment-13627524
 ] 

Todd Lipcon commented on HDFS-4660:
---

Can you create a functional test which does something like this?
- Create a pipeline and write a number of bytes which isn't an exact multiple 
of the checksum chunk size (eg 800 bytes).
- Call hflush to ensure that all DNs have the full length
- Restart the second DN in the pipeline, to trigger adding DN4
- Write a bit more and close the file.
- Verify that all replicas have identical checksum files.

> Duplicated checksum on DN in a recovered pipeline
> -
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: PengZhang
>Priority: Critical
> Attachments: HDFS-4660.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4305) Add a configurable limit on number of blocks per file, and min block size

2013-04-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627522#comment-13627522
 ] 

Todd Lipcon commented on HDFS-4305:
---

bq. I am not sure if minimum block size is really required. I would rather make 
it a namenode WebUI status to say, your block size is way too small.

What about for the case that just one or a few files have the small block size? 
You wouldn't want to put this on the NN web UI.

The issue I've seen in the past is that some well-meaning but naive user wants 
to get their MR job to generate more splits. They don't know how to do this 
properly within MR, so instead they create the file with a tiny block size like 
1KB, then are surprised when they have really bad performance, etc. Having some 
reasonable limit should help keep them from shooting themselves in the foot.

> Add a configurable limit on number of blocks per file, and min block size
> -
>
> Key: HDFS-4305
> URL: https://issues.apache.org/jira/browse/HDFS-4305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and 
> managed to create a single file with hundreds of thousands of blocks. This 
> caused problems with the edit log since the OP_ADD op was so large 
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To 
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are 
> rejected
> - introduce a configurable maximum number of blocks per file, above which 
> requests to add another block are rejected (with a suitably high default as 
> to not prevent legitimate large files)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.

2013-04-09 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-2576:
--

Attachment: hdfs-2576-trunk-7.1.patch

Fixes the findbug warning.

> Namenode should have a favored nodes hint to enable clients to have control 
> over block placement.
> -
>
> Key: HDFS-2576
> URL: https://issues.apache.org/jira/browse/HDFS-2576
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
>Assignee: Devaraj Das
> Fix For: 2.0.5-beta
>
> Attachments: hdfs-2576-1.txt, hdfs-2576-trunk-1.patch, 
> hdfs-2576-trunk-2.patch, hdfs-2576-trunk-7.1.patch, hdfs-2576-trunk-7.patch
>
>
> Sometimes Clients like HBase are required to dynamically compute the 
> datanodes it wishes to place the blocks for a file for higher level of 
> locality. For this purpose there is a need of a way to give the Namenode a 
> hint in terms of a favoredNodes parameter about the locations where the 
> client wants to put each block. The proposed solution is a favored nodes 
> parameter in the addBlock() method and in the create() file method to enable 
> the clients to give the hints to the NameNode about the locations of each 
> replica of the block. Note that this would be just a hint and finally the 
> NameNode would look at disk usage, datanode load etc. and decide whether it 
> can respect the hints or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4590) Add more Unit Test Case for HDFS-3701 HDFS Miss Final Block Reading when File is Open for Write

2013-04-09 Thread Huned Lokhandwala (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huned Lokhandwala updated HDFS-4590:


Attachment: HDFS-4590.b1.002.patch

Hi,
I edited according to the suggestions given previously. Please let me know if 
this patch looks better.

> Add more Unit Test Case for HDFS-3701 HDFS Miss Final Block Reading when File 
> is Open for Write
> ---
>
> Key: HDFS-4590
> URL: https://issues.apache.org/jira/browse/HDFS-4590
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs-client
>Affects Versions: 1.2.0
> Environment: Java Unit Test Case
>Reporter: Huned Lokhandwala
>Assignee: Huned Lokhandwala
>Priority: Minor
> Attachments: HDFS-4590.b1.002.patch
>
>
> Add more Java Unit Test Coverage for the feature HDFS-3701 where the file is 
> opened for writing, the DFSClient calls one of the datanode owning the last 
> block to get its size, and if this datanode is dead then test for if socket 
> IO Exception is thrown. Add a unit test case to write to a file, shutdown the 
> datanodes, and then try to read from the file and expect an IO Exception. On 
> branch 1 it should throw the IO Exception as expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4590) Add more Unit Test Case for HDFS-3701 HDFS Miss Final Block Reading when File is Open for Write

2013-04-09 Thread Huned Lokhandwala (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huned Lokhandwala updated HDFS-4590:


Attachment: (was: HDFS-4590.b1.001.patch)

> Add more Unit Test Case for HDFS-3701 HDFS Miss Final Block Reading when File 
> is Open for Write
> ---
>
> Key: HDFS-4590
> URL: https://issues.apache.org/jira/browse/HDFS-4590
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs-client
>Affects Versions: 1.2.0
> Environment: Java Unit Test Case
>Reporter: Huned Lokhandwala
>Assignee: Huned Lokhandwala
>Priority: Minor
> Attachments: HDFS-4590.b1.002.patch
>
>
> Add more Java Unit Test Coverage for the feature HDFS-3701 where the file is 
> opened for writing, the DFSClient calls one of the datanode owning the last 
> block to get its size, and if this datanode is dead then test for if socket 
> IO Exception is thrown. Add a unit test case to write to a file, shutdown the 
> datanodes, and then try to read from the file and expect an IO Exception. On 
> branch 1 it should throw the IO Exception as expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4413) Secondary namenode won't start if HDFS isn't the default file system

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-4413.
---

   Resolution: Fixed
Fix Version/s: 1.2.0
 Hadoop Flags: Reviewed

I have committed this patch to branch-1, 1.2 and 1-win. Thank you Mostafa.

Thank you Chris for reviewing the patch.

> Secondary namenode won't start if HDFS isn't the default file system
> 
>
> Key: HDFS-4413
> URL: https://issues.apache.org/jira/browse/HDFS-4413
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1-win, 1.2.1
>Reporter: Mostafa Elhemali
>Assignee: Mostafa Elhemali
> Fix For: 1.2.0
>
> Attachments: HDFS-4413.branch-1.patch
>
>
> If HDFS is not the default file system (fs.default.name is something other 
> than hdfs://...), then secondary namenode throws early on in its 
> initialization. This is a needless check as far as I can tell, and blocks 
> scenarios where HDFS services are up but HDFS is not the default file system.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4413) Secondary namenode won't start if HDFS isn't the default file system

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627433#comment-13627433
 ] 

Suresh Srinivas commented on HDFS-4413:
---

+1 for the patch.

> Secondary namenode won't start if HDFS isn't the default file system
> 
>
> Key: HDFS-4413
> URL: https://issues.apache.org/jira/browse/HDFS-4413
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1-win, 1.2.1
>Reporter: Mostafa Elhemali
>Assignee: Mostafa Elhemali
> Attachments: HDFS-4413.branch-1.patch
>
>
> If HDFS is not the default file system (fs.default.name is something other 
> than hdfs://...), then secondary namenode throws early on in its 
> initialization. This is a needless check as far as I can tell, and blocks 
> scenarios where HDFS services are up but HDFS is not the default file system.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.

2013-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627430#comment-13627430
 ] 

Hadoop QA commented on HDFS-2576:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12577936/hdfs-2576-trunk-7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4214//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4214//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4214//console

This message is automatically generated.

> Namenode should have a favored nodes hint to enable clients to have control 
> over block placement.
> -
>
> Key: HDFS-2576
> URL: https://issues.apache.org/jira/browse/HDFS-2576
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
>Assignee: Devaraj Das
> Fix For: 2.0.5-beta
>
> Attachments: hdfs-2576-1.txt, hdfs-2576-trunk-1.patch, 
> hdfs-2576-trunk-2.patch, hdfs-2576-trunk-7.patch
>
>
> Sometimes Clients like HBase are required to dynamically compute the 
> datanodes it wishes to place the blocks for a file for higher level of 
> locality. For this purpose there is a need of a way to give the Namenode a 
> hint in terms of a favoredNodes parameter about the locations where the 
> client wants to put each block. The proposed solution is a favored nodes 
> parameter in the addBlock() method and in the create() file method to enable 
> the clients to give the hints to the NameNode about the locations of each 
> replica of the block. Note that this would be just a hint and finally the 
> NameNode would look at disk usage, datanode load etc. and decide whether it 
> can respect the hints or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627420#comment-13627420
 ] 

Hadoop QA commented on HDFS-4679:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577937/HDFS-4679.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4213//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4213//console

This message is automatically generated.

> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-4679.patch, HDFS-4679.patch
>
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4675) Fix rename across snapshottable directories

2013-04-09 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4675:


Attachment: HDFS-4675.001.patch

Add more unit tests. Besides fixing the scenario described in the description, 
the patch also fixes another two issues:
1. Check if the src is a snapshottable directory with snapshots before rename.

2. Fix a bug in snapshot deletion: handling reference node when it is 1) in the 
deleted list of the to-be-delete snapshot, and 2) also in the created list of 
the prior snapshot.

> Fix rename across snapshottable directories
> ---
>
> Key: HDFS-4675
> URL: https://issues.apache.org/jira/browse/HDFS-4675
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-4675.000.patch, HDFS-4675.001.patch
>
>
> For rename across snapshottable directories, suppose there are two 
> snapshottable directories: /user1 and /user2 and we have the following steps:
> 1. Take snapshot s1 on /user1 at time t1.
> 2. Take snapshot s2 on /user2 at time t2.
> 3. Take snapshot s3 on /user1 at time t3.
> 4. Rename /user2/foo/ (an INodeDirectoryWithSnapshot instance) to /user1/foo/.
> After the rename we update the subtree of /user1/foo/ again (e.g., delete 
> /user1/foo/bar), we need to decide where to record the diff. The problem is 
> that the current implementation will identify s3 as the latest snapshot, thus 
> recording the snapshot copy of bar to s3. However, the parent of bar, 
> /user1/foo, is still in the created list of s3. Thus here we should record 
> the snapshot copy of bar to s2.
> If we further take snapshot s4 on /user1, and make some further change under 
> /user1/foo, these changes will be recorded in s4. Then if we delete the 
> snapshot s4, similar with above, we should merge the change to s2, not s3.
> Thus in general, we may need to record the latest snapshots of both the 
> src/dst subtree in the renamed inode and update the current 
> INodeDirectory#getExistingINodeInPath accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627399#comment-13627399
 ] 

Chris Nauroth commented on HDFS-4670:
-

Thank you for addressing my questions, Elliott.  One more follow-up:

{quote}
Yep right now there's no reliance on javascript.
{quote}

In that case, can you help me understand the inclusion of the js files in the 
patch?  I couldn't find anything linking to bootstrap.js, bootstrap.min.js, 
jquery.min.js, or tab.js.  I do see html5shiv.js linked from script tags in 
each of the pages.


> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4660) Duplicated checksum on DN in a recovered pipeline

2013-04-09 Thread PengZhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627388#comment-13627388
 ] 

PengZhang commented on HDFS-4660:
-

The scenario for this issue is simple: client send a packet, part of which 
needs to write and part needs to skip. When amount of data to skip reaches 
trunk size, receiver doesn't skip checksum and has it duplicated. 

But for unit test, I found receivePacket() got many dependences and there's no 
test for it before. So I think it's not easy to add unit tests for it.

Any good ideas, Todd?

> Duplicated checksum on DN in a recovered pipeline
> -
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: PengZhang
>Priority: Critical
> Attachments: HDFS-4660.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4305) Add a configurable limit on number of blocks per file, and min block size

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627361#comment-13627361
 ] 

Suresh Srinivas commented on HDFS-4305:
---

bq. It would be an incompatible change, but perhaps a good one. What do folks 
think?
+1 for adding limiting max number of blocks per file, even though it is 
incompatible.

I am not sure if minimum block size is really required. I would rather make it 
a namenode WebUI status to say, your block size is way too small.

> Add a configurable limit on number of blocks per file, and min block size
> -
>
> Key: HDFS-4305
> URL: https://issues.apache.org/jira/browse/HDFS-4305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and 
> managed to create a single file with hundreds of thousands of blocks. This 
> caused problems with the edit log since the OP_ADD op was so large 
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To 
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are 
> rejected
> - introduce a configurable maximum number of blocks per file, above which 
> requests to add another block are rejected (with a suitably high default as 
> to not prevent legitimate large files)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.

2013-04-09 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-2576:
--

Fix Version/s: 2.0.5-beta

> Namenode should have a favored nodes hint to enable clients to have control 
> over block placement.
> -
>
> Key: HDFS-2576
> URL: https://issues.apache.org/jira/browse/HDFS-2576
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
>Assignee: Devaraj Das
> Fix For: 2.0.5-beta
>
> Attachments: hdfs-2576-1.txt, hdfs-2576-trunk-1.patch, 
> hdfs-2576-trunk-2.patch, hdfs-2576-trunk-7.patch
>
>
> Sometimes Clients like HBase are required to dynamically compute the 
> datanodes it wishes to place the blocks for a file for higher level of 
> locality. For this purpose there is a need of a way to give the Namenode a 
> hint in terms of a favoredNodes parameter about the locations where the 
> client wants to put each block. The proposed solution is a favored nodes 
> parameter in the addBlock() method and in the create() file method to enable 
> the clients to give the hints to the NameNode about the locations of each 
> replica of the block. Note that this would be just a hint and finally the 
> NameNode would look at disk usage, datanode load etc. and decide whether it 
> can respect the hints or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4679:
--

Attachment: HDFS-4679.patch

Updated patch with unit test changed to expecte SafeModeException from setTimes 
operation.

> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-4679.patch, HDFS-4679.patch
>
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.

2013-04-09 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-2576:
--

Assignee: Devaraj Das
  Status: Patch Available  (was: Open)

> Namenode should have a favored nodes hint to enable clients to have control 
> over block placement.
> -
>
> Key: HDFS-2576
> URL: https://issues.apache.org/jira/browse/HDFS-2576
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
>Assignee: Devaraj Das
> Attachments: hdfs-2576-1.txt, hdfs-2576-trunk-1.patch, 
> hdfs-2576-trunk-2.patch, hdfs-2576-trunk-7.patch
>
>
> Sometimes Clients like HBase are required to dynamically compute the 
> datanodes it wishes to place the blocks for a file for higher level of 
> locality. For this purpose there is a need of a way to give the Namenode a 
> hint in terms of a favoredNodes parameter about the locations where the 
> client wants to put each block. The proposed solution is a favored nodes 
> parameter in the addBlock() method and in the create() file method to enable 
> the clients to give the hints to the NameNode about the locations of each 
> replica of the block. Note that this would be just a hint and finally the 
> NameNode would look at disk usage, datanode load etc. and decide whether it 
> can respect the hints or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627353#comment-13627353
 ] 

Suresh Srinivas edited comment on HDFS-4679 at 4/10/13 1:28 AM:


In this patch, along with the suggested changes in description, I fixed the 
following issues I found:
# FSNamesystem#setTimes does not check for safemode and throw SafemodeException.
# FSNamesystem#createSymlinkInt did not check for safemode and throw 
SafemodeException


The test failure is due to my patch adding check for safemode in 
FSNamesystem#setTimes.

Was there any reason why FSNamesystem#setTimes() write operation does not check 
for safemode? Given the lack of comments I will assume that it is a bug.

  was (Author: sureshms):
In this patch, along with the suggested changes in description, I found an 
issue where FSNamesystem#setTimes was not checking for safemode and throwing 
SafemodeException. The test failure is due to my patch adding check for 
safemode.

Was there any reason why FSNamesystem#setTimes() write operation does not check 
for safemode? Given the lack of comments I will assume that it is a bug.
  
> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-4679.patch
>
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.

2013-04-09 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-2576:
--

Attachment: hdfs-2576-trunk-7.patch

Updated patch w.r.t the current trunk.

> Namenode should have a favored nodes hint to enable clients to have control 
> over block placement.
> -
>
> Key: HDFS-2576
> URL: https://issues.apache.org/jira/browse/HDFS-2576
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
> Attachments: hdfs-2576-1.txt, hdfs-2576-trunk-1.patch, 
> hdfs-2576-trunk-2.patch, hdfs-2576-trunk-7.patch
>
>
> Sometimes Clients like HBase are required to dynamically compute the 
> datanodes it wishes to place the blocks for a file for higher level of 
> locality. For this purpose there is a need of a way to give the Namenode a 
> hint in terms of a favoredNodes parameter about the locations where the 
> client wants to put each block. The proposed solution is a favored nodes 
> parameter in the addBlock() method and in the create() file method to enable 
> the clients to give the hints to the NameNode about the locations of each 
> replica of the block. Note that this would be just a hint and finally the 
> NameNode would look at disk usage, datanode load etc. and decide whether it 
> can respect the hints or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627353#comment-13627353
 ] 

Suresh Srinivas commented on HDFS-4679:
---

In this patch, along with the suggested changes in description, I found an 
issue where FSNamesystem#setTimes was not checking for safemode and throwing 
SafemodeException. The test failure is due to my patch adding check for 
safemode.

Was there any reason why FSNamesystem#setTimes() write operation does not check 
for safemode? Given the lack of comments I will assume that it is a bug.

> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-4679.patch
>
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627336#comment-13627336
 ] 

Fengdong Yu commented on HDFS-4670:
---

bq. Fengdong Yu
I'll have to check but I thought that browse filesystem button is functional on 
the standby namenode. In which case displaying it isn't an error.

Are you serious?

Did you deploy using official hadoopv2 tar package with HA enabled? if you 
don't know how to deploy HA, just put your question here.

standby namenode cannot read and write, if click "browse file system" on the 
standby namenode, just get an exception, no others.

so you can change the UI style, but you cannot change it by design, right?


> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4672) Support tiered storage policies

2013-04-09 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627309#comment-13627309
 ] 

Andrew Purtell commented on HDFS-4672:
--

bq. In terms of storage, you would see a big win by only requiring one 
attribute per dir vs per file & again you would have only a single place to 
look, so less code.

Can you clarify if you mean only ONE attribute per directory or file, or if you 
mean that one (or more) attributes apply only to the one directory or file they 
are associated with?

> Support tiered storage policies
> ---
>
> Key: HDFS-4672
> URL: https://issues.apache.org/jira/browse/HDFS-4672
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, libhdfs, namenode
>Reporter: Andrew Purtell
>
> We would like to be able to create certain files on certain storage device 
> classes (e.g. spinning media, solid state devices, RAM disk, non-volatile 
> memory). HDFS-2832 enables heterogeneous storage at the DataNode, so the 
> NameNode can gain awareness of what different storage options are available 
> in the pool and where they are located, but no API is provided for clients or 
> block placement plugins to perform device aware block placement. We would 
> like to propose a set of extensions that also have broad applicability to use 
> cases where storage device affinity is important:
>  
> - Add an enum of generic storage device classes, borrowing from current 
> taxonomy of the storage industry
>  
> - Augment DataNode volume metadata in storage reports with this enum
>  
> - Extend the namespace so pluggable block policies can be specified on a 
> directory and storage device class can be tracked in the Inode. Perhaps this 
> could be a larger discussion on adding support for extended attributes in the 
> HDFS namespace. The Inode should track both the storage device class hint and 
> the current actual storage device class. FileStatus should expose this 
> information (or xattrs in general) to clients.
>  
> - Extend the pluggable block policy framework so policies can also consider, 
> and specify, affinity for a particular storage device class
>  
> - Extend the file creation API to accept a storage device class affinity 
> hint. Such a hint can be supplied directly as a parameter, or, if we are 
> considering extended attribute support, then instead as one of a set of 
> xattrs. The hint would be stored in the namespace and also used by the client 
> to indicate to the NameNode/block placement policy/DataNode constraints on 
> block placement. Furthermore, if xattrs or device storage class affinity 
> hints are associated with directories, then the NameNode should provide the 
> storage device affinity hint to the client in the create API response, so the 
> client can provide the appropriate hint to DataNodes when writing new blocks.
>  
> - The list of candidate DataNodes for new blocks supplied by the NameNode to 
> clients should be weighted/sorted by availability of the desired storage 
> device class. 
>  
> - Block replication should consider storage device affinity hints. If a 
> client move()s a file from a location under a path with affinity hint X to 
> under a path with affinity hint Y, then all blocks currently residing on 
> media X should be eventually replicated onto media Y with the then excess 
> replicas on media X deleted.
>  
> - Introduce the concept of degraded path: a path can be degraded if a block 
> placement policy is forced to abandon a constraint in order to persist the 
> block, when there may not be available space on the desired device class, or 
> to maintain the minimum necessary replication factor. This concept is 
> distinct from the corrupt path, where one or more blocks are missing. Paths 
> in degraded state should be periodically reevaluated for re-replication.
>  
> - The FSShell should be extended with commands for changing the storage 
> device class hint for a directory or file. 
>  
> - Clients like DistCP which compare metadata should be extended to be aware 
> of the storage device class hint. For DistCP specifically, there should be an 
> option to ignore the storage device class hints, enabled by default.
>  
> Suggested semantics:
>  
> - The default storage device class should be the null class, or simply the 
> “default class”, for all cases where a hint is not available. This should be 
> configurable. hdfs-defaults.xml could provide the default as spinning media.
>  
> - A storage device class hint should be provided (and is necessary) only when 
> the default is not sufficient.
>  
> - For backwards compatibility, any FSImage or edit log entry lacking a  
> storage device class hint is interpreted as having affinity for the null 
> class.
>  
> - All blocks for a given file share the

[jira] [Commented] (HDFS-4672) Support tiered storage policies

2013-04-09 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627307#comment-13627307
 ] 

eric baldeschwieler commented on HDFS-4672:
---

The complexity of lookup is why I'd suggested only the immediately containing 
directory.  You don't want to have to walk the tree to see what policy applies. 
 Checking exactly one directory would be a lot simpler.

In terms of storage, you would see a big win by only requiring one attribute 
per dir vs per file & again you would have only a single place to look, so less 
code.



> Support tiered storage policies
> ---
>
> Key: HDFS-4672
> URL: https://issues.apache.org/jira/browse/HDFS-4672
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, libhdfs, namenode
>Reporter: Andrew Purtell
>
> We would like to be able to create certain files on certain storage device 
> classes (e.g. spinning media, solid state devices, RAM disk, non-volatile 
> memory). HDFS-2832 enables heterogeneous storage at the DataNode, so the 
> NameNode can gain awareness of what different storage options are available 
> in the pool and where they are located, but no API is provided for clients or 
> block placement plugins to perform device aware block placement. We would 
> like to propose a set of extensions that also have broad applicability to use 
> cases where storage device affinity is important:
>  
> - Add an enum of generic storage device classes, borrowing from current 
> taxonomy of the storage industry
>  
> - Augment DataNode volume metadata in storage reports with this enum
>  
> - Extend the namespace so pluggable block policies can be specified on a 
> directory and storage device class can be tracked in the Inode. Perhaps this 
> could be a larger discussion on adding support for extended attributes in the 
> HDFS namespace. The Inode should track both the storage device class hint and 
> the current actual storage device class. FileStatus should expose this 
> information (or xattrs in general) to clients.
>  
> - Extend the pluggable block policy framework so policies can also consider, 
> and specify, affinity for a particular storage device class
>  
> - Extend the file creation API to accept a storage device class affinity 
> hint. Such a hint can be supplied directly as a parameter, or, if we are 
> considering extended attribute support, then instead as one of a set of 
> xattrs. The hint would be stored in the namespace and also used by the client 
> to indicate to the NameNode/block placement policy/DataNode constraints on 
> block placement. Furthermore, if xattrs or device storage class affinity 
> hints are associated with directories, then the NameNode should provide the 
> storage device affinity hint to the client in the create API response, so the 
> client can provide the appropriate hint to DataNodes when writing new blocks.
>  
> - The list of candidate DataNodes for new blocks supplied by the NameNode to 
> clients should be weighted/sorted by availability of the desired storage 
> device class. 
>  
> - Block replication should consider storage device affinity hints. If a 
> client move()s a file from a location under a path with affinity hint X to 
> under a path with affinity hint Y, then all blocks currently residing on 
> media X should be eventually replicated onto media Y with the then excess 
> replicas on media X deleted.
>  
> - Introduce the concept of degraded path: a path can be degraded if a block 
> placement policy is forced to abandon a constraint in order to persist the 
> block, when there may not be available space on the desired device class, or 
> to maintain the minimum necessary replication factor. This concept is 
> distinct from the corrupt path, where one or more blocks are missing. Paths 
> in degraded state should be periodically reevaluated for re-replication.
>  
> - The FSShell should be extended with commands for changing the storage 
> device class hint for a directory or file. 
>  
> - Clients like DistCP which compare metadata should be extended to be aware 
> of the storage device class hint. For DistCP specifically, there should be an 
> option to ignore the storage device class hints, enabled by default.
>  
> Suggested semantics:
>  
> - The default storage device class should be the null class, or simply the 
> “default class”, for all cases where a hint is not available. This should be 
> configurable. hdfs-defaults.xml could provide the default as spinning media.
>  
> - A storage device class hint should be provided (and is necessary) only when 
> the default is not sufficient.
>  
> - For backwards compatibility, any FSImage or edit log entry lacking a  
> storage device class hint is interpreted as having affinity for the null 
> class.
>  

[jira] [Commented] (HDFS-4672) Support tiered storage policies

2013-04-09 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627272#comment-13627272
 ] 

Colin Patrick McCabe commented on HDFS-4672:


We can add extended attributes in a way that imposes zero overhead for users 
who don't make use of them, by creating another subclass (or subclasses) of 
INode.  Inherited xattrs (that apply to all descendants) is also a reasonable 
idea.

> Support tiered storage policies
> ---
>
> Key: HDFS-4672
> URL: https://issues.apache.org/jira/browse/HDFS-4672
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, libhdfs, namenode
>Reporter: Andrew Purtell
>
> We would like to be able to create certain files on certain storage device 
> classes (e.g. spinning media, solid state devices, RAM disk, non-volatile 
> memory). HDFS-2832 enables heterogeneous storage at the DataNode, so the 
> NameNode can gain awareness of what different storage options are available 
> in the pool and where they are located, but no API is provided for clients or 
> block placement plugins to perform device aware block placement. We would 
> like to propose a set of extensions that also have broad applicability to use 
> cases where storage device affinity is important:
>  
> - Add an enum of generic storage device classes, borrowing from current 
> taxonomy of the storage industry
>  
> - Augment DataNode volume metadata in storage reports with this enum
>  
> - Extend the namespace so pluggable block policies can be specified on a 
> directory and storage device class can be tracked in the Inode. Perhaps this 
> could be a larger discussion on adding support for extended attributes in the 
> HDFS namespace. The Inode should track both the storage device class hint and 
> the current actual storage device class. FileStatus should expose this 
> information (or xattrs in general) to clients.
>  
> - Extend the pluggable block policy framework so policies can also consider, 
> and specify, affinity for a particular storage device class
>  
> - Extend the file creation API to accept a storage device class affinity 
> hint. Such a hint can be supplied directly as a parameter, or, if we are 
> considering extended attribute support, then instead as one of a set of 
> xattrs. The hint would be stored in the namespace and also used by the client 
> to indicate to the NameNode/block placement policy/DataNode constraints on 
> block placement. Furthermore, if xattrs or device storage class affinity 
> hints are associated with directories, then the NameNode should provide the 
> storage device affinity hint to the client in the create API response, so the 
> client can provide the appropriate hint to DataNodes when writing new blocks.
>  
> - The list of candidate DataNodes for new blocks supplied by the NameNode to 
> clients should be weighted/sorted by availability of the desired storage 
> device class. 
>  
> - Block replication should consider storage device affinity hints. If a 
> client move()s a file from a location under a path with affinity hint X to 
> under a path with affinity hint Y, then all blocks currently residing on 
> media X should be eventually replicated onto media Y with the then excess 
> replicas on media X deleted.
>  
> - Introduce the concept of degraded path: a path can be degraded if a block 
> placement policy is forced to abandon a constraint in order to persist the 
> block, when there may not be available space on the desired device class, or 
> to maintain the minimum necessary replication factor. This concept is 
> distinct from the corrupt path, where one or more blocks are missing. Paths 
> in degraded state should be periodically reevaluated for re-replication.
>  
> - The FSShell should be extended with commands for changing the storage 
> device class hint for a directory or file. 
>  
> - Clients like DistCP which compare metadata should be extended to be aware 
> of the storage device class hint. For DistCP specifically, there should be an 
> option to ignore the storage device class hints, enabled by default.
>  
> Suggested semantics:
>  
> - The default storage device class should be the null class, or simply the 
> “default class”, for all cases where a hint is not available. This should be 
> configurable. hdfs-defaults.xml could provide the default as spinning media.
>  
> - A storage device class hint should be provided (and is necessary) only when 
> the default is not sufficient.
>  
> - For backwards compatibility, any FSImage or edit log entry lacking a  
> storage device class hint is interpreted as having affinity for the null 
> class.
>  
> - All blocks for a given file share the same storage device class. If the 
> replication factor for this file is increased the replicas shou

[jira] [Commented] (HDFS-4677) Editlog should support synchronous writes

2013-04-09 Thread Ivan Mitic (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627273#comment-13627273
 ] 

Ivan Mitic commented on HDFS-4677:
--

bq. 
http://stas-blogspot.blogspot.com/2011/11/java-file-flushing-performance.html
Thanks, indeed interesting data. Diff on Windows between "rwd" (which is also 
based on FILE_FLAG_WRITE_THROUGH) and {{FileChannel#force}} looks reasonable.

> Editlog should support synchronous writes
> -
>
> Key: HDFS-4677
> URL: https://issues.apache.org/jira/browse/HDFS-4677
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1-win
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
>
> In the current implementation, NameNode editlog performs syncs to the 
> persistent storage using the {{FileChannel#force}} Java APIs. This API is 
> documented to be slower compared to an alternative where {{RandomAccessFile}} 
> is opened with "rws" flags (synchronous writes). 
> We instrumented {{FileChannel#force}} on Windows and it some 
> software/hardware configurations it can perform significantly slower than the 
> “rws” alternative.
> In terms of the Windows APIs, FileChannel#force internally calls 
> [FlushFileBuffers|http://msdn.microsoft.com/en-us/library/windows/desktop/aa364439(v=vs.85).aspx]
>  while RandomAccessFile (“rws”) opens the file with the 
> [FILE_FLAG_WRITE_THROUGH flag|http://support.microsoft.com/kb/99794]. 
> With this Jira I'd like to introduce a flag that provide means to configure 
> NameNode to use synchronous writes. There is a catch though, the behavior of 
> the "rws" flags is platform and hardware specific and might not provide the 
> same level of guarantees as {{FileChannel#force}} w.r.t. flushing the on-disk 
> cache. This is an expert level setting, and it should be documented as such.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627248#comment-13627248
 ] 

Hadoop QA commented on HDFS-3934:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577899/HDFS-3934.004.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4212//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4212//console

This message is automatically generated.

> duplicative dfs_hosts entries handled wrong
> ---
>
> Key: HDFS-3934
> URL: https://issues.apache.org/jira/browse/HDFS-3934
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
> HDFS-3934.003.patch, HDFS-3934.004.patch
>
>
> A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
> hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
> after the NN restarts because {{getDatanodeListForReport}} does not handle 
> such a "pseudo-duplicate" correctly:
> # the "Remove any nodes we know about from the map" loop no longer has the 
> knowledge to remove the spurious entries
> # the "The remaining nodes are ones that are referenced by the hosts files" 
> loop does not do hostname lookups, so does not know that the IP and hostname 
> refer to the same host.
> Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
> the JSP output:  The *Node* column shows ":50010" as the nodename, with HTML 
> markup {{ href="http://:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F&nnaddr=172.29.97.196:8020";
>  title="172.29.97.216:50010">:50010}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4300) TransferFsImage.downloadEditsToStorage should use a tmp file for destination

2013-04-09 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627223#comment-13627223
 ] 

Andrew Wang commented on HDFS-4300:
---

I think this is okay. A storage failure while downloading will blacklist the 
storage (I believe until the SNN is restarted), preventing rename. It also 
redownloads edits on a future attempt if the final edits file does not exist. 
So when we go to rename, we should only be iterating over valid storages with 
freshly downloaded files.

> TransferFsImage.downloadEditsToStorage should use a tmp file for destination
> 
>
> Key: HDFS-4300
> URL: https://issues.apache.org/jira/browse/HDFS-4300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Andrew Wang
>Priority: Critical
> Attachments: hdfs-4300-1.patch
>
>
> Currently, in TransferFsImage.downloadEditsToStorage, we download the edits 
> file directly to its finalized path. So, if the transfer fails in the middle, 
> a half-written file is left and cannot be distinguished from a correct file. 
> So, future checkpoints by the 2NN will fail, since the file is truncated in 
> the middle -- but it won't ever download a good copy because it thinks it 
> already has the proper file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4672) Support tiered storage policies

2013-04-09 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627219#comment-13627219
 ] 

Andrew Purtell commented on HDFS-4672:
--

bq. One way to reduce complexity and RAM pressure would be to only support 
placement hints on directories and have them apply only to files in that 
immediate directory.  That should limit meta-data cost and address HBase and 
other use cases.

On minimizing RAM pressure then the thing to do here might be to allow for 
hints on a directory to apply to all descendants. Otherwise if we have N 
directories under one parent then we would need N hints instead of 1. 

If the proposal for storage device class hints will be generalized/incorporated 
into an extended attributes facility, then this may be an interesting 
discussion. In the case of least Linux, Windows NT+, and *BSD, xattrs are 
arbitrary name/value pairs associated only with a single file or directory 
object, and a query on a given file or directory returns only the xattrs found 
in its inode (or equivalent). However since namespace storage in HDFS is at a 
premium, it may make sense to introduce a bit that signals the xattr should be 
inherited by all descendants.

> Support tiered storage policies
> ---
>
> Key: HDFS-4672
> URL: https://issues.apache.org/jira/browse/HDFS-4672
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, libhdfs, namenode
>Reporter: Andrew Purtell
>
> We would like to be able to create certain files on certain storage device 
> classes (e.g. spinning media, solid state devices, RAM disk, non-volatile 
> memory). HDFS-2832 enables heterogeneous storage at the DataNode, so the 
> NameNode can gain awareness of what different storage options are available 
> in the pool and where they are located, but no API is provided for clients or 
> block placement plugins to perform device aware block placement. We would 
> like to propose a set of extensions that also have broad applicability to use 
> cases where storage device affinity is important:
>  
> - Add an enum of generic storage device classes, borrowing from current 
> taxonomy of the storage industry
>  
> - Augment DataNode volume metadata in storage reports with this enum
>  
> - Extend the namespace so pluggable block policies can be specified on a 
> directory and storage device class can be tracked in the Inode. Perhaps this 
> could be a larger discussion on adding support for extended attributes in the 
> HDFS namespace. The Inode should track both the storage device class hint and 
> the current actual storage device class. FileStatus should expose this 
> information (or xattrs in general) to clients.
>  
> - Extend the pluggable block policy framework so policies can also consider, 
> and specify, affinity for a particular storage device class
>  
> - Extend the file creation API to accept a storage device class affinity 
> hint. Such a hint can be supplied directly as a parameter, or, if we are 
> considering extended attribute support, then instead as one of a set of 
> xattrs. The hint would be stored in the namespace and also used by the client 
> to indicate to the NameNode/block placement policy/DataNode constraints on 
> block placement. Furthermore, if xattrs or device storage class affinity 
> hints are associated with directories, then the NameNode should provide the 
> storage device affinity hint to the client in the create API response, so the 
> client can provide the appropriate hint to DataNodes when writing new blocks.
>  
> - The list of candidate DataNodes for new blocks supplied by the NameNode to 
> clients should be weighted/sorted by availability of the desired storage 
> device class. 
>  
> - Block replication should consider storage device affinity hints. If a 
> client move()s a file from a location under a path with affinity hint X to 
> under a path with affinity hint Y, then all blocks currently residing on 
> media X should be eventually replicated onto media Y with the then excess 
> replicas on media X deleted.
>  
> - Introduce the concept of degraded path: a path can be degraded if a block 
> placement policy is forced to abandon a constraint in order to persist the 
> block, when there may not be available space on the desired device class, or 
> to maintain the minimum necessary replication factor. This concept is 
> distinct from the corrupt path, where one or more blocks are missing. Paths 
> in degraded state should be periodically reevaluated for re-replication.
>  
> - The FSShell should be extended with commands for changing the storage 
> device class hint for a directory or file. 
>  
> - Clients like DistCP which compare metadata should be extended to be aware 
> of the storage device class hint. For DistCP specifically,

[jira] [Commented] (HDFS-4672) Support tiered storage policies

2013-04-09 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627217#comment-13627217
 ] 

Andrew Purtell commented on HDFS-4672:
--

bq. It would be particularly interesting if people could use both flash and 
hard disks in the same cluster.  Perhaps the flash could be used for 
HBase-backed storage, and the hard disks for everything else, for example.

That is certainly a use case we are looking at. More specifically, migration of 
the blocks for a given column accessed in a read-mostly random access manner to 
the most suitable available storage device class for that type of workload. 
That would be the first aim of HBASE-6572. 

bq. I feel like we might also want to enable automatic migration between tiers, 
at least for some files.  I suppose this could also be done outside HDFS, with 
a daemon that looks at file access times (atimes) and attaches the correct 
xattrs. 

Our thinking is block placement and replication policy plug points could be 
extended or introduced so it’s not necessary to deploy and manage an additional 
set of daemons, but that is only one possible implementation option.

> Support tiered storage policies
> ---
>
> Key: HDFS-4672
> URL: https://issues.apache.org/jira/browse/HDFS-4672
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, libhdfs, namenode
>Reporter: Andrew Purtell
>
> We would like to be able to create certain files on certain storage device 
> classes (e.g. spinning media, solid state devices, RAM disk, non-volatile 
> memory). HDFS-2832 enables heterogeneous storage at the DataNode, so the 
> NameNode can gain awareness of what different storage options are available 
> in the pool and where they are located, but no API is provided for clients or 
> block placement plugins to perform device aware block placement. We would 
> like to propose a set of extensions that also have broad applicability to use 
> cases where storage device affinity is important:
>  
> - Add an enum of generic storage device classes, borrowing from current 
> taxonomy of the storage industry
>  
> - Augment DataNode volume metadata in storage reports with this enum
>  
> - Extend the namespace so pluggable block policies can be specified on a 
> directory and storage device class can be tracked in the Inode. Perhaps this 
> could be a larger discussion on adding support for extended attributes in the 
> HDFS namespace. The Inode should track both the storage device class hint and 
> the current actual storage device class. FileStatus should expose this 
> information (or xattrs in general) to clients.
>  
> - Extend the pluggable block policy framework so policies can also consider, 
> and specify, affinity for a particular storage device class
>  
> - Extend the file creation API to accept a storage device class affinity 
> hint. Such a hint can be supplied directly as a parameter, or, if we are 
> considering extended attribute support, then instead as one of a set of 
> xattrs. The hint would be stored in the namespace and also used by the client 
> to indicate to the NameNode/block placement policy/DataNode constraints on 
> block placement. Furthermore, if xattrs or device storage class affinity 
> hints are associated with directories, then the NameNode should provide the 
> storage device affinity hint to the client in the create API response, so the 
> client can provide the appropriate hint to DataNodes when writing new blocks.
>  
> - The list of candidate DataNodes for new blocks supplied by the NameNode to 
> clients should be weighted/sorted by availability of the desired storage 
> device class. 
>  
> - Block replication should consider storage device affinity hints. If a 
> client move()s a file from a location under a path with affinity hint X to 
> under a path with affinity hint Y, then all blocks currently residing on 
> media X should be eventually replicated onto media Y with the then excess 
> replicas on media X deleted.
>  
> - Introduce the concept of degraded path: a path can be degraded if a block 
> placement policy is forced to abandon a constraint in order to persist the 
> block, when there may not be available space on the desired device class, or 
> to maintain the minimum necessary replication factor. This concept is 
> distinct from the corrupt path, where one or more blocks are missing. Paths 
> in degraded state should be periodically reevaluated for re-replication.
>  
> - The FSShell should be extended with commands for changing the storage 
> device class hint for a directory or file. 
>  
> - Clients like DistCP which compare metadata should be extended to be aware 
> of the storage device class hint. For DistCP specifically, there should be an 
> option to ignore the storage device 

[jira] [Created] (HDFS-4680) Audit logging of client names

2013-04-09 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-4680:
-

 Summary: Audit logging of client names
 Key: HDFS-4680
 URL: https://issues.apache.org/jira/browse/HDFS-4680
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, security
Affects Versions: 2.0.3-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang


HDFS audit logging tracks HDFS operations made by different users, e.g. 
creation and deletion of files. This is useful for after-the-fact root cause 
analysis and security. However, logging merely the username is insufficient for 
many usecases. For instance, it is common for a single user to run multiple 
MapReduce jobs (I believe this is the case with Hive). In this scenario, given 
a particular audit log entry, it is difficult to trace it back to the MR job or 
task that generated that entry.

I see a number of potential options for implementing this.

1. Make an optional "client name" field part of the NN RPC format. We already 
pass a {{clientName}} as a parameter in many RPC calls, so this would 
essentially make it standardized. MR tasks could then set this field to the job 
and task ID.
2. This could be generalized to a set of optional key-value *tags* in the NN 
RPC format, which would then be audit logged. This has standalone benefits 
outside of just verifying MR task ids.
3. Neither of the above two options actually securely verify that MR clients 
are who they claim they are. Doing this securely requires the JobTracker to 
sign MR task attempts, and then having the NN verify this signature. However, 
this is substantially more work, and could be built on after idea #2.

Thoughts welcomed.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627193#comment-13627193
 ] 

Hadoop QA commented on HDFS-3934:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577885/HDFS-3934.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4211//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4211//console

This message is automatically generated.

> duplicative dfs_hosts entries handled wrong
> ---
>
> Key: HDFS-3934
> URL: https://issues.apache.org/jira/browse/HDFS-3934
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
> HDFS-3934.003.patch, HDFS-3934.004.patch
>
>
> A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
> hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
> after the NN restarts because {{getDatanodeListForReport}} does not handle 
> such a "pseudo-duplicate" correctly:
> # the "Remove any nodes we know about from the map" loop no longer has the 
> knowledge to remove the spurious entries
> # the "The remaining nodes are ones that are referenced by the hosts files" 
> loop does not do hostname lookups, so does not know that the IP and hostname 
> refer to the same host.
> Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
> the JSP output:  The *Node* column shows ":50010" as the nodename, with HTML 
> markup {{ href="http://:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F&nnaddr=172.29.97.196:8020";
>  title="172.29.97.216:50010">:50010}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4305) Add a configurable limit on number of blocks per file, and min block size

2013-04-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627159#comment-13627159
 ] 

Todd Lipcon commented on HDFS-4305:
---

This looks good to me.

One possible change: we can set the default min block size to 1MB or so, and 
then add a minimum of 0 to the hdfs-site.xml in src/test/resources to avoid 
having to rewrite a bunch of tests.

It would be an incompatible change, but perhaps a good one. What do folks think?

> Add a configurable limit on number of blocks per file, and min block size
> -
>
> Key: HDFS-4305
> URL: https://issues.apache.org/jira/browse/HDFS-4305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 1.0.4, 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Andrew Wang
>Priority: Minor
> Attachments: hdfs-4305-1.patch
>
>
> We recently had an issue where a user set the block size very very low and 
> managed to create a single file with hundreds of thousands of blocks. This 
> caused problems with the edit log since the OP_ADD op was so large 
> (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To 
> prevent users from making such mistakes, we should:
> - introduce a configurable minimum block size, below which requests are 
> rejected
> - introduce a configurable maximum number of blocks per file, above which 
> requests to add another block are rejected (with a suitably high default as 
> to not prevent legitimate large files)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4643) Fix flakiness in TestQuorumJournalManager

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627154#comment-13627154
 ] 

Hudson commented on HDFS-4643:
--

Integrated in Hadoop-trunk-Commit #3586 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3586/])
HDFS-4643. Fix flakiness in TestQuorumJournalManager. Contributed by Todd 
Lipcon. (Revision 1466253)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1466253
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/GenericTestUtils.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/client/TestQuorumJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java


> Fix flakiness in TestQuorumJournalManager
> -
>
> Key: HDFS-4643
> URL: https://issues.apache.org/jira/browse/HDFS-4643
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm, test
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Trivial
> Fix For: 3.0.0, 2.0.5-beta
>
> Attachments: hdfs-4643.txt, hdfs-4643.txt, hdfs-4643.txt
>
>
> TestQuorumJournalManager can occasionally fail if two consecutive test cases 
> pick the same port number for the JournalNodes. In this case, sometimes an 
> IPC client can be cached from a previous test case, and then fail when it 
> tries to make an IPC over that cached connection to the now-broken 
> connection. We need to more carefully call close() on all the QJMs to prevent 
> this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4300) TransferFsImage.downloadEditsToStorage should use a tmp file for destination

2013-04-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627152#comment-13627152
 ] 

Todd Lipcon commented on HDFS-4300:
---

Hey Andrew. One question in a fault scenario: let's say it's trying to download 
some edits, and one of the dirs fails. This would leave the tmp files in place. 
Is it possible that then, in a future attempt at checkpointing, we might 
accidentally rename that tmp file into the final location?

One potential fix for that would be to make the tmp file names use the current 
timestamp as a suffix, so that they aren't reused in later attempts.

> TransferFsImage.downloadEditsToStorage should use a tmp file for destination
> 
>
> Key: HDFS-4300
> URL: https://issues.apache.org/jira/browse/HDFS-4300
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Andrew Wang
>Priority: Critical
> Attachments: hdfs-4300-1.patch
>
>
> Currently, in TransferFsImage.downloadEditsToStorage, we download the edits 
> file directly to its finalized path. So, if the transfer fails in the middle, 
> a half-written file is left and cannot be distinguished from a correct file. 
> So, future checkpoints by the 2NN will fail, since the file is truncated in 
> the middle -- but it won't ever download a good copy because it thinks it 
> already has the proper file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4643) Fix flakiness in TestQuorumJournalManager

2013-04-09 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-4643:
--

   Resolution: Fixed
Fix Version/s: 2.0.5-beta
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2

> Fix flakiness in TestQuorumJournalManager
> -
>
> Key: HDFS-4643
> URL: https://issues.apache.org/jira/browse/HDFS-4643
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: qjm, test
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Trivial
> Fix For: 3.0.0, 2.0.5-beta
>
> Attachments: hdfs-4643.txt, hdfs-4643.txt, hdfs-4643.txt
>
>
> TestQuorumJournalManager can occasionally fail if two consecutive test cases 
> pick the same port number for the JournalNodes. In this case, sometimes an 
> IPC client can be cached from a previous test case, and then fail when it 
> tries to make an IPC over that cached connection to the now-broken 
> connection. We need to more carefully call close() on all the QJMs to prevent 
> this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-04-09 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3934:
---

Attachment: HDFS-3934.004.patch

when the port is not specified in the hosts file, use a default from the 
configuration, rather than a compile-time default.

> duplicative dfs_hosts entries handled wrong
> ---
>
> Key: HDFS-3934
> URL: https://issues.apache.org/jira/browse/HDFS-3934
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
> HDFS-3934.003.patch, HDFS-3934.004.patch
>
>
> A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
> hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
> after the NN restarts because {{getDatanodeListForReport}} does not handle 
> such a "pseudo-duplicate" correctly:
> # the "Remove any nodes we know about from the map" loop no longer has the 
> knowledge to remove the spurious entries
> # the "The remaining nodes are ones that are referenced by the hosts files" 
> loop does not do hostname lookups, so does not know that the IP and hostname 
> refer to the same host.
> Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
> the JSP output:  The *Node* column shows ":50010" as the nodename, with HTML 
> markup {{ href="http://:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F&nnaddr=172.29.97.196:8020";
>  title="172.29.97.216:50010">:50010}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627139#comment-13627139
 ] 

Hadoop QA commented on HDFS-4679:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577871/HDFS-4679.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestSafeMode

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4210//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4210//console

This message is automatically generated.

> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-4679.patch
>
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-04-09 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3934:
---

Attachment: HDFS-3934.003.patch

* add a test for duplicate hosts entries.

* Add test timeouts for tests in {{TestDecommission}}.

* In {{DatanodeManager#getDatanodeListForReport}}, de-dupe by ip address + 
port, rather than by ip address alone.  (since multiple DNs can run on the same 
node with different ports).

> duplicative dfs_hosts entries handled wrong
> ---
>
> Key: HDFS-3934
> URL: https://issues.apache.org/jira/browse/HDFS-3934
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
> HDFS-3934.003.patch
>
>
> A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
> hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
> after the NN restarts because {{getDatanodeListForReport}} does not handle 
> such a "pseudo-duplicate" correctly:
> # the "Remove any nodes we know about from the map" loop no longer has the 
> knowledge to remove the spurious entries
> # the "The remaining nodes are ones that are referenced by the hosts files" 
> loop does not do hostname lookups, so does not know that the IP and hostname 
> refer to the same host.
> Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
> the JSP output:  The *Node* column shows ":50010" as the nodename, with HTML 
> markup {{ href="http://:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F&nnaddr=172.29.97.196:8020";
>  title="172.29.97.216:50010">:50010}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627062#comment-13627062
 ] 

Suresh Srinivas commented on HDFS-4434:
---

bq. such a symlink is not supported in any Posix system today
But posix does not support addressing inodes in the path like we are doing 
either. To me, leaving target unintepreted and intepreting it only when the 
link is accessed seems to comply more with posix. 

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627044#comment-13627044
 ] 

Sanjay Radia commented on HDFS-4434:


bq. symlink - note /.reserved does not make sense for target - hence you need 
to check for it.
To further clarify. If target is allowed to have /.reserved/inode/id and it is 
*left unchanged* then you are allowing a "sticky symlink" in the sense that if 
the normalized target file is deleted  and recreated with same name then the 
symlink will never follow through because it will have a new inodeId; such a 
symlink is not supported in any Posix system today. Further if the id in the 
target does yet exist then one cannot be sure what it will exactly point to.
I propose that you disallow "/.reserved" in the target.

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627012#comment-13627012
 ] 

Elliott Clark commented on HDFS-4670:
-

[~azuryy]
I'll have to check but I thought that browse filesystem button is functional on 
the standby namenode.  In which case displaying it isn't an error.

[~cnauroth]
# Browser support: This degrades all the way down to lynx so, browser suport 
should be good.  Things may not be pretty in ie6 but they should all be usable.
# Yep right now there's no reliance on javascript.  I've tested in lynx and 
everything looked good and was pretty easy to read.
# That's how HBase handled it.

Good catch on the typo.

[~tgraves]
Bootstrap is absolutely the gold standard in base css frameworks.  It's got the 
most momentum and some of the best community support.

This is just a css change if you want functionality such as paging and ajax 
single page applications, this is just a stepping stone for that (though the 
css constructs for displaying it all exist).


> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627003#comment-13627003
 ] 

Suresh Srinivas commented on HDFS-4434:
---

I created HDFS-4679 and posted cleanup patch to ensure consistent ordering of 
checks.

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4679:
--

Attachment: HDFS-4679.patch

> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-4679.patch
>
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4679:
--

Status: Patch Available  (was: Open)

> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-4679.patch
>
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Suresh Srinivas (JIRA)
Suresh Srinivas created HDFS-4679:
-

 Summary: Namenode operation checks should be done in a consistent 
manner
 Key: HDFS-4679
 URL: https://issues.apache.org/jira/browse/HDFS-4679
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas


Different operations performs checks in different order. I propose consistently 
checking the following in namenode operations:
# Print debug log related to the operation
# validate the input parameters, file names
# Grab read or write lock
#* check if system is ready for read or write operation
#* check if system is in safemode (for write operations)
#* check permissions to see if the user is owner, has access or is super user 
privileges
# Release the lock


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4679) Namenode operation checks should be done in a consistent manner

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627000#comment-13627000
 ] 

Suresh Srinivas commented on HDFS-4679:
---

While making this change issues I found in FSNamesystem, if confirmed, should 
be addressed in separate jiras:
# getAdditionalDatanode grabs readLock but checks for safemode?
# commitBlockSynchronization has no check for permissions against the path. 
Should we


> Namenode operation checks should be done in a consistent manner
> ---
>
> Key: HDFS-4679
> URL: https://issues.apache.org/jira/browse/HDFS-4679
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>
> Different operations performs checks in different order. I propose 
> consistently checking the following in namenode operations:
> # Print debug log related to the operation
> # validate the input parameters, file names
> # Grab read or write lock
> #* check if system is ready for read or write operation
> #* check if system is in safemode (for write operations)
> #* check permissions to see if the user is owner, has access or is super user 
> privileges
> # Release the lock

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4669) TestBlockPoolManager fails using IBM java

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626970#comment-13626970
 ] 

Hudson commented on HDFS-4669:
--

Integrated in Hadoop-trunk-Commit #3583 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3583/])
HDFS-4669. TestBlockPoolManager fails using IBM java. Contributed by Tian 
Hong Wang. (Revision 1466176)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1466176
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockPoolManager.java


> TestBlockPoolManager fails using IBM java
> -
>
> Key: HDFS-4669
> URL: https://issues.apache.org/jira/browse/HDFS-4669
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.3-alpha
>Reporter: Tian Hong Wang
>Assignee: Tian Hong Wang
>  Labels: patch
> Fix For: 2.0.5-beta
>
> Attachments: HADOOP-4669.patch
>
>
> TestBlockPoolManager unit test fails with the following error message using 
> IBM java:
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 27 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
> The root cause is:
> (1)if we want to remove the first NS, keep the second NS, it should be 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns2), not 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns1).
> (2)Since HashMap & HashSet store the data in the random order way, so in ibm 
> java & Oracle java, HashMap get the random order  that causing 
> the random ns1&ns2 value.  So in the code, it should use LinkedHashMap & 
> LinkedHashSet to keep the original order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626948#comment-13626948
 ] 

Suresh Srinivas edited comment on HDFS-4434 at 4/9/13 7:07 PM:
---

bq.  It appears to generally be a simple ordering issue.
Agreed. I will post an updated patch.


  was (Author: sureshms):
bq.  It appears to generally be a simple ordering issue.
Agreed. I will post an update patch.

  
> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626964#comment-13626964
 ] 

Thomas Graves commented on HDFS-4670:
-

Can you add more details about what benefits this has over other frameworks?  
YARN has a newer UI so how does this compare, why not use that for HDFS?  I'm 
not necessarily a huge fan of the YARN one but it took quite a while for it to 
stabilize.

We use a lot of tables in YARN I didn't see much detail about options for 
tables other then http://twitter.github.io/bootstrap/base-css.html#tables.  
Does it nicely handle paging, scrolling, etc?

> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4669) TestBlockPoolManager fails using IBM java

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4669:
--

   Resolution: Fixed
Fix Version/s: (was: 2.0.3-alpha)
   2.0.5-beta
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

+1 for the patch. Committed the patch to trunk and branch-2.

Thank you Tian. Thank you Arpit for the review.



> TestBlockPoolManager fails using IBM java
> -
>
> Key: HDFS-4669
> URL: https://issues.apache.org/jira/browse/HDFS-4669
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.3-alpha
>Reporter: Tian Hong Wang
>Assignee: Tian Hong Wang
>  Labels: patch
> Fix For: 2.0.5-beta
>
> Attachments: HADOOP-4669.patch
>
>
> TestBlockPoolManager unit test fails with the following error message using 
> IBM java:
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 27 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
> The root cause is:
> (1)if we want to remove the first NS, keep the second NS, it should be 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns2), not 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns1).
> (2)Since HashMap & HashSet store the data in the random order way, so in ibm 
> java & Oracle java, HashMap get the random order  that causing 
> the random ns1&ns2 value.  So in the code, it should use LinkedHashMap & 
> LinkedHashSet to keep the original order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4672) Support tiered storage policies

2013-04-09 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626955#comment-13626955
 ] 

eric baldeschwieler commented on HDFS-4672:
---

We should be careful to add as little complexity as possible while enabling the 
core feature here...

Adding extended attributes is a well discussed idea, probably the right way to 
go, but it adds RAM pressure on the NN and needs to be thought out carefully.  
I believe there is already a JIRA on that?

One way to reduce complexity and RAM pressure would be to only support 
placement hints on directories and have them apply only to files in that 
immediate directory.  That should limit meta-data cost and address HBase and 
other use cases.

That said, tying namespace data to the blocks, where replication policy is 
applied is a little complicated and deserves discussion.  Something sanjay, 
suresh and I have been discussing.  Maybe they can jump in with their thoughts.



> Support tiered storage policies
> ---
>
> Key: HDFS-4672
> URL: https://issues.apache.org/jira/browse/HDFS-4672
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, hdfs-client, libhdfs, namenode
>Reporter: Andrew Purtell
>
> We would like to be able to create certain files on certain storage device 
> classes (e.g. spinning media, solid state devices, RAM disk, non-volatile 
> memory). HDFS-2832 enables heterogeneous storage at the DataNode, so the 
> NameNode can gain awareness of what different storage options are available 
> in the pool and where they are located, but no API is provided for clients or 
> block placement plugins to perform device aware block placement. We would 
> like to propose a set of extensions that also have broad applicability to use 
> cases where storage device affinity is important:
>  
> - Add an enum of generic storage device classes, borrowing from current 
> taxonomy of the storage industry
>  
> - Augment DataNode volume metadata in storage reports with this enum
>  
> - Extend the namespace so pluggable block policies can be specified on a 
> directory and storage device class can be tracked in the Inode. Perhaps this 
> could be a larger discussion on adding support for extended attributes in the 
> HDFS namespace. The Inode should track both the storage device class hint and 
> the current actual storage device class. FileStatus should expose this 
> information (or xattrs in general) to clients.
>  
> - Extend the pluggable block policy framework so policies can also consider, 
> and specify, affinity for a particular storage device class
>  
> - Extend the file creation API to accept a storage device class affinity 
> hint. Such a hint can be supplied directly as a parameter, or, if we are 
> considering extended attribute support, then instead as one of a set of 
> xattrs. The hint would be stored in the namespace and also used by the client 
> to indicate to the NameNode/block placement policy/DataNode constraints on 
> block placement. Furthermore, if xattrs or device storage class affinity 
> hints are associated with directories, then the NameNode should provide the 
> storage device affinity hint to the client in the create API response, so the 
> client can provide the appropriate hint to DataNodes when writing new blocks.
>  
> - The list of candidate DataNodes for new blocks supplied by the NameNode to 
> clients should be weighted/sorted by availability of the desired storage 
> device class. 
>  
> - Block replication should consider storage device affinity hints. If a 
> client move()s a file from a location under a path with affinity hint X to 
> under a path with affinity hint Y, then all blocks currently residing on 
> media X should be eventually replicated onto media Y with the then excess 
> replicas on media X deleted.
>  
> - Introduce the concept of degraded path: a path can be degraded if a block 
> placement policy is forced to abandon a constraint in order to persist the 
> block, when there may not be available space on the desired device class, or 
> to maintain the minimum necessary replication factor. This concept is 
> distinct from the corrupt path, where one or more blocks are missing. Paths 
> in degraded state should be periodically reevaluated for re-replication.
>  
> - The FSShell should be extended with commands for changing the storage 
> device class hint for a directory or file. 
>  
> - Clients like DistCP which compare metadata should be extended to be aware 
> of the storage device class hint. For DistCP specifically, there should be an 
> option to ignore the storage device class hints, enabled by default.
>  
> Suggested semantics:
>  
> - The default storage device class should be the null class, or simply the 
> “default class”, for all cases where a hint

[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626950#comment-13626950
 ] 

Sanjay Radia commented on HDFS-4434:


 Some early comments on my first skim of the code. More feedback future comment.
 * ResolvePath
 ** this method is merely normalizing the path by checking the prefix - i 
prefer the name normalizePath
 ** Typo- javadoc of FSDirectory#resolvePath   -  @return if the *patch* 
indicates an inode,
 ** can improve the overhead of spliting the path if you check for the prefix 
in getPathComponents OR do the split/concat in resolvePath.  I see that you are 
minimizing what is done under the lock.
  Note down the road we should do the split in the begining and pass the 
components down to the rest of code - this is a failry invasive change and 
should be done later in another jira.
 * symlink - note /.reserved does not make sense for target - hence you need to 
check for it.
 * Related work for another Jira - .reserved should not be allowed anywhere. We 
may want to allow chrooted file systems (note viewfs already has an internal 
client side fs to support chrooted fs). We can discuss this further in the jira 
- disallowing at root is good enough for this jira.
 * ls of /.reserved  (but not ls of /.reserved/inodes) - again this is best 
done in another jira.

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626948#comment-13626948
 ] 

Suresh Srinivas commented on HDFS-4434:
---

bq.  It appears to generally be a simple ordering issue.
Agreed. I will post an update patch.


> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4669) TestBlockPoolManager fails using IBM java

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4669:
--

Assignee: Tian Hong Wang  (was: Suresh Srinivas)

> TestBlockPoolManager fails using IBM java
> -
>
> Key: HDFS-4669
> URL: https://issues.apache.org/jira/browse/HDFS-4669
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.3-alpha
>Reporter: Tian Hong Wang
>Assignee: Tian Hong Wang
>  Labels: patch
> Fix For: 2.0.3-alpha
>
> Attachments: HADOOP-4669.patch
>
>
> TestBlockPoolManager unit test fails with the following error message using 
> IBM java:
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 27 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
> The root cause is:
> (1)if we want to remove the first NS, keep the second NS, it should be 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns2), not 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns1).
> (2)Since HashMap & HashSet store the data in the random order way, so in ibm 
> java & Oracle java, HashMap get the random order  that causing 
> the random ns1&ns2 value.  So in the code, it should use LinkedHashMap & 
> LinkedHashSet to keep the original order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-4669) TestBlockPoolManager fails using IBM java

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reassigned HDFS-4669:
-

Assignee: Suresh Srinivas

> TestBlockPoolManager fails using IBM java
> -
>
> Key: HDFS-4669
> URL: https://issues.apache.org/jira/browse/HDFS-4669
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.3-alpha
>Reporter: Tian Hong Wang
>Assignee: Suresh Srinivas
>  Labels: patch
> Fix For: 2.0.3-alpha
>
> Attachments: HADOOP-4669.patch
>
>
> TestBlockPoolManager unit test fails with the following error message using 
> IBM java:
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 27 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
> The root cause is:
> (1)if we want to remove the first NS, keep the second NS, it should be 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns2), not 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns1).
> (2)Since HashMap & HashSet store the data in the random order way, so in ibm 
> java & Oracle java, HashMap get the random order  that causing 
> the random ns1&ns2 value.  So in the code, it should use LinkedHashMap & 
> LinkedHashSet to keep the original order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4669) TestBlockPoolManager fails using IBM java

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4669:
--

Summary: TestBlockPoolManager fails using IBM java  (was: 
org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager fails using IBM 
java)

> TestBlockPoolManager fails using IBM java
> -
>
> Key: HDFS-4669
> URL: https://issues.apache.org/jira/browse/HDFS-4669
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.3-alpha
>Reporter: Tian Hong Wang
>  Labels: patch
> Fix For: 2.0.3-alpha
>
> Attachments: HADOOP-4669.patch
>
>
> TestBlockPoolManager unit test fails with the following error message using 
> IBM java:
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 27 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
> The root cause is:
> (1)if we want to remove the first NS, keep the second NS, it should be 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns2), not 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns1).
> (2)Since HashMap & HashSet store the data in the random order way, so in ibm 
> java & Oracle java, HashMap get the random order  that causing 
> the random ns1&ns2 value.  So in the code, it should use LinkedHashMap & 
> LinkedHashSet to keep the original order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4669) org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager fails using IBM java

2013-04-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626925#comment-13626925
 ] 

Chris Nauroth commented on HDFS-4669:
-

+1 for the patch.  I verified that the test passes on Mac and Windows.  Thank 
you, Tian!

> org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager fails using IBM 
> java
> 
>
> Key: HDFS-4669
> URL: https://issues.apache.org/jira/browse/HDFS-4669
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.3-alpha
>Reporter: Tian Hong Wang
>  Labels: patch
> Fix For: 2.0.3-alpha
>
> Attachments: HADOOP-4669.patch
>
>
> TestBlockPoolManager unit test fails with the following error message using 
> IBM java:
> testFederationRefresh(org.apache.hadoop.hdfs.server.datanode.TestBlockPoolManager)
>   Time elapsed: 27 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected: refresh #2]
> > but was: refresh #1]
> >
> The root cause is:
> (1)if we want to remove the first NS, keep the second NS, it should be 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns2), not 
> conf.set(DFSConfigKeys.DFS_NAMESERVICES, ns1).
> (2)Since HashMap & HashSet store the data in the random order way, so in ibm 
> java & Oracle java, HashMap get the random order  that causing 
> the random ns1&ns2 value.  So in the code, it should use LinkedHashMap & 
> LinkedHashSet to keep the original order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4660) Duplicated checksum on DN in a recovered pipeline

2013-04-09 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626916#comment-13626916
 ] 

Todd Lipcon commented on HDFS-4660:
---

Hi Peng. Can you please see if you can add a unit test for this?

> Duplicated checksum on DN in a recovered pipeline
> -
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: PengZhang
>Priority: Critical
> Attachments: HDFS-4660.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626883#comment-13626883
 ] 

Daryn Sharp commented on HDFS-4434:
---

Sure.  Today I can't determine path names if I don't have access to any of its 
ancestor directories.  With this patch's inode resolution I can figure out path 
names.  As Sanjay pointed out, the permission checker throws an exception with 
only the path fragment that access permissions allow you to see.  That 
mitigates the issue _if_ the permission check is the _first_ check applied to 
an inode resolved path.  It's not in many cases - ex. a safe mode exception, or 
other pre-conditions prior to the permission check, will divulge the full path 
resolved from an inode.  It appears to generally be a simple ordering issue.

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626881#comment-13626881
 ] 

Suresh Srinivas commented on HDFS-4489:
---

bq. The GSet used for InodeID to INode map is also semi-fixed. Is it allocated 
similarly to BlocksMap?
Yes. Please see the patch in HDFS-4434. About 1% of heap is used for the GSet.

bq. Simply saying the overhead is insignificant won't convince users. We should 
explain why the benefit from having this feature justifies the overhead. I 
don't think on/off switch is necessary.
I think the assertion here is not overhead is insignificant. Depending on 
details of how the namespace of a system is laid out, I would think this would 
be anywhere from 2 to 5%.

As far the benefits, in the main description I laid this out:

---
This helps in several use cases:
# HDFS can evolve to support ID based protocols such as NFS. We plan to add an 
experimental NFS V3 gateway to HDFS using this mechanism. Will post a github 
link soon.
# InodeID can be used by the tools to track a single instance of a file, for 
cacheing data or tracking and checking for modification based on INodeID, in 
tools like distcp.
# Path cannot identify a unique instance of a file. This causes issues as 
described in HDFS-4258 and HDFS-4437. It has also been a requirement of many 
other jiras such as HDFS-385.
# Using InodeID as an identifier instead of path can be more efficient than 
path bases accesses.
---

bq. We have a namenode which will not work well if we upgrade to a release with 
this feature since it will need extra 4-6GB for the steady-state operation. 
Even if it could absorb the extra memory requirement, we would have to tell 
users that the namespace limit is X% worse.
Is this because namenode does not have RAM? With this change, it is expected 
that NN is allocated more memory, say 5%. If this is done I am not sure why 
users should be told namespace limit is X% worse?

My rationale, repeating what I said earlier is,  machines are becoming 
available with more RAM. Adding 5% JVM heap should not be a problem. In fact 
most of the namenodes are configured with enough head room already and might 
not even need a change. But if this is a big concern, I am okay making 
additional change to bring down the memory consumption close to zero. 



> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> 
>
> Key: HDFS-4489
> URL: https://issues.apache.org/jira/browse/HDFS-4489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brandon Li
>Assignee: Brandon Li
>
> The benefit of using InodeID to uniquely identify a file can be multiple 
> folds. Here are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, 
> HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been 
> replaced or renamed to, the file name and size combination is no t reliable, 
> but the combination of file id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of 
> filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4674) TestBPOfferService fails on Windows due to failure parsing datanode data directory as URI

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626870#comment-13626870
 ] 

Hudson commented on HDFS-4674:
--

Integrated in Hadoop-trunk-Commit #3582 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3582/])
HDFS-4674. TestBPOfferService fails on Windows due to failure parsing 
datanode data directory as URI. Contributed by Chris Nauroth. (Revision 1466148)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1466148
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java


> TestBPOfferService fails on Windows due to failure parsing datanode data 
> directory as URI
> -
>
> Key: HDFS-4674
> URL: https://issues.apache.org/jira/browse/HDFS-4674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: HDFS-4674.1.patch
>
>
> {{TestBPOfferService}} does not set {{dfs.datanode.data.dir}}.  When 
> {{BPServiceActor}} starts, it attempts to use a thread name containing 
> {{dfs.datanode.data.dir}} parsed as URI.  On Windows, this will not parse 
> correctly due to presence of '\'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs

2013-04-09 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626868#comment-13626868
 ] 

Kihwal Lee commented on HDFS-4489:
--

bq. Please look at the overall increase in memory usage instead of increase 
over used memory. 
Your point would be valid only if the overhead was entirely a fixed amount 
(e.g. GSet).  Since the extra memory consumption increases as the size of 
namespace grows, factoring the arbitrary max heap size into this can be 
misleading.  But I agree that the 9% figure does not have an absolute meaning 
either. If the inode-to-block ratio is different, the number will be different. 
For the clusters I have seen, it will be a lower number. The GSet used for 
InodeID to INode map is also semi-fixed. Is it allocated similarly to 
BlocksMap? 

In any case, I would not call this insignificant. We have a namenode which will 
not work well if we upgrade to a release with this feature since it will need 
extra 4-6GB for the steady-state operation. Even if it could absorb the extra 
memory requirement, we would have to tell users that the namespace limit is X% 
worse.  

Simply saying the overhead is insignificant won't convince users. We should 
explain why the benefit from having this feature justifies the overhead.  I 
don't think on/off switch is necessary. 

> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> 
>
> Key: HDFS-4489
> URL: https://issues.apache.org/jira/browse/HDFS-4489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brandon Li
>Assignee: Brandon Li
>
> The benefit of using InodeID to uniquely identify a file can be multiple 
> folds. Here are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, 
> HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been 
> replaced or renamed to, the file name and size combination is no t reliable, 
> but the combination of file id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of 
> filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4674) TestBPOfferService fails on Windows due to failure parsing datanode data directory as URI

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4674:
--

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed the patch to trunk. Thank you Chris.

> TestBPOfferService fails on Windows due to failure parsing datanode data 
> directory as URI
> -
>
> Key: HDFS-4674
> URL: https://issues.apache.org/jira/browse/HDFS-4674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Fix For: 3.0.0
>
> Attachments: HDFS-4674.1.patch
>
>
> {{TestBPOfferService}} does not set {{dfs.datanode.data.dir}}.  When 
> {{BPServiceActor}} starts, it attempts to use a thread name containing 
> {{dfs.datanode.data.dir}} parsed as URI.  On Windows, this will not parse 
> correctly due to presence of '\'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4674) TestBPOfferService fails on Windows due to failure parsing datanode data directory as URI

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626845#comment-13626845
 ] 

Suresh Srinivas commented on HDFS-4674:
---

+1 for the patch.

> TestBPOfferService fails on Windows due to failure parsing datanode data 
> directory as URI
> -
>
> Key: HDFS-4674
> URL: https://issues.apache.org/jira/browse/HDFS-4674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: HDFS-4674.1.patch
>
>
> {{TestBPOfferService}} does not set {{dfs.datanode.data.dir}}.  When 
> {{BPServiceActor}} starts, it attempts to use a thread name containing 
> {{dfs.datanode.data.dir}} parsed as URI.  On Windows, this will not parse 
> correctly due to presence of '\'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626835#comment-13626835
 ] 

Suresh Srinivas commented on HDFS-4434:
---

bq. Do you plan to add the consistent path checks before the inode mapping? 
That would be preferable to opening and then closing a hole for divulging paths.
I am not sure I understand. Can you add more details?

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626829#comment-13626829
 ] 

Daryn Sharp commented on HDFS-4434:
---

bq. Agreed. I am planning to open a jira to ensure all the methods make 
consistent checks. I will make the required change to no divulge the complete 
path after that.

Do you plan to add the consistent path checks before the inode mapping?  That 
would be preferable to opening and then closing a hole for divulging paths.

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626811#comment-13626811
 ] 

Chris Nauroth commented on HDFS-4670:
-

I took the new UI for a test drive.  It looks great!  A couple of higher-level 
questions:

# Is there a specific set of browsers/versions that we intend to support?  The 
old UI was so basic that pretty much any browser could handle it to some 
degree.  Bootstrap itself claims wide cross-browser support all the way back to 
IE7, though I don't know if that's only for a specific subset of its 
functionality.  I did some non-exhaustive testing with Chrome, Firefox and 
Safari on Mac, and IE8 on Windows, and they all looked fine.  I don't have 
access to anything older than IE8.  I haven't tried any mobile browsers.
# Does it degrade gracefully if Javascript is disabled?  I have seen 
deployments that place the Hadoop cluster behind a firewall with users only 
having access to the UI through a browser that by policy disables Javascript.
# Considering that it's common to screen-scrape the old pages, do we need to 
mark this change backwards-incompatible, or at least put a warning in the 
release notes?

I haven't checked all of the code, but I did spot this typo:

{code}
  " Compilcation Information" +
{code}

Thanks, Elliott!


> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs

2013-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626799#comment-13626799
 ] 

Daryn Sharp commented on HDFS-4489:
---

Maybe something simple like GridMix to get a rough feel for the overhead of the 
extra resolution.  I don't expect it to be much, but it'd be nice to know.

> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> 
>
> Key: HDFS-4489
> URL: https://issues.apache.org/jira/browse/HDFS-4489
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Brandon Li
>Assignee: Brandon Li
>
> The benefit of using InodeID to uniquely identify a file can be multiple 
> folds. Here are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, 
> HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been 
> replaced or renamed to, the file name and size combination is no t reliable, 
> but the combination of file id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of 
> filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4678) libhdfs casts Japanese character incorrectly to Java API

2013-04-09 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626797#comment-13626797
 ] 

Eric Yang commented on HDFS-4678:
-

This particular JIRA is focusing on the macro to convert char* to NewUTF8String 
may have bugs in the serialization.  I think the example setlocal should use 
"ja_JP.UTF-8" to ensure the passing char* is UTF8 encoded string.

>  libhdfs casts Japanese character incorrectly to Java API 
> --
>
> Key: HDFS-4678
> URL: https://issues.apache.org/jira/browse/HDFS-4678
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 1.1.2
> Environment: Platform:    Linux64
> Locale:    Japanese (ja_JP.UTF-8)
>Reporter: Jiqiu
>Priority: Critical
> Fix For: 1.2.0
>
>
> put a local file with Japanese characters to hdfs,
> while browsing it in hdfs, it cannot be recognized. 
> here is the test.c
> #include "hdfs.h"
> #include 
> #include 
> int main(int argc, char **argv) {
> if(!setlocale(LC_CTYPE, "ja_JP")) {
>   printf("Can not set locale type\n");
> }
> printf("0\n");
> hdfsFS fs = hdfsConnect("localhost", 9000);
> printf("1\n");
> const char* writePath = "/tmp/\xF0\xA0\x80\x8B.txt";
> printf("2\n");
> hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 
> 0);
> if(!writeFile) {
>   fprintf(stderr, "Failed to open %s for writing!\n", writePath);
>   exit(-1);
> }
> char* buffer = "Hello, World! \xF0\xA0\x80\x8B";
> tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, 
> strlen(buffer)+1);
> if (hdfsFlush(fs, writeFile)) {
>fprintf(stderr, "Failed to 'flush' %s\n", writePath); 
>   exit(-1);
> }
>printf("3\n");
>hdfsCloseFile(fs, writeFile);
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626782#comment-13626782
 ] 

Suresh Srinivas commented on HDFS-4434:
---

bq. If the directory denies access, then we should ensure that no operation 
will expose the path names because the user isn't supposed to be able to see 
them.
Agreed. I am planning to open a jira to ensure all the methods make consistent 
checks. I will make the required change to no divulge the complete path after 
that.

bq. Does lease renewal work for inode paths?
renewLease() operation currently takes only clientName. I think we should 
change this method to add path along with the clientName. I plan on doing this 
in another jira. Once that is done, I plan on making changes to allow inode 
path after that jira.


> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4676) TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free up memory

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626774#comment-13626774
 ] 

Hudson commented on HDFS-4676:
--

Integrated in Hadoop-trunk-Commit #3581 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3581/])
HDFS-4676. TestHDFSFileSystemContract should set MiniDFSCluster variable to 
null to free up memory. Contributed by Suresh Srinivas. (Revision 1466099)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1466099
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHDFSFileSystemContract.java


> TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free 
> up memory
> ---
>
> Key: HDFS-4676
> URL: https://issues.apache.org/jira/browse/HDFS-4676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Minor
> Fix For: 2.0.5-beta
>
> Attachments: HDFS-4676.patch
>
>
> TestHDFSFileSystemContract should reset the cluster member to null in order 
> to  make garbage collection quickly collect large chunk of memory associated 
> with MiniDFSCluster. This avoids OutOfMemory errors.
> See 
> https://issues.apache.org/jira/browse/HDFS-4434?focusedCommentId=13624246&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13624246
>  and the next jenkins tests where the OOM was fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4676) TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free up memory

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4676:
--

   Resolution: Fixed
Fix Version/s: 2.0.5-beta
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed the patch to trunk and branch-2. Thank you Sanjay for the review.

> TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free 
> up memory
> ---
>
> Key: HDFS-4676
> URL: https://issues.apache.org/jira/browse/HDFS-4676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Minor
> Fix For: 2.0.5-beta
>
> Attachments: HDFS-4676.patch
>
>
> TestHDFSFileSystemContract should reset the cluster member to null in order 
> to  make garbage collection quickly collect large chunk of memory associated 
> with MiniDFSCluster. This avoids OutOfMemory errors.
> See 
> https://issues.apache.org/jira/browse/HDFS-4434?focusedCommentId=13624246&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13624246
>  and the next jenkins tests where the OOM was fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4659) Support setting execution bit for regular files

2013-04-09 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626766#comment-13626766
 ] 

Brandon Li commented on HDFS-4659:
--

OS not supporting the sticky bit doesn't mean the sticky bit can't be set. 

>From file system point view, the sticky bit for regular file has little 
>meaning (sticky bit for directory is a different story). If we try it on 
>Linux, sticky bit can still be set for regular files though the kernel may 
>just ignore it.

> Support setting execution bit for regular files
> ---
>
> Key: HDFS-4659
> URL: https://issues.apache.org/jira/browse/HDFS-4659
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
> Attachments: HDFS-4659.patch, HDFS-4659.patch, HDFS-4659.patch, 
> HDFS-4659.patch
>
>
> By default regular files are created with mode "rw-r--r--", which is similar 
> as that on many UNIX platforms. However, setting execution bit for regular 
> files are not supported by HDFS. 
> It's the client's choice to set file access mode. HDFS would be easier to use 
> if it can support it, especially when HDFS is accessed by network file system 
> protocols. This JIRA is to track the change to support execution bit. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3940) Add Gset#clear method and clear the block map when namenode is shutdown

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3940:
--

   Resolution: Fixed
Fix Version/s: 2.0.5-beta
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I committed this to trunk and branch-2. Thank you Sanjay for the review.

> Add Gset#clear method and clear the block map when namenode is shutdown
> ---
>
> Key: HDFS-3940
> URL: https://issues.apache.org/jira/browse/HDFS-3940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Suresh Srinivas
>Priority: Minor
> Fix For: 2.0.5-beta
>
> Attachments: HDFS-3940.patch, HDFS-3940.patch
>
>
> Per HDFS-3936 it would be useful if GSet has a clear method so BM#close could 
> clear out the LightWeightGSet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4676) TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free up memory

2013-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626746#comment-13626746
 ] 

Hadoop QA commented on HDFS-4676:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577673/HDFS-4676.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4209//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4209//console

This message is automatically generated.

> TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free 
> up memory
> ---
>
> Key: HDFS-4676
> URL: https://issues.apache.org/jira/browse/HDFS-4676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Minor
> Attachments: HDFS-4676.patch
>
>
> TestHDFSFileSystemContract should reset the cluster member to null in order 
> to  make garbage collection quickly collect large chunk of memory associated 
> with MiniDFSCluster. This avoids OutOfMemory errors.
> See 
> https://issues.apache.org/jira/browse/HDFS-4434?focusedCommentId=13624246&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13624246
>  and the next jenkins tests where the OOM was fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4678) libhdfs casts Japanese character incorrectly to Java API

2013-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626732#comment-13626732
 ] 

Daryn Sharp commented on HDFS-4678:
---

HDFS uses UTF-8 encoding.

>  libhdfs casts Japanese character incorrectly to Java API 
> --
>
> Key: HDFS-4678
> URL: https://issues.apache.org/jira/browse/HDFS-4678
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 1.1.2
> Environment: Platform:    Linux64
> Locale:    Japanese (ja_JP.UTF-8)
>Reporter: Jiqiu
>Priority: Critical
> Fix For: 1.2.0
>
>
> put a local file with Japanese characters to hdfs,
> while browsing it in hdfs, it cannot be recognized. 
> here is the test.c
> #include "hdfs.h"
> #include 
> #include 
> int main(int argc, char **argv) {
> if(!setlocale(LC_CTYPE, "ja_JP")) {
>   printf("Can not set locale type\n");
> }
> printf("0\n");
> hdfsFS fs = hdfsConnect("localhost", 9000);
> printf("1\n");
> const char* writePath = "/tmp/\xF0\xA0\x80\x8B.txt";
> printf("2\n");
> hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 
> 0);
> if(!writeFile) {
>   fprintf(stderr, "Failed to open %s for writing!\n", writePath);
>   exit(-1);
> }
> char* buffer = "Hello, World! \xF0\xA0\x80\x8B";
> tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, 
> strlen(buffer)+1);
> if (hdfsFlush(fs, writeFile)) {
>fprintf(stderr, "Failed to 'flush' %s\n", writePath); 
>   exit(-1);
> }
>printf("3\n");
>hdfsCloseFile(fs, writeFile);
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4659) Support setting execution bit for regular files

2013-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626730#comment-13626730
 ] 

Daryn Sharp commented on HDFS-4659:
---

I'm not sure that allowing sticky bit on file should now be allowed.  Older 
*nix flavors used it to keep the text segment of an executable in swap.  Linux 
has never supported it, and I believe most other OSes dropped it.

> Support setting execution bit for regular files
> ---
>
> Key: HDFS-4659
> URL: https://issues.apache.org/jira/browse/HDFS-4659
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
> Attachments: HDFS-4659.patch, HDFS-4659.patch, HDFS-4659.patch, 
> HDFS-4659.patch
>
>
> By default regular files are created with mode "rw-r--r--", which is similar 
> as that on many UNIX platforms. However, setting execution bit for regular 
> files are not supported by HDFS. 
> It's the client's choice to set file access mode. HDFS would be easier to use 
> if it can support it, especially when HDFS is accessed by network file system 
> protocols. This JIRA is to track the change to support execution bit. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4677) Editlog should support synchronous writes

2013-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626712#comment-13626712
 ] 

Daryn Sharp commented on HDFS-4677:
---

A bit dated, but interesting data:
http://stas-blogspot.blogspot.com/2011/11/java-file-flushing-performance.html

> Editlog should support synchronous writes
> -
>
> Key: HDFS-4677
> URL: https://issues.apache.org/jira/browse/HDFS-4677
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 1-win
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
>
> In the current implementation, NameNode editlog performs syncs to the 
> persistent storage using the {{FileChannel#force}} Java APIs. This API is 
> documented to be slower compared to an alternative where {{RandomAccessFile}} 
> is opened with "rws" flags (synchronous writes). 
> We instrumented {{FileChannel#force}} on Windows and it some 
> software/hardware configurations it can perform significantly slower than the 
> “rws” alternative.
> In terms of the Windows APIs, FileChannel#force internally calls 
> [FlushFileBuffers|http://msdn.microsoft.com/en-us/library/windows/desktop/aa364439(v=vs.85).aspx]
>  while RandomAccessFile (“rws”) opens the file with the 
> [FILE_FLAG_WRITE_THROUGH flag|http://support.microsoft.com/kb/99794]. 
> With this Jira I'd like to introduce a flag that provide means to configure 
> NameNode to use synchronous writes. There is a catch though, the behavior of 
> the "rws" flags is platform and hardware specific and might not provide the 
> same level of guarantees as {{FileChannel#force}} w.r.t. flushing the on-disk 
> cache. This is an expert level setting, and it should be documented as such.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode

2013-04-09 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626689#comment-13626689
 ] 

Daryn Sharp commented on HDFS-4434:
---

bq.  Exception might print the regular path corresponding to a given inode ID. 
Do you see any issue with it?
Yes.  In some environments, the path name itself might be enough to divulge 
sensitive information.  There are two cases to consider:
# If a directory allows listing, but child paths are not readable, then 
allowing arbitrary inode resolution is ok because the user could see the paths 
anyway.
# If the directory denies access, then we should ensure that no operation will 
expose the path names because the user isn't supposed to be able to see them.

bq.  the exception will not give you any an additional info as long *as the 
exception does not returns the full path*
Yes, exactly my concern, per #2 above.  Some operations appear to check 
preconditions before using the permission checker, or resolve before checking 
safemode - and then throw exceptions with the resolved path.  Ex. mkdir, 
append, lease recovery, completeFile, rename, delete, and probably others will 
divulge paths.

Concat doesn't appear to handle inode paths.  Does lease renewal work for inode 
paths?

> Provide a mapping from INodeId to INode
> ---
>
> Key: HDFS-4434
> URL: https://issues.apache.org/jira/browse/HDFS-4434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Suresh Srinivas
> Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, 
> HDFS-4434.patch, HDFS-4434.patch
>
>
> This JIRA is to provide a way to access the INode via its id. The proposed 
> solution is to have an in-memory mapping from INodeId to INode. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4676) TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free up memory

2013-04-09 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4676:
--

Status: Patch Available  (was: Open)

> TestHDFSFileSystemContract should set MiniDFSCluster variable to null to free 
> up memory
> ---
>
> Key: HDFS-4676
> URL: https://issues.apache.org/jira/browse/HDFS-4676
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Minor
> Attachments: HDFS-4676.patch
>
>
> TestHDFSFileSystemContract should reset the cluster member to null in order 
> to  make garbage collection quickly collect large chunk of memory associated 
> with MiniDFSCluster. This avoids OutOfMemory errors.
> See 
> https://issues.apache.org/jira/browse/HDFS-4434?focusedCommentId=13624246&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13624246
>  and the next jenkins tests where the OOM was fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4339) Persist inode id in fsimage and editlog

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626582#comment-13626582
 ] 

Hudson commented on HDFS-4339:
--

Integrated in Hadoop-Mapreduce-trunk #1394 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1394/])
HDFS-4339. Persist inode id in fsimage and editlog. Contributed by Brandon 
Li. (Revision 1465835)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465835
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored.xml


> Persist inode id in fsimage and editlog
> ---
>
> Key: HDFS-4339
> URL: https://issues.apache.org/jira/browse/HDFS-4339
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
> Fix For: 3.0.0
>
> Attachments: editsStored, HDFS-4339.patch, HDFS-4339.patch, 
> HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, 
> HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch
>
>
>  Persist inode id in fsimage and editlog and update offline viewers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3940) Add Gset#clear method and clear the block map when namenode is shutdown

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626580#comment-13626580
 ] 

Hudson commented on HDFS-3940:
--

Integrated in Hadoop-Mapreduce-trunk #1394 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1394/])
HDFS-3940. Add Gset#clear method and clear the block map when namenode is 
shutdown. Contributed by Suresh Srinivas. (Revision 1465851)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465851
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/GSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/GSetByHashMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightGSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestGSet.java


> Add Gset#clear method and clear the block map when namenode is shutdown
> ---
>
> Key: HDFS-3940
> URL: https://issues.apache.org/jira/browse/HDFS-3940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Suresh Srinivas
>Priority: Minor
> Attachments: HDFS-3940.patch, HDFS-3940.patch
>
>
> Per HDFS-3936 it would be useful if GSet has a clear method so BM#close could 
> clear out the LightWeightGSet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3981) access time is set without holding FSNamesystem write lock

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626578#comment-13626578
 ] 

Hudson commented on HDFS-3981:
--

Integrated in Hadoop-Mapreduce-trunk #1394 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1394/])
HDFS-3981. Fix handling of FSN lock in getBlockLocations. Contributed by 
Xiaobo Peng and Todd Lipcon. (Revision 1465751)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465751
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/MockitoUtil.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSetTimes.java


> access time is set without holding FSNamesystem write lock
> --
>
> Key: HDFS-3981
> URL: https://issues.apache.org/jira/browse/HDFS-3981
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 0.23.3, 2.0.3-alpha, 0.23.5
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
> Fix For: 3.0.0, 2.0.5-beta
>
> Attachments: HDFS-3981-branch-0.23.4.patch, 
> HDFS-3981-branch-0.23.patch, HDFS-3981-branch-2.patch, HDFS-3981-trunk.patch, 
> hdfs-3981.txt
>
>
> Incorrect condition in {{FSNamesystem.getBlockLocatoins()}} can lead to 
> updating times without write lock. In most cases this condition will force 
> {{FSNamesystem.getBlockLocatoins()}} to hold write lock, even if times do not 
> need to be updated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4339) Persist inode id in fsimage and editlog

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626558#comment-13626558
 ] 

Hudson commented on HDFS-4339:
--

Integrated in Hadoop-Hdfs-trunk #1367 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1367/])
HDFS-4339. Persist inode id in fsimage and editlog. Contributed by Brandon 
Li. (Revision 1465835)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465835
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored.xml


> Persist inode id in fsimage and editlog
> ---
>
> Key: HDFS-4339
> URL: https://issues.apache.org/jira/browse/HDFS-4339
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
> Fix For: 3.0.0
>
> Attachments: editsStored, HDFS-4339.patch, HDFS-4339.patch, 
> HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, 
> HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch
>
>
>  Persist inode id in fsimage and editlog and update offline viewers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3940) Add Gset#clear method and clear the block map when namenode is shutdown

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626556#comment-13626556
 ] 

Hudson commented on HDFS-3940:
--

Integrated in Hadoop-Hdfs-trunk #1367 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1367/])
HDFS-3940. Add Gset#clear method and clear the block map when namenode is 
shutdown. Contributed by Suresh Srinivas. (Revision 1465851)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465851
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/GSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/GSetByHashMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightGSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestGSet.java


> Add Gset#clear method and clear the block map when namenode is shutdown
> ---
>
> Key: HDFS-3940
> URL: https://issues.apache.org/jira/browse/HDFS-3940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Suresh Srinivas
>Priority: Minor
> Attachments: HDFS-3940.patch, HDFS-3940.patch
>
>
> Per HDFS-3936 it would be useful if GSet has a clear method so BM#close could 
> clear out the LightWeightGSet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3981) access time is set without holding FSNamesystem write lock

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626554#comment-13626554
 ] 

Hudson commented on HDFS-3981:
--

Integrated in Hadoop-Hdfs-trunk #1367 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1367/])
HDFS-3981. Fix handling of FSN lock in getBlockLocations. Contributed by 
Xiaobo Peng and Todd Lipcon. (Revision 1465751)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465751
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/MockitoUtil.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSetTimes.java


> access time is set without holding FSNamesystem write lock
> --
>
> Key: HDFS-3981
> URL: https://issues.apache.org/jira/browse/HDFS-3981
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 0.23.3, 2.0.3-alpha, 0.23.5
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
> Fix For: 3.0.0, 2.0.5-beta
>
> Attachments: HDFS-3981-branch-0.23.4.patch, 
> HDFS-3981-branch-0.23.patch, HDFS-3981-branch-2.patch, HDFS-3981-trunk.patch, 
> hdfs-3981.txt
>
>
> Incorrect condition in {{FSNamesystem.getBlockLocatoins()}} can lead to 
> updating times without write lock. In most cases this condition will force 
> {{FSNamesystem.getBlockLocatoins()}} to hold write lock, even if times do not 
> need to be updated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4339) Persist inode id in fsimage and editlog

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626467#comment-13626467
 ] 

Hudson commented on HDFS-4339:
--

Integrated in Hadoop-Yarn-trunk #178 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/178/])
HDFS-4339. Persist inode id in fsimage and editlog. Contributed by Brandon 
Li. (Revision 1465835)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465835
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored.xml


> Persist inode id in fsimage and editlog
> ---
>
> Key: HDFS-4339
> URL: https://issues.apache.org/jira/browse/HDFS-4339
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
> Fix For: 3.0.0
>
> Attachments: editsStored, HDFS-4339.patch, HDFS-4339.patch, 
> HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, 
> HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch
>
>
>  Persist inode id in fsimage and editlog and update offline viewers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3940) Add Gset#clear method and clear the block map when namenode is shutdown

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626465#comment-13626465
 ] 

Hudson commented on HDFS-3940:
--

Integrated in Hadoop-Yarn-trunk #178 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/178/])
HDFS-3940. Add Gset#clear method and clear the block map when namenode is 
shutdown. Contributed by Suresh Srinivas. (Revision 1465851)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465851
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/GSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/GSetByHashMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightGSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestGSet.java


> Add Gset#clear method and clear the block map when namenode is shutdown
> ---
>
> Key: HDFS-3940
> URL: https://issues.apache.org/jira/browse/HDFS-3940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Suresh Srinivas
>Priority: Minor
> Attachments: HDFS-3940.patch, HDFS-3940.patch
>
>
> Per HDFS-3936 it would be useful if GSet has a clear method so BM#close could 
> clear out the LightWeightGSet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3981) access time is set without holding FSNamesystem write lock

2013-04-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626463#comment-13626463
 ] 

Hudson commented on HDFS-3981:
--

Integrated in Hadoop-Yarn-trunk #178 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/178/])
HDFS-3981. Fix handling of FSN lock in getBlockLocations. Contributed by 
Xiaobo Peng and Todd Lipcon. (Revision 1465751)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1465751
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/MockitoUtil.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSetTimes.java


> access time is set without holding FSNamesystem write lock
> --
>
> Key: HDFS-3981
> URL: https://issues.apache.org/jira/browse/HDFS-3981
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 0.23.3, 2.0.3-alpha, 0.23.5
>Reporter: Xiaobo Peng
>Assignee: Xiaobo Peng
> Fix For: 3.0.0, 2.0.5-beta
>
> Attachments: HDFS-3981-branch-0.23.4.patch, 
> HDFS-3981-branch-0.23.patch, HDFS-3981-branch-2.patch, HDFS-3981-trunk.patch, 
> hdfs-3981.txt
>
>
> Incorrect condition in {{FSNamesystem.getBlockLocatoins()}} can lead to 
> updating times without write lock. In most cases this condition will force 
> {{FSNamesystem.getBlockLocatoins()}} to hold write lock, even if times do not 
> need to be updated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4678) libhdfs casts Japanese character incorrectly to Java API

2013-04-09 Thread Jiqiu (JIRA)
Jiqiu created HDFS-4678:
---

 Summary:  libhdfs casts Japanese character incorrectly to Java API 
 Key: HDFS-4678
 URL: https://issues.apache.org/jira/browse/HDFS-4678
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 1.1.2
 Environment: Platform:    Linux64
Locale:    Japanese (ja_JP.UTF-8)
Reporter: Jiqiu
Priority: Critical
 Fix For: 1.2.0


put a local file with Japanese characters to hdfs,
while browsing it in hdfs, it cannot be recognized. 


here is the test.c

#include "hdfs.h"
#include 
#include 

int main(int argc, char **argv) {
if(!setlocale(LC_CTYPE, "ja_JP")) {
  printf("Can not set locale type\n");
}
printf("0\n");
hdfsFS fs = hdfsConnect("localhost", 9000);
printf("1\n");
const char* writePath = "/tmp/\xF0\xA0\x80\x8B.txt";
printf("2\n");
hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 0);
if(!writeFile) {
  fprintf(stderr, "Failed to open %s for writing!\n", writePath);
  exit(-1);
}
char* buffer = "Hello, World! \xF0\xA0\x80\x8B";
tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, 
strlen(buffer)+1);
if (hdfsFlush(fs, writeFile)) {
   fprintf(stderr, "Failed to 'flush' %s\n", writePath); 
  exit(-1);
}
   printf("3\n");
   hdfsCloseFile(fs, writeFile);
}



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626329#comment-13626329
 ] 

Hadoop QA commented on HDFS-4670:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12577737/ha2.PNG
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4208//console

This message is automatically generated.

> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Fengdong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengdong Yu updated HDFS-4670:
--

Attachment: ha2.PNG

> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4670) Style Hadoop HDFS web ui's with Twitter's bootstrap.

2013-04-09 Thread Fengdong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626326#comment-13626326
 ] 

Fengdong Yu commented on HDFS-4670:
---

I attached HA2.PNG;

I applied this patch in my test cluster with HA enabled.

but the stand by namenode's web UI also display "browse file system", which 
should be hidden.

please fix this issue.

> Style Hadoop HDFS web ui's with Twitter's bootstrap.
> 
>
> Key: HDFS-4670
> URL: https://issues.apache.org/jira/browse/HDFS-4670
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.3-alpha
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Attachments: ha2.PNG, Hadoop JournalNode.png, Hadoop NameNode.png, 
> HDFS-4670-0.patch, HDFS-4670-1.patch, hdfs_browser.png
>
>
> A users' first experience of Apache Hadoop is often looking at the web ui.  
> This should give the user confidence that the project is usable and 
> relatively current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira