[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-11-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144576#comment-13144576
 ] 

Hadoop QA commented on HDFS-2476:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12502566/hashStructures.patch-7
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.TestFileAppend2
  org.apache.hadoop.hdfs.TestBalancerBandwidth

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1535//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1535//console

This message is automatically generated.

> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch, hashStructures.patch-2, 
> hashStructures.patch-3, hashStructures.patch-4, hashStructures.patch-5, 
> hashStructures.patch-6, hashStructures.patch-7
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2495) Increase granularity of write operations in ReplicationMonitor thus reducing contention for write lock

2011-11-04 Thread Tomasz Nykiel (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Nykiel updated HDFS-2495:


Attachment: replicationMon.patch-1

> Increase granularity of write operations in ReplicationMonitor thus reducing 
> contention for write lock
> --
>
> Key: HDFS-2495
> URL: https://issues.apache.org/jira/browse/HDFS-2495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: replicationMon.patch, replicationMon.patch-1
>
>
> For processing blocks in ReplicationMonitor 
> (BlockManager.computeReplicationWork), we first obtain a list of blocks to be 
> replicated by calling chooseUnderReplicatedBlocks, and then for each block 
> which was found, we call computeReplicationWorkForBlock. The latter processes 
> a block in three stages, acquiring the writelock twice per call:
> 1. obtaining block related info (livenodes, srcnode, etc.) under lock
> 2. choosing target for replication
> 3. scheduling replication (under lock)
> We would like to change this behaviour and decrease contention for the write 
> lock, by batching blocks and executing 1,2,3, for sets of blocks, rather than 
> for each one separately. This would decrease the number of writeLock to 2, 
> from 2*numberofblocks.
> Also, the info level logging can be pushed outside the writelock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2495) Increase granularity of write operations in ReplicationMonitor thus reducing contention for write lock

2011-11-04 Thread Tomasz Nykiel (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144544#comment-13144544
 ] 

Tomasz Nykiel commented on HDFS-2495:
-

Nicolas, I fixed the patch for formatting issues. Thanks.

> Increase granularity of write operations in ReplicationMonitor thus reducing 
> contention for write lock
> --
>
> Key: HDFS-2495
> URL: https://issues.apache.org/jira/browse/HDFS-2495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: replicationMon.patch, replicationMon.patch-1
>
>
> For processing blocks in ReplicationMonitor 
> (BlockManager.computeReplicationWork), we first obtain a list of blocks to be 
> replicated by calling chooseUnderReplicatedBlocks, and then for each block 
> which was found, we call computeReplicationWorkForBlock. The latter processes 
> a block in three stages, acquiring the writelock twice per call:
> 1. obtaining block related info (livenodes, srcnode, etc.) under lock
> 2. choosing target for replication
> 3. scheduling replication (under lock)
> We would like to change this behaviour and decrease contention for the write 
> lock, by batching blocks and executing 1,2,3, for sets of blocks, rather than 
> for each one separately. This would decrease the number of writeLock to 2, 
> from 2*numberofblocks.
> Also, the info level logging can be pushed outside the writelock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2495) Increase granularity of write operations in ReplicationMonitor thus reducing contention for write lock

2011-11-04 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144545#comment-13144545
 ] 

jirapos...@reviews.apache.org commented on HDFS-2495:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2542/
---

(Updated 2011-11-05 03:30:29.153921)


Review request for Hairong Kuang.


Changes
---

-synced with the trunk
-fixed formatting issues


Summary
---

For processing blocks in ReplicationMonitor 
(BlockManager.computeReplicationWork), we first obtain a list of blocks to be 
replicated by calling chooseUnderReplicatedBlocks, and then for each block 
which was found, we call computeReplicationWorkForBlock. The latter processes a 
block in three stages, acquiring the writelock twice per call:

1. obtaining block related info (livenodes, srcnode, etc.) under lock
2. choosing target for replication
3. scheduling replication (under lock)

We would like to change this behaviour and decrease contention for the write 
lock, by batching blocks and executing 1,2,3, for sets of blocks, rather than 
for each one separately. This would decrease the number of writeLock to 2, from 
2*numberofblocks.

Also, the info level logging can be pushed outside the writelock.


This addresses bug HDFS-2495.
https://issues.apache.org/jira/browse/HDFS-2495


Diffs (updated)
-

  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 1197875 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
 1197875 

Diff: https://reviews.apache.org/r/2542/diff


Testing
---

Running JUnit.


Thanks,

Tomasz



> Increase granularity of write operations in ReplicationMonitor thus reducing 
> contention for write lock
> --
>
> Key: HDFS-2495
> URL: https://issues.apache.org/jira/browse/HDFS-2495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: replicationMon.patch, replicationMon.patch-1
>
>
> For processing blocks in ReplicationMonitor 
> (BlockManager.computeReplicationWork), we first obtain a list of blocks to be 
> replicated by calling chooseUnderReplicatedBlocks, and then for each block 
> which was found, we call computeReplicationWorkForBlock. The latter processes 
> a block in three stages, acquiring the writelock twice per call:
> 1. obtaining block related info (livenodes, srcnode, etc.) under lock
> 2. choosing target for replication
> 3. scheduling replication (under lock)
> We would like to change this behaviour and decrease contention for the write 
> lock, by batching blocks and executing 1,2,3, for sets of blocks, rather than 
> for each one separately. This would decrease the number of writeLock to 2, 
> from 2*numberofblocks.
> Also, the info level logging can be pushed outside the writelock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-11-04 Thread Tomasz Nykiel (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Nykiel updated HDFS-2476:


Attachment: hashStructures.patch-7

Synced with the trunk.

> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch, hashStructures.patch-2, 
> hashStructures.patch-3, hashStructures.patch-4, hashStructures.patch-5, 
> hashStructures.patch-6, hashStructures.patch-7
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-11-04 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144522#comment-13144522
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2246:
--

- For DataNode.getBlockLocalPathInfo(..),
-* DataNode should check DFS_CLIENT_READ_SHORTCIRCUIT.
-* Should it calls checkBlockToken?
-* Should it checks whether the client address is local?

- DataNode.getBlockLocalPathInfo(..) is not supposed to calls 
FSDataset.getMetaFile(..) directly.  It should only calls FSDatasetInterface 
methods.  How about change the new FSDatasetInterface.getBlockFile(..) to 
getBlockLocalPathInfo(..) wich returns BlockLocalPathInfo?

- Change DataNode.userWithLocalPathAccess to final.

- SoftLRUCache is implemented with SoftReference values.  In general, it is a 
useful data structure.  However, it may not be useful here since the size of 
the cache is about 10k and the size of value is small (< 1k).  I suggest using 
a synchronized LinkedHashMap (no SoftReference) in BlockReaderLocal.cache and 
remove SoftLRUCache from the patch.

- BlockReaderLocal.cache should not be initialized if checkShortCircuit is not 
enabled.

I have not yet finished reading the client code.


> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security.patch, HDFS-2246.20s.1.patch, 
> HDFS-2246.20s.2.txt, HDFS-2246.20s.3.txt, HDFS-2246.20s.4.txt, 
> HDFS-2246.20s.patch, localReadShortcut20-security.2patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1943) fail to start datanode while start-dfs.sh is executed by root user

2011-11-04 Thread Roman Shaposhnik (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HDFS-1943:
---

Affects Version/s: 0.20.205.0

> fail to start datanode while start-dfs.sh is executed by root user
> --
>
> Key: HDFS-1943
> URL: https://issues.apache.org/jira/browse/HDFS-1943
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.20.205.0, 0.22.0, 0.23.0
>Reporter: Wei Yongjun
>Assignee: Wei Yongjun
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: HDFS-1943.patch
>
>
> When start-dfs.sh is run by root user, we got the following error message:
> # start-dfs.sh
> Starting namenodes on [localhost ]
> localhost: namenode running as process 2556. Stop it first.
> localhost: starting datanode, logging to 
> /usr/hadoop/hadoop-common-0.23.0-SNAPSHOT/bin/../logs/hadoop-root-datanode-cspf01.out
> localhost: Unrecognized option: -jvm
> localhost: Could not create the Java virtual machine.
> The -jvm options should be passed to jsvc when we starting a secure
> datanode, but it still passed to java when start-dfs.sh is run by root
> while secure datanode is disabled. This is a bug of bin/hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-617) Support for non-recursive create() in HDFS

2011-11-04 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144468#comment-13144468
 ] 

Suresh Srinivas commented on HDFS-617:
--

+1 for 20-security patch.

> Support for non-recursive create() in HDFS
> --
>
> Key: HDFS-617
> URL: https://issues.apache.org/jira/browse/HDFS-617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Kan Zhang
>Assignee: Kan Zhang
> Fix For: 0.20.205.1, 0.21.0
>
> Attachments: HDFS-617-branch-0.20-security.patch, 
> HDFS-617_20-append.patch, h617-01.patch, h617-02.patch, h617-03.patch, 
> h617-04.patch, h617-06.patch
>
>
> HADOOP-4952 calls for a create call that doesn't automatically create missing 
> parent directories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-11-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144403#comment-13144403
 ] 

Hudson commented on HDFS-2477:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1270 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1270/])
Add the missing test file to HDFS-2477.

hairong : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197801
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfo.java


> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Fix For: 0.24.0
>
> Attachments: reportDiff.patch, reportDiff.patch-2, 
> reportDiff.patch-3, reportDiff.patch-4, reportDiff.patch-5
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2540) Change WebHdfsFileSystem to two-step create/append

2011-11-04 Thread Tsz Wo (Nicholas), SZE (Created) (JIRA)
Change WebHdfsFileSystem to two-step create/append
--

 Key: HDFS-2540
 URL: https://issues.apache.org/jira/browse/HDFS-2540
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-11-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144385#comment-13144385
 ] 

Hudson commented on HDFS-2477:
--

Integrated in Hadoop-Common-trunk-Commit #1248 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1248/])
Add the missing test file to HDFS-2477.

hairong : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197801
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfo.java


> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Fix For: 0.24.0
>
> Attachments: reportDiff.patch, reportDiff.patch-2, 
> reportDiff.patch-3, reportDiff.patch-4, reportDiff.patch-5
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-11-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144383#comment-13144383
 ] 

Hudson commented on HDFS-2477:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1322 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1322/])
Add the missing test file to HDFS-2477.

hairong : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197801
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfo.java


> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Fix For: 0.24.0
>
> Attachments: reportDiff.patch, reportDiff.patch-2, 
> reportDiff.patch-3, reportDiff.patch-4, reportDiff.patch-5
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-11-04 Thread Hairong Kuang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144380#comment-13144380
 ] 

Hairong Kuang commented on HDFS-2477:
-

Eli, I just added the missing file TestBlockInfo.java. Thanks for checking this!

> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Fix For: 0.24.0
>
> Attachments: reportDiff.patch, reportDiff.patch-2, 
> reportDiff.patch-3, reportDiff.patch-4, reportDiff.patch-5
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2539) Support doAs and GETHOMEDIRECTORY in webhdfs

2011-11-04 Thread Tsz Wo (Nicholas), SZE (Created) (JIRA)
Support doAs and GETHOMEDIRECTORY in webhdfs


 Key: HDFS-2539
 URL: https://issues.apache.org/jira/browse/HDFS-2539
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-11-04 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144367#comment-13144367
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

1. Agree, symlink could be optional.  BTW, the "isDir" and "isSymlink" are 
replaced with "type", which is an enum {FILE, DIRECTORY, SYMLINK}

2. Sure.  Let's change it in a separated JIRA then.

3. Path is a parameter value, therefore case sensitive.

I think case sensitive causes more confusion:

Q: Why this is not working?  http://nn:port/webhdfs/v1/path?Op=GETFILECHECKSUM
A: You must use lower case: "Op" should be "op"

Q: Why this is not working?  
http://nn:port/webhdfs/v1/path?op=GETFILECHECKSUM&does=nicholas
A: It is a typo: "does" should be "doas".
Q: But "doas" looks more like a typo than "does".  I wish I can use "doAs"

How about op values and boolean values?  Do you also think that they should be 
case sensitive?

4. "name" should like file/directory names.  "localName" is an empty string if 
the full path is given.  How about calling it "pathSuffix"?

5. As Arpit mentioned, the payload for liststatus won't be increased 
significantly.  We only needs two more words per request (instead of one more 
word per status.)

Is Hoop going to support file system other than HDFS?

It is common to convert JSON to/from XML.  We should make the convertion 
trivial.

6. Webhdfs and hoop should not share the same scheme since they require 
different implementatons.  Webhdfs should use "webhdfs://".  For hoop, I 
suggest not using "http://"; as a file system scheme.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf, WebHdfsAPI2003.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-11-04 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2246:
---

Attachment: HDFS-2246-branch-0.20-security-205.patch

Updated patch for 205 for latest state of the branch, addresses Arpit's 
comments.

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security.patch, HDFS-2246.20s.1.patch, 
> HDFS-2246.20s.2.txt, HDFS-2246.20s.3.txt, HDFS-2246.20s.4.txt, 
> HDFS-2246.20s.patch, localReadShortcut20-security.2patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-11-04 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2246:
---

Release Note: 
# dfs.block.local-path-access.user is the key in datanode configuration to 
specify the user allowed to do short circuit read.
# dfs.client.read.shortcircuit is the key to enable short circuit read at the 
client side configuration.
# dfs.client.read.shortcircuit.checksum is the key to bypass checksum check at 
the client side.

By default none of the above are enabled and short circuit read will not kick 
in.

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security.patch, HDFS-2246.20s.1.patch, 
> HDFS-2246.20s.2.txt, HDFS-2246.20s.3.txt, HDFS-2246.20s.4.txt, 
> HDFS-2246.20s.patch, localReadShortcut20-security.2patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-617) Support for non-recursive create() in HDFS

2011-11-04 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-617:
--

Attachment: HDFS-617-branch-0.20-security.patch

Patch for 20-security uploaded.
The deprecation of older create API is removed. I didn't bump the version of 
client protocol for backward compatibility with 205. I also added 
FileAlreadyExistsException to Hdfs, because otherwise hdfs had a dependency on 
mapred.

> Support for non-recursive create() in HDFS
> --
>
> Key: HDFS-617
> URL: https://issues.apache.org/jira/browse/HDFS-617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Kan Zhang
>Assignee: Kan Zhang
> Fix For: 0.20.205.1, 0.21.0
>
> Attachments: HDFS-617-branch-0.20-security.patch, 
> HDFS-617_20-append.patch, h617-01.patch, h617-02.patch, h617-03.patch, 
> h617-04.patch, h617-06.patch
>
>
> HADOOP-4952 calls for a create call that doesn't automatically create missing 
> parent directories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-617) Support for non-recursive create() in HDFS

2011-11-04 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-617:
--

Target Version/s: 0.20.205.1
   Fix Version/s: 0.20.205.1

> Support for non-recursive create() in HDFS
> --
>
> Key: HDFS-617
> URL: https://issues.apache.org/jira/browse/HDFS-617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Kan Zhang
>Assignee: Kan Zhang
> Fix For: 0.20.205.1, 0.21.0
>
> Attachments: HDFS-617-branch-0.20-security.patch, 
> HDFS-617_20-append.patch, h617-01.patch, h617-02.patch, h617-03.patch, 
> h617-04.patch, h617-06.patch
>
>
> HADOOP-4952 calls for a create call that doesn't automatically create missing 
> parent directories.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2528) webhdfs rest call to a secure dn fails when a token is sent

2011-11-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144329#comment-13144329
 ] 

Hadoop QA commented on HDFS-2528:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12502525/h2528_2003_0.20s.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1534//console

This message is automatically generated.

> webhdfs rest call to a secure dn fails when a token is sent
> ---
>
> Key: HDFS-2528
> URL: https://issues.apache.org/jira/browse/HDFS-2528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2528_2001.patch, h2528_2001_0.20s.patch, 
> h2528_2001b.patch, h2528_2001b_0.20s.patch, h2528_2002.patch, 
> h2528_2002_0.20s.patch, h2528_2003.patch, h2528_2003_0.20s.patch, 
> h2528_2003_0.20s.patch
>
>
> curl -L -u : --negotiate -i 
> "http://NN:50070/webhdfs/v1/tmp/webhdfs_data/file_small_data.txt?op=OPEN";
> the following exception is thrown by the datanode when the redirect happens.
> {"RemoteException":{"exception":"IOException","javaClassName":"java.io.IOException","message":"Call
>  to  failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]"}}
> Interestingly when using ./bin/hadoop with a webhdfs path we are able to cat 
> or tail a file successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2528) webhdfs rest call to a secure dn fails when a token is sent

2011-11-04 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2528:
-

Attachment: h2528_2003_0.20s.patch

> webhdfs rest call to a secure dn fails when a token is sent
> ---
>
> Key: HDFS-2528
> URL: https://issues.apache.org/jira/browse/HDFS-2528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2528_2001.patch, h2528_2001_0.20s.patch, 
> h2528_2001b.patch, h2528_2001b_0.20s.patch, h2528_2002.patch, 
> h2528_2002_0.20s.patch, h2528_2003.patch, h2528_2003_0.20s.patch, 
> h2528_2003_0.20s.patch
>
>
> curl -L -u : --negotiate -i 
> "http://NN:50070/webhdfs/v1/tmp/webhdfs_data/file_small_data.txt?op=OPEN";
> the following exception is thrown by the datanode when the redirect happens.
> {"RemoteException":{"exception":"IOException","javaClassName":"java.io.IOException","message":"Call
>  to  failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]"}}
> Interestingly when using ./bin/hadoop with a webhdfs path we are able to cat 
> or tail a file successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2495) Increase granularity of write operations in ReplicationMonitor thus reducing contention for write lock

2011-11-04 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144305#comment-13144305
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2495:
--

Hi Tom, could you clean up the indentation changes in the patch?  Also, we 
don't use tabs in Hadoop.

> Increase granularity of write operations in ReplicationMonitor thus reducing 
> contention for write lock
> --
>
> Key: HDFS-2495
> URL: https://issues.apache.org/jira/browse/HDFS-2495
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: replicationMon.patch
>
>
> For processing blocks in ReplicationMonitor 
> (BlockManager.computeReplicationWork), we first obtain a list of blocks to be 
> replicated by calling chooseUnderReplicatedBlocks, and then for each block 
> which was found, we call computeReplicationWorkForBlock. The latter processes 
> a block in three stages, acquiring the writelock twice per call:
> 1. obtaining block related info (livenodes, srcnode, etc.) under lock
> 2. choosing target for replication
> 3. scheduling replication (under lock)
> We would like to change this behaviour and decrease contention for the write 
> lock, by batching blocks and executing 1,2,3, for sets of blocks, rather than 
> for each one separately. This would decrease the number of writeLock to 2, 
> from 2*numberofblocks.
> Also, the info level logging can be pushed outside the writelock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-11-04 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144275#comment-13144275
 ] 

Arpit Gupta commented on HDFS-2246:
---

Also noticed that when running a hadoop cli saw the following being printed 
out..

11/11/04 19:42:49 INFO hdfs.DFSClient: Short circuit read is false

We should change the above to debug.

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security.patch, HDFS-2246.20s.1.patch, 
> HDFS-2246.20s.2.txt, HDFS-2246.20s.3.txt, HDFS-2246.20s.4.txt, 
> HDFS-2246.20s.patch, localReadShortcut20-security.2patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-11-04 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144259#comment-13144259
 ] 

Arpit Gupta commented on HDFS-2246:
---

the javadoc for BlockReaderLocal states

The client performing short circuit reads must be running as the same user as 
that of the datanode.

Where as the impl states that 'dfs.block.local-path-access.user' needs to set 
on the dn to the appropriate user. We should update the javadoc with that info.

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security.patch, HDFS-2246.20s.1.patch, 
> HDFS-2246.20s.2.txt, HDFS-2246.20s.3.txt, HDFS-2246.20s.4.txt, 
> HDFS-2246.20s.patch, localReadShortcut20-security.2patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-11-04 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2246:
---

Attachment: HDFS-2246-branch-0.20-security-205.patch

Patch for 205.

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch, 
> HDFS-2246-branch-0.20-security-205.patch, 
> HDFS-2246-branch-0.20-security.patch, HDFS-2246.20s.1.patch, 
> HDFS-2246.20s.2.txt, HDFS-2246.20s.3.txt, HDFS-2246.20s.4.txt, 
> HDFS-2246.20s.patch, localReadShortcut20-security.2patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2528) webhdfs rest call to a secure dn fails when a token is sent

2011-11-04 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144165#comment-13144165
 ] 

Arpit Gupta commented on HDFS-2528:
---

Actually the issue mentioned in comment 
https://issues.apache.org/jira/browse/HDFS-2528?focusedCommentId=13143641&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13143641
 is when you make a getdelegationtoken call and send a delegation param

forexample

curl -i 
"http://NN:50070/webhdfs/v1/tmp/webhdfs_data/file_small_data.txt?op=getdelegationtoken&delegation=sdfsd";

throws a 500

When the delegation param is not sent it throws a 401 with no error response.

> webhdfs rest call to a secure dn fails when a token is sent
> ---
>
> Key: HDFS-2528
> URL: https://issues.apache.org/jira/browse/HDFS-2528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2528_2001.patch, h2528_2001_0.20s.patch, 
> h2528_2001b.patch, h2528_2001b_0.20s.patch, h2528_2002.patch, 
> h2528_2002_0.20s.patch, h2528_2003.patch, h2528_2003_0.20s.patch
>
>
> curl -L -u : --negotiate -i 
> "http://NN:50070/webhdfs/v1/tmp/webhdfs_data/file_small_data.txt?op=OPEN";
> the following exception is thrown by the datanode when the redirect happens.
> {"RemoteException":{"exception":"IOException","javaClassName":"java.io.IOException","message":"Call
>  to  failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]"}}
> Interestingly when using ./bin/hadoop with a webhdfs path we are able to cat 
> or tail a file successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2538) option for fsck dots to be on/off

2011-11-04 Thread Kihwal Lee (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144102#comment-13144102
 ] 

Kihwal Lee commented on HDFS-2538:
--

I like the patch very much, but at the same time the part I like the most may 
need some changes. :)

- Instead of {{-annoying}}, {{-verbose}} or {{-progress}} would be nice. 
- At the same time, can we change {{%100}} to something bigger?

> option for fsck dots to be on/off 
> --
>
> Key: HDFS-2538
> URL: https://issues.apache.org/jira/browse/HDFS-2538
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Allen Wittenauer
> Attachments: HDFS-2538-branch-0.20-security-204.patch, 
> HDFS-2538-branch-0.20-security-204.patch
>
>
> this patch turns the dots during fsck off by default and provides an option 
> to turn them back on if you have a fetish for millions and millions of dots 
> on your terminal.  i haven't done any benchmarks, but i suspect fsck is now 
> 300% faster to boot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2538) option for fsck dots to be on/off

2011-11-04 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144098#comment-13144098
 ] 

Todd Lipcon commented on HDFS-2538:
---

Given we know how many files/blocks/whatevers we have in the namespace before 
we even start the operation, maybe we can just make the dot frequency relative 
to that? eg print a dot for every 1% or 5% so you just get a line or two, but 
still have some indication of progress?

> option for fsck dots to be on/off 
> --
>
> Key: HDFS-2538
> URL: https://issues.apache.org/jira/browse/HDFS-2538
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Allen Wittenauer
> Attachments: HDFS-2538-branch-0.20-security-204.patch, 
> HDFS-2538-branch-0.20-security-204.patch
>
>
> this patch turns the dots during fsck off by default and provides an option 
> to turn them back on if you have a fetish for millions and millions of dots 
> on your terminal.  i haven't done any benchmarks, but i suspect fsck is now 
> 300% faster to boot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2538) option for fsck dots to be on/off

2011-11-04 Thread Allen Wittenauer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2538:
---

Attachment: HDFS-2538-branch-0.20-security-204.patch

woops this included another patch. let's try that again.

> option for fsck dots to be on/off 
> --
>
> Key: HDFS-2538
> URL: https://issues.apache.org/jira/browse/HDFS-2538
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Allen Wittenauer
> Attachments: HDFS-2538-branch-0.20-security-204.patch, 
> HDFS-2538-branch-0.20-security-204.patch
>
>
> this patch turns the dots during fsck off by default and provides an option 
> to turn them back on if you have a fetish for millions and millions of dots 
> on your terminal.  i haven't done any benchmarks, but i suspect fsck is now 
> 300% faster to boot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2538) option for fsck dots to be on/off

2011-11-04 Thread Allen Wittenauer (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-2538:
---

Attachment: HDFS-2538-branch-0.20-security-204.patch

> option for fsck dots to be on/off 
> --
>
> Key: HDFS-2538
> URL: https://issues.apache.org/jira/browse/HDFS-2538
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Allen Wittenauer
> Attachments: HDFS-2538-branch-0.20-security-204.patch
>
>
> this patch turns the dots during fsck off by default and provides an option 
> to turn them back on if you have a fetish for millions and millions of dots 
> on your terminal.  i haven't done any benchmarks, but i suspect fsck is now 
> 300% faster to boot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2538) option for fsck dots to be on/off

2011-11-04 Thread Allen Wittenauer (Created) (JIRA)
option for fsck dots to be on/off 
--

 Key: HDFS-2538
 URL: https://issues.apache.org/jira/browse/HDFS-2538
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Allen Wittenauer


this patch turns the dots during fsck off by default and provides an option to 
turn them back on if you have a fetish for millions and millions of dots on 
your terminal.  i haven't done any benchmarks, but i suspect fsck is now 300% 
faster to boot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2527) Remove the use of Range header from webhdfs

2011-11-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144001#comment-13144001
 ] 

Hudson commented on HDFS-2527:
--

Integrated in Hadoop-Mapreduce-trunk #887 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/887/])
HDFS-2527. WebHdfs: remove the use of "Range" header in Open; use ugi 
username if renewer parameter is null in GetDelegationToken; response OK when 
setting replication for non-files; rename GETFILEBLOCKLOCATIONS to 
GET_BLOCK_LOCATIONS and state that it is a private unstable API; replace 
isDirectory and isSymlink with enum {FILE, DIRECTORY, SYMLINK} in 
HdfsFileStatus JSON object.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197329
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ByteRangeInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/resources/DatanodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestByteRangeInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestOffsetUrlInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java


> Remove the use of Range header from webhdfs
> ---
>
> Key: HDFS-2527
> URL: https://issues.apache.org/jira/browse/HDFS-2527
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.20.205.1, 0.20.206.0, 0.24.0, 0.23.1
>
> Attachments: h2527_2001b_0.20s.patch, h2527_2002.patch, 
> h2527_2002_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2527) Remove the use of Range header from webhdfs

2011-11-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143997#comment-13143997
 ] 

Hudson commented on HDFS-2527:
--

Integrated in Hadoop-Mapreduce-0.23-Build #80 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/80/])
svn merge -c 1197329 from trunk for HDFS-2527.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197335
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ByteRangeInputStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/resources/DatanodeWebHdfsMethods.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestByteRangeInputStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestOffsetUrlInputStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java


> Remove the use of Range header from webhdfs
> ---
>
> Key: HDFS-2527
> URL: https://issues.apache.org/jira/browse/HDFS-2527
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.20.205.1, 0.20.206.0, 0.24.0, 0.23.1
>
> Attachments: h2527_2001b_0.20s.patch, h2527_2002.patch, 
> h2527_2002_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2527) Remove the use of Range header from webhdfs

2011-11-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143977#comment-13143977
 ] 

Hudson commented on HDFS-2527:
--

Integrated in Hadoop-Hdfs-trunk #853 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/853/])
HDFS-2527. WebHdfs: remove the use of "Range" header in Open; use ugi 
username if renewer parameter is null in GetDelegationToken; response OK when 
setting replication for non-files; rename GETFILEBLOCKLOCATIONS to 
GET_BLOCK_LOCATIONS and state that it is a private unstable API; replace 
isDirectory and isSymlink with enum {FILE, DIRECTORY, SYMLINK} in 
HdfsFileStatus JSON object.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197329
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ByteRangeInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/resources/DatanodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestByteRangeInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestOffsetUrlInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java


> Remove the use of Range header from webhdfs
> ---
>
> Key: HDFS-2527
> URL: https://issues.apache.org/jira/browse/HDFS-2527
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.20.205.1, 0.20.206.0, 0.24.0, 0.23.1
>
> Attachments: h2527_2001b_0.20s.patch, h2527_2002.patch, 
> h2527_2002_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2527) Remove the use of Range header from webhdfs

2011-11-04 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143969#comment-13143969
 ] 

Hudson commented on HDFS-2527:
--

Integrated in Hadoop-Hdfs-0.23-Build #66 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/66/])
svn merge -c 1197329 from trunk for HDFS-2527.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1197335
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ByteRangeInputStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/resources/DatanodeWebHdfsMethods.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestByteRangeInputStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestOffsetUrlInputStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java


> Remove the use of Range header from webhdfs
> ---
>
> Key: HDFS-2527
> URL: https://issues.apache.org/jira/browse/HDFS-2527
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.20.205.1, 0.20.206.0, 0.24.0, 0.23.1
>
> Attachments: h2527_2001b_0.20s.patch, h2527_2002.patch, 
> h2527_2002_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2246) Shortcut a local client reads to a Datanodes files directly

2011-11-04 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2246:
---

Attachment: HDFS-2246-branch-0.20-security.patch

The attached patch adds a server side configuration to specify the user who can 
use short circuit read.
 This parameter should be specified at the datanode. Datanode will allow 
getBlockLocalPathInfo only for this user. 

> Shortcut a local client reads to a Datanodes files directly
> ---
>
> Key: HDFS-2246
> URL: https://issues.apache.org/jira/browse/HDFS-2246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Sanjay Radia
> Attachments: 0001-HDFS-347.-Local-reads.patch, 
> HDFS-2246-branch-0.20-security.patch, HDFS-2246.20s.1.patch, 
> HDFS-2246.20s.2.txt, HDFS-2246.20s.3.txt, HDFS-2246.20s.4.txt, 
> HDFS-2246.20s.patch, localReadShortcut20-security.2patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2528) webhdfs rest call to a secure dn fails when a token is sent

2011-11-04 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143825#comment-13143825
 ] 

Hadoop QA commented on HDFS-2528:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12502270/h2528_2003.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.TestFileAppend2
  org.apache.hadoop.hdfs.TestBalancerBandwidth

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1533//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1533//console

This message is automatically generated.

> webhdfs rest call to a secure dn fails when a token is sent
> ---
>
> Key: HDFS-2528
> URL: https://issues.apache.org/jira/browse/HDFS-2528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2528_2001.patch, h2528_2001_0.20s.patch, 
> h2528_2001b.patch, h2528_2001b_0.20s.patch, h2528_2002.patch, 
> h2528_2002_0.20s.patch, h2528_2003.patch, h2528_2003_0.20s.patch
>
>
> curl -L -u : --negotiate -i 
> "http://NN:50070/webhdfs/v1/tmp/webhdfs_data/file_small_data.txt?op=OPEN";
> the following exception is thrown by the datanode when the redirect happens.
> {"RemoteException":{"exception":"IOException","javaClassName":"java.io.IOException","message":"Call
>  to  failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]"}}
> Interestingly when using ./bin/hadoop with a webhdfs path we are able to cat 
> or tail a file successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2535) A Model for Data Durability

2011-11-04 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13143792#comment-13143792
 ] 

Suresh Srinivas commented on HDFS-2535:
---

Thanks Rob for making the spreadsheet available :-)

> A Model for Data Durability
> ---
>
> Key: HDFS-2535
> URL: https://issues.apache.org/jira/browse/HDFS-2535
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation
>Reporter: Robert Chansler
>Assignee: Robert Chansler
> Attachments: LosingBlocks.xlsx
>
>
> This is a statistical model considering data durability in HDFS in the 
> presence of Data Node failures. The attached spreadsheet considers the 
> probability of losing a block with three replicas in the case of uncorrelated 
> failures of DNs. Also included is a section that looks at the consequences of 
> simultaneous failures. 
> The model parameters reflect experience at Yahoo with a large cluster. But it 
> is easy to change the parameters in the spreadsheet. Number of replicas is 
> not a easily adjusted parameter. And while published reports (the Google 
> papers) suggest that node failures are not really uncorrelated, this does 
> give some practical insight into HDFS durability.
> I and others have quoted from this work in the past. I thought it good to 
> make the details conveniently available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira