[jira] [Commented] (HDFS-2481) Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol

2011-10-20 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132368#comment-13132368
 ] 

Hadoop QA commented on HDFS-2481:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12500088/testFailures1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.server.namenode.TestBackupNode
  org.apache.hadoop.hdfs.TestFileCreationClient
  org.apache.hadoop.hdfs.TestSetrepIncreasing
  org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1404//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1404//console

This message is automatically generated.

> Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol
> 
>
> Key: HDFS-2481
> URL: https://issues.apache.org/jira/browse/HDFS-2481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Sanjay Radia
> Attachments: testFailures1.patch
>
>
> A few unit tests are failing, e.g.
> {noformat}
> Test set: org.apache.hadoop.hdfs.TestDistributedFileSystem
> ---
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 16.813 sec 
> <<< FAILURE!
> testAllWithDualPort(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.254 sec  <<< ERROR!
> java.io.IOException: java.io.IOException: Unknown protocol: 
> org.apache.hadoop.hdfs.protocol.ClientProtocol
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:615)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1517)
>   ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-10-20 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132364#comment-13132364
 ] 

jirapos...@reviews.apache.org commented on HDFS-2476:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2515/#review2738
---

Ship it!


Looks very good. See the small nits below. Are there any non-trivial changes in 
this patch compared to what you're running in production?


trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java


This javadoc describes a linked hashset, but the implementation is just a 
normal hashset. Copy-paste error?



trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java


can hashCode and element be marked final?



trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java


formatting is a little off - should go on same line



trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java


I find this javadoc a little misleading - it implies that there's a linked 
list of elements separate from the hashtable chains. The removal order is in 
hash-bucket order rather than insertion order. "Maybe something like: removes N 
entries from the hashtable. The order in which entries are removed is 
unspecified, and may not correspond to the order in which they were inserted."


- Todd


On 2011-10-20 22:58:45, Tomasz Nykiel wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2515/
bq.  ---
bq.  
bq.  (Updated 2011-10-20 22:58:45)
bq.  
bq.  
bq.  Review request for Hairong Kuang.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  This patch introduces two hash data structures for storing 
under-replicated, over-replicated and invalidated blocks.
bq.  
bq.  1. LightWeightHashSet
bq.  2. LightWeightLinkedSet
bq.  
bq.  Currently in all these cases we are using java.util.TreeSet which adds 
unnecessary overhead.
bq.  
bq.  The main bottlenecks addressed by this patch are:
bq.  -cluster instability times, when these queues (especially 
under-replicated) tend to grow quite drastically,
bq.  -initial cluster startup, when the queues are initialized, after leaving 
safemode,
bq.  -block reports,
bq.  -explicit acks for block addition and deletion
bq.  
bq.  1. The introduced structures are CPU-optimized.
bq.  2. They shrink and expand according to current capacity.
bq.  3. Add/contains/delete ops are performed in O(1) time (unlike current log 
n for TreeSet).
bq.  4. The sets are equipped with fast access methods for polling a number of 
elements (get+remove), which are used for handling the queues.
bq.  
bq.  
bq.  This addresses bug HDFS-2476.
bq.  https://issues.apache.org/jira/browse/HDFS-2476
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
 PRE-CREATION 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java
 PRE-CREATION 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
 1187124 
bq.
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLigh

[jira] [Commented] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132356#comment-13132356
 ] 

Hadoop QA commented on HDFS-2480:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12500087/HDFS-2480.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 33 javac compiler warnings (more 
than the trunk's current 31 warnings).

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.TestDistributedFileSystem
  org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool
  org.apache.hadoop.hdfs.TestAbandonBlock
  org.apache.hadoop.hdfs.server.namenode.TestBackupNode
  org.apache.hadoop.hdfs.TestRestartDFS

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1403//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1403//console

This message is automatically generated.

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data

2011-10-20 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-347:
-

Component/s: performance
 hdfs client
 data-node

> DFS read performance suboptimal when client co-located on nodes with data
> -
>
> Key: HDFS-347
> URL: https://issues.apache.org/jira/browse/HDFS-347
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, hdfs client, performance
>Reporter: George Porter
>Assignee: Todd Lipcon
> Attachments: BlockReaderLocal1.txt, HADOOP-4801.1.patch, 
> HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-branch-20-append.txt, 
> all.tsv, hdfs-347.png, hdfs-347.txt, local-reads-doc
>
>
> One of the major strategies Hadoop uses to get scalable data processing is to 
> move the code to the data.  However, putting the DFS client on the same 
> physical node as the data blocks it acts on doesn't improve read performance 
> as much as expected.
> After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem 
> is due to the HDFS streaming protocol causing many more read I/O operations 
> (iops) than necessary.  Consider the case of a DFSClient fetching a 64 MB 
> disk block from the DataNode process (running in a separate JVM) running on 
> the same machine.  The DataNode will satisfy the single disk block request by 
> sending data back to the HDFS client in 64-KB chunks.  In BlockSender.java, 
> this is done in the sendChunk() method, relying on Java's transferTo() 
> method.  Depending on the host O/S and JVM implementation, transferTo() is 
> implemented as either a sendfilev() syscall or a pair of mmap() and write().  
> In either case, each chunk is read from the disk by issuing a separate I/O 
> operation for each chunk.  The result is that the single request for a 64-MB 
> block ends up hitting the disk as over a thousand smaller requests for 64-KB 
> each.
> Since the DFSClient runs in a different JVM and process than the DataNode, 
> shuttling data from the disk to the DFSClient also results in context 
> switches each time network packets get sent (in this case, the 64-kb chunk 
> turns into a large number of 1500 byte packet send operations).  Thus we see 
> a large number of context switches for each block send operation.
> I'd like to get some feedback on the best way to address this, but I think 
> providing a mechanism for a DFSClient to directly open data blocks that 
> happen to be on the same machine.  It could do this by examining the set of 
> LocatedBlocks returned by the NameNode, marking those that should be resident 
> on the local host.  Since the DataNode and DFSClient (probably) share the 
> same hadoop configuration, the DFSClient should be able to find the files 
> holding the block data, and it could directly open them and send data back to 
> the client.  This would avoid the context switches imposed by the network 
> layer, and would allow for much larger read buffers than 64KB, which should 
> reduce the number of iops imposed by each read block operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-918) Use single Selector and small thread pool to replace many instances of BlockSender for reads

2011-10-20 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-918:
-

Component/s: performance

> Use single Selector and small thread pool to replace many instances of 
> BlockSender for reads
> 
>
> Key: HDFS-918
> URL: https://issues.apache.org/jira/browse/HDFS-918
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Reporter: Jay Booth
>Assignee: Jay Booth
> Attachments: hbase-hdfs-benchmarks.ods, hdfs-918-20100201.patch, 
> hdfs-918-20100203.patch, hdfs-918-20100211.patch, hdfs-918-20100228.patch, 
> hdfs-918-20100309.patch, hdfs-918-TRUNK.patch, 
> hdfs-918-branch20-append.patch, hdfs-918-branch20.2.patch, 
> hdfs-918-pool.patch, hdfs-multiplex.patch
>
>
> Currently, on read requests, the DataXCeiver server allocates a new thread 
> per request, which must allocate its own buffers and leads to 
> higher-than-optimal CPU and memory usage by the sending threads.  If we had a 
> single selector and a small threadpool to multiplex request packets, we could 
> theoretically achieve higher performance while taking up fewer resources and 
> leaving more CPU on datanodes available for mapred, hbase or whatever.  This 
> can be done without changing any wire protocols.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1323) Pool/share file channels for HDFS read

2011-10-20 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1323:
--

Component/s: performance

> Pool/share file channels for HDFS read
> --
>
> Key: HDFS-1323
> URL: https://issues.apache.org/jira/browse/HDFS-1323
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Reporter: Jay Booth
> Attachments: hdfs-1323-20100730.patch, hdfs-1323-trunk.txt
>
>
> Currently, all reads in HDFS require opening and closing the underlying 
> block/meta filechannels.  We could pool these filechannels and save some 
> system calls and other work.  Since HDFS read requests can be satisfied by 
> positioned reads and transferTos, we can even share these filechannels 
> between concurrently executing requests.
> The attached patch was benchmarked as part of work on HDFS-918 and exhibited 
> a 10% performance increase for small random reads.
> This does not affect client logic and involves minimal change to server 
> logic.  Patch is based on branch 20-append. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2080) Speed up DFS read path by lessening checksum overhead

2011-10-20 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2080:
--

Component/s: performance

> Speed up DFS read path by lessening checksum overhead
> -
>
> Key: HDFS-2080
> URL: https://issues.apache.org/jira/browse/HDFS-2080
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, performance
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.23.0
>
> Attachments: hdfs-2080.txt, hdfs-2080.txt
>
>
> I've developed a series of patches that speeds up the HDFS read path by a 
> factor of about 2.5x (~300M/sec to ~800M/sec for localhost reading from 
> buffer cache) and also will make it easier to allow for advanced users (eg 
> hbase) to skip a buffer copy. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-941) Datanode xceiver protocol should allow reuse of a connection

2011-10-20 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-941:
-

Component/s: performance

> Datanode xceiver protocol should allow reuse of a connection
> 
>
> Key: HDFS-941
> URL: https://issues.apache.org/jira/browse/HDFS-941
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, hdfs client, performance
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: bc Wong
> Fix For: 0.22.0
>
> Attachments: 941.22.txt, 941.22.txt, 941.22.v2.txt, 941.22.v3.txt, 
> HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch, 
> HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.22.patch, HDFS-941-6.patch, 
> HDFS-941-6.patch, HDFS-941-6.patch, fix-close-delta.txt, hdfs-941.txt, 
> hdfs-941.txt, hdfs-941.txt, hdfs-941.txt, hdfs941-1.png
>
>
> Right now each connection into the datanode xceiver only processes one 
> operation.
> In the case that an operation leaves the stream in a well-defined state (eg a 
> client reads to the end of a block successfully) the same connection could be 
> reused for a second operation. This should improve random read performance 
> significantly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1148) Convert FSDataset to ReadWriteLock

2011-10-20 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1148:
--

Component/s: performance

> Convert FSDataset to ReadWriteLock
> --
>
> Key: HDFS-1148
> URL: https://issues.apache.org/jira/browse/HDFS-1148
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Reporter: Todd Lipcon
>Assignee: Dave Thompson
> Attachments: hdfs-1148-old.txt, hdfs-1148-trunk.txt, 
> patch-HDFS-1148-rel0.20.2.txt
>
>
> In benchmarking HDFS-941 I noticed that for the random read workload, the 
> FSDataset lock is highly contended. After converting it to a 
> ReentrantReadWriteLock, I saw a ~25% improvement on both latency and 
> ops/second.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2129) Simplify BlockReader to not inherit from FSInputChecker

2011-10-20 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2129:
--

Component/s: performance

> Simplify BlockReader to not inherit from FSInputChecker
> ---
>
> Key: HDFS-2129
> URL: https://issues.apache.org/jira/browse/HDFS-2129
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs client, performance
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.23.0
>
> Attachments: hdfs-2129-benchmark.png, hdfs-2129.txt, hdfs-2129.txt, 
> hdfs-2129.txt, seq-read-1gb-bench.png
>
>
> BlockReader is currently quite complicated since it has to conform to the 
> FSInputChecker inheritance structure. It would be much simpler to implement 
> it standalone. Benchmarking indicates it's slightly faster, as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132322#comment-13132322
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2316:
--

> Here is the WebHdfs API.

I mean the attached file WebHdfsAPI20111020.pdf.


> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2178) Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132321#comment-13132321
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2178:
--

> attaching HTML version of the API not to truncate the URLs

I posted the WebHdfs API in HDFS-2316.

> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write 
> capabilities
> --
>
> Key: HDFS-2178
> URL: https://issues.apache.org/jira/browse/HDFS-2178
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.0
>
> Attachments: HDFS-2178.patch, HDFSoverHTTP-API.html, HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) 
> for HDFS Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations 
> (read and write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file 
> system operations. It can be accessed using standard HTTP tools (i.e. curl 
> and wget), HTTP libraries from different programing languages (i.e. Perl, 
> Java Script) as well as using the Hoop client. The Hoop server component is a 
> standard Java web-application and it has been implemented using Jersey 
> (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client 
> that allows using the familiar Hadoop filesystem API to access HDFS data 
> through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for 
> Kerberos HTTP SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be 
> done first, as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2316) webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2316:
-

Attachment: WebHdfsAPI20111020.pdf

Here is the WebHdfs API.

> webhdfs: a complete FileSystem implementation for accessing HDFS over HTTP
> --
>
> Key: HDFS-2316
> URL: https://issues.apache.org/jira/browse/HDFS-2316
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: WebHdfsAPI20111020.pdf
>
>
> We current have hftp for accessing HDFS over HTTP.  However, hftp is a 
> read-only FileSystem and does not provide "write" accesses.
> In HDFS-2284, we propose to have webhdfs for providing a complete FileSystem 
> implementation for accessing HDFS over HTTP.  The is the umbrella JIRA for 
> the tasks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2481) Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol

2011-10-20 Thread Sanjay Radia (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-2481:
---

Status: Patch Available  (was: Open)

> Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol
> 
>
> Key: HDFS-2481
> URL: https://issues.apache.org/jira/browse/HDFS-2481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Sanjay Radia
> Attachments: testFailures1.patch
>
>
> A few unit tests are failing, e.g.
> {noformat}
> Test set: org.apache.hadoop.hdfs.TestDistributedFileSystem
> ---
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 16.813 sec 
> <<< FAILURE!
> testAllWithDualPort(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.254 sec  <<< ERROR!
> java.io.IOException: java.io.IOException: Unknown protocol: 
> org.apache.hadoop.hdfs.protocol.ClientProtocol
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:615)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1517)
>   ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2481) Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol

2011-10-20 Thread Sanjay Radia (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-2481:
---

Attachment: testFailures1.patch

> Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol
> 
>
> Key: HDFS-2481
> URL: https://issues.apache.org/jira/browse/HDFS-2481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Sanjay Radia
> Attachments: testFailures1.patch
>
>
> A few unit tests are failing, e.g.
> {noformat}
> Test set: org.apache.hadoop.hdfs.TestDistributedFileSystem
> ---
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 16.813 sec 
> <<< FAILURE!
> testAllWithDualPort(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.254 sec  <<< ERROR!
> java.io.IOException: java.io.IOException: Unknown protocol: 
> org.apache.hadoop.hdfs.protocol.ClientProtocol
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:615)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1517)
>   ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-2480:
--

Attachment: HDFS-2480.txt

bq. In NamenodeWireProtocol, do we need the constants (e.g. ACT_SHUTDOWN)?
Removed

bq. We probably should have utility methods for createNamenodeWithRetry(..) and 
createNamenode(..).
bq. Remove NamenodeProtocolTranslatorR23.getProxyWithoutRetry()
I will cleanup when NamenodeProtocol translators are used in a separate patch.  
Also utility is required for other patches. Will address it in another issue.

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Konstantin Boudnik (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik resolved HDFS-2452.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

I have committed it to branch-0.22. Thanks Uma.

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch_with-around_.patch, 
> HDFS-2452-22branch_with-around_.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Konstantin Shvachko (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132290#comment-13132290
 ] 

Konstantin Shvachko commented on HDFS-2452:
---

I ran the test several times. It is working fine now.
I also ran the test target, which passed.
+1 on the patch. Tanks Cos and Uma for working on this.

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch_with-around_.patch, 
> HDFS-2452-22branch_with-around_.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2427) webhdfs mkdirs api call creates path with 777 permission, we should default it to 755

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2427:
-

   Resolution: Fixed
Fix Version/s: 0.24.0
   0.20.206.0
   0.20.205.1
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I have committed this.

> webhdfs mkdirs api call creates path with 777 permission, we should default 
> it to 755
> -
>
> Key: HDFS-2427
> URL: https://issues.apache.org/jira/browse/HDFS-2427
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.20.205.1, 0.20.206.0, 0.24.0
>
> Attachments: h2427_20111019.patch, h2427_20111019b.patch, 
> h2427_20111020.patch, h2427_20111020_svn_mv.patch, 
> h2427_20111020_svn_mv_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2427) webhdfs mkdirs api call creates path with 777 permission, we should default it to 755

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2427:
-

Attachment: h2427_20111020_svn_mv.patch
h2427_20111020_svn_mv_0.20s.patch

h2427_20111020_svn_mv_0.20s.patch
h2427_20111020_svn_mv.patch

Patches using "svn mv".

> webhdfs mkdirs api call creates path with 777 permission, we should default 
> it to 755
> -
>
> Key: HDFS-2427
> URL: https://issues.apache.org/jira/browse/HDFS-2427
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2427_20111019.patch, h2427_20111019b.patch, 
> h2427_20111020.patch, h2427_20111020_svn_mv.patch, 
> h2427_20111020_svn_mv_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132187#comment-13132187
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2480:
--

+1 patch looks good.

Some comments (you may work on them for the next patch):

- In NamenodeWireProtocol, do we need the constants (e.g. ACT_SHUTDOWN)?

- CheckpointSignatureWritable.setBlockpoolID(..), 
BlockWithLocationsWritable.getBlocks(), and all getters and setters in 
XxxWritable should be removed.

- We probably should have utility methods for createNamenodeWithRetry(..) and 
createNamenode(..).

- Remove NamenodeProtocolTranslatorR23.getProxyWithoutRetry()

- In NamenodeProtocolServerSideTranslatorR23, the error messages "Datanode 
Serverside ..." should be "Namenode Serverside ...".


> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-10-20 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132168#comment-13132168
 ] 

jirapos...@reviews.apache.org commented on HDFS-2476:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2515/
---

(Updated 2011-10-20 22:58:45.114557)


Review request for Hairong Kuang.


Summary (updated)
---

This patch introduces two hash data structures for storing under-replicated, 
over-replicated and invalidated blocks.

1. LightWeightHashSet
2. LightWeightLinkedSet

Currently in all these cases we are using java.util.TreeSet which adds 
unnecessary overhead.

The main bottlenecks addressed by this patch are:
-cluster instability times, when these queues (especially under-replicated) 
tend to grow quite drastically,
-initial cluster startup, when the queues are initialized, after leaving 
safemode,
-block reports,
-explicit acks for block addition and deletion

1. The introduced structures are CPU-optimized.
2. They shrink and expand according to current capacity.
3. Add/contains/delete ops are performed in O(1) time (unlike current log n for 
TreeSet).
4. The sets are equipped with fast access methods for polling a number of 
elements (get+remove), which are used for handling the queues.


This addresses bug HDFS-2476.
https://issues.apache.org/jira/browse/HDFS-2476


Diffs
-

  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
 PRE-CREATION 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java
 PRE-CREATION 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightHashSet.java
 PRE-CREATION 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightLinkedSet.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2515/diff


Testing
---

Provided JUnit tests.


Thanks,

Tomasz



> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch, hashStructures.patch-2
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apac

[jira] [Updated] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-10-20 Thread Tomasz Nykiel (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Nykiel updated HDFS-2477:


Attachment: reportDiff.patch-2

> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: reportDiff.patch, reportDiff.patch-2
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-10-20 Thread Tomasz Nykiel (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Nykiel updated HDFS-2477:


Attachment: (was: hashStructures.patch-2)

> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: reportDiff.patch, reportDiff.patch-2
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-10-20 Thread Tomasz Nykiel (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Nykiel updated HDFS-2477:


Attachment: hashStructures.patch-2

> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch-2, reportDiff.patch
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.

2011-10-20 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132164#comment-13132164
 ] 

jirapos...@reviews.apache.org commented on HDFS-2477:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2516/
---

Review request for Hairong Kuang.


Summary
---

When a block report is processed at the NN, the BlockManager.reportDiff 
traverses all blocks contained in the report, and for each one block, which is 
also present in the corresponding datanode descriptor, the block is moved to 
the head of the list of the blocks in this datanode descriptor.

With HDFS-395 the huge majority of the blocks in the report, are also present 
in the datanode descriptor, which means that almost every block in the report 
will have to be moved to the head of the list.

Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
which removes a block from a list and then inserts it. In this process, we call 
findDatanode several times (afair 6 times for each moveBlockToHead call). 
findDatanode is relatively expensive, since it linearly goes through the 
triplets to locate the given datanode.

With this patch, we do some memoization of findDatanode, so we can reclaim 2 
findDatanode calls. Our experiments show that this can improve the reportDiff 
(which is executed under write lock) by around 15%. Currently with HDFS-395, 
reportDiff is responsible for almost 100% of the block report processing time.


This addresses bug HDFS-2477.
https://issues.apache.org/jira/browse/HDFS-2477


Diffs
-

  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java
 1187125 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 1187125 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
 1187125 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfo.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2516/diff


Testing
---

Additional JUnit tests.


Thanks,

Tomasz



> Optimize computing the diff between a block report and the namenode state.
> --
>
> Key: HDFS-2477
> URL: https://issues.apache.org/jira/browse/HDFS-2477
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch-2, reportDiff.patch
>
>
> When a block report is processed at the NN, the BlockManager.reportDiff 
> traverses all blocks contained in the report, and for each one block, which 
> is also present in the corresponding datanode descriptor, the block is moved 
> to the head of the list of the blocks in this datanode descriptor.
> With HDFS-395 the huge majority of the blocks in the report, are also present 
> in the datanode descriptor, which means that almost every block in the report 
> will have to be moved to the head of the list.
> Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, 
> which removes a block from a list and then inserts it. In this process, we 
> call findDatanode several times (afair 6 times for each moveBlockToHead 
> call). findDatanode is relatively expensive, since it linearly goes through 
> the triplets to locate the given datanode.
> With this patch, we do some memoization of findDatanode, so we can reclaim 2 
> findDatanode calls. Our experiments show that this can improve the reportDiff 
> (which is executed under write lock) by around 15%. Currently with HDFS-395, 
> reportDiff is responsible for almost 100% of the block report processing time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-10-20 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132161#comment-13132161
 ] 

jirapos...@reviews.apache.org commented on HDFS-2476:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2515/
---

Review request for Hairong Kuang.


Summary
---

This patch introduces two hash data structures for storing under-replicated, 
over-replicated and invalidated blocks.

1. LightWeightHashSet
2. LightWeightLinkedSet

Currently in all these cases we are using java.util.TreeSet which adds 
unnecessary overhead.

The main bottlenecks addressed by this patch are:
-cluster instability times, when these queues (especially under-replicated) 
tend to grow quite drastically,
-initial cluster startup, when the queues are initialized, after leaving 
safemode,
-block reports,
-explicit acks for block addition and deletion

1. The introduced structures are CPU-optimized.
2. They shrink and expand according to current capacity.
3. Add/contains/delete ops are performed in O(1) time (unlike current log n for 
TreeSet).
4. The sets are equipped with fast access methods for polling a number of 
elements (get+remove), which are used for handling the queues.


This addresses bug HDFS-2476.
https://issues.apache.org/jira/browse/HDFS-2476


Diffs
-

  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
 PRE-CREATION 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java
 PRE-CREATION 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
 1187124 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightHashSet.java
 PRE-CREATION 
  
trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightLinkedSet.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/2515/diff


Testing
---

Provided JUnit tests.


Thanks,

Tomasz



> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch, hashStructures.patch-2
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.

[jira] [Updated] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-10-20 Thread Tomasz Nykiel (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomasz Nykiel updated HDFS-2476:


Attachment: hashStructures.patch-2

> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch, hashStructures.patch-2
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-1266) Missing license headers in branch-20-append

2011-10-20 Thread Todd Lipcon (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-1266.
---

Resolution: Invalid

branch-0.20-append is abandoned. license headers in 0.20-security should be OK.

> Missing license headers in branch-20-append
> ---
>
> Key: HDFS-1266
> URL: https://issues.apache.org/jira/browse/HDFS-1266
> Project: Hadoop HDFS
>  Issue Type: Task
>Affects Versions: 0.20-append
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Trivial
> Fix For: 0.20-append
>
>
> We appear to have some files without license headers, we should do a quick 
> pass through and fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2427) webhdfs mkdirs api call creates path with 777 permission, we should default it to 755

2011-10-20 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132145#comment-13132145
 ] 

Suresh Srinivas commented on HDFS-2427:
---

+1 for the patch.

> webhdfs mkdirs api call creates path with 777 permission, we should default 
> it to 755
> -
>
> Key: HDFS-2427
> URL: https://issues.apache.org/jira/browse/HDFS-2427
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2427_20111019.patch, h2427_20111019b.patch, 
> h2427_20111020.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132142#comment-13132142
 ] 

Suresh Srinivas commented on HDFS-2480:
---

Note I have not changed the client and server to user the translators. Will do 
it in a different patch.

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132141#comment-13132141
 ] 

Suresh Srinivas commented on HDFS-2480:
---

Two javac warnings are related to two deprecated methods called. I am not sure 
why javac warning comes up even after adding @SuppressWarnings.
[WARNING] 
/Users/suresh/Documents/workspace/hadoop.committer/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/NamenodeProtocolServerSideTranslatorR23.java:[122,37]
 [deprecation] rollEditLog() in 
org.apache.hadoop.hdfs.protocolR23Compatible.NamenodeWireProtocol has been 
deprecated
[WARNING] 
/Users/suresh/Documents/workspace/hadoop.committer/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/NamenodeProtocolTranslatorR23.java:[144,29]
 [deprecation] rollEditLog() in 
org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol has been deprecated


> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1975) HA: Support for sharing the namenode state from active to standby.

2011-10-20 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132137#comment-13132137
 ] 

Todd Lipcon commented on HDFS-1975:
---

Cool, I'll look forward to your next revision. Let me know if I can help in any 
way.

Another thing I was considering while reading your patch is that it would be 
nice if the messages went through the same code path regardless of whether the 
NN is in standby or active mode. That way we have fewer code paths to debug. 
Does that seem feasible?

> HA: Support for sharing the namenode state from active to standby.
> --
>
> Key: HDFS-1975
> URL: https://issues.apache.org/jira/browse/HDFS-1975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1975-HA.2.patch, HDFS-1975-HA.patch, hdfs-1975.txt, 
> hdfs-1975.txt
>
>
> To enable hot standby namenode, the standby node must have current 
> information for - namenode state (image + edits) and block location 
> information. This jira addresses keeping the namenode state current in the 
> standby node. To do this, the proposed solution in this jira is to use a 
> shared storage to store the namenode state. 
> Note one could also build an alternative solution by augmenting the backup 
> node. A seperate jira could explore this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1975) HA: Support for sharing the namenode state from active to standby.

2011-10-20 Thread Jitendra Nath Pandey (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132131#comment-13132131
 ] 

Jitendra Nath Pandey commented on HDFS-1975:


Thanks for early review Todd. 

The patch is still in works. To reduce the amount of memory required to store 
the pending messages I am considering following two approaches.

  1) Instead of storing the entire block report, storing only those blocks that 
have newer gs. This will reduce the memory required to store pending messages.

  2) Allow reading segments from the middle, but only in following two cases
   1) The segment is finalized
   2) The segment is in progress and a threshold of time has passed, just to 
avoid opening in_progress file too frequently.


> HA: Support for sharing the namenode state from active to standby.
> --
>
> Key: HDFS-1975
> URL: https://issues.apache.org/jira/browse/HDFS-1975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1975-HA.2.patch, HDFS-1975-HA.patch, hdfs-1975.txt, 
> hdfs-1975.txt
>
>
> To enable hot standby namenode, the standby node must have current 
> information for - namenode state (image + edits) and block location 
> information. This jira addresses keeping the namenode state current in the 
> standby node. To do this, the proposed solution in this jira is to use a 
> shared storage to store the namenode state. 
> Note one could also build an alternative solution by augmenting the backup 
> node. A seperate jira could explore this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Konstantin Boudnik (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-2452:
-

Attachment: HDFS-2452-22branch_with-around_.patch

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch_with-around_.patch, 
> HDFS-2452-22branch_with-around_.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Konstantin Boudnik (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-2452:
-

Attachment: HDFS-2452-22branch_with-around_.patch

Uma, we can throw an exception instead of calling run() method. In this case 
before(...): advice has to be replaced with void around(...): one (see new 
patch).

Or, doing this before invoking run() means that it should be happening in 
DataXceiverServer, right? They the patch needs to be slightly different and we 
need to instrument DataXceiverServer instead of DataXceiver. Sorry, I am 
confused.

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch_with-around_.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2178) Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities

2011-10-20 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132111#comment-13132111
 ] 

Sanjay Radia commented on HDFS-2178:


* 100 Continue issue
** I like arpits's suggestion since it does leak implementation limitations to 
the API; this almost the same as a getCreateHandle and a getAppendHandle 
operations. That is simply submit the "put" without the data and use the result 
as the handle. Alejandro, is this acceptable?
* I agree that the we want to keep the proxy and webhdfs APIs the same or 
almost the same - but if an operation does not make sense for the proxy or if 
an operation does not make sense for webhdfs we should allow such differences. 
For example, will the proxy ever redirect? Does getDelegationToken make sense?
* APi - the webhdfs API has been derived from the original Hoop APi with 
changes made based on feedback over the last month. I would like to start with 
the webhdfs api as currently in truck and 205 as the stating point. Nicholas 
will post a document that describes the current implementation.
* Code sharing - I agree - right now lets get the apis to match and then over 
time we can move to a shared implementation.
* Pure proxy vs hdfs proxy. I agree that hdfs proxy has merit beyond pure proxy 
(e.g. proxy for s3, authentication mapping etc). Would it make sense simply 
forward the webhdfs operations directly as is to  webhdfs inside HDFS? This 
makes code sharing even easier. 

> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write 
> capabilities
> --
>
> Key: HDFS-2178
> URL: https://issues.apache.org/jira/browse/HDFS-2178
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.0
>
> Attachments: HDFS-2178.patch, HDFSoverHTTP-API.html, HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) 
> for HDFS Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations 
> (read and write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file 
> system operations. It can be accessed using standard HTTP tools (i.e. curl 
> and wget), HTTP libraries from different programing languages (i.e. Perl, 
> Java Script) as well as using the Hoop client. The Hoop server component is a 
> standard Java web-application and it has been implemented using Jersey 
> (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client 
> that allows using the familiar Hadoop filesystem API to access HDFS data 
> through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for 
> Kerberos HTTP SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be 
> done first, as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2459) Separate datatypes for Journal protocol

2011-10-20 Thread Suresh Srinivas (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-2459:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I committed the patch.

> Separate datatypes for Journal protocol
> ---
>
> Key: HDFS-2459
> URL: https://issues.apache.org/jira/browse/HDFS-2459
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2459.txt, HDFS-2459.txt, HDFS-2459.txt
>
>
> This jira separates for JournalProtocol the wire types from the types used by 
> the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2427) webhdfs mkdirs api call creates path with 777 permission, we should default it to 755

2011-10-20 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132087#comment-13132087
 ] 

Hadoop QA commented on HDFS-2427:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12499919/h2427_20111020.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.TestDistributedFileSystem
  org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool
  org.apache.hadoop.hdfs.TestAbandonBlock
  org.apache.hadoop.hdfs.server.namenode.TestBackupNode
  org.apache.hadoop.hdfs.TestRestartDFS

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1400//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1400//console

This message is automatically generated.

> webhdfs mkdirs api call creates path with 777 permission, we should default 
> it to 755
> -
>
> Key: HDFS-2427
> URL: https://issues.apache.org/jira/browse/HDFS-2427
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2427_20111019.patch, h2427_20111019b.patch, 
> h2427_20111020.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132085#comment-13132085
 ] 

Hadoop QA commented on HDFS-2480:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12499906/HDFS-2480.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 33 javac compiler warnings (more 
than the trunk's current 31 warnings).

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.server.namenode.TestBackupNode
  org.apache.hadoop.hdfs.TestFileAppend2
  org.apache.hadoop.hdfs.TestBalancerBandwidth
  org.apache.hadoop.hdfs.TestRestartDFS
  org.apache.hadoop.hdfs.TestDistributedFileSystem
  org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1401//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1401//console

This message is automatically generated.

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2479) HDFS Client Data Types in Protocol Buffers

2011-10-20 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132082#comment-13132082
 ] 

Sanjay Radia commented on HDFS-2479:


Another option is to create a new type for HdfsLocatedFileStatus as
{code}
HdfsLocatedFileStatusProto {
  HdfsFileStatusProto fs = 1;
  LocatedBlocksProto locations = 2;
}
{code}
  

> HDFS Client Data Types in Protocol Buffers
> --
>
> Key: HDFS-2479
> URL: https://issues.apache.org/jira/browse/HDFS-2479
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Attachments: hdfs.proto, pbClientTypes1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2481) Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2481:
-

Assignee: Sanjay Radia  (was: Suresh Srinivas)

> Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol
> 
>
> Key: HDFS-2481
> URL: https://issues.apache.org/jira/browse/HDFS-2481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Sanjay Radia
>
> A few unit tests are failing, e.g.
> {noformat}
> Test set: org.apache.hadoop.hdfs.TestDistributedFileSystem
> ---
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 16.813 sec 
> <<< FAILURE!
> testAllWithDualPort(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.254 sec  <<< ERROR!
> java.io.IOException: java.io.IOException: Unknown protocol: 
> org.apache.hadoop.hdfs.protocol.ClientProtocol
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:615)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1517)
>   ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132067#comment-13132067
 ] 

Uma Maheswara Rao G commented on HDFS-2452:
---

Cos, do we have the option in aspectJ to throw the exception before invoking 
run. 
Here the expectation is that, injected exception should be propagated to 
DataXceiverServer try-catch block.
That will not happen with current case, t

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132059#comment-13132059
 ] 

Uma Maheswara Rao G commented on HDFS-2452:
---

I think test wont work as expected because, throwing exception from run will 
not get propagated to parent thread. Silently child threads will die.
I am not sure,we can throw the exception before actually spawning the thread.

I feel simple, the option would be from DataXceiver Constructor.


Thanks
Uma

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2483) [HDFS-RAID] DistributedRaidFileSystem cannot handle token delegation

2011-10-20 Thread Andrew Purtell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HDFS-2483:
-

Attachment: HDFS-RAID-delegate.patch

Patch is from a private development branch, so it may not apply, but should 
serve to illustrate a possible approach.

> [HDFS-RAID] DistributedRaidFileSystem cannot handle token delegation
> 
>
> Key: HDFS-2483
> URL: https://issues.apache.org/jira/browse/HDFS-2483
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Priority: Minor
> Attachments: HDFS-RAID-delegate.patch
>
>
> DistributedRaidFileSystem cannot handle token delegation, so it is impossible 
> to specify it as fs.hdfs.impl in a secure configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2483) [HDFS-RAID] DistributedRaidFileSystem cannot handle token delegation

2011-10-20 Thread Andrew Purtell (Created) (JIRA)
[HDFS-RAID] DistributedRaidFileSystem cannot handle token delegation


 Key: HDFS-2483
 URL: https://issues.apache.org/jira/browse/HDFS-2483
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Purtell
Priority: Minor


DistributedRaidFileSystem cannot handle token delegation, so it is impossible 
to specify it as fs.hdfs.impl in a secure configuration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2482) [HDFS-RAID] ExtFSInputStream#read wrapper does not preserve semantics

2011-10-20 Thread Andrew Purtell (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HDFS-2482:
-

Attachment: HDFS-RAID-fix-ExtFSInputStream.patch

Patch may not apply, it's from a private distribution, but does illustrate a 
possible fix.

> [HDFS-RAID] ExtFSInputStream#read wrapper does not preserve semantics
> -
>
> Key: HDFS-2482
> URL: https://issues.apache.org/jira/browse/HDFS-2482
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Priority: Minor
> Attachments: HDFS-RAID-fix-ExtFSInputStream.patch
>
>
> The ExtFSInputStream#read wrapper has signed byte issues. No need to use a 
> local byte buffer either, IMO.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2482) [HDFS-RAID] ExtFSInputStream#read wrapper does not preserve semantics

2011-10-20 Thread Andrew Purtell (Created) (JIRA)
[HDFS-RAID] ExtFSInputStream#read wrapper does not preserve semantics
-

 Key: HDFS-2482
 URL: https://issues.apache.org/jira/browse/HDFS-2482
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Purtell
Priority: Minor


The ExtFSInputStream#read wrapper has signed byte issues. No need to use a 
local byte buffer either, IMO.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1971) HA: Send block report from datanode to both active and standby namenodes

2011-10-20 Thread Sanjay Radia (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132028#comment-13132028
 ] 

Sanjay Radia commented on HDFS-1971:


Sorry. Will do. I am making some other changes and post an updated patch 
shortly.

> HA: Send block report from datanode to both active and standby namenodes
> 
>
> Key: HDFS-1971
> URL: https://issues.apache.org/jira/browse/HDFS-1971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, name-node
>Reporter: Suresh Srinivas
>Assignee: Sanjay Radia
> Attachments: DualBlockReports.pdf, daulBr1.patch
>
>
> To enable hot standby namenode, the standby node must have current 
> information for - namenode state (image + edits) and block location 
> information. This jira addresses keeping the block location information 
> current in the standby node. To do this, the proposed solution is to send 
> block reports from the datanodes to both the active and the standby namenode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2479) HDFS Client Data Types in Protocol Buffers

2011-10-20 Thread Sanjay Radia (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-2479:
---

Attachment: hdfs.proto

For convenience here is the .proto file (note this is not a new file, i have 
simply added some new types).

> HDFS Client Data Types in Protocol Buffers
> --
>
> Key: HDFS-2479
> URL: https://issues.apache.org/jira/browse/HDFS-2479
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Attachments: hdfs.proto, pbClientTypes1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2178) Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities

2011-10-20 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132011#comment-13132011
 ] 

Hadoop QA commented on HDFS-2178:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12499940/HDFS-2178.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 107 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1402//console

This message is automatically generated.

> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write 
> capabilities
> --
>
> Key: HDFS-2178
> URL: https://issues.apache.org/jira/browse/HDFS-2178
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.0
>
> Attachments: HDFS-2178.patch, HDFSoverHTTP-API.html, HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) 
> for HDFS Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations 
> (read and write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file 
> system operations. It can be accessed using standard HTTP tools (i.e. curl 
> and wget), HTTP libraries from different programing languages (i.e. Perl, 
> Java Script) as well as using the Hoop client. The Hoop server component is a 
> standard Java web-application and it has been implemented using Jersey 
> (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client 
> that allows using the familiar Hadoop filesystem API to access HDFS data 
> through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for 
> Kerberos HTTP SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be 
> done first, as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2479) HDFS Client Data Types in Protocol Buffers

2011-10-20 Thread Sanjay Radia (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-2479:
---

Attachment: pbClientTypes1.patch

Early patch of pb types. Note:
* I have made some of the fields of HdfsFileStatus to be optional since they 
only apply to a file. Thoughts?
* Rather than define a new HdfsLocatedFileStatus (as done with writables), I 
made locations  optional. Thoughts?

> HDFS Client Data Types in Protocol Buffers
> --
>
> Key: HDFS-2479
> URL: https://issues.apache.org/jira/browse/HDFS-2479
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Attachments: pbClientTypes1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2178) Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities

2011-10-20 Thread Alejandro Abdelnur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-2178:
-

Status: Patch Available  (was: Open)

> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write 
> capabilities
> --
>
> Key: HDFS-2178
> URL: https://issues.apache.org/jira/browse/HDFS-2178
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.0
>
> Attachments: HDFS-2178.patch, HDFSoverHTTP-API.html, HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) 
> for HDFS Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations 
> (read and write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file 
> system operations. It can be accessed using standard HTTP tools (i.e. curl 
> and wget), HTTP libraries from different programing languages (i.e. Perl, 
> Java Script) as well as using the Hoop client. The Hoop server component is a 
> standard Java web-application and it has been implemented using Jersey 
> (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client 
> that allows using the familiar Hadoop filesystem API to access HDFS data 
> through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for 
> Kerberos HTTP SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be 
> done first, as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2178) Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities

2011-10-20 Thread Alejandro Abdelnur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-2178:
-

Attachment: HDFS-2178.patch

Patch adding Hoop to Hadoop HDFS project.

* It uses the HTTP REST API previously posted in this JIRA (it uses a handle 
request for create and append to support redirection)
* It is a new Maven module, hadoop-hdfs-httpfs
* It adds a HttpFileSystem client to hadoop-common as well as its registration 
in core-site.xml
* It runs a a separate service, httpfs.
* It is integrated in the final tar following new layout convention.
* It uses a tomcat server to run the httpfs webapplication (tomcat is bundled 
and preconfigured with it).
* It has documentation and it is fully javadoc.
* It supports pseudo & kerberos authentication.


> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write 
> capabilities
> --
>
> Key: HDFS-2178
> URL: https://issues.apache.org/jira/browse/HDFS-2178
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.0
>
> Attachments: HDFS-2178.patch, HDFSoverHTTP-API.html, HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) 
> for HDFS Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations 
> (read and write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file 
> system operations. It can be accessed using standard HTTP tools (i.e. curl 
> and wget), HTTP libraries from different programing languages (i.e. Perl, 
> Java Script) as well as using the Hoop client. The Hoop server component is a 
> standard Java web-application and it has been implemented using Jersey 
> (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client 
> that allows using the familiar Hadoop filesystem API to access HDFS data 
> through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for 
> Kerberos HTTP SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be 
> done first, as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1975) HA: Support for sharing the namenode state from active to standby.

2011-10-20 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131983#comment-13131983
 ] 

Todd Lipcon commented on HDFS-1975:
---

- can we move EditLogTailer to the ha package?
- should probably have some brief javadoc there. Also, if it is a public class, 
it needs InterfaceAudience/Stability annotations
- why do we sleep 60 seconds between tails? I think keeping the standby as up 
to date as possible is important. Though it's not an immediate goal of today's 
project, we should keep in mind the secondary goal of serving stale reads from 
the standby.
- using the terminology "sharedEditDirs" implies that we only support 
directories here. Instead, shouldn't we call it "sharedEditUris"? Same with the 
configs, etc.
- the code in {{stopReadingEditLogs}} seems really race prone. We need better 
inter-thread coordination than just sleeping.
- The name {{waitForGenStamp}} implies that it waits for something, but in fact 
this is just {{isGenStampInFuture}}
- need license on PendingDataNodeMessages
- need javadoc in lots of spots - explain the purposes of the new class, etc
- getMaxGsInBlockList could be moved to a static method in BlockListAsLongs

- DataNodeMessage and subclasses: make the fields final

- needs unit tests - there are some in my earlier patch for this issue that 
could be used to verify EditLogTailer, I think.

I want to also do some thinking on synchronization for the datanode messages, 
etc. Will write more later.


> HA: Support for sharing the namenode state from active to standby.
> --
>
> Key: HDFS-1975
> URL: https://issues.apache.org/jira/browse/HDFS-1975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1975-HA.2.patch, HDFS-1975-HA.patch, hdfs-1975.txt, 
> hdfs-1975.txt
>
>
> To enable hot standby namenode, the standby node must have current 
> information for - namenode state (image + edits) and block location 
> information. This jira addresses keeping the namenode state current in the 
> standby node. To do this, the proposed solution in this jira is to use a 
> shared storage to store the namenode state. 
> Note one could also build an alternative solution by augmenting the backup 
> node. A seperate jira could explore this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2481) Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol

2011-10-20 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131980#comment-13131980
 ] 

Todd Lipcon commented on HDFS-2481:
---

Bunches of the non-mavenized MR tests seem to be failing like this too. EG:

Testcase: testAuditLoggerWithIP took 0.535 sec
Caused an ERROR
java.io.IOException: Unknown protocol: 
org.apache.hadoop.ipc.TestRPC$TestProtocol


> Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol
> 
>
> Key: HDFS-2481
> URL: https://issues.apache.org/jira/browse/HDFS-2481
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Suresh Srinivas
>
> A few unit tests are failing, e.g.
> {noformat}
> Test set: org.apache.hadoop.hdfs.TestDistributedFileSystem
> ---
> Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 16.813 sec 
> <<< FAILURE!
> testAllWithDualPort(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
> elapsed: 0.254 sec  <<< ERROR!
> java.io.IOException: java.io.IOException: Unknown protocol: 
> org.apache.hadoop.hdfs.protocol.ClientProtocol
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:615)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1517)
>   ...
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Konstantin Boudnik (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131979#comment-13131979
 ] 

Konstantin Boudnik commented on HDFS-2452:
--

And yes - patch looks good to me overall.

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Konstantin Boudnik (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131976#comment-13131976
 ] 

Konstantin Boudnik commented on HDFS-2452:
--

Thanks for fixing aspect code Uma - I was doing it in a hassle last night and 
did a suboptimal job apparently.
Aspects should be injected in 'injectfaults' target which is called by the test 
target. You don't need to have anything else to make it happen: all components 
of AOP framework are in place by default. 

I have applied your patch and ran the test and I see that injected code is 
working as expected. I see many OOM errors thrown from a huge number of the 
DataXceiver threads like this
{noformat}
[junit] Exception in thread 
"org.apache.hadoop.hdfs.server.datanode.DataXceiver@7e3feb83" 
java.lang.OutOfMemoryError: Pretend there's no more memory
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiverAspects.ajc$before$org_apache_hadoop_hdfs_server_datanode_DataXceiverAspects$1$1a38ea(DataXceiverAspects.aj:36)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:120)
[junit] at java.lang.Thread.run(Thread.java:680)
[junit] Exception in thread 
"org.apache.hadoop.hdfs.server.datanode.DataXceiver@29b18dc0" 
java.lang.OutOfMemoryError: Pretend there's no more memory
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiverAspects.ajc$before$org_apache_hadoop_hdfs_server_datanode_DataXceiverAspects$1$1a38ea(DataXceiverAspects.aj:36)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:120)
[junit] at java.lang.Thread.run(Thread.java:680)
{noformat}

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Konstantin Boudnik (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131978#comment-13131978
 ] 

Konstantin Boudnik commented on HDFS-2452:
--

I have ran the test twice: it passed once and crashed the second time.

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-2480:
--

Status: Patch Available  (was: Open)

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-10-20 Thread Doug Cutting (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131970#comment-13131970
 ] 

Doug Cutting commented on HDFS-2298:


I believe the plan is to merge this to 0.23 after HADOOP-7524 and then 
HADOOP-7693 have been merged there.  If those issues are not merged to 0.23 
then the best option would be simply to remove this test, as was done in 
HDFS-2383.  Sanjay said he intended merge HADOOP-7524 to 0.23, which gates this.

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Aaron T. Myers
>Assignee: Doug Cutting
> Fix For: 0.24.0
>
> Attachments: HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch, 
> HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-10-20 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131969#comment-13131969
 ] 

Uma Maheswara Rao G commented on HDFS-2298:
---

0.23 also has the same failures. I think we should merge it to 0.23 as well.

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Aaron T. Myers
>Assignee: Doug Cutting
> Fix For: 0.24.0
>
> Attachments: HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch, 
> HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2481) Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Created) (JIRA)
Unknown protocol: org.apache.hadoop.hdfs.protocol.ClientProtocol


 Key: HDFS-2481
 URL: https://issues.apache.org/jira/browse/HDFS-2481
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Suresh Srinivas


A few unit tests are failing, e.g.
{noformat}
Test set: org.apache.hadoop.hdfs.TestDistributedFileSystem
---
Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 16.813 sec <<< 
FAILURE!
testAllWithDualPort(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time 
elapsed: 0.254 sec  <<< ERROR!
java.io.IOException: java.io.IOException: Unknown protocol: 
org.apache.hadoop.hdfs.protocol.ClientProtocol
at 
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:615)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1517)
...
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131964#comment-13131964
 ] 

Uma Maheswara Rao G commented on HDFS-2452:
---

Here is a patch with CountDownLatch.

Other changes in previous patch is:
 
1) fixed one null pointer, which is caused because of not mocking the getConf 
on DN.
 Added the mock in test. 
 
2) DataXceiverAspects file apache header was not a header comment
{code}
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
{code}
updated it.

3) System.setProperty("fi.enabledOOM", null); also changed to 
System.setProperty("fi.enabledOOM", "false");
   because can not set null in props.
   Also changed the check in aspectj file to check with true.


To be frank, i am not familiar with aspectJ programs. 
So, requesting Cos to review once with my changes. One more thing is when i ran 
the tests with your command, i don't see any aspects injected in to 
DataXceiver.class and not running test. Not sure my compilations has the 
problems with aspectj. Since this is urgent for some of the load tests running 
by Konstantin, i updated the patch here.

Konstantin, can you check the patch and give the test run once.


Thanks,
Uma

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2298) TestDfsOverAvroRpc is failing on trunk

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131957#comment-13131957
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2298:
--

Should this be committed to 0.23?

> TestDfsOverAvroRpc is failing on trunk
> --
>
> Key: HDFS-2298
> URL: https://issues.apache.org/jira/browse/HDFS-2298
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Aaron T. Myers
>Assignee: Doug Cutting
> Fix For: 0.24.0
>
> Attachments: HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch, 
> HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch, HDFS-2298.patch
>
>
> The relevant bit of the error:
> {noformat}
> ---
> Test set: org.apache.hadoop.hdfs.TestDfsOverAvroRpc
> ---
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.486 sec <<< 
> FAILURE!
> testWorkingDirectory(org.apache.hadoop.hdfs.TestDfsOverAvroRpc)  Time 
> elapsed: 1.424 sec  <<< ERROR!
> org.apache.avro.AvroTypeException: Two methods with same name: delete
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2427) webhdfs mkdirs api call creates path with 777 permission, we should default it to 755

2011-10-20 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2427:
-

Attachment: h2427_20111020.patch

h2427_20111020.patch: fixed the bugs and added some new tests

> webhdfs mkdirs api call creates path with 777 permission, we should default 
> it to 755
> -
>
> Key: HDFS-2427
> URL: https://issues.apache.org/jira/browse/HDFS-2427
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h2427_20111019.patch, h2427_20111019b.patch, 
> h2427_20111020.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2452) OutOfMemoryError in DataXceiverServer takes down the DataNode

2011-10-20 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2452:
--

Attachment: HDFS-2452-22Branch.2.patch

> OutOfMemoryError in DataXceiverServer takes down the DataNode
> -
>
> Key: HDFS-2452
> URL: https://issues.apache.org/jira/browse/HDFS-2452
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Uma Maheswara Rao G
> Fix For: 0.22.0
>
> Attachments: HDFS-2452-22Branch.2.patch, HDFS-2452-22branch.1.patch, 
> HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, HDFS-2452-22branch.patch, 
> HDFS-2452-22branch.patch
>
>
> OutOfMemoryError brings down DataNode, when DataXceiverServer tries to spawn 
> a new data transfer thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2178) Contributing Hoop to HDFS, replacement for HDFS proxy with read/write capabilities

2011-10-20 Thread Alejandro Abdelnur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131920#comment-13131920
 ] 

Alejandro Abdelnur commented on HDFS-2178:
--

@Cos, yes I considered webdav before starting with Hoop, the main issues I've 
found were that:

* webdav introduces several HTTP extensions such as additional verbs (methods) 
and headers. General purpose http libraries sometimes fail to support such 
extensions. And the same goes for some HTTP proxy implementations.
* webdav requires functionality that would be difficult to implement in an 
hdfs-proxy, like COPY.
* webdav usage with HTTP tools/libs is quite complicated, you need a webdav 
client.

Said this, I'm not dismissing a webdav bridge, it was just not matching the 
goal of hdfs-proxy hoop.


> Contributing Hoop to HDFS, replacement for HDFS proxy with read/write 
> capabilities
> --
>
> Key: HDFS-2178
> URL: https://issues.apache.org/jira/browse/HDFS-2178
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.0
>
> Attachments: HDFSoverHTTP-API.html, HdfsHttpAPI.pdf
>
>
> We'd like to contribute Hoop to Hadoop HDFS as a replacement (an improvement) 
> for HDFS Proxy.
> Hoop provides access to all Hadoop Distributed File System (HDFS) operations 
> (read and write) over HTTP/S.
> The Hoop server component is a REST HTTP gateway to HDFS supporting all file 
> system operations. It can be accessed using standard HTTP tools (i.e. curl 
> and wget), HTTP libraries from different programing languages (i.e. Perl, 
> Java Script) as well as using the Hoop client. The Hoop server component is a 
> standard Java web-application and it has been implemented using Jersey 
> (JAX-RS).
> The Hoop client component is an implementation of Hadoop FileSystem client 
> that allows using the familiar Hadoop filesystem API to access HDFS data 
> through a Hoop server.
>   Repo: https://github.com/cloudera/hoop
>   Docs: http://cloudera.github.com/hoop
>   Blog: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/
> Hoop is a Maven based project that depends on Hadoop HDFS and Alfredo (for 
> Kerberos HTTP SPNEGO authentication). 
> To make the integration easy, HDFS Mavenization (HDFS-2096) would have to be 
> done first, as well as the Alfredo contribution (HADOOP-7119).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2450) Only complete hostname is supported to access data via hdfs://

2011-10-20 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131919#comment-13131919
 ] 

Hadoop QA commented on HDFS-2450:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12499910/HDFS-2450-1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1399//console

This message is automatically generated.

> Only complete hostname is supported to access data via hdfs://
> --
>
> Key: HDFS-2450
> URL: https://issues.apache.org/jira/browse/HDFS-2450
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Daryn Sharp
> Attachments: HDFS-2450-1.patch, HDFS-2450.patch
>
>
> If my complete hostname is  host1.abc.xyz.com, only complete hostname must be 
> used to access data via hdfs://
> I am running following in .20.205 Client to get data from .20.205 NN (host1)
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1/tmp
> copyFromLocal: Wrong FS: hdfs://host1/tmp, expected: hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hftp://host1.abc.xyz/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> Only following is supported 
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc.xyz.com/tmp/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2450) Only complete hostname is supported to access data via hdfs://

2011-10-20 Thread Daryn Sharp (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131913#comment-13131913
 ] 

Daryn Sharp commented on HDFS-2450:
---

Trimmed out as much as I could, reverted the refactor of {{checkPath}}, and 
removed extraneous comments based on input from Suresh & Nicholas.

> Only complete hostname is supported to access data via hdfs://
> --
>
> Key: HDFS-2450
> URL: https://issues.apache.org/jira/browse/HDFS-2450
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Daryn Sharp
> Attachments: HDFS-2450-1.patch, HDFS-2450.patch
>
>
> If my complete hostname is  host1.abc.xyz.com, only complete hostname must be 
> used to access data via hdfs://
> I am running following in .20.205 Client to get data from .20.205 NN (host1)
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1/tmp
> copyFromLocal: Wrong FS: hdfs://host1/tmp, expected: hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hftp://host1.abc.xyz/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> Only following is supported 
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc.xyz.com/tmp/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2450) Only complete hostname is supported to access data via hdfs://

2011-10-20 Thread Daryn Sharp (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-2450:
--

Attachment: HDFS-2450-1.patch

 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.


> Only complete hostname is supported to access data via hdfs://
> --
>
> Key: HDFS-2450
> URL: https://issues.apache.org/jira/browse/HDFS-2450
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Daryn Sharp
> Attachments: HDFS-2450-1.patch, HDFS-2450.patch
>
>
> If my complete hostname is  host1.abc.xyz.com, only complete hostname must be 
> used to access data via hdfs://
> I am running following in .20.205 Client to get data from .20.205 NN (host1)
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1/tmp
> copyFromLocal: Wrong FS: hdfs://host1/tmp, expected: hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hftp://host1.abc.xyz/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> Only following is supported 
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc.xyz.com/tmp/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2440) webhdfs open a file and send 0 or invalid bufferSize,length, offest throws a 500, we should throw a 400

2011-10-20 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131890#comment-13131890
 ] 

Arpit Gupta commented on HDFS-2440:
---

Also GETFILECHECKSUM rest api throws a 500 when called on a dir. We should 
throw a 403 with the appropriate message stating why so the user can determine 
why the call failed.

> webhdfs open a file and send 0 or invalid bufferSize,length, offest throws a 
> 500, we should throw a 400
> ---
>
> Key: HDFS-2440
> URL: https://issues.apache.org/jira/browse/HDFS-2440
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-2480:
--

Attachment: HDFS-2480.txt

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt, HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-2480:
--

Attachment: HDFS-2480.txt

Attached patch

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2480.txt
>
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Created) (JIRA)
Separate datatypes for NamenodeProtocol
---

 Key: HDFS-2480
 URL: https://issues.apache.org/jira/browse/HDFS-2480
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0, 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas


This jira separates for JournalProtocol the wire types from the types used by 
the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2480) Separate datatypes for NamenodeProtocol

2011-10-20 Thread Suresh Srinivas (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-2480:
--

Description: This jira separates for NamenodeProtocol the wire types from 
the types used by the client and server, similar to HDFS-2181.  (was: This jira 
separates for JournalProtocol the wire types from the types used by the client 
and server, similar to HDFS-2181.)

> Separate datatypes for NamenodeProtocol
> ---
>
> Key: HDFS-2480
> URL: https://issues.apache.org/jira/browse/HDFS-2480
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>
> This jira separates for NamenodeProtocol the wire types from the types used 
> by the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2459) Separate datatypes for Journal protocol

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131773#comment-13131773
 ] 

Hudson commented on HDFS-2459:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1138 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1138/])
HDFS-2459. Separate datatypes for Journal Protocol. (suresh)

suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186896
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalProtocolServerSideTranslatorR23.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalProtocolTranslatorR23.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalWireProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/NamenodeRegistrationWritable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/StorageInfoWritable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java


> Separate datatypes for Journal protocol
> ---
>
> Key: HDFS-2459
> URL: https://issues.apache.org/jira/browse/HDFS-2459
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2459.txt, HDFS-2459.txt, HDFS-2459.txt
>
>
> This jira separates for JournalProtocol the wire types from the types used by 
> the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2479) HDFS Client Data Types in Protocol Buffers

2011-10-20 Thread Sanjay Radia (Created) (JIRA)
HDFS Client Data Types in Protocol Buffers
--

 Key: HDFS-2479
 URL: https://issues.apache.org/jira/browse/HDFS-2479
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2478) HDFS Protocols in Protocol Buffers

2011-10-20 Thread Sanjay Radia (Created) (JIRA)
HDFS Protocols in Protocol Buffers
--

 Key: HDFS-2478
 URL: https://issues.apache.org/jira/browse/HDFS-2478
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Sanjay Radia
Assignee: Sanjay Radia




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2060) DFS client RPCs using protobufs

2011-10-20 Thread Sanjay Radia (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-2060:
---

Issue Type: Sub-task  (was: New Feature)
Parent: HDFS-2478

> DFS client RPCs using protobufs
> ---
>
> Key: HDFS-2060
> URL: https://issues.apache.org/jira/browse/HDFS-2060
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-2060-getblocklocations.txt
>
>
> The most important place for wire-compatibility in DFS is between clients and 
> the cluster, since lockstep upgrade is very difficult and a single client may 
> want to talk to multiple server versions. So, I'd like to focus this JIRA on 
> making the RPCs between the DFS client and the NN/DNs wire-compatible using 
> protocol buffer based serialization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2459) Separate datatypes for Journal protocol

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131751#comment-13131751
 ] 

Hudson commented on HDFS-2459:
--

Integrated in Hadoop-Common-trunk-Commit #1122 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1122/])
HDFS-2459. Separate datatypes for Journal Protocol. (suresh)

suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186896
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalProtocolServerSideTranslatorR23.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalProtocolTranslatorR23.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalWireProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/NamenodeRegistrationWritable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/StorageInfoWritable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java


> Separate datatypes for Journal protocol
> ---
>
> Key: HDFS-2459
> URL: https://issues.apache.org/jira/browse/HDFS-2459
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2459.txt, HDFS-2459.txt, HDFS-2459.txt
>
>
> This jira separates for JournalProtocol the wire types from the types used by 
> the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2459) Separate datatypes for Journal protocol

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131750#comment-13131750
 ] 

Hudson commented on HDFS-2459:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1201 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1201/])
HDFS-2459. Separate datatypes for Journal Protocol. (suresh)

suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186896
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalProtocolServerSideTranslatorR23.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalProtocolTranslatorR23.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/JournalWireProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/NamenodeRegistrationWritable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolR23Compatible/StorageInfoWritable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupOutputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/JournalProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/NamenodeRegistration.java


> Separate datatypes for Journal protocol
> ---
>
> Key: HDFS-2459
> URL: https://issues.apache.org/jira/browse/HDFS-2459
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2459.txt, HDFS-2459.txt, HDFS-2459.txt
>
>
> This jira separates for JournalProtocol the wire types from the types used by 
> the client and server, similar to HDFS-2181.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2471) Add Federation feature, configuration and tools documentation

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131613#comment-13131613
 ] 

Hudson commented on HDFS-2471:
--

Integrated in Hadoop-Mapreduce-0.23-Build #57 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/57/])
HDFS-2471. Add documentation for HDFS federation feature. Contributed by 
Suresh Srinivas.

suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186579
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/Federation.apt.vm
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/index.apt.vm
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/resources
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/federation-background.gif
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/federation.gif


> Add Federation feature, configuration and tools documentation
> -
>
> Key: HDFS-2471
> URL: https://issues.apache.org/jira/browse/HDFS-2471
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.23.0, 0.24.0
>
> Attachments: HDFS-2471.patch, HDFS-2471.patch
>
>
> This jira intends to add Federation documentation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2445) Incorrect exit code for hadoop-hdfs-test tests when exception thrown

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131607#comment-13131607
 ] 

Hudson commented on HDFS-2445:
--

Integrated in Hadoop-Mapreduce-0.23-Build #57 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/57/])
Merge -c 1186550 from trunk to branch-0.23 to complete fix for HDFS-2445.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186551
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/test/HdfsTestDriver.java


> Incorrect exit code for hadoop-hdfs-test tests when exception thrown
> 
>
> Key: HDFS-2445
> URL: https://issues.apache.org/jira/browse/HDFS-2445
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.20.205.0, 0.23.0, 0.24.0
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 0.23.0
>
> Attachments: HDFS-2445.patch
>
>
> Please see MAPREDUCE-3179 for a full description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2451) TestNodeCount.testNodeCount failes with NPE

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131584#comment-13131584
 ] 

Hudson commented on HDFS-2451:
--

Integrated in Hadoop-Hdfs-22-branch #100 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-22-branch/100/])
HDFS-2451. TestNodeCount.testNodeCount failes with NPE. Contributed by 
Konstantin Boudnik.

cos : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186556
Files : 
* /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.22/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestNodeCount.java


> TestNodeCount.testNodeCount failes with NPE
> ---
>
> Key: HDFS-2451
> URL: https://issues.apache.org/jira/browse/HDFS-2451
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
>Priority: Blocker
> Attachments: HDFS-2451.patch, HDFS-2451.patch, HDFS-2451.patch
>
>
> in the [commit build #97|http://is.gd/UPuXg2] TestNodeCount.testNodeCount 
> failed with NPE
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.BlockManager.countNodes(BlockManager.java:1436)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNodeCount.__CLR2_4_39bdgm6whp(TestNodeCount.java:119)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNodeCount.testNodeCount(TestNodeCount.java:40)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2471) Add Federation feature, configuration and tools documentation

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131575#comment-13131575
 ] 

Hudson commented on HDFS-2471:
--

Integrated in Hadoop-Hdfs-0.23-Build #45 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/45/])
HDFS-2471. Add documentation for HDFS federation feature. Contributed by 
Suresh Srinivas.

suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186579
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ClusterSetup.apt.vm
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/Federation.apt.vm
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/index.apt.vm
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/resources
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/federation-background.gif
* 
/hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/resources/federation.gif


> Add Federation feature, configuration and tools documentation
> -
>
> Key: HDFS-2471
> URL: https://issues.apache.org/jira/browse/HDFS-2471
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: documentation
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.23.0, 0.24.0
>
> Attachments: HDFS-2471.patch, HDFS-2471.patch
>
>
> This jira intends to add Federation documentation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2445) Incorrect exit code for hadoop-hdfs-test tests when exception thrown

2011-10-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131570#comment-13131570
 ] 

Hudson commented on HDFS-2445:
--

Integrated in Hadoop-Hdfs-0.23-Build #45 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/45/])
Merge -c 1186550 from trunk to branch-0.23 to complete fix for HDFS-2445.

acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1186551
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/test/HdfsTestDriver.java


> Incorrect exit code for hadoop-hdfs-test tests when exception thrown
> 
>
> Key: HDFS-2445
> URL: https://issues.apache.org/jira/browse/HDFS-2445
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.20.205.0, 0.23.0, 0.24.0
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Fix For: 0.23.0
>
> Attachments: HDFS-2445.patch
>
>
> Please see MAPREDUCE-3179 for a full description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2011-10-20 Thread Steve Loughran (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131528#comment-13131528
 ] 

Steve Loughran commented on HDFS-2476:
--

I like this idea -though someone who understands the NN internals would have to 
review the changes there. The new collections may be best in hadoop-common, as 
they could be of broader value

> More CPU efficient data structure for 
> under-replicated/over-replicated/invalidate blocks
> 
>
> Key: HDFS-2476
> URL: https://issues.apache.org/jira/browse/HDFS-2476
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Tomasz Nykiel
>Assignee: Tomasz Nykiel
> Attachments: hashStructures.patch
>
>
> This patch introduces two hash data structures for storing under-replicated, 
> over-replicated and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds 
> unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) 
> tend to grow quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving 
> safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
> for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of 
> elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2334) Add Closeable to JournalManager

2011-10-20 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131464#comment-13131464
 ] 

jirapos...@reviews.apache.org commented on HDFS-2334:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2247/
---

(Updated 2011-10-20 09:00:57.815922)


Review request for hadoop-hdfs.


Summary (updated)
---

A JournalManager may take hold of resources for the duration of their lifetime. 
This isn't the case at the moment for FileJournalManager, but 
BookKeeperJournalManager will, and it's conceivable that FileJournalManager 
could take a lock on a directory etc.

This JIRA is to add Closeable to JournalManager so that these resources can be 
cleaned up when FSEditLog is closed.


This addresses bug HDFS-2334.
http://issues.apache.org/jira/browse/HDFS-2334


Diffs
-

  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
 6976620 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
 4a41a2c 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
 8cfc975 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
 0bb7b0f 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java
 0d6bc74 

Diff: https://reviews.apache.org/r/2247/diff


Testing
---


Thanks,

Ivan



> Add Closeable to JournalManager
> ---
>
> Key: HDFS-2334
> URL: https://issues.apache.org/jira/browse/HDFS-2334
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.23.0
>
> Attachments: HDFS-2334.diff, HDFS-2334.diff
>
>
> A JournalManager may take hold of resources for the duration of their 
> lifetime. This isn't the case at the moment for FileJournalManager, but 
> BookKeeperJournalManager will, and it's conceivable that FileJournalManager 
> could take a lock on a directory etc. 
> This JIRA is to add Closeable to JournalManager so that these resources can 
> be cleaned up when FSEditLog is closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1975) HA: Support for sharing the namenode state from active to standby.

2011-10-20 Thread Ivan Kelly (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131445#comment-13131445
 ] 

Ivan Kelly commented on HDFS-1975:
--

{quote}
initJournals checks the state.
{quote}
Ah, so it does. Ignore that comment then.

> HA: Support for sharing the namenode state from active to standby.
> --
>
> Key: HDFS-1975
> URL: https://issues.apache.org/jira/browse/HDFS-1975
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Suresh Srinivas
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1975-HA.2.patch, HDFS-1975-HA.patch, hdfs-1975.txt, 
> hdfs-1975.txt
>
>
> To enable hot standby namenode, the standby node must have current 
> information for - namenode state (image + edits) and block location 
> information. This jira addresses keeping the namenode state current in the 
> standby node. To do this, the proposed solution in this jira is to use a 
> shared storage to store the namenode state. 
> Note one could also build an alternative solution by augmenting the backup 
> node. A seperate jira could explore this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira