[jira] [Commented] (HDFS-4456) Add concat to HttpFS and WebHDFS REST API docs

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569503#comment-13569503
 ] 

Hudson commented on HDFS-4456:
--

Integrated in Hadoop-Yarn-trunk #115 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/115/])
HDFS-4456. Add concat to HttpFS and WebHDFS REST API docs. (plamenj2003 via 
tucu) (Revision 1441603)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441603
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ConcatSourcesParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


 Add concat to HttpFS and WebHDFS REST API docs
 --

 Key: HDFS-4456
 URL: https://issues.apache.org/jira/browse/HDFS-4456
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Plamen Jeliazkov
 Fix For: 2.0.3-alpha

 Attachments: HDFS-3598.trunk.docAndHttpFS.patch, 
 HDFS-4456.trunk.docAndHttpFS.patch, HDFS-4456.trunk.docAndHttpFS.patch, 
 HDFS-4456.trunk.genericsRemoval.patch, HDFS-4456.trunk.patch, 
 HDFS-4456.trunk.patch


 HDFS-3598 adds the concat feature to WebHDFS.  The REST API should be updated 
 accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4452) getAdditionalBlock() can create multiple blocks if the client times out and retries.

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569505#comment-13569505
 ] 

Hudson commented on HDFS-4452:
--

Integrated in Hadoop-Yarn-trunk #115 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/115/])
HDFS-4452. getAdditionalBlock() can create multiple blocks if the client 
times out and retries. Contributed by Konstantin Shvachko. (Revision 1441681)

 Result = SUCCESS
shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441681
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAddBlockRetry.java


 getAdditionalBlock() can create multiple blocks if the client times out and 
 retries.
 

 Key: HDFS-4452
 URL: https://issues.apache.org/jira/browse/HDFS-4452
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.2-alpha
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: getAdditionalBlock-branch2.patch, 
 getAdditionalBlock.patch, getAdditionalBlock.patch, getAdditionalBlock.patch, 
 TestAddBlockRetry.java


 HDFS client tries to addBlock() to a file. If NameNode is busy the client can 
 timeout and will reissue the same request again. The two requests will race 
 with each other in {{FSNamesystem.getAdditionalBlock()}}, which can result in 
 creating two new blocks on the NameNode while the client will know of only 
 one of them. This eventually results in {{NotReplicatedYetException}} because 
 the extra block is never reported by any DataNode, which stalls file creation 
 and puts it in invalid state with an empty block in the middle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3119) Overreplicated block is not deleted even after the replication factor is reduced after sync follwed by closing that file

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569518#comment-13569518
 ] 

Hudson commented on HDFS-3119:
--

Integrated in Hadoop-Hdfs-0.23-Build #513 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/513/])
merge -r 1311379:1311380 Merging from trunk to branch-0.23 to fix HDFS-3119 
(Revision 1441656)

 Result = SUCCESS
kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441656
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestOverReplicatedBlocks.java


 Overreplicated block is not deleted even after the replication factor is 
 reduced after sync follwed by closing that file
 

 Key: HDFS-3119
 URL: https://issues.apache.org/jira/browse/HDFS-3119
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.24.0
Reporter: J.Andreina
Assignee: Ashish Singhi
Priority: Minor
  Labels: patch
 Fix For: 0.24.0, 2.0.0-alpha, 0.23.7

 Attachments: HDFS-3119-1.patch, HDFS-3119-1.patch, HDFS-3119.patch


 cluster setup:
 --
 1NN,2 DN,replication factor 2,block report interval 3sec ,block size-256MB
 step1: write a file filewrite.txt of size 90bytes with sync(not closed) 
 step2: change the replication factor to 1  using the command: ./hdfs dfs 
 -setrep 1 /filewrite.txt
 step3: close the file
 * At the NN side the file Decreasing replication from 2 to 1 for 
 /filewrite.txt , logs has occured but the overreplicated blocks are not 
 deleted even after the block report is sent from DN
 * while listing the file in the console using ./hdfs dfs -ls  the 
 replication factor for that file is mentioned as 1
 * In fsck report for that files displays that the file is replicated to 2 
 datanodes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4444) Add space between total transaction time and number of transactions in FSEditLog#printStatistics

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569528#comment-13569528
 ] 

Hudson commented on HDFS-:
--

Integrated in Hadoop-Hdfs-0.23-Build #513 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/513/])
HDFS-. Add space between total transaction time and number of 
transactions in FSEditLog#printStatistics. (Stephen Chu via tgraves) (Revision 
1441652)

 Result = SUCCESS
tgraves : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441652
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java


 Add space between total transaction time and number of transactions in 
 FSEditLog#printStatistics
 

 Key: HDFS-
 URL: https://issues.apache.org/jira/browse/HDFS-
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Trivial
 Fix For: 1.2.0, 2.0.3-alpha, 0.23.7

 Attachments: HDFS-.patch.001, HDFS-.patch.branch-1


 Currently, when we log statistics, we see something like
 {code}
 13/01/25 23:16:59 INFO namenode.FSNamesystem: Number of transactions: 0 Total 
 time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number 
 of syncs: 0 SyncTimes(ms): 0
 {code}
 Notice how the value for total transactions time and Number of transactions 
 batched in Syncs needs a space to separate them.
 FSEditLog#printStatistics:
 {code}
   private void printStatistics(boolean force) {
 long now = now();
 if (lastPrintTime + 6  now  !force) {
   return;
 }
 lastPrintTime = now;
 StringBuilder buf = new StringBuilder();
 buf.append(Number of transactions: );
 buf.append(numTransactions);
 buf.append( Total time for transactions(ms): );
 buf.append(totalTimeTransactions);
 buf.append(Number of transactions batched in Syncs: );
 buf.append(numTransactionsBatchedInSync);
 buf.append( Number of syncs: );
 buf.append(editLogStream.getNumSync());
 buf.append( SyncTimes(ms): );
 buf.append(journalSet.getSyncTimes());
 LOG.info(buf);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569529#comment-13569529
 ] 

Hudson commented on HDFS-2476:
--

Integrated in Hadoop-Hdfs-0.23-Build #513 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/513/])
merge -r 1201990:1201991 Merging from trunk to branch-0.23 to fix HDFS-2476 
(Revision 1441463)

 Result = SUCCESS
kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441463
Files : 
* /hadoop/common/branches/branch-0.23
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightHashSet.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightLinkedSet.java


 More CPU efficient data structure for 
 under-replicated/over-replicated/invalidate blocks
 

 Key: HDFS-2476
 URL: https://issues.apache.org/jira/browse/HDFS-2476
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: 0.23.0
Reporter: Tomasz Nykiel
Assignee: Tomasz Nykiel
 Fix For: 2.0.0-alpha, 0.23.7

 Attachments: hashStructures.patch, hashStructures.patch-2, 
 hashStructures.patch-3, hashStructures.patch-4, hashStructures.patch-5, 
 hashStructures.patch-6, hashStructures.patch-7, hashStructures.patch-8, 
 hashStructures.patch-9


 This patch introduces two hash data structures for storing under-replicated, 
 over-replicated and invalidated blocks. 
 1. LightWeightHashSet
 2. LightWeightLinkedSet
 Currently in all these cases we are using java.util.TreeSet which adds 
 unnecessary overhead.
 The main bottlenecks addressed by this patch are:
 -cluster instability times, when these queues (especially under-replicated) 
 tend to grow quite drastically,
 -initial cluster startup, when the queues are initialized, after leaving 
 safemode,
 -block reports,
 -explicit acks for block addition and deletion
 1. The introduced structures are CPU-optimized.
 2. They shrink and expand according to current capacity.
 3. Add/contains/delete ops are performed in O(1) time (unlike current log n 
 for TreeSet).
 4. The sets are equipped with fast access methods for polling a number of 
 elements (get+remove), which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1765) Block Replication should respect under-replication block priority

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569530#comment-13569530
 ] 

Hudson commented on HDFS-1765:
--

Integrated in Hadoop-Hdfs-0.23-Build #513 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/513/])
merge -r 1213536:1213537 Merging from trunk to branch-0.23 to fix HDFS-1765 
(Revision 1441577)

 Result = SUCCESS
kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441577
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/metrics/TestNameNodeMetrics.java


 Block Replication should respect under-replication block priority
 -

 Key: HDFS-1765
 URL: https://issues.apache.org/jira/browse/HDFS-1765
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.0
Reporter: Hairong Kuang
Assignee: Uma Maheswara Rao G
 Fix For: 2.0.0-alpha, 0.23.7

 Attachments: HDFS-1765.patch, HDFS-1765.patch, HDFS-1765.patch, 
 HDFS-1765.patch, HDFS-1765.pdf, underReplicatedQueue.pdf

  Time Spent: 0.5h
  Remaining Estimate: 0h

 Currently under-replicated blocks are assigned different priorities depending 
 on how many replicas a block has. However the replication monitor works on 
 blocks in a round-robin fashion. So the newly added high priority blocks 
 won't get replicated until all low-priority blocks are done. One example is 
 that on decommissioning datanode WebUI we often observe that blocks with 
 only decommissioning replicas do not get scheduled to replicate before other 
 blocks, so risking data availability if the node is shutdown for repair 
 before decommission completes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4456) Add concat to HttpFS and WebHDFS REST API docs

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569538#comment-13569538
 ] 

Hudson commented on HDFS-4456:
--

Integrated in Hadoop-Hdfs-trunk #1304 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1304/])
HDFS-4456. Add concat to HttpFS and WebHDFS REST API docs. (plamenj2003 via 
tucu) (Revision 1441603)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441603
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ConcatSourcesParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


 Add concat to HttpFS and WebHDFS REST API docs
 --

 Key: HDFS-4456
 URL: https://issues.apache.org/jira/browse/HDFS-4456
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Plamen Jeliazkov
 Fix For: 2.0.3-alpha

 Attachments: HDFS-3598.trunk.docAndHttpFS.patch, 
 HDFS-4456.trunk.docAndHttpFS.patch, HDFS-4456.trunk.docAndHttpFS.patch, 
 HDFS-4456.trunk.genericsRemoval.patch, HDFS-4456.trunk.patch, 
 HDFS-4456.trunk.patch


 HDFS-3598 adds the concat feature to WebHDFS.  The REST API should be updated 
 accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4452) getAdditionalBlock() can create multiple blocks if the client times out and retries.

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569540#comment-13569540
 ] 

Hudson commented on HDFS-4452:
--

Integrated in Hadoop-Hdfs-trunk #1304 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1304/])
HDFS-4452. getAdditionalBlock() can create multiple blocks if the client 
times out and retries. Contributed by Konstantin Shvachko. (Revision 1441681)

 Result = SUCCESS
shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441681
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAddBlockRetry.java


 getAdditionalBlock() can create multiple blocks if the client times out and 
 retries.
 

 Key: HDFS-4452
 URL: https://issues.apache.org/jira/browse/HDFS-4452
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.2-alpha
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: getAdditionalBlock-branch2.patch, 
 getAdditionalBlock.patch, getAdditionalBlock.patch, getAdditionalBlock.patch, 
 TestAddBlockRetry.java


 HDFS client tries to addBlock() to a file. If NameNode is busy the client can 
 timeout and will reissue the same request again. The two requests will race 
 with each other in {{FSNamesystem.getAdditionalBlock()}}, which can result in 
 creating two new blocks on the NameNode while the client will know of only 
 one of them. This eventually results in {{NotReplicatedYetException}} because 
 the extra block is never reported by any DataNode, which stalls file creation 
 and puts it in invalid state with an empty block in the middle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4464) Combine collectSubtreeBlocksAndClear with deleteDiffsForSnapshot

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569546#comment-13569546
 ] 

Hudson commented on HDFS-4464:
--

Integrated in Hadoop-Hdfs-Snapshots-Branch-build #89 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/89/])
HDFS-4464. Combine collectSubtreeBlocksAndClear with deleteDiffsForSnapshot 
and rename it to destroySubtreeAndCollectBlocks. (Revision 1441680)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441680
Files : 
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiffList.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeFileUnderConstructionWithSnapshot.java
* 
/hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeFileWithSnapshot.java


 Combine collectSubtreeBlocksAndClear with deleteDiffsForSnapshot
 

 Key: HDFS-4464
 URL: https://issues.apache.org/jira/browse/HDFS-4464
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: Snapshot (HDFS-2802)

 Attachments: h4464_20120201b.patch, h4464_20120201.patch


 Both collectSubtreeBlocksAndClear and deleteDiffsForSnapshot are recursive 
 methods for deleting inodes and collecting blocks for further block 
 deletion/update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4452) getAdditionalBlock() can create multiple blocks if the client times out and retries.

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569557#comment-13569557
 ] 

Hudson commented on HDFS-4452:
--

Integrated in Hadoop-Mapreduce-trunk #1332 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1332/])
HDFS-4452. getAdditionalBlock() can create multiple blocks if the client 
times out and retries. Contributed by Konstantin Shvachko. (Revision 1441681)

 Result = SUCCESS
shv : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441681
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAddBlockRetry.java


 getAdditionalBlock() can create multiple blocks if the client times out and 
 retries.
 

 Key: HDFS-4452
 URL: https://issues.apache.org/jira/browse/HDFS-4452
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.2-alpha
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: getAdditionalBlock-branch2.patch, 
 getAdditionalBlock.patch, getAdditionalBlock.patch, getAdditionalBlock.patch, 
 TestAddBlockRetry.java


 HDFS client tries to addBlock() to a file. If NameNode is busy the client can 
 timeout and will reissue the same request again. The two requests will race 
 with each other in {{FSNamesystem.getAdditionalBlock()}}, which can result in 
 creating two new blocks on the NameNode while the client will know of only 
 one of them. This eventually results in {{NotReplicatedYetException}} because 
 the extra block is never reported by any DataNode, which stalls file creation 
 and puts it in invalid state with an empty block in the middle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4456) Add concat to HttpFS and WebHDFS REST API docs

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569555#comment-13569555
 ] 

Hudson commented on HDFS-4456:
--

Integrated in Hadoop-Mapreduce-trunk #1332 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1332/])
HDFS-4456. Add concat to HttpFS and WebHDFS REST API docs. (plamenj2003 via 
tucu) (Revision 1441603)

 Result = SUCCESS
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441603
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ConcatSourcesParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


 Add concat to HttpFS and WebHDFS REST API docs
 --

 Key: HDFS-4456
 URL: https://issues.apache.org/jira/browse/HDFS-4456
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: webhdfs
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Plamen Jeliazkov
 Fix For: 2.0.3-alpha

 Attachments: HDFS-3598.trunk.docAndHttpFS.patch, 
 HDFS-4456.trunk.docAndHttpFS.patch, HDFS-4456.trunk.docAndHttpFS.patch, 
 HDFS-4456.trunk.genericsRemoval.patch, HDFS-4456.trunk.patch, 
 HDFS-4456.trunk.patch


 HDFS-3598 adds the concat feature to WebHDFS.  The REST API should be updated 
 accordingly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4197) SNN JMX is lacking checkpoint info

2013-02-02 Thread Daisuke Kobayashi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569561#comment-13569561
 ] 

Daisuke Kobayashi commented on HDFS-4197:
-

Just found HDFS-3409 and dup of it.  Does it need to expose the info on the 2NN?

 SNN JMX is lacking checkpoint info
 --

 Key: HDFS-4197
 URL: https://issues.apache.org/jira/browse/HDFS-4197
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.2-alpha
Reporter: Andy Isaacson
Assignee: Daisuke Kobayashi
  Labels: newbie

 The SecondaryNameNode status.jsp page contains the following:
 {noformat}
 SecondaryNameNode Status
 Name Node Address: snn1/172.29.122.91:50020
 Start Time   : Fri Nov 09 09:25:29 PST 2012
 Last Checkpoint Time : Thu Nov 15 15:35:06 PST 2012
 Checkpoint Period: 30 seconds
 Checkpoint Size  : 39.06 KB (= 4 bytes)
 Checkpoint Dirs  : [file:///tmp/hdfs-adi/dfs/namesecondary]
 Checkpoint Edits Dirs: [file:///tmp/hdfs-adi/dfs/namesecondary]
 {noformat}
 The JMX page at {{:50090/jmx}} should also provide this info; at the very 
 least the Last Checkpoint Time and Checkpoint Size so that users are not 
 tempted to scrape the {{status.jsp}} output.  Perhaps I'm missing it but 
 these data seem to be missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4414) Create a DiffReport class to represent the diff between snapshots to end users

2013-02-02 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569634#comment-13569634
 ] 

Suresh Srinivas commented on HDFS-4414:
---

Comments:
# Remove unnecessary import change in DFSClient.java
# In DFSClient.java and DistributedFileSystem.java point to the documentation 
in ClientProtocol#getSnapshotDiffReport() method, instead of repeating the same 
javadoc.
# ClientProtocol.java - Methods getSnapshotDiffReport, allowSnapshot, and 
disallowSnapshot should document the specific expections thrown by the method 
and the condition in which they are thrown. This is unrelated to the change in 
this patch and can be done as a separate jira.
# SnapshotDiffReport class should in o.a.h.protocol package.
# INodeDirectoryWithSnapshot - javadoc typo fromEarlierSnapshot - fromEarlier
# SnapshotDiffReport.java - lineSepearator could be static final variable
# We should create another jira to conver the current implementation into 
iterative report. Otherwise dealing with large set of changes in a single 
response will result in issues. This should work similar to iterative ls 
operation.

+1 with these changes.

 Create a DiffReport class to represent the diff between snapshots to end users
 --

 Key: HDFS-4414
 URL: https://issues.apache.org/jira/browse/HDFS-4414
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-4414.001.patch, HDFS-4414.003.patch, 
 HDFS-4414.004.patch, HDFS-4414+4131.002.patch


 HDFS-4131 computes the difference between two snapshots (or between a 
 snapshot and the current tree). In this jira we create a DiffReport class to 
 represent the diff to end users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4414) Add support for getting snapshot diff from DistributedFileSystem

2013-02-02 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4414:
--

Summary: Add support for getting snapshot diff from DistributedFileSystem  
(was: Create a DiffReport class to represent the diff between snapshots to end 
users)

 Add support for getting snapshot diff from DistributedFileSystem
 

 Key: HDFS-4414
 URL: https://issues.apache.org/jira/browse/HDFS-4414
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-4414.001.patch, HDFS-4414.003.patch, 
 HDFS-4414.004.patch, HDFS-4414+4131.002.patch


 HDFS-4131 computes the difference between two snapshots (or between a 
 snapshot and the current tree). In this jira we create a DiffReport class to 
 represent the diff to end users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4414) Add support for getting snapshot diff from DistributedFileSystem

2013-02-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-4414:


Attachment: HDFS-4414.005.patch

Thanks for the comments Suresh! Upload the new patch. Will create jiras to 
address 37.

 Add support for getting snapshot diff from DistributedFileSystem
 

 Key: HDFS-4414
 URL: https://issues.apache.org/jira/browse/HDFS-4414
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-4414.001.patch, HDFS-4414.003.patch, 
 HDFS-4414.004.patch, HDFS-4414.005.patch, HDFS-4414+4131.002.patch


 HDFS-4131 computes the difference between two snapshots (or between a 
 snapshot and the current tree). In this jira we create a DiffReport class to 
 represent the diff to end users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4414) Add support for getting snapshot diff from DistributedFileSystem

2013-02-02 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-4414.
---

   Resolution: Fixed
Fix Version/s: Snapshot (HDFS-2802)
 Hadoop Flags: Reviewed

Committed the patch to HDFS-2802 branch. Thank you Jing!

 Add support for getting snapshot diff from DistributedFileSystem
 

 Key: HDFS-4414
 URL: https://issues.apache.org/jira/browse/HDFS-4414
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: Snapshot (HDFS-2802)

 Attachments: HDFS-4414.001.patch, HDFS-4414.003.patch, 
 HDFS-4414.004.patch, HDFS-4414.005.patch, HDFS-4414+4131.002.patch


 HDFS-4131 computes the difference between two snapshots (or between a 
 snapshot and the current tree). In this jira we create a DiffReport class to 
 represent the diff to end users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4350) Make enabling of stale marking on read and write paths independent

2013-02-02 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569653#comment-13569653
 ] 

Suresh Srinivas commented on HDFS-4350:
---

+1 for the branch-1 patch as well.

 Make enabling of stale marking on read and write paths independent
 --

 Key: HDFS-4350
 URL: https://issues.apache.org/jira/browse/HDFS-4350
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-4350-1.patch, hdfs-4350-2.patch, hdfs-4350-3.patch, 
 hdfs-4350-4.patch, hdfs-4350-5.patch, hdfs-4350-6.patch, hdfs-4350-7.patch, 
 hdfs-4350-branch-1-1.patch, hdfs-4350-branch-1-2.patch, 
 hdfs-4350-branch-1-3.patch, hdfs-4350.txt


 Marking of datanodes as stale for the read and write path was introduced in 
 HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, 
 {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently 
 exists a dependency, since you cannot enable write marking without also 
 enabling read marking, since the first key enables both checking of staleness 
 and read marking.
 I propose renaming the first key to 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled 
 if either of the keys are set. This will allow read and write marking to be 
 enabled independently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4350) Make enabling of stale marking on read and write paths independent

2013-02-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569654#comment-13569654
 ] 

Hudson commented on HDFS-4350:
--

Integrated in Hadoop-trunk-Commit #3315 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3315/])
HDFS-4350. Make enabling of stale marking on read and write paths 
independent. Contributed by Andrew Wang. (Revision 1441819)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1441819
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/metrics/TestNameNodeMetrics.java


 Make enabling of stale marking on read and write paths independent
 --

 Key: HDFS-4350
 URL: https://issues.apache.org/jira/browse/HDFS-4350
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-4350-1.patch, hdfs-4350-2.patch, hdfs-4350-3.patch, 
 hdfs-4350-4.patch, hdfs-4350-5.patch, hdfs-4350-6.patch, hdfs-4350-7.patch, 
 hdfs-4350-branch-1-1.patch, hdfs-4350-branch-1-2.patch, 
 hdfs-4350-branch-1-3.patch, hdfs-4350.txt


 Marking of datanodes as stale for the read and write path was introduced in 
 HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, 
 {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently 
 exists a dependency, since you cannot enable write marking without also 
 enabling read marking, since the first key enables both checking of staleness 
 and read marking.
 I propose renaming the first key to 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled 
 if either of the keys are set. This will allow read and write marking to be 
 enabled independently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3703) Decrease the datanode failure detection time

2013-02-02 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3703:
--

Fix Version/s: (was: 2.0.3-alpha)
   (was: 3.0.0)
   2.0.2-alpha

 Decrease the datanode failure detection time
 

 Key: HDFS-3703
 URL: https://issues.apache.org/jira/browse/HDFS-3703
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Affects Versions: 1.0.3, 2.0.0-alpha, 3.0.0
Reporter: nkeywal
Assignee: Jing Zhao
 Fix For: 1.1.0, 2.0.2-alpha

 Attachments: 3703-hadoop-1.0.txt, 
 HDFS-3703-branch-1.1-read-only.patch, HDFS-3703-branch-1.1-read-only.patch, 
 HDFS-3703-branch2.patch, HDFS-3703.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-with-write.patch


 By default, if a box dies, the datanode will be marked as dead by the 
 namenode after 10:30 minutes. In the meantime, this datanode will still be 
 proposed  by the nanenode to write blocks or to read replicas. It happens as 
 well if the datanode crashes: there is no shutdown hooks to tell the nanemode 
 we're not there anymore.
 It especially an issue with HBase. HBase regionserver timeout for production 
 is often 30s. So with these configs, when a box dies HBase starts to recover 
 after 30s and, while 10 minutes, the namenode will consider the blocks on the 
 same box as available. Beyond the write errors, this will trigger a lot of 
 missed reads:
 - during the recovery, HBase needs to read the blocks used on the dead box 
 (the ones in the 'HBase Write-Ahead-Log')
 - after the recovery, reading these data blocks (the 'HBase region') will 
 fail 33% of the time with the default number of replica, slowering the data 
 access, especially when the errors are socket timeout (i.e. around 60s most 
 of the time). 
 Globally, it would be ideal if HDFS settings could be under HBase settings. 
 As a side note, HBase relies on ZooKeeper to detect regionservers issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3703) Decrease the datanode failure detection time

2013-02-02 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3703:
--

Fix Version/s: (was: 2.0.2-alpha)
   2.0.3-alpha

 Decrease the datanode failure detection time
 

 Key: HDFS-3703
 URL: https://issues.apache.org/jira/browse/HDFS-3703
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Affects Versions: 1.0.3, 2.0.0-alpha, 3.0.0
Reporter: nkeywal
Assignee: Jing Zhao
 Fix For: 1.1.0, 2.0.3-alpha

 Attachments: 3703-hadoop-1.0.txt, 
 HDFS-3703-branch-1.1-read-only.patch, HDFS-3703-branch-1.1-read-only.patch, 
 HDFS-3703-branch2.patch, HDFS-3703.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, 
 HDFS-3703-trunk-with-write.patch


 By default, if a box dies, the datanode will be marked as dead by the 
 namenode after 10:30 minutes. In the meantime, this datanode will still be 
 proposed  by the nanenode to write blocks or to read replicas. It happens as 
 well if the datanode crashes: there is no shutdown hooks to tell the nanemode 
 we're not there anymore.
 It especially an issue with HBase. HBase regionserver timeout for production 
 is often 30s. So with these configs, when a box dies HBase starts to recover 
 after 30s and, while 10 minutes, the namenode will consider the blocks on the 
 same box as available. Beyond the write errors, this will trigger a lot of 
 missed reads:
 - during the recovery, HBase needs to read the blocks used on the dead box 
 (the ones in the 'HBase Write-Ahead-Log')
 - after the recovery, reading these data blocks (the 'HBase region') will 
 fail 33% of the time with the default number of replica, slowering the data 
 access, especially when the errors are socket timeout (i.e. around 60s most 
 of the time). 
 Globally, it would be ideal if HDFS settings could be under HBase settings. 
 As a side note, HBase relies on ZooKeeper to detect regionservers issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4350) Make enabling of stale marking on read and write paths independent

2013-02-02 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-4350:
--

  Resolution: Fixed
   Fix Version/s: 2.0.3-alpha
  1.2.0
Target Version/s: 1.2.0, 2.0.3-alpha  (was: 1.2.0, 3.0.0)
Release Note: 
This patch makes an incompatible configuration change, as described below:
In releases 1.1.0 and other point releases 1.1.x, the configuration parameter 
dfs.namenode.check.stale.datanode could be used to turn on checking for the 
stale nodes. This configuration is no longer supported in release 1.2.0 onwards 
and is renamed as dfs.namenode.avoid.read.stale.datanode. 

How feature works and configuring this feature:
As described in HDFS-3703 release notes, datanode stale period can be 
configured using parameter dfs.namenode.stale.datanode.interval in seconds 
(default value is 30 seconds). NameNode can be configured to use this staleness 
information for reads using configuration 
dfs.namenode.avoid.read.stale.datanode. When this parameter is set to true, 
namenode picks a stale datanode as the last target to read from when returning 
block locations for reads. Using staleness information for writes is as 
described in the releases notes of HDFS-3912.

Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)
  Status: Resolved  (was: Patch Available)

I committed the patch to trunk, branch-2 and branch-1.

Thank you Andrew!

 Make enabling of stale marking on read and write paths independent
 --

 Key: HDFS-4350
 URL: https://issues.apache.org/jira/browse/HDFS-4350
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 1.2.0, 2.0.3-alpha

 Attachments: hdfs-4350-1.patch, hdfs-4350-2.patch, hdfs-4350-3.patch, 
 hdfs-4350-4.patch, hdfs-4350-5.patch, hdfs-4350-6.patch, hdfs-4350-7.patch, 
 hdfs-4350-branch-1-1.patch, hdfs-4350-branch-1-2.patch, 
 hdfs-4350-branch-1-3.patch, hdfs-4350.txt


 Marking of datanodes as stale for the read and write path was introduced in 
 HDFS-3703 and HDFS-3912 respectively. This is enabled using two new keys, 
 {{DFS_NAMENODE_CHECK_STALE_DATANODE_KEY}} and 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY}}. However, there currently 
 exists a dependency, since you cannot enable write marking without also 
 enabling read marking, since the first key enables both checking of staleness 
 and read marking.
 I propose renaming the first key to 
 {{DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_READ_KEY}}, and make checking enabled 
 if either of the keys are set. This will allow read and write marking to be 
 enabled independently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4343) When storageID of dfs.data.dir of being inconsistent, restart datanode will be failure.

2013-02-02 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-4343.
---

Resolution: Invalid

Closing this jira as Invalid. The behavior in the description is the expected 
behavior. When storage directories are in inconsistent state, invalid or stale 
directories are expected to be cleaned manually, before restarting a datanode.

If you disagree, feel free to reopen the jira. Please justify why you think it 
is a valid jira, when you reopen the jira.

 When storageID of dfs.data.dir of being inconsistent, restart datanode will 
 be failure.
 ---

 Key: HDFS-4343
 URL: https://issues.apache.org/jira/browse/HDFS-4343
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
 Environment: namenode datanode 
Reporter: liuyang
 Attachments: hadoop-root-datanode-167-52-0-55.log, VERSION-1, 
 VERSION-2


 A datanode has multiple storage directories configured using dfs.data.dir. 
 When the storageID in the VERSION files in these directories, the datanode 
 fails to startup. Consider a scenario, when old data in a storage directory 
 is not cleared, the storage ID from it will not match with storage ID of in 
 other storage storage directories. In this situation, the DataNode will quit 
 and restart fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3374) hdfs' TestDelegationToken fails intermittently with a race condition

2013-02-02 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569658#comment-13569658
 ] 

Suresh Srinivas commented on HDFS-3374:
---

bq. I will upload a branch-1 patch to remove the synchronization in 
ExpiredTokenRemover.run().
Can you please do this in a separate jira?

 hdfs' TestDelegationToken fails intermittently with a race condition
 

 Key: HDFS-3374
 URL: https://issues.apache.org/jira/browse/HDFS-3374
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 1.0.3

 Attachments: HDFS-3374-branch-1.0.patch, hdfs-3374.patch, 
 HDFS-3374.patch, HDFS-3374.trunk.patch


 The testcase is failing because the MiniDFSCluster is shutdown before the 
 secret manager can change the key, which calls system.exit with no edit 
 streams available.
 {code}
 [junit] 2012-05-04 15:03:51,521 WARN  common.Storage 
 (FSImage.java:updateRemovedDirs(224)) - Removing storage dir 
 /home/horton/src/hadoop/build/test/data/dfs/name1
 [junit] 2012-05-04 15:03:51,522 FATAL namenode.FSNamesystem 
 (FSEditLog.java:fatalExit(388)) - No edit streams are accessible
 [junit] java.lang.Exception: No edit streams are accessible
 [junit] at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:388)
 [junit] at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.exitIfNoStreams(FSEditLog.java:407)
 [junit] at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.removeEditsAndStorageDir(FSEditLog.java:432)
 [junit] at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.removeEditsStreamsAndStorageDirs(FSEditLog.java:468)
 [junit] at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:1028)
 [junit] at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logUpdateMasterKey(FSNamesystem.java:5641)
 [junit] at 
 org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logUpdateMasterKey(DelegationTokenSecretManager.java:286)
 [junit] at 
 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:150)
 [junit] at 
 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey(AbstractDelegationTokenSecretManager.java:174)
 [junit] at 
 org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:385)
 [junit] at java.lang.Thread.run(Thread.java:662)
 [junit] Running org.apache.hadoop.hdfs.security.TestDelegationToken
 [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
 [junit] Test org.apache.hadoop.hdfs.security.TestDelegationToken FAILED 
 (crashed)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3771) Namenode can't restart due to corrupt edit logs, timing issue with shutdown and edit log rolling

2013-02-02 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3771:
--

Affects Version/s: (was: 2.0.0-alpha)

This isn't needed in 2.x - perhaps the 0.23.x maintainers want to keep this 
open for 0.23.x? Otherwise feel free to close. (I removed the 2.x affects 
version)

 Namenode can't restart due to corrupt edit logs, timing issue with shutdown 
 and edit log rolling
 

 Key: HDFS-3771
 URL: https://issues.apache.org/jira/browse/HDFS-3771
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.3
 Environment: QE, 20 node Federated cluster with 3 NNs and 15 DNs, 
 using Kerberos based security
Reporter: patrick white
Priority: Critical

 Our 0.23.3 nightly HDFS regression suite encountered a particularly nasty 
 issue recently, which resulted in the cluster's default Namenode being unable 
 to restart, this was on a 20 node Federated cluster with security. The cause 
 appears to be that the NN was just starting to roll its edit log when a 
 shutdown occurred, the shutdown was intentional to restart the cluster as 
 part of an automated test.
 The tests that were running do not appear to be the issue in themselves, the 
 cluster was just wrapping up an adminReport subset and this failure case has 
 not reproduce so far, nor was it failing previously. It looks like a chance 
 occurrence of sending the shutdown just as the edit log roll was begun.
 From the NN log, the following sequence is noted:
 1. an InvalidateBlocks operation had completed
 2. FSNamesystem: Roll Edit Log from [Secondary Namenode IPaddr]
 3. FSEditLog: Ending log segment 23963
 4. FSEditLog: Starting log segment at 23967
 4. NameNode: SHUTDOWN_MSG
 = the NN shuts down and then is restarted...
 5. FSImageTransactionalStorageInspector: Logs beginning at txid 23967 were 
 are all in-progress
 6. FSImageTransactionalStorageInspector: Marking log at 
 /grid/[PATH]/edits_inprogress_0023967 as corrupt since it has no 
 transactions in it.
 7. NameNode: Exception in namenode join 
 [main]java.lang.IllegalStateException: No non-corrupt logs for txid 23967
 = NN start attempts continue to cycle trying to restart but can't, failing 
 on the same exception due to lack of non-corrupt edit logs
 If observations are correct and issue is from shutdown happening as edit logs 
 are rolling, does the NN have an equivalent to the conventional fs 'sync' 
 blocking action that should be called, or perhaps has a timing hole?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira