[jira] Commented: (HDFS-63) Datanode stops cleaning disk space

2010-12-13 Thread Prateek Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970798#action_12970798
 ] 

Prateek Sharma commented on HDFS-63:


We are experiencing the same issue in Hadoop 0.18.2. Over the time, one of the 
datanodes gets its disk space clogged, but on subsequent restart of the 
process, the disk usage evens out. Is the issue still persistent in the 
upgraded 0.20 versions?

> Datanode stops cleaning disk space
> --
>
> Key: HDFS-63
> URL: https://issues.apache.org/jira/browse/HDFS-63
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Linux
>Reporter: Igor Bolotin
>Priority: Critical
>
> Here is the situation - DFS cluster running Hadoop version 0.19.0. The 
> cluster is running on multiple servers with practically identical hardware. 
> Everything works perfectly well, except for one thing - from time to time one 
> of the data nodes (every time it's a different node) starts to consume more 
> and more disk space. The node keeps going and if we don't do anything - it 
> runs out of space completely (ignoring 20GB reserved space settings). 
> Once restarted - it cleans disk rapidly and goes back to approximately the 
> same utilization as the rest of data nodes in the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970936#action_12970936
 ] 

Mahadev konar commented on HDFS-1526:
-

was just thinking abt this, would it be better for log parsing if both the 
DFSClient names looked the same? Meaning with and without mapreduce? 

something like:

DFSClient_applicationid_randomint_threadid (where applicationid = 
mapred.task.id or else = "null" or some other constant).

I dont know much abt the dfsclient code so dont know if this would be useful or 
not.



> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970970#action_12970970
 ] 

dhruba borthakur commented on HDFS-1526:


I agree with Mahadev, we could follow his suggestion.

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1536) Improve HDFS WebUI

2010-12-13 Thread Hairong Kuang (JIRA)
Improve HDFS WebUI
--

 Key: HDFS-1536
 URL: https://issues.apache.org/jira/browse/HDFS-1536
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.23.0


1. Make the missing blocks count accurate;
2. Make the under replicated blocks count excluding missing blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1536) Improve HDFS WebUI

2010-12-13 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1536:


Attachment: missingBlocksWebUI.patch

The patch adds a separate queue in neededReplication for replicas that has only 
zero replica. The missing block count is set to be the size of this queue, 
while the under-replication blocks count is equal to the number of the rest of 
under-replicated blocks.

> Improve HDFS WebUI
> --
>
> Key: HDFS-1536
> URL: https://issues.apache.org/jira/browse/HDFS-1536
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: missingBlocksWebUI.patch
>
>
> 1. Make the missing blocks count accurate;
> 2. Make the under replicated blocks count excluding missing blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971062#action_12971062
 ] 

Hairong Kuang commented on HDFS-1526:
-

Which constant string we should choose? null seems not user friendly. How about 
an empty string or "NONMAPREDUCE"?

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971066#action_12971066
 ] 

Mahadev konar commented on HDFS-1526:
-

agreed. NONMAPREDUCE seems fine (or you can use MAHADEV, just kidding!!  :) )

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-1526:


Attachment: randClientId3.patch

Here is the patch that uses "NONMAPREDUCE" as app name for dfs clients that are 
not mapreduce tasks. (MAHADEV sounds good to me too. :)

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch, randClientId3.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated HDFS-1526:


Status: Open  (was: Patch Available)

resubmitting for hudson.

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch, randClientId3.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971083#action_12971083
 ] 

Mahadev konar commented on HDFS-1526:
-

+1 the patch looks good. 

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch, randClientId3.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated HDFS-1526:


Status: Patch Available  (was: Open)

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch, randClientId3.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971113#action_12971113
 ] 

Hairong Kuang commented on HDFS-1526:
-

 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] -1 tests included.  The patch doesn't appe
 [exec] ar to include any new or modified tests.
 [exec] Please justify why no new tests are needed 
for this patch.
 [exec] Also please list what manual steps were 
performed to verify this patch.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 system test framework.  The patch passed system test 
framework compile

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch, randClientId3.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1115) DFSClient unable to create new block

2010-12-13 Thread sanford rockowitz (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971176#action_12971176
 ] 

sanford rockowitz commented on HDFS-1115:
-

Similar problem, under openSuSE 11.3.   Solved the problem by installing an 
Oracle JDK and ensuring JAVA_HOME enivonment set properly in daemons as well as 
client.  

> DFSClient unable to create new block
> 
>
> Key: HDFS-1115
> URL: https://issues.apache.org/jira/browse/HDFS-1115
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.20.2
> Environment: OpenSuse 11.2 running as a Virtual Machine on Windows 
> Vista
>Reporter: manas
>
> Here, input is a folder containing all .xml files from ./conf  
> Then trying the command:
> ./bin/hadoop fs -copyFromLocal input input
> The following message is displayed: 
> {noformat}
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Operation not supported
> INFO hdfs.DFSClient: Abandoning block blk_-1884214035513073759_1010
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Protocol not available
> INFO hdfs.DFSClient: Abandoning block blk_5533397873275401028_1010
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Protocol not available
> INFO hdfs.DFSClient: Abandoning block blk_-237603871573204731_1011
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Protocol not available
> INFO hdfs.DFSClient: Abandoning block blk_-8668593183126057334_1011
> WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to 
> create new block.
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
> WARN hdfs.DFSClient: Error Recovery for block blk_-8668593183126057334_1011 
> bad datanode[0] nodes == null
> WARN hdfs.DFSClient: Could not get block locations. Source file 
> "/user/max/input/core-site.xml" - Aborting...
> copyFromLocal: Protocol not available
> ERROR hdfs.DFSClient: Exception closing file /user/max/input/core-site.xml : 
> java.net.SocketException: Protocol not available
> java.net.SocketException: Protocol not available
> at sun.nio.ch.Net.getIntOption0(Native Method)
> at sun.nio.ch.Net.getIntOption(Net.java:178)
> at sun.nio.ch.SocketChannelImpl$1.getInt(SocketChannelImpl.java:419)
> at sun.nio.ch.SocketOptsImpl.getInt(SocketOptsImpl.java:60)
> at sun.nio.ch.SocketOptsImpl.sendBufferSize(SocketOptsImpl.java:156)
> at 
> sun.nio.ch.SocketOptsImpl$IP$TCP.sendBufferSize(SocketOptsImpl.java:286)
> at sun.nio.ch.OptionAdaptor.getSendBufferSize(OptionAdaptor.java:129)
> at sun.nio.ch.SocketAdaptor.getSendBufferSize(SocketAdaptor.java:328)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2873)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Operation not supported
> INFO hdfs.DFSClient: Abandoning block blk_-1884214035513073759_1010
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Protocol not available
> INFO hdfs.DFSClient: Abandoning block blk_5533397873275401028_1010
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Protocol not available
> INFO hdfs.DFSClient: Abandoning block blk_-237603871573204731_1011
> INFO hdfs.DFSClient: Exception in createBlockOutputStream 
> java.net.SocketException: Protocol not available
> INFO hdfs.DFSClient: Abandoning block blk_-8668593183126057334_1011
> WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to 
> create new block.
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
> WARN hdfs.DFSClient: Error Recovery for block blk_-8668593183126057334_1011 
> bad datanode[0] nodes == null
> WARN hdfs.DFSClient: Could not get block locations. Source file 
> "/user/max/input/core-site.xml" - Aborting...
> 

[jira] Commented: (HDFS-1536) Improve HDFS WebUI

2010-12-13 Thread Nigel Daley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971180#action_12971180
 ] 

Nigel Daley commented on HDFS-1536:
---

Hairong, can you add a tooltip that describes the new meaning of this value in 
context?  Something like:

{code}
private String colTxt(String title) {
  return " ";
}
...
colTxt("Excludes missing blocks.")
{code}

We should probably do this for other fields too, but that's a separate jira.

> Improve HDFS WebUI
> --
>
> Key: HDFS-1536
> URL: https://issues.apache.org/jira/browse/HDFS-1536
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: missingBlocksWebUI.patch
>
>
> 1. Make the missing blocks count accurate;
> 2. Make the under replicated blocks count excluding missing blocks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1526) Dfs client name for a map/reduce task should have some randomness

2010-12-13 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971187#action_12971187
 ] 

Hairong Kuang commented on HDFS-1526:
-

All unit tests passed except for known failed ones.

> Dfs client name for a map/reduce task should have some randomness
> -
>
> Key: HDFS-1526
> URL: https://issues.apache.org/jira/browse/HDFS-1526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.23.0
>
> Attachments: clientName.patch, randClientId1.patch, 
> randClientId2.patch, randClientId3.patch
>
>
> Fsck shows one of the files in our dfs cluster is corrupt.
> /bin/hadoop fsck aFile -files -blocks -locations
> aFile: 4633 bytes, 2 block(s): 
> aFile: CORRUPT block blk_-4597378336099313975
> OK
> 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...]
> 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT
> On disk, these two blocks are of the same size and the same content. It turns 
> out the writer of the file is from a multiple threaded map task. Each thread 
> may write to the same file. One possible interaction of two threads might 
> make this to happen:
> [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to 
> aFile][T2: addBlock1 to aFile]...
> Because T1 and T2 have the same client name, which is the map task id, the 
> above interactions could be done without any lease exception, thus eventually 
> leading to a corrupt file. To solve the problem, a mapreduce task's client 
> name could be formed by its task id followed by a random number.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-981) test-contrib fails due to test-cactus failure

2010-12-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971193#action_12971193
 ] 

Todd Lipcon commented on HDFS-981:
--

This seems to be the case again - the apache mirror we use only keeps the 
latest version, not back versions.

> test-contrib fails due to test-cactus failure
> -
>
> Key: HDFS-981
> URL: https://issues.apache.org/jira/browse/HDFS-981
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: contrib/hdfsproxy
>Affects Versions: 0.22.0
>Reporter: Eli Collins
> Fix For: 0.22.0
>
>
> Relevant output from a recent run 
> http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/232/console
>  [exec] BUILD FAILED
>  [exec] 
> /grid/0/hudson/hudson-slave/workspace/Hdfs-Patch-h5.grid.sp2.yahoo.net/trunk/build.xml:568:
>  The following error occurred while executing this line:
>  [exec] 
> /grid/0/hudson/hudson-slave/workspace/Hdfs-Patch-h5.grid.sp2.yahoo.net/trunk/src/contrib/build.xml:48:
>  The following error occurred while executing this line:
>  [exec] 
> /grid/0/hudson/hudson-slave/workspace/Hdfs-Patch-h5.grid.sp2.yahoo.net/trunk/src/contrib/hdfsproxy/build.xml:292:
>  org.codehaus.cargo.container.ContainerException: Failed to download 
> [http://apache.osuosl.org/tomcat/tomcat-6/v6.0.18/bin/apache-tomcat-6.0.18.zip]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.