date:20091113

[jira] Created: (HDFS-770) SocketTimeoutException: timeout while waiting for channel to be ready for read

2009-11-13 Thread Leon Mergen (JIRA)

SocketTimeoutException: timeout while waiting for channel to be ready for read
--

 Key: HDFS-770
 URL: https://issues.apache.org/jira/browse/HDFS-770
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs, data-node, hdfs client, name-node
Affects Versions: 0.20.1
 Environment: Ubuntu Linux 8.04
Reporter: Leon Mergen
 Attachments: client.txt, datanode.txt, namenode.txt

We're having issues with timeouts occurring in our client: for some reason, a 
timeout of 63000 milliseconds is triggered while writing HDFS data. Since we 
currently have a single-server setup, this results in our client terminating 
with a All datanodes are bad IOException.

We're running all services, including the client, on our single server, so it 
cannot be a network error. The load on the client is extremely low during this 
period: only a few kilobytes a minute were being written around the time the 
error occured. 

After browsing a bit online, a lot of people talk about setting 
dfs.datanode.socket.write.timeout to 0 as a solution for this problem. Due to 
the low load of our system during this period, however, I do feel this is a 
real error and a timeout that should not be occurring. I have attached 3 logs 
of the namenode, datanode and client.

It could be that this is related to 
http://issues.apache.org/jira/browse/HDFS-693

Any pointers on how I can assist to resolve this issue will be greatly 
appreciated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-770) SocketTimeoutException: timeout while waiting for channel to be ready for read

2009-11-13 Thread Leon Mergen (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Leon Mergen updated HDFS-770:
-

Attachment: client.txt
namenode.txt
datanode.txt

SocketTimeoutException: timeout while waiting for channel to be ready for read
--

Key: HDFS-770
URL: https://issues.apache.org/jira/browse/HDFS-770
Project: Hadoop HDFS
Issue Type: Bug
Components: contrib/libhdfs, data-node, hdfs client, name-node
Affects Versions: 0.20.1
Environment: Ubuntu Linux 8.04
Reporter: Leon Mergen
Attachments: client.txt, datanode.txt, namenode.txt

We're having issues with timeouts occurring in our client: for some reason, a
timeout of 63000 milliseconds is triggered while writing HDFS data. Since we
currently have a single-server setup, this results in our client terminating
with a All datanodes are bad IOException.
We're running all services, including the client, on our single server, so it
cannot be a network error. The load on the client is extremely low during
this period: only a few kilobytes a minute were being written around the time
the error occured.
After browsing a bit online, a lot of people talk about setting
dfs.datanode.socket.write.timeout to 0 as a solution for this problem. Due
to the low load of our system during this period, however, I do feel this is
a real error and a timeout that should not be occurring. I have attached 3
logs of the namenode, datanode and client.
It could be that this is related to
http://issues.apache.org/jira/browse/HDFS-693
Any pointers on how I can assist to resolve this issue will be greatly
appreciated.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-763:
--

Status: Open  (was: Patch Available)

 DataBlockScanner reporting of bad blocks is slightly misleading
 ---

 Key: HDFS-763
 URL: https://issues.apache.org/jira/browse/HDFS-763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: scanErrors.txt, scanErrors.txt


 The Datanode generates a report of the period block scanning that verifies 
 crcs. It reports something like the following:
 Scans since restart : 192266
 Scan errors since restart : 33
 Transient scan errors : 0
 The statement saying that there were 33 errors is slightly midleading because 
 these are not crc mismatches, rather the block was being deleted when the crc 
 verification was about to happen. 
 I propose that DataBlockScanner.totalScanErrors is not updated if the 
 dataset.getFile(block) is null, i.e. the block is now deleted from the 
 datanode. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-763:
--

Status: Patch Available  (was: Open)

Trigger HadoopQA tests

 DataBlockScanner reporting of bad blocks is slightly misleading
 ---

 Key: HDFS-763
 URL: https://issues.apache.org/jira/browse/HDFS-763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: scanErrors.txt, scanErrors.txt


 The Datanode generates a report of the period block scanning that verifies 
 crcs. It reports something like the following:
 Scans since restart : 192266
 Scan errors since restart : 33
 Transient scan errors : 0
 The statement saying that there were 33 errors is slightly midleading because 
 these are not crc mismatches, rather the block was being deleted when the crc 
 verification was about to happen. 
 I propose that DataBlockScanner.totalScanErrors is not updated if the 
 dataset.getFile(block) is null, i.e. the block is now deleted from the 
 datanode. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-94) The Heap Size in HDFS web ui may not be accurate

2009-11-13 Thread dhruba borthakur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777438#action_12777438
 ] 

dhruba borthakur commented on HDFS-94:
--

Currently, the code uses
{quote}
long totalMemory = Runtime.getRuntime().totalMemory();
long maxMemory = Runtime.getRuntime().maxMemory();
long used = (totalMemory * 100)/maxMemory;

{quote}

Is it better to use :

{quote}
MemoryMXBean memoryMXBean = ManagementFactory.getMemoryMXBean();
MemoryUsage status = memoryMXBean.getHeapMemoryUsage();
usedMemory = status.getUsed();
maxMemory = status.getMax();
{quote}

 The Heap Size in HDFS web ui may not be accurate
 --

 Key: HDFS-94
 URL: https://issues.apache.org/jira/browse/HDFS-94
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Tsz Wo (Nicholas), SZE

 It seems that the Heap Size shown in HDFS web UI is not accurate.  It keeps 
 showing 100% of usage.  e.g.
 {noformat}
 Heap Size is 10.01 GB / 10.01 GB (100%) 
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777441#action_12777441
]

Hadoop QA commented on HDFS-763:

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12424762/scanErrors.txt
against trunk revision 835752.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 patch. The patch command could not apply the patch.

Console output:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/109/console

This message is automatically generated.

DataBlockScanner reporting of bad blocks is slightly misleading
---

Key: HDFS-763
URL: https://issues.apache.org/jira/browse/HDFS-763
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Attachments: scanErrors.txt, scanErrors.txt

The Datanode generates a report of the period block scanning that verifies
crcs. It reports something like the following:
Scans since restart : 192266
Scan errors since restart : 33
Transient scan errors : 0
The statement saying that there were 33 errors is slightly midleading because
these are not crc mismatches, rather the block was being deleted when the crc
verification was about to happen.
I propose that DataBlockScanner.totalScanErrors is not updated if the
dataset.getFile(block) is null, i.e. the block is now deleted from the
datanode.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-763:
--

Attachment: scanErrors.txt

 DataBlockScanner reporting of bad blocks is slightly misleading
 ---

 Key: HDFS-763
 URL: https://issues.apache.org/jira/browse/HDFS-763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt


 The Datanode generates a report of the period block scanning that verifies 
 crcs. It reports something like the following:
 Scans since restart : 192266
 Scan errors since restart : 33
 Transient scan errors : 0
 The statement saying that there were 33 errors is slightly midleading because 
 these are not crc mismatches, rather the block was being deleted when the crc 
 verification was about to happen. 
 I propose that DataBlockScanner.totalScanErrors is not updated if the 
 dataset.getFile(block) is null, i.e. the block is now deleted from the 
 datanode. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-763:
--

Status: Patch Available  (was: Open)

Trigger HadoopQA.

 DataBlockScanner reporting of bad blocks is slightly misleading
 ---

 Key: HDFS-763
 URL: https://issues.apache.org/jira/browse/HDFS-763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt


 The Datanode generates a report of the period block scanning that verifies 
 crcs. It reports something like the following:
 Scans since restart : 192266
 Scan errors since restart : 33
 Transient scan errors : 0
 The statement saying that there were 33 errors is slightly midleading because 
 these are not crc mismatches, rather the block was being deleted when the crc 
 verification was about to happen. 
 I propose that DataBlockScanner.totalScanErrors is not updated if the 
 dataset.getFile(block) is null, i.e. the block is now deleted from the 
 datanode. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-763:
--

Status: Open  (was: Patch Available)

 DataBlockScanner reporting of bad blocks is slightly misleading
 ---

 Key: HDFS-763
 URL: https://issues.apache.org/jira/browse/HDFS-763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt


 The Datanode generates a report of the period block scanning that verifies 
 crcs. It reports something like the following:
 Scans since restart : 192266
 Scan errors since restart : 33
 Transient scan errors : 0
 The statement saying that there were 33 errors is slightly midleading because 
 these are not crc mismatches, rather the block was being deleted when the crc 
 verification was about to happen. 
 I propose that DataBlockScanner.totalScanErrors is not updated if the 
 dataset.getFile(block) is null, i.e. the block is now deleted from the 
 datanode. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777471#action_12777471
]

Hadoop QA commented on HDFS-763:

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12424830/scanErrors.txt
against trunk revision 835752.

+1 @author. The patch does not contain any @author tags.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/console

This message is automatically generated.

DataBlockScanner reporting of bad blocks is slightly misleading
---

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-769) test-c++-libhdfs constantly fails

2009-11-13 Thread dhruba borthakur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777473#action_12777473
 ] 

dhruba borthakur commented on HDFS-769:
---

I will try to take a look at this one.

 test-c++-libhdfs constantly fails
 -

 Key: HDFS-769
 URL: https://issues.apache.org/jira/browse/HDFS-769
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
Reporter: Konstantin Boudnik

 Execution of {{test-c++-libhdfs}} always fails.
 Running 
 {noformat}
   % ant test-c++-libhdfs -Dcompile.c++=yes -Dlibhdfs=yes
 {noformat}
 fails with the following diagnostic:
 {noformat}
 test-c++-libhdfs:
 [mkdir] Created dir: /homes/xxx/work/Hdfs.trunk/build/test/libhdfs
 ...
  [exec] /homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/test-libhdfs.sh
  [exec] 
 
  [exec] LIB_JVM_DIR = /usr/java/latest/jre/lib/i386/server
  [exec] 
 
  [exec] /homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/test-libhdfs.sh: 
 line 118: /homes/xxx/work/Hdfs.trunk/bin/hadoop: No such file or directory
  [exec] 
 CLASSPATH=/homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/conf:/homes/xxx/work/Hdfs.trunk/conf:/homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/conf:/homes/cot
  [exec] Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/conf/Configuration
  [exec] Can't construct instance of class 
 org.apache.hadoop.conf.Configuration
  [exec] Oops! Failed to connect to hdfs!
  [exec] exiting with 255
  [exec] /homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/test-libhdfs.sh: 
 line 126: /homes/xxx/work/Hdfs.trunk/bin/hadoop-daemon.sh: No such file or 
 directory
  [exec] make: *** [test] Error 255
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.

2009-11-13 Thread Cosmin Lehene (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777500#action_12777500
 ] 

Cosmin Lehene commented on HDFS-630:


stack: I can't reproduce it on 0.21. I did find it in the NN log before 
upgrading the HBase jar to the patched hdfs. 

java.io.IOException: Cannot complete block: block has not been COMMITTED by the 
client
at 
org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction.convertToCompleteBlock(BlockInfoUnderConstruction.java:158)
at 
org.apache.hadoop.hdfs.server.namenode.BlockManager.completeBlock(BlockManager.java:288)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1243)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:637)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621)
at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:516)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:964)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:960)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:958)

I should point that 
 at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621)

line 621 in the NameNode means it was called from an unpached DFSClient that 
calls the old NameNode interface
line 621: return addBlock(src, clientName, null, null); 

This is part of  public LocatedBlock addBlock(String src, String clientName, 
Block previous)

  @Override
  public LocatedBlock addBlock(String src, String clientName,
   Block previous)
throws IOException {
return addBlock(src, clientName, null, null);
  }

This is different than your stacktrace http://pastie.org/695936 that calls the 
complete() method. 

However could you search for the same error while adding a new block with 
addBlock() (like mine)? If you find it, you could figure out what's the entry 
point in NameNode, and if it's line 621 you might have a an unpatched 
DFSClient. 

However, even with an unpatched DFSClient I still fail, yet, to figure out why 
would it cause it. Perhaps I should get a better understanding of the cause of 
the exception. So far, from the code comments in BlockInfoUnderConstruction I 
have that
the state of the block  (the generation stamp and the length) has not been 
committed by the client or it does not have at least a minimal number of 
replicas reported from data-nodes. 

 In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
 datanodes when locating the next block.
 ---

 Key: HDFS-630
 URL: https://issues.apache.org/jira/browse/HDFS-630
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs client
Affects Versions: 0.21.0
Reporter: Ruyue Ma
Assignee: Ruyue Ma
Priority: Minor
 Fix For: 0.21.0

 Attachments: 0001-Fix-HDFS-630-for-0.21.patch, HDFS-630.patch


 created from hdfs-200.
 If during a write, the dfsclient sees that a block replica location for a 
 newly allocated block is not-connectable, it re-requests the NN to get a 
 fresh set of replica locations of the block. It tries this 
 dfs.client.block.write.retries times (default 3), sleeping 6 seconds between 
 each retry ( see DFSClient.nextBlockOutputStream).
 This setting works well when you have a reasonable size cluster; if u have 
 few datanodes in the cluster, every retry maybe pick the dead-datanode and 
 the above logic bails out.
 Our solution: when getting block location from namenode, we give nn the 
 excluded datanodes. The list of dead datanodes is only for one block 
 allocation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-771) Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode

2009-11-13 Thread Konstantin Boudnik (JIRA)

Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode


 Key: HDFS-771
 URL: https://issues.apache.org/jira/browse/HDFS-771
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
 Environment: Apache Hudson build machine
Reporter: Konstantin Boudnik


In an execution of a tests the following exception has been thrown:
{noformat}
Error Message

port out of range:-1

Stacktrace
java.lang.IllegalArgumentException: port out of range:-1
at java.net.InetSocketAddress.init(InetSocketAddress.java:118)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131)
at 
org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92)
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-771) Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode

2009-11-13 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-771:


Attachment: testEditLog.html

Full log of the test execution

 Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode
 

 Key: HDFS-771
 URL: https://issues.apache.org/jira/browse/HDFS-771
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
 Environment: Apache Hudson build machine
Reporter: Konstantin Boudnik
 Attachments: testEditLog.html


 In an execution of a tests the following exception has been thrown:
 {noformat}
 Error Message
 port out of range:-1
 Stacktrace
 java.lang.IllegalArgumentException: port out of range:-1
   at java.net.InetSocketAddress.init(InetSocketAddress.java:118)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-771) Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode

2009-11-13 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777544#action_12777544
 ] 

Konstantin Boudnik commented on HDFS-771:
-

The environment and all are available from [Hudson 
build|http://hudson.zones.apache.org/hudson/view/Hadoop/job/Hadoop-Hdfs-trunk-Commit/109/testReport/org.apache.hadoop.hdfs.server.namenode/TestEditLog/testEditLog/]

 Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode
 

 Key: HDFS-771
 URL: https://issues.apache.org/jira/browse/HDFS-771
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
 Environment: Apache Hudson build machine
Reporter: Konstantin Boudnik
 Attachments: testEditLog.html


 In an execution of a tests the following exception has been thrown:
 {noformat}
 Error Message
 port out of range:-1
 Stacktrace
 java.lang.IllegalArgumentException: port out of range:-1
   at java.net.InetSocketAddress.init(InetSocketAddress.java:118)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread Raghu Angadi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777605#action_12777605
 ] 

Raghu Angadi commented on HDFS-763:
---

+1.

totalErrors shown on 'blockScannerReport' now becomes same as number of 
verification failures, rather than all the errors seen. 


 DataBlockScanner reporting of bad blocks is slightly misleading
 ---

 Key: HDFS-763
 URL: https://issues.apache.org/jira/browse/HDFS-763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt


 The Datanode generates a report of the period block scanning that verifies 
 crcs. It reports something like the following:
 Scans since restart : 192266
 Scan errors since restart : 33
 Transient scan errors : 0
 The statement saying that there were 33 errors is slightly midleading because 
 these are not crc mismatches, rather the block was being deleted when the crc 
 verification was about to happen. 
 I propose that DataBlockScanner.totalScanErrors is not updated if the 
 dataset.getFile(block) is null, i.e. the block is now deleted from the 
 datanode. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-733) TestBlockReport fails intermittently

2009-11-13 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777610#action_12777610
 ] 

Konstantin Boudnik commented on HDFS-733:
-

I've ran the patched test on the Hudson hardware a few times and everything 
seems to be all right - no failures are seeing. I'm going to commit this 
shortly.

 TestBlockReport fails intermittently
 

 Key: HDFS-733
 URL: https://issues.apache.org/jira/browse/HDFS-733
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Suresh Srinivas
Assignee: Konstantin Boudnik
 Fix For: 0.21.0, 0.22.0

 Attachments: HDFS-733.2.patch, HDFS-733.patch, HDFS-733.patch, 
 HDFS-733.patch, HDFS-733.patch


 Details at 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/58/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-770) SocketTimeoutException: timeout while waiting for channel to be ready for read

2009-11-13 Thread Raghu Angadi (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777612#action_12777612
]

Raghu Angadi commented on HDFS-770:
---

From the datanode log :

2009-11-13 06:18:21,965 DEBUG org.apache.hadoop.ipc.RPC: Call: sendHeartbeat
14
2009-11-13 06:19:38,081 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47)
connection to dfs.hadoop.tsukku.solatis/127.0.0.1:9000 from hadoop: closed

Note that there is no activity on DataNode for 77 seconds. There are number of
possibilities, common one being GC. we haven't seen GC taking this long DN
though.

Assuming DN went to sleep for some reason, rest of the behaviour is expected.
If you do expect such delays, what you need to increase is the read timeout for
responder thread in DFSOutputStream (there is a config for generic read
timeout that applies to sockets in many contexts).

SocketTimeoutException: timeout while waiting for channel to be ready for read
--

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-733) TestBlockReport fails intermittently

2009-11-13 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-733:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

The fix is committed to the trunk and to the branch 0.21

 TestBlockReport fails intermittently
 

 Key: HDFS-733
 URL: https://issues.apache.org/jira/browse/HDFS-733
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Suresh Srinivas
Assignee: Konstantin Boudnik
 Fix For: 0.21.0, 0.22.0

 Attachments: HDFS-733.2.patch, HDFS-733.patch, HDFS-733.patch, 
 HDFS-733.patch, HDFS-733.patch


 Details at 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/58/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-733) TestBlockReport fails intermittently

2009-11-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777627#action_12777627
 ] 

Hudson commented on HDFS-733:
-

Integrated in Hadoop-Hdfs-trunk-Commit #110 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/110/])
. TestBlockReport fails intermittently (cos)


 TestBlockReport fails intermittently
 

 Key: HDFS-733
 URL: https://issues.apache.org/jira/browse/HDFS-733
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Suresh Srinivas
Assignee: Konstantin Boudnik
 Fix For: 0.21.0, 0.22.0

 Attachments: HDFS-733.2.patch, HDFS-733.patch, HDFS-733.patch, 
 HDFS-733.patch, HDFS-733.patch


 Details at 
 http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/58/testReport/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-767) Job failure due to BlockMissingException

2009-11-13 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777635#action_12777635
]

Todd Lipcon commented on HDFS-767:
--

Hi Ning,

Sounds good - your formula seems to make sense. If you can add a few lines of
comments around the formula (or a pointer to this JIRA) I think that would be
helpful to make sure people looking at the code down the line will understand
where it came from.

Additionally, I think making the 3000 parameter a configuration variable (even
if an undocumented one) would be swell.

Job failure due to BlockMissingException

Key: HDFS-767
URL: https://issues.apache.org/jira/browse/HDFS-767
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: Ning Zhang

If a block is request by too many mappers/reducers (say, 3000) at the same
time, a BlockMissingException is thrown because it exceeds the upper limit (I
think 256 by default) of number of threads accessing the same block at the
same time. The DFSClient wil catch that exception and retry 3 times after
waiting for 3 seconds. Since the wait time is a fixed value, a lot of clients
will retry at about the same time and a large portion of them get another
failure. After 3 retries, there are about 256*4 = 1024 clients got the block.
If the number of clients are more than that, the job will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-641) Move all of the benchmarks and tests that depend on mapreduce to mapreduce

2009-11-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777647#action_12777647
 ] 

Hudson commented on HDFS-641:
-

Integrated in Hadoop-Mapreduce-trunk-Commit #118 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/118/])
. Move all of the components that depend on map/reduce to 
map/reduce. (omalley)


 Move all of the benchmarks and tests that depend on mapreduce to mapreduce
 --

 Key: HDFS-641
 URL: https://issues.apache.org/jira/browse/HDFS-641
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.2
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.21.0


 Currently, we have a bad cycle where to build hdfs you need to test mapreduce 
 and iterate once. This is broken.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-706) Intermittent failures in TestFiHFlush

2009-11-13 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-706:


Assignee: Konstantin Boudnik
Hadoop Flags: [Reviewed]
  Status: Patch Available  (was: Open)

+1 patch looks good.

 Intermittent failures in TestFiHFlush
 -

 Key: HDFS-706
 URL: https://issues.apache.org/jira/browse/HDFS-706
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Attachments: HDFS-706.patch, 
 TEST-org.apache.hadoop.hdfs.TestHFlush.txt


 Running tests on a Linux box I've started seeing intermittent failures among 
 TestFiHFlush test cases. 
 It turns out that occasional failures are observed on my laptop running BSD

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-13 Thread Raghu Angadi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1215#action_1215
 ] 

Raghu Angadi commented on HDFS-763:
---


I don't think this needs an extra unit test. That stat affected here is only 
for display purposes and also not related to stats reported to stats servers 
like simon.

 DataBlockScanner reporting of bad blocks is slightly misleading
 ---

 Key: HDFS-763
 URL: https://issues.apache.org/jira/browse/HDFS-763
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.1
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt


 The Datanode generates a report of the period block scanning that verifies 
 crcs. It reports something like the following:
 Scans since restart : 192266
 Scan errors since restart : 33
 Transient scan errors : 0
 The statement saying that there were 33 errors is slightly midleading because 
 these are not crc mismatches, rather the block was being deleted when the crc 
 verification was about to happen. 
 I propose that DataBlockScanner.totalScanErrors is not updated if the 
 dataset.getFile(block) is null, i.e. the block is now deleted from the 
 datanode. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-94) The Heap Size in HDFS web ui may not be accurate

2009-11-13 Thread Tsz Wo (Nicholas), SZE (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1222#action_1222
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-94:


It makes to replace the current codes by MemoryMXBean since it provides more 
information.  I think it is better to show more numbers like non-heap usage, 
init, used, committed, max, etc.

 The Heap Size in HDFS web ui may not be accurate
 --

 Key: HDFS-94
 URL: https://issues.apache.org/jira/browse/HDFS-94
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Tsz Wo (Nicholas), SZE

 It seems that the Heap Size shown in HDFS web UI is not accurate.  It keeps 
 showing 100% of usage.  e.g.
 {noformat}
 Heap Size is 10.01 GB / 10.01 GB (100%) 
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-706) Intermittent failures in TestFiHFlush

2009-11-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1224#action_1224
]

Hadoop QA commented on HDFS-706:

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12424811/HDFS-706.patch
against trunk revision 835958.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 9 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/console

This message is automatically generated.

Intermittent failures in TestFiHFlush
-

Key: HDFS-706
URL: https://issues.apache.org/jira/browse/HDFS-706
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
Attachments: HDFS-706.patch, HDFS-706.patch,
TEST-org.apache.hadoop.hdfs.TestHFlush.txt

Running tests on a Linux box I've started seeing intermittent failures among
TestFiHFlush test cases.
It turns out that occasional failures are observed on my laptop running BSD

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-706) Intermittent failures in TestFiHFlush

2009-11-13 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1235#action_1235
 ] 

Konstantin Boudnik commented on HDFS-706:
-

The test failure is irrelevant

 Intermittent failures in TestFiHFlush
 -

 Key: HDFS-706
 URL: https://issues.apache.org/jira/browse/HDFS-706
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Attachments: HDFS-706.patch, HDFS-706.patch, 
 TEST-org.apache.hadoop.hdfs.TestHFlush.txt


 Running tests on a Linux box I've started seeing intermittent failures among 
 TestFiHFlush test cases. 
 It turns out that occasional failures are observed on my laptop running BSD

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-772) DFSClient.getFileChecksum(..) computes file md5 with extra padding

2009-11-13 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-772:


Attachment: h772_20091113.patch

h772_20091113.patch: use md5out.getLength() to limit the data.

 DFSClient.getFileChecksum(..) computes file md5 with extra padding
 --

 Key: HDFS-772
 URL: https://issues.apache.org/jira/browse/HDFS-772
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.20.1
Reporter: Tsz Wo (Nicholas), SZE
 Attachments: h772_20091113.patch


 {code}
 //DFSClient.getFileChecksum(..)
 final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData()); 
 {code}
 The fileMD5 is computed with the entire byte array returning by 
 md5out.getData().  However, data are valid only up to md5out.getLength().  
 Therefore, the currently implementation of the algorithm compute fileMD5 with 
 extra padding.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-758) Improve reporting of progress of decommissioning

2009-11-13 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-758:
--

Attachment: HDFS-758.1.patch

 Improve reporting of progress of decommissioning
 

 Key: HDFS-758
 URL: https://issues.apache.org/jira/browse/HDFS-758
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
 Attachments: HDFS-758.1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HDFS-641) Move all of the benchmarks and tests that depend on mapreduce to mapreduce

2009-11-13 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-641.


Resolution: Fixed

I just committed this.

 Move all of the benchmarks and tests that depend on mapreduce to mapreduce
 --

 Key: HDFS-641
 URL: https://issues.apache.org/jira/browse/HDFS-641
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.2
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.21.0


 Currently, we have a bad cycle where to build hdfs you need to test mapreduce 
 and iterate once. This is broken.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-596) Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory for mOwner and mGroup

2009-11-13 Thread Christian Kunz (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1265#action_1265
 ] 

Christian Kunz commented on HDFS-596:
-

We just hit this bug big time. Our applications ran out of memory.
We will have to apply this patch ourselves.
Why was this issue not declared as blocker? 
This is a bserious/b memory issue introduced  between hadoop-0.18 and 
hadoop-0.20.

 Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory 
 for mOwner and mGroup
 

 Key: HDFS-596
 URL: https://issues.apache.org/jira/browse/HDFS-596
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/fuse-dfs
Affects Versions: 0.20.1
 Environment: Linux hadoop-001 2.6.28-14-server #47-Ubuntu SMP Sat Jul 
 25 01:18:34 UTC 2009 i686 GNU/Linux. Namenode with 1GB memory. 
Reporter: Zhang Bingjun
Priority: Critical
 Fix For: 0.20.2

 Attachments: HDFS-596.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 This bugs affects fuse-dfs severely. In my test, about 1GB memory were 
 exhausted and the fuse-dfs mount directory was disconnected after writing 
 14000 files. This bug is related to the memory leak problem of this issue: 
 http://issues.apache.org/jira/browse/HDFS-420. 
 The bug can be fixed very easily. In function hdfsFreeFileInfo() in file 
 hdfs.c (under c++/libhdfs/) change code block:
 //Free the mName
 int i;
 for (i=0; i  numEntries; ++i) {
 if (hdfsFileInfo[i].mName) {
 free(hdfsFileInfo[i].mName);
 }
 }
 into:
 // free mName, mOwner and mGroup
 int i;
 for (i=0; i  numEntries; ++i) {
 if (hdfsFileInfo[i].mName) {
 free(hdfsFileInfo[i].mName);
 }
 if (hdfsFileInfo[i].mOwner){
 free(hdfsFileInfo[i].mOwner);
 }
 if (hdfsFileInfo[i].mGroup){
 free(hdfsFileInfo[i].mGroup);
 }
 }
 I am new to Jira and haven't figured out a way to generate .patch file yet. 
 Could anyone help me do that so that others can commit the changes into the 
 code base. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-596) Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory for mOwner and mGroup

2009-11-13 Thread Christian Kunz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Kunz updated HDFS-596:


Priority: Blocker  (was: Critical)

 Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory 
 for mOwner and mGroup
 

 Key: HDFS-596
 URL: https://issues.apache.org/jira/browse/HDFS-596
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/fuse-dfs
Affects Versions: 0.20.1
 Environment: Linux hadoop-001 2.6.28-14-server #47-Ubuntu SMP Sat Jul 
 25 01:18:34 UTC 2009 i686 GNU/Linux. Namenode with 1GB memory. 
Reporter: Zhang Bingjun
Priority: Blocker
 Fix For: 0.20.2

 Attachments: HDFS-596.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 This bugs affects fuse-dfs severely. In my test, about 1GB memory were 
 exhausted and the fuse-dfs mount directory was disconnected after writing 
 14000 files. This bug is related to the memory leak problem of this issue: 
 http://issues.apache.org/jira/browse/HDFS-420. 
 The bug can be fixed very easily. In function hdfsFreeFileInfo() in file 
 hdfs.c (under c++/libhdfs/) change code block:
 //Free the mName
 int i;
 for (i=0; i  numEntries; ++i) {
 if (hdfsFileInfo[i].mName) {
 free(hdfsFileInfo[i].mName);
 }
 }
 into:
 // free mName, mOwner and mGroup
 int i;
 for (i=0; i  numEntries; ++i) {
 if (hdfsFileInfo[i].mName) {
 free(hdfsFileInfo[i].mName);
 }
 if (hdfsFileInfo[i].mOwner){
 free(hdfsFileInfo[i].mOwner);
 }
 if (hdfsFileInfo[i].mGroup){
 free(hdfsFileInfo[i].mGroup);
 }
 }
 I am new to Jira and haven't figured out a way to generate .patch file yet. 
 Could anyone help me do that so that others can commit the changes into the 
 code base. Thanks!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-758) Improve reporting of progress of decommissioning

2009-11-13 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-758:
--

Attachment: HDFS-758.2.patch

 Improve reporting of progress of decommissioning
 

 Key: HDFS-758
 URL: https://issues.apache.org/jira/browse/HDFS-758
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-758.1.patch, HDFS-758.2.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-771) NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown

2009-11-13 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-771:


Component/s: (was: test)
 name-node
   Priority: Blocker  (was: Major)
   Tags: regression
Summary: NameNode's HttpServer can't instantiate InetSocketAddress: 
IllegalArgumentException is thrown  (was: Jetty crashes: MiniDFSCluster 
supplies incorrect port number to the NameNode)

This issue seems to be a regression (or rather the result of incomplete fix of 
HADOOP-4744)
The problem is surfacing quite often in Pig (a few times per week). So, I'm 
raising the priority to the Blocker, cause all components are affected by this 
issue.

 NameNode's HttpServer can't instantiate InetSocketAddress: 
 IllegalArgumentException is thrown
 -

 Key: HDFS-771
 URL: https://issues.apache.org/jira/browse/HDFS-771
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.22.0
 Environment: Apache Hudson build machine
Reporter: Konstantin Boudnik
Priority: Blocker
 Attachments: testEditLog.html


 In an execution of a tests the following exception has been thrown:
 {noformat}
 Error Message
 port out of range:-1
 Stacktrace
 java.lang.IllegalArgumentException: port out of range:-1
   at java.net.InetSocketAddress.init(InetSocketAddress.java:118)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-771) NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown

2009-11-13 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777806#action_12777806
 ] 

Konstantin Boudnik commented on HDFS-771:
-

After all it seems to be a race condition in the Jetty, e.g. (NameNode:367)
{noformat}
this.httpServer.start();
{noformat}

Appropriate log file
{noformat}
2009-11-13 07:02:04,605 INFO  http.HttpServer (HttpServer.java:start(432)) - 
Port returned by webServer.getConnectors()[0].getLocalPort() before open() is 
-1. Opening the listener on 0
2009-11-13 07:02:04,606 INFO  http.HttpServer (HttpServer.java:start(437)) - 
listener.getLocalPort() returned 37817 
webServer.getConnectors()[0].getLocalPort() returned 37817
2009-11-13 07:02:04,607 INFO  http.HttpServer (HttpServer.java:start(470)) - 
Jetty bound to port 37817
2009-11-13 07:02:04,607 INFO  mortbay.log (?:invoke0(?)) - jetty-6.1.14
2009-11-13 07:03:04,231 INFO  mortbay.log (?:invoke0(?)) - Started 
selectchannelconnec...@localhost:37817
{noformat}

And the this code is executed (NameNode:370-371)
{noformat}
// The web-server port can be ephemeral... ensure we have the correct info
infoPort = this.httpServer.getPort();
this.httpAddress = new InetSocketAddress(infoHost, infoPort);
{noformat}
and {{this.httpServer.getPort();}} returns -1 as the infoPort value

I'll try to work out a minimal test case to reproduce this problem, however it 
might be hard.

 NameNode's HttpServer can't instantiate InetSocketAddress: 
 IllegalArgumentException is thrown
 -

 Key: HDFS-771
 URL: https://issues.apache.org/jira/browse/HDFS-771
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.22.0
 Environment: Apache Hudson build machine
Reporter: Konstantin Boudnik
Priority: Blocker
 Attachments: testEditLog.html


 In an execution of a tests the following exception has been thrown:
 {noformat}
 Error Message
 port out of range:-1
 Stacktrace
 java.lang.IllegalArgumentException: port out of range:-1
   at java.net.InetSocketAddress.init(InetSocketAddress.java:118)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131)
   at 
 org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92)
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream

2009-11-13 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777815#action_12777815
 ] 

Eli Collins commented on HDFS-755:
--

Hey Todd -- patch looks great.  Did you test w/o checksums enabled? 



 Read multiple checksum chunks at once in DFSInputStream
 ---

 Key: HDFS-755
 URL: https://issues.apache.org/jira/browse/HDFS-755
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt


 HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple 
 checksum chunks in a single call to readChunk. This is the HDFS-side use of 
 that new feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem

2009-11-13 Thread dhruba borthakur (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777827#action_12777827
]

dhruba borthakur commented on HDFS-718:
---

Thanks Nicholas, Todd and Allen for the comments.

Todd: The idea of the proposed configuration to is to ensure that *no* scripts
can format thisnamenode, however hard it may try. The main purpose is to not
add the -y option. This is for the paranoid adminstrator who, for sure,
never wants *any* scripts to format this namenode.

configuration parameter to prevent accidental formatting of HDFS filesystem
---

Key: HDFS-718
URL: https://issues.apache.org/jira/browse/HDFS-718
Project: Hadoop HDFS
Issue Type: Improvement
Components: name-node
Affects Versions: 0.22.0
Environment: Any
Reporter: Andrew Ryan
Assignee: Andrew Ryan
Priority: Minor
Attachments: HDFS-718.patch.txt

Currently, any time the NameNode is not running, an HDFS filesystem will
accept the 'format' command, and will duly format itself. There are those of
us who have multi-PB HDFS filesystems who are really quite uncomfortable with
this behavior. There is Y/N confirmation in the format command, but if the
formatter genuinely believes themselves to be doing the right thing, the
filesystem will be formatted.
This patch adds a configuration parameter to the namenode,
dfs.namenode.support.allowformat, which defaults to true, the current
behavior: always allow formatting if the NameNode is down or some other
process is not holding the namenode lock. But if
dfs.namenode.support.allowformat is set to false, the NameNode will not
allow itself to be formatted until this config parameter is changed to true.
The general idea is that for production HDFS filesystems, the user would
format the HDFS once, then set dfs.namenode.support.allowformat to false
for all time.
The attached patch was generated against trunk and +1's on my test machine.
We have a 0.20 version that we are using in our cluster as well.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-756) libhdfs unit tests do not run

2009-11-13 Thread Konstantin Boudnik (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777833#action_12777833
 ] 

Konstantin Boudnik commented on HDFS-756:
-

I'd suggest to raise the priority on this, because it makes full build {{ant 
test}} to fail all the time.

 libhdfs unit tests do not run 
 --

 Key: HDFS-756
 URL: https://issues.apache.org/jira/browse/HDFS-756
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Reporter: dhruba borthakur
Assignee: Eli Collins
 Fix For: 0.22.0


 The libhdfs unit tests (ant test-c++-libhdfs -Dislibhdfs=1) do not run yet 
 because the scripts are in the common subproject,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

38 matches

Mail list logo