date:20120508


[ 
https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270228#comment-13270228
 ] 

Todd Lipcon commented on HDFS-744:
--

Hi Lars. Sorry, I wasn't watching this before so I missed your work til you 
mentioned it on the HBase JIRA. I'll try to take a look at this this week. Feel 
free to grab me on IRC if you have specific questions.

 Support hsync in HDFS
 -

 Key: HDFS-744
 URL: https://issues.apache.org/jira/browse/HDFS-744
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Hairong Kuang
 Attachments: hdfs-744-v2.txt, hdfs-744-v3.txt, hdfs-744.txt


 HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, 
 the real expected semantics should be flushes out to all replicas and all 
 replicas have done posix fsync equivalent - ie the OS has flushed it to the 
 disk device (but the disk may have it in its cache). This jira aims to 
 implement the expected behaviour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-744) Support hsync in HDFS


[ 
https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270233#comment-13270233
 ] 

Todd Lipcon commented on HDFS-744:
--

One quick note, though: in order for this to be reviewed/committed, the patch 
needs to be on trunk. You'll find that on trunk we use protobufs for the data 
transfer protocol, so adding a new sync packet type/flag should be much 
simpler! I think we should work on trunk to iron out the APIs, and then figure 
out how to shoehorn it into the 1.x protocol if need be at a later date.

 Support hsync in HDFS
 -

 Key: HDFS-744
 URL: https://issues.apache.org/jira/browse/HDFS-744
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Hairong Kuang
 Attachments: hdfs-744-v2.txt, hdfs-744-v3.txt, hdfs-744.txt


 HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, 
 the real expected semantics should be flushes out to all replicas and all 
 replicas have done posix fsync equivalent - ie the OS has flushed it to the 
 disk device (but the disk may have it in its cache). This jira aims to 
 implement the expected behaviour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3381) BK JM : After restart bookkeeper continuously throwing error beause BK JM create all Namenode related znode under /ledger znode.

surendra singh lilhore created HDFS-3381:


 Summary: BK JM : After restart bookkeeper continuously throwing 
error beause BK JM create all Namenode related znode under /ledger znode.
 Key: HDFS-3381
 URL: https://issues.apache.org/jira/browse/HDFS-3381
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Reporter: surendra singh lilhore
 Fix For: 0.24.0


Issue :

Bookkeeper journal manager create all the Namenode related znode under 
'/ledgers' znode in zookeeper. When bookkeeper read all the ledgers from  
'/ledgers' znode it consider this znode (version, lock, maxtxid)  as incorrect 
format ledger and log following error.

{noformat}
2012-04-20 11:52:25,611 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: ledgers
2012-04-20 11:52:25,611 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: lock
2012-04-20 11:52:25,612 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: maxtxid
2012-04-20 11:52:26,613 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: version
2012-04-20 11:52:26,613 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: ledgers
2012-04-20 11:52:26,613 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: lock
2012-04-20 11:52:26,613 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: maxtxid
2012-04-20 11:52:27,614 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: version
2012-04-20 11:52:27,614 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: ledgers
2012-04-20 11:52:27,614 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: lock
2012-04-20 11:52:27,615 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: maxtxid
2012-04-20 11:52:28,616 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: version
2012-04-20 11:52:28,616 - WARN  
[main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
from ZK ledger node: ledgers


{noformat}

I think Namenode related znode should be create in separate znode in zookeeper.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-744) Support hsync in HDFS

2012-05-08 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270239#comment-13270239
 ] 

Lars Hofhansl commented on HDFS-744:


Thanks Todd. I'll create a trunk patch tomorrow (hopefully).

 Support hsync in HDFS
 -

 Key: HDFS-744
 URL: https://issues.apache.org/jira/browse/HDFS-744
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Hairong Kuang
 Attachments: hdfs-744-v2.txt, hdfs-744-v3.txt, hdfs-744.txt


 HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, 
 the real expected semantics should be flushes out to all replicas and all 
 replicas have done posix fsync equivalent - ie the OS has flushed it to the 
 disk device (but the disk may have it in its cache). This jira aims to 
 implement the expected behaviour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3381) BK JM : After restart bookkeeper continuously throwing error because BK JM create all Namenode related znode under '/ledgers' znode.


 [ 
https://issues.apache.org/jira/browse/HDFS-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-3381:
-

Summary: BK JM : After restart bookkeeper continuously throwing error 
because BK JM create all Namenode related znode under '/ledgers' znode.  (was: 
BK JM : After restart bookkeeper continuously throwing error because BK JM 
create all Namenode related znode under '/ledger' znode.)

 BK JM : After restart bookkeeper continuously throwing error because BK JM 
 create all Namenode related znode under '/ledgers' znode.
 

 Key: HDFS-3381
 URL: https://issues.apache.org/jira/browse/HDFS-3381
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Reporter: surendra singh lilhore
 Fix For: 0.24.0


 Issue :
 Bookkeeper journal manager create all the Namenode related znode under 
 '/ledgers' znode in zookeeper. When bookkeeper read all the ledgers from  
 '/ledgers' znode it consider this znode (version, lock, maxtxid)  as 
 incorrect format ledger and log following error.
 {noformat}
 2012-04-20 11:52:25,611 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 2012-04-20 11:52:25,611 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: lock
 2012-04-20 11:52:25,612 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: maxtxid
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: version
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: lock
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: maxtxid
 2012-04-20 11:52:27,614 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: version
 2012-04-20 11:52:27,614 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 2012-04-20 11:52:27,614 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: lock
 2012-04-20 11:52:27,615 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: maxtxid
 2012-04-20 11:52:28,616 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: version
 2012-04-20 11:52:28,616 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 {noformat}
 I think Namenode related znode should be create in separate znode in 
 zookeeper.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3381) BK JM : After restart bookkeeper continuously throwing error because BK JM create all Namenode related znode under '/ledger' znode.


 [ 
https://issues.apache.org/jira/browse/HDFS-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-3381:
-

Summary: BK JM : After restart bookkeeper continuously throwing error 
because BK JM create all Namenode related znode under '/ledger' znode.  (was: 
BK JM : After restart bookkeeper continuously throwing error beause BK JM 
create all Namenode related znode under /ledger znode.)

 BK JM : After restart bookkeeper continuously throwing error because BK JM 
 create all Namenode related znode under '/ledger' znode.
 ---

 Key: HDFS-3381
 URL: https://issues.apache.org/jira/browse/HDFS-3381
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Reporter: surendra singh lilhore
 Fix For: 0.24.0


 Issue :
 Bookkeeper journal manager create all the Namenode related znode under 
 '/ledgers' znode in zookeeper. When bookkeeper read all the ledgers from  
 '/ledgers' znode it consider this znode (version, lock, maxtxid)  as 
 incorrect format ledger and log following error.
 {noformat}
 2012-04-20 11:52:25,611 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 2012-04-20 11:52:25,611 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: lock
 2012-04-20 11:52:25,612 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: maxtxid
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: version
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: lock
 2012-04-20 11:52:26,613 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: maxtxid
 2012-04-20 11:52:27,614 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: version
 2012-04-20 11:52:27,614 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 2012-04-20 11:52:27,614 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: lock
 2012-04-20 11:52:27,615 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: maxtxid
 2012-04-20 11:52:28,616 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: version
 2012-04-20 11:52:28,616 - WARN  
 [main-EventThread:AbstractZkLedgerManager$2@123] - Error extracting ledgerId 
 from ZK ledger node: ledgers
 {noformat}
 I think Namenode related znode should be create in separate znode in 
 zookeeper.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3328) Premature EOFExcepion in HdfsProtoUtil#vintPrefixed

[
https://issues.apache.org/jira/browse/HDFS-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270252#comment-13270252
]

Hadoop QA commented on HDFS-3328:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525954/hdfs-3328.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2387//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2387//console

This message is automatically generated.

Premature EOFExcepion in HdfsProtoUtil#vintPrefixed
---

Key: HDFS-3328
URL: https://issues.apache.org/jira/browse/HDFS-3328
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 2.0.0
Reporter: Uma Maheswara Rao G
Assignee: Eli Collins
Priority: Minor
Attachments: hdfs-3328.txt

While running the tests, I have seen this exceptions.Tests passed.
Not sure this is a problem.
{quote}
2012-04-26 23:15:51,763 WARN hdfs.DFSClient (DFSOutputStream.java:run(710))
- DFSOutputStream ResponseProcessor exception for block
BP-1372255573-49.249.124.17-1335462329685:blk_-843504080180201_1005
java.io.EOFException: Premature EOF: no length prefix available
at
org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
at
org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:657)
Exception in thread DataXceiver for client /127.0.0.1:52323 [Cleaning up]
java.lang.NullPointerException
at org.apache.hadoop.ipc.Server$Listener.getAddress(Server.java:669)
at org.apache.hadoop.ipc.Server.getListenerAddress(Server.java:1988)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getIpcPort(DataNode.java:882)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.getDisplayName(DataNode.java:863)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:171)
at java.lang.Thread.run(Unknown Source){quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3382) BookKeeperJournalManager: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes

2012-05-08 Thread Rakesh R (JIRA)

Rakesh R created HDFS-3382:
--

 Summary: BookKeeperJournalManager: NN startup is failing, when 
tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes
 Key: HDFS-3382
 URL: https://issues.apache.org/jira/browse/HDFS-3382
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Rakesh R
 Fix For: 0.24.0


Say, the InProgress_000X node is corrupted due to not writing the data(version, 
ledgerId, firstTxId) to this inProgress_000X znode. Namenode startup has the 
logic to recover all the unfinalized segments, here will try to read the 
segment and getting shutdown.

{noformat}
EditLogLedgerMetadata.java:

static EditLogLedgerMetadata read(ZooKeeper zkc, String path)
  throws IOException, KeeperException.NoNodeException  {
  byte[] data = zkc.getData(path, false, null);
  String[] parts = new String(data).split(;);
  if (parts.length == 3)
 reading inprogress metadata
  else if (parts.length == 4)
 reading inprogress metadata
  else
throw new IOException(Invalid ledger entry, 
  + new String(data));
  }
{noformat}


Scenario:- Leaving bad inProgress_000X node ?
Assume BKJM has created the inProgress_000X zNode and ZK is not available when 
trying to add the metadata. Now, inProgress_000X ends up with partial 
information.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3382) BookKeeperJournalManager: NN startup is failing, when tries to recoverUnfinalizedSegments() a bad inProgress_ ZNodes

2012-05-08 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270287#comment-13270287
 ] 

Rakesh R commented on HDFS-3382:


This is an endless condition, not allowing to start the NN as it has bad 
inprogress zNodes. IMHO, we would consider its like dirty or partial data, good 
to delete those entries by giving warning messages. Otw for starting NN, admin 
has to manually do the cleanups from the ZooKeeper.

 BookKeeperJournalManager: NN startup is failing, when tries to 
 recoverUnfinalizedSegments() a bad inProgress_ ZNodes
 

 Key: HDFS-3382
 URL: https://issues.apache.org/jira/browse/HDFS-3382
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Rakesh R
 Fix For: 0.24.0


 Say, the InProgress_000X node is corrupted due to not writing the 
 data(version, ledgerId, firstTxId) to this inProgress_000X znode. Namenode 
 startup has the logic to recover all the unfinalized segments, here will try 
 to read the segment and getting shutdown.
 {noformat}
 EditLogLedgerMetadata.java:
 static EditLogLedgerMetadata read(ZooKeeper zkc, String path)
   throws IOException, KeeperException.NoNodeException  {
   byte[] data = zkc.getData(path, false, null);
   String[] parts = new String(data).split(;);
   if (parts.length == 3)
  reading inprogress metadata
   else if (parts.length == 4)
  reading inprogress metadata
   else
 throw new IOException(Invalid ledger entry, 
   + new String(data));
   }
 {noformat}
 Scenario:- Leaving bad inProgress_000X node ?
 Assume BKJM has created the inProgress_000X zNode and ZK is not available 
 when trying to add the metadata. Now, inProgress_000X ends up with partial 
 information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3383) libhdfs does not build on ARM because jni_md.h is not found

2012-05-08 Thread Trevor Robinson (JIRA)

Trevor Robinson created HDFS-3383:
-

 Summary: libhdfs does not build on ARM because jni_md.h is not 
found
 Key: HDFS-3383
 URL: https://issues.apache.org/jira/browse/HDFS-3383
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 0.23.1
 Environment: Linux 3.2.0-1412-omap4 #16-Ubuntu SMP PREEMPT Tue Apr 17 
19:38:42 UTC 2012 armv7l armv7l armv7l GNU/Linux
java version 1.7.0_04-ea
Java(TM) SE Runtime Environment for Embedded (build 1.7.0_04-ea-b20, headless)
Java HotSpot(TM) Embedded Server VM (build 23.0-b21, mixed mode, experimental)

Reporter: Trevor Robinson


The wrong include directory is used for jni_md.h:

[INFO] --- make-maven-plugin:1.0-beta-1:make-install (compile) @ hadoop-hdfs ---
[INFO] /bin/bash ./libtool --tag=CC   --mode=compile gcc 
-DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ 
-DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs\ 0.1.0\ 
-DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ 
-DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 
-DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 
-DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 
-DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 -DHAVE_STRERROR=1 
-DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 -DHAVE_STDBOOL_H=1 -I. -g 
-O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ -I/usr/lib/jvm/ejdk1.7.0_04/include 
-I/usr/lib/jvm/ejdk1.7.0_04/include/arm -Wall -Wstrict-prototypes -MT hdfs.lo 
-MD -MP -MF .deps/hdfs.Tpo -c -o hdfs.lo hdfs.c
[INFO] libtool: compile:  gcc -DPACKAGE_NAME=\libhdfs\ 
-DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ 
-DPACKAGE_STRING=\libhdfs 0.1.0\ -DPACKAGE_BUGREPORT=\omal...@apache.org\ 
-DPACKAGE_URL=\\ -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
-DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
-DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 
-DHAVE_STRERROR=1 -DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 
-DHAVE_STDBOOL_H=1 -I. -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ 
-I/usr/lib/jvm/ejdk1.7.0_04/include -I/usr/lib/jvm/ejdk1.7.0_04/include/arm 
-Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c hdfs.c  
-fPIC -DPIC -o .libs/hdfs.o
[INFO] In file included from hdfs.h:33:0,
[INFO]  from hdfs.c:19:
[INFO] /usr/lib/jvm/ejdk1.7.0_04/include/jni.h:45:20: fatal error: jni_md.h: No 
such file or directory
[INFO] compilation terminated.
[INFO] make: *** [hdfs.lo] Error 1

The problem is caused by 
hadoop-hdfs-project/hadoop-hdfs/src/main/native/m4/apsupport.m4 overriding 
supported_os=arm when host_cpu=arm*; supported_os should remain linux, since 
it determines the jni_md.h include path. OpenJDK 6 and 7 (in Ubuntu 12.04, at 
least) and Oracle EJDK put jni_md.h in include/linux. Not sure if/why this ever 
worked before.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3383) libhdfs does not build on ARM because jni_md.h is not found

2012-05-08 Thread Trevor Robinson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated HDFS-3383:
--

Attachment: HDFS-3383.patch

 libhdfs does not build on ARM because jni_md.h is not found
 ---

 Key: HDFS-3383
 URL: https://issues.apache.org/jira/browse/HDFS-3383
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 0.23.1
 Environment: Linux 3.2.0-1412-omap4 #16-Ubuntu SMP PREEMPT Tue Apr 17 
 19:38:42 UTC 2012 armv7l armv7l armv7l GNU/Linux
 java version 1.7.0_04-ea
 Java(TM) SE Runtime Environment for Embedded (build 1.7.0_04-ea-b20, headless)
 Java HotSpot(TM) Embedded Server VM (build 23.0-b21, mixed mode, experimental)
Reporter: Trevor Robinson
 Attachments: HDFS-3383.patch


 The wrong include directory is used for jni_md.h:
 [INFO] --- make-maven-plugin:1.0-beta-1:make-install (compile) @ hadoop-hdfs 
 ---
 [INFO] /bin/bash ./libtool --tag=CC   --mode=compile gcc 
 -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ 
 -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs\ 0.1.0\ 
 -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ 
 -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 
 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 
 -DHAVE_STRERROR=1 -DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 
 -DHAVE_STDBOOL_H=1 -I. -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ 
 -I/usr/lib/jvm/ejdk1.7.0_04/include -I/usr/lib/jvm/ejdk1.7.0_04/include/arm 
 -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c -o 
 hdfs.lo hdfs.c
 [INFO] libtool: compile:  gcc -DPACKAGE_NAME=\libhdfs\ 
 -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ 
 -DPACKAGE_STRING=\libhdfs 0.1.0\ 
 -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ 
 -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 
 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 
 -DHAVE_STRERROR=1 -DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 
 -DHAVE_STDBOOL_H=1 -I. -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ 
 -I/usr/lib/jvm/ejdk1.7.0_04/include -I/usr/lib/jvm/ejdk1.7.0_04/include/arm 
 -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c hdfs.c  
 -fPIC -DPIC -o .libs/hdfs.o
 [INFO] In file included from hdfs.h:33:0,
 [INFO]  from hdfs.c:19:
 [INFO] /usr/lib/jvm/ejdk1.7.0_04/include/jni.h:45:20: fatal error: jni_md.h: 
 No such file or directory
 [INFO] compilation terminated.
 [INFO] make: *** [hdfs.lo] Error 1
 The problem is caused by 
 hadoop-hdfs-project/hadoop-hdfs/src/main/native/m4/apsupport.m4 overriding 
 supported_os=arm when host_cpu=arm*; supported_os should remain linux, 
 since it determines the jni_md.h include path. OpenJDK 6 and 7 (in Ubuntu 
 12.04, at least) and Oracle EJDK put jni_md.h in include/linux. Not sure 
 if/why this ever worked before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3383) libhdfs does not build on ARM because jni_md.h is not found

2012-05-08 Thread Trevor Robinson (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated HDFS-3383:
--

Status: Patch Available  (was: Open)

 libhdfs does not build on ARM because jni_md.h is not found
 ---

 Key: HDFS-3383
 URL: https://issues.apache.org/jira/browse/HDFS-3383
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 0.23.1
 Environment: Linux 3.2.0-1412-omap4 #16-Ubuntu SMP PREEMPT Tue Apr 17 
 19:38:42 UTC 2012 armv7l armv7l armv7l GNU/Linux
 java version 1.7.0_04-ea
 Java(TM) SE Runtime Environment for Embedded (build 1.7.0_04-ea-b20, headless)
 Java HotSpot(TM) Embedded Server VM (build 23.0-b21, mixed mode, experimental)
Reporter: Trevor Robinson
 Attachments: HDFS-3383.patch


 The wrong include directory is used for jni_md.h:
 [INFO] --- make-maven-plugin:1.0-beta-1:make-install (compile) @ hadoop-hdfs 
 ---
 [INFO] /bin/bash ./libtool --tag=CC   --mode=compile gcc 
 -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ 
 -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs\ 0.1.0\ 
 -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ 
 -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 
 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 
 -DHAVE_STRERROR=1 -DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 
 -DHAVE_STDBOOL_H=1 -I. -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ 
 -I/usr/lib/jvm/ejdk1.7.0_04/include -I/usr/lib/jvm/ejdk1.7.0_04/include/arm 
 -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c -o 
 hdfs.lo hdfs.c
 [INFO] libtool: compile:  gcc -DPACKAGE_NAME=\libhdfs\ 
 -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ 
 -DPACKAGE_STRING=\libhdfs 0.1.0\ 
 -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ 
 -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 
 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 
 -DHAVE_STRERROR=1 -DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 
 -DHAVE_STDBOOL_H=1 -I. -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ 
 -I/usr/lib/jvm/ejdk1.7.0_04/include -I/usr/lib/jvm/ejdk1.7.0_04/include/arm 
 -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c hdfs.c  
 -fPIC -DPIC -o .libs/hdfs.o
 [INFO] In file included from hdfs.h:33:0,
 [INFO]  from hdfs.c:19:
 [INFO] /usr/lib/jvm/ejdk1.7.0_04/include/jni.h:45:20: fatal error: jni_md.h: 
 No such file or directory
 [INFO] compilation terminated.
 [INFO] make: *** [hdfs.lo] Error 1
 The problem is caused by 
 hadoop-hdfs-project/hadoop-hdfs/src/main/native/m4/apsupport.m4 overriding 
 supported_os=arm when host_cpu=arm*; supported_os should remain linux, 
 since it determines the jni_md.h include path. OpenJDK 6 and 7 (in Ubuntu 
 12.04, at least) and Oracle EJDK put jni_md.h in include/linux. Not sure 
 if/why this ever worked before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3384) DataStreamer thread should be closed immediatly when failed to setup a PipelineForAppendOrRecovery

Brahma Reddy Battula created HDFS-3384:
--

 Summary: DataStreamer thread should be closed immediatly when 
failed to setup a PipelineForAppendOrRecovery
 Key: HDFS-3384
 URL: https://issues.apache.org/jira/browse/HDFS-3384
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Priority: Minor


Scenraio:
=
write a file
corrupt block manually
call append..

{noformat}

2012-04-19 09:33:10,776 INFO  hdfs.DFSClient 
(DFSOutputStream.java:createBlockOutputStream(1059)) - Exception in 
createBlockOutputStream
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1039)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:939)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
2012-04-19 09:33:10,807 WARN  hdfs.DFSClient (DFSOutputStream.java:run(549)) - 
DataStreamer Exception
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:510)
2012-04-19 09:33:10,807 WARN  hdfs.DFSClient 
(DFSOutputStream.java:hflush(1511)) - Error while syncing
java.io.IOException: All datanodes 10.18.40.20:50010 are bad. Aborting...
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:908)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
java.io.IOException: All datanodes 10.18.40.20:50010 are bad. Aborting...
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:908)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3383) libhdfs does not build on ARM because jni_md.h is not found


[ 
https://issues.apache.org/jira/browse/HDFS-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270339#comment-13270339
 ] 

Hadoop QA commented on HDFS-3383:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12525971/HDFS-3383.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2388//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2388//console

This message is automatically generated.

 libhdfs does not build on ARM because jni_md.h is not found
 ---

 Key: HDFS-3383
 URL: https://issues.apache.org/jira/browse/HDFS-3383
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 0.23.1
 Environment: Linux 3.2.0-1412-omap4 #16-Ubuntu SMP PREEMPT Tue Apr 17 
 19:38:42 UTC 2012 armv7l armv7l armv7l GNU/Linux
 java version 1.7.0_04-ea
 Java(TM) SE Runtime Environment for Embedded (build 1.7.0_04-ea-b20, headless)
 Java HotSpot(TM) Embedded Server VM (build 23.0-b21, mixed mode, experimental)
Reporter: Trevor Robinson
 Attachments: HDFS-3383.patch


 The wrong include directory is used for jni_md.h:
 [INFO] --- make-maven-plugin:1.0-beta-1:make-install (compile) @ hadoop-hdfs 
 ---
 [INFO] /bin/bash ./libtool --tag=CC   --mode=compile gcc 
 -DPACKAGE_NAME=\libhdfs\ -DPACKAGE_TARNAME=\libhdfs\ 
 -DPACKAGE_VERSION=\0.1.0\ -DPACKAGE_STRING=\libhdfs\ 0.1.0\ 
 -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ 
 -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 
 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 
 -DHAVE_STRERROR=1 -DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 
 -DHAVE_STDBOOL_H=1 -I. -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ 
 -I/usr/lib/jvm/ejdk1.7.0_04/include -I/usr/lib/jvm/ejdk1.7.0_04/include/arm 
 -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c -o 
 hdfs.lo hdfs.c
 [INFO] libtool: compile:  gcc -DPACKAGE_NAME=\libhdfs\ 
 -DPACKAGE_TARNAME=\libhdfs\ -DPACKAGE_VERSION=\0.1.0\ 
 -DPACKAGE_STRING=\libhdfs 0.1.0\ 
 -DPACKAGE_BUGREPORT=\omal...@apache.org\ -DPACKAGE_URL=\\ 
 -DPACKAGE=\libhdfs\ -DVERSION=\0.1.0\ -DSTDC_HEADERS=1 
 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 
 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 
 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\.libs/\ -DHAVE_STRDUP=1 
 -DHAVE_STRERROR=1 -DHAVE_STRTOUL=1 -DHAVE_FCNTL_H=1 -DHAVE__BOOL=1 
 -DHAVE_STDBOOL_H=1 -I. -g -O2 -DOS_LINUX -DDSO_DLFCN -DCPU=\arm\ 
 -I/usr/lib/jvm/ejdk1.7.0_04/include -I/usr/lib/jvm/ejdk1.7.0_04/include/arm 
 -Wall -Wstrict-prototypes -MT hdfs.lo -MD -MP -MF .deps/hdfs.Tpo -c hdfs.c  
 -fPIC -DPIC -o .libs/hdfs.o
 [INFO] In file included from hdfs.h:33:0,
 [INFO]  from hdfs.c:19:
 [INFO] /usr/lib/jvm/ejdk1.7.0_04/include/jni.h:45:20: fatal error: jni_md.h: 
 No such file or directory
 [INFO] compilation terminated.
 [INFO] make: *** [hdfs.lo] Error 1
 The problem is caused by 
 hadoop-hdfs-project/hadoop-hdfs/src/main/native/m4/apsupport.m4 overriding 
 supported_os=arm when host_cpu=arm*; supported_os should remain linux, 
 since it determines the jni_md.h include path. OpenJDK 6 and 7 (in Ubuntu 
 12.04, at least) and Oracle EJDK put jni_md.h in include/linux. Not sure 
 if/why this ever worked before.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3385) ClassCastException when trying to append a file

Brahma Reddy Battula created HDFS-3385:
--

 Summary: ClassCastException when trying to append a file
 Key: HDFS-3385
 URL: https://issues.apache.org/jira/browse/HDFS-3385
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 3.0.0
 Environment: HDFS
Reporter: Brahma Reddy Battula
Priority: Minor
 Fix For: 3.0.0


When I try to append a file I got 

2012-05-08 18:13:40,506 WARN  util.KerberosName 
(KerberosName.java:clinit(87)) - Kerberos krb5 configuration not found, 
setting default realm to empty
Exception in thread main java.lang.ClassCastException: 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:217)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42592)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)

at org.apache.hadoop.ipc.Client.call(Client.java:1159)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
at $Proxy9.append(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.append(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:204)
at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
at 
org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3363) blockmanagement should stop using INodeFile INodeFileUC


[ 
https://issues.apache.org/jira/browse/HDFS-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270435#comment-13270435
 ] 

Hudson commented on HDFS-3363:
--

Integrated in Hadoop-Hdfs-trunk #1038 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1038/])
HDFS-3363. Define BlockCollection and MutableBlockCollection interfaces so 
that INodeFile and INodeFileUnderConstruction do not have to be used in block 
management.  Contributed by John George (Revision 1335304)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335304
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/MutableBlockCollection.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSInodeInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java


 blockmanagement should stop using INodeFile  INodeFileUC 
 --

 Key: HDFS-3363
 URL: https://issues.apache.org/jira/browse/HDFS-3363
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0, 3.0.0
Reporter: John George
Assignee: John George
Priority: Minor
 Fix For: 2.0.0

 Attachments: HDFS-3363.java, HDFS-3363.java, HDFS-3363.java, 
 HDFS-3363.patch


 Blockmanagement should stop using INodeFile and INodeFileUnderConstruction. 
 One way would be to create an interface, like BlockColletion, that is passed 
 along to the blockmanagement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3365) Enable users to disable socket caching in DFS client configuration


[ 
https://issues.apache.org/jira/browse/HDFS-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270436#comment-13270436
 ] 

Hudson commented on HDFS-3365:
--

Integrated in Hadoop-Hdfs-trunk #1038 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1038/])
HDFS-3365. Enable users to disable socket caching in DFS client 
configuration. Contributed by Todd Lipcon. (Revision 1335222)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335222
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestConnCache.java


 Enable users to disable socket caching in DFS client configuration
 --

 Key: HDFS-3365
 URL: https://issues.apache.org/jira/browse/HDFS-3365
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 2.0.0

 Attachments: hdfs-3365.txt


 Currently the user may configure the number of sockets to cache. But, if this 
 conf is set to 0, then an exception results. Instead, setting to 0 should 
 effectively disable the caching behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets


[ 
https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270439#comment-13270439
 ] 

Hudson commented on HDFS-3376:
--

Integrated in Hadoop-Hdfs-trunk #1038 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1038/])
HDFS-3376. DFSClient fails to make connection to DN if there are many 
unusable cached sockets. Contributed by Todd Lipcon. (Revision 1335115)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335115
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java


 DFSClient fails to make connection to DN if there are many unusable cached 
 sockets
 --

 Key: HDFS-3376
 URL: https://issues.apache.org/jira/browse/HDFS-3376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 2.0.0

 Attachments: hdfs-3376.txt


 After fixing the datanode side of keepalive to properly disconnect stale 
 clients, (HDFS-3357), the client side has the following issue: when it 
 connects to a DN, it first tries to use cached sockets, and will try a 
 configurable number of sockets from the cache. If there are more cached 
 sockets than the configured number of retries, and all of them have been 
 closed by the datanode side, then the client will throw an exception and mark 
 the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3375) Put client name in DataXceiver thread name for readBlock and keepalive


[ 
https://issues.apache.org/jira/browse/HDFS-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270440#comment-13270440
 ] 

Hudson commented on HDFS-3375:
--

Integrated in Hadoop-Hdfs-trunk #1038 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1038/])
HDFS-3375. Put client name in DataXceiver thread name for readBlock and 
keepalive. Contributed by Todd Lipcon. (Revision 1335270)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335270
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java


 Put client name in DataXceiver thread name for readBlock and keepalive
 --

 Key: HDFS-3375
 URL: https://issues.apache.org/jira/browse/HDFS-3375
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Fix For: 2.0.0

 Attachments: hdfs-3375.txt


 Currently the datanode thread names include the client for write operations, 
 but not for read. We should include the client name for read ops. 
 Additionally, in the keepalive phase, we should include the client name that 
 initiated the previous request as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3378) Remove DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY and DEFAULT


[ 
https://issues.apache.org/jira/browse/HDFS-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270437#comment-13270437
 ] 

Hudson commented on HDFS-3378:
--

Integrated in Hadoop-Hdfs-trunk #1038 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1038/])
HDFS-3378. Remove DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY and DEFAULT. 
Contributed by Eli Collins (Revision 1335309)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335309
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java


 Remove DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY and DEFAULT
 

 Key: HDFS-3378
 URL: https://issues.apache.org/jira/browse/HDFS-3378
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Trivial
 Fix For: 2.0.0

 Attachments: hdfs-3378.txt


 DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY/DEFAULT are dead code as of HDFS-2617, 
 let's remove them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3328) Premature EOFExcepion in HdfsProtoUtil#vintPrefixed


[ 
https://issues.apache.org/jira/browse/HDFS-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270443#comment-13270443
 ] 

Daryn Sharp commented on HDFS-3328:
---

Nice, this looks like the more appropriate port for {{toString()}}.  However, 
the metrics are also inited with the display name.  Does it matter for metrics 
collection as to whether the xfer or ipc port is used?

 Premature EOFExcepion in HdfsProtoUtil#vintPrefixed
 ---

 Key: HDFS-3328
 URL: https://issues.apache.org/jira/browse/HDFS-3328
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0
Reporter: Uma Maheswara Rao G
Assignee: Eli Collins
Priority: Minor
 Attachments: hdfs-3328.txt


 While running the tests, I have seen this exceptions.Tests passed. 
 Not sure this is a problem.
 {quote}
 2012-04-26 23:15:51,763 WARN  hdfs.DFSClient (DFSOutputStream.java:run(710)) 
 - DFSOutputStream ResponseProcessor exception  for block 
 BP-1372255573-49.249.124.17-1335462329685:blk_-843504080180201_1005
 java.io.EOFException: Premature EOF: no length prefix available
   at 
 org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:657)
 Exception in thread DataXceiver for client /127.0.0.1:52323 [Cleaning up] 
 java.lang.NullPointerException
   at org.apache.hadoop.ipc.Server$Listener.getAddress(Server.java:669)
   at org.apache.hadoop.ipc.Server.getListenerAddress(Server.java:1988)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getIpcPort(DataNode.java:882)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getDisplayName(DataNode.java:863)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:171)
   at java.lang.Thread.run(Unknown Source){quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-234) Integration with BookKeeper logging system

2012-05-08 Thread Ivan Kelly (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-234:


Attachment: HDFS-234-branch-2.patch

This patch is a direct copy of the bkjournal code from trunk on the 8th May 
2012. To be applied on branch-2.

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3363) blockmanagement should stop using INodeFile INodeFileUC


[ 
https://issues.apache.org/jira/browse/HDFS-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270469#comment-13270469
 ] 

Hudson commented on HDFS-3363:
--

Integrated in Hadoop-Mapreduce-trunk #1073 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1073/])
HDFS-3363. Define BlockCollection and MutableBlockCollection interfaces so 
that INodeFile and INodeFileUnderConstruction do not have to be used in block 
management.  Contributed by John George (Revision 1335304)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335304
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/MutableBlockCollection.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSInodeInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java


 blockmanagement should stop using INodeFile  INodeFileUC 
 --

 Key: HDFS-3363
 URL: https://issues.apache.org/jira/browse/HDFS-3363
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0, 3.0.0
Reporter: John George
Assignee: John George
Priority: Minor
 Fix For: 2.0.0

 Attachments: HDFS-3363.java, HDFS-3363.java, HDFS-3363.java, 
 HDFS-3363.patch


 Blockmanagement should stop using INodeFile and INodeFileUnderConstruction. 
 One way would be to create an interface, like BlockColletion, that is passed 
 along to the blockmanagement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3378) Remove DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY and DEFAULT


[ 
https://issues.apache.org/jira/browse/HDFS-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270471#comment-13270471
 ] 

Hudson commented on HDFS-3378:
--

Integrated in Hadoop-Mapreduce-trunk #1073 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1073/])
HDFS-3378. Remove DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY and DEFAULT. 
Contributed by Eli Collins (Revision 1335309)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335309
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java


 Remove DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY and DEFAULT
 

 Key: HDFS-3378
 URL: https://issues.apache.org/jira/browse/HDFS-3378
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Trivial
 Fix For: 2.0.0

 Attachments: hdfs-3378.txt


 DFS_NAMENODE_SECONDARY_HTTPS_PORT_KEY/DEFAULT are dead code as of HDFS-2617, 
 let's remove them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3365) Enable users to disable socket caching in DFS client configuration


[ 
https://issues.apache.org/jira/browse/HDFS-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270470#comment-13270470
 ] 

Hudson commented on HDFS-3365:
--

Integrated in Hadoop-Mapreduce-trunk #1073 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1073/])
HDFS-3365. Enable users to disable socket caching in DFS client 
configuration. Contributed by Todd Lipcon. (Revision 1335222)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335222
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestConnCache.java


 Enable users to disable socket caching in DFS client configuration
 --

 Key: HDFS-3365
 URL: https://issues.apache.org/jira/browse/HDFS-3365
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 2.0.0

 Attachments: hdfs-3365.txt


 Currently the user may configure the number of sockets to cache. But, if this 
 conf is set to 0, then an exception results. Instead, setting to 0 should 
 effectively disable the caching behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3375) Put client name in DataXceiver thread name for readBlock and keepalive


[ 
https://issues.apache.org/jira/browse/HDFS-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270474#comment-13270474
 ] 

Hudson commented on HDFS-3375:
--

Integrated in Hadoop-Mapreduce-trunk #1073 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1073/])
HDFS-3375. Put client name in DataXceiver thread name for readBlock and 
keepalive. Contributed by Todd Lipcon. (Revision 1335270)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335270
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java


 Put client name in DataXceiver thread name for readBlock and keepalive
 --

 Key: HDFS-3375
 URL: https://issues.apache.org/jira/browse/HDFS-3375
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Fix For: 2.0.0

 Attachments: hdfs-3375.txt


 Currently the datanode thread names include the client for write operations, 
 but not for read. We should include the client name for read ops. 
 Additionally, in the keepalive phase, we should include the client name that 
 initiated the previous request as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets


[ 
https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270473#comment-13270473
 ] 

Hudson commented on HDFS-3376:
--

Integrated in Hadoop-Mapreduce-trunk #1073 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1073/])
HDFS-3376. DFSClient fails to make connection to DN if there are many 
unusable cached sockets. Contributed by Todd Lipcon. (Revision 1335115)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335115
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java


 DFSClient fails to make connection to DN if there are many unusable cached 
 sockets
 --

 Key: HDFS-3376
 URL: https://issues.apache.org/jira/browse/HDFS-3376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 2.0.0

 Attachments: hdfs-3376.txt


 After fixing the datanode side of keepalive to properly disconnect stale 
 clients, (HDFS-3357), the client side has the following issue: when it 
 connects to a DN, it first tries to use cached sockets, and will try a 
 configurable number of sockets from the cache. If there are more cached 
 sockets than the configured number of retries, and all of them have been 
 closed by the datanode side, then the client will throw an exception and mark 
 the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3386) Namenode is not deleting his lock entry '/ledgers/lock/lock-0000X', when fails to acquire the lock

surendra singh lilhore created HDFS-3386:


 Summary: Namenode is not deleting his lock entry 
'/ledgers/lock/lock-X', when fails to acquire the lock
 Key: HDFS-3386
 URL: https://issues.apache.org/jira/browse/HDFS-3386
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Reporter: surendra singh lilhore
Priority: Minor
 Fix For: 0.23.0



When a Standby NN becomes Active, it will first create his sequential lock 
entry create lock-000X  in ZK and then tries to acquire the lock as shown below:
{quote}
myznode = zkc.create(lockpath + /lock-, new byte[] {'0'},
 Ids.OPEN_ACL_UNSAFE,
 CreateMode.EPHEMERAL_SEQUENTIAL);
if ((lockpath + / + nodes.get(0)).equals(myznode)) {
if (LOG.isTraceEnabled()) {
LOG.trace(Lock acquired -  + myznode);
}
lockCount.set(1);
zkc.exists(myznode, this);
return;
} else {
LOG.error(Failed to acquire lock with  + myznode
+ ,  + nodes.get(0) +  already has it);
throw new IOException(Could not acquire lock);
}  
{quote}

Say the transition to standby fails to acquire the lock it will throw the 
exception and NN is getting shutdown. Here the problem is, the lock entry 
lock-000X will exists in the ZK till session expiry and the further start-up 
will not be able to acquire lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3386) BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-0000X', when fails to acquire the lock


 [ 
https://issues.apache.org/jira/browse/HDFS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore updated HDFS-3386:
-

Description: 
When a Standby NN becomes Active, it will first create his sequential lock 
entry create lock-000X  in ZK and then tries to acquire the lock as shown below:
{quote}
myznode = zkc.create(lockpath + /lock-, new byte[] {'0'},
 Ids.OPEN_ACL_UNSAFE,
 CreateMode.EPHEMERAL_SEQUENTIAL);
if ((lockpath + / + nodes.get(0)).equals(myznode)) {
if (LOG.isTraceEnabled()) {
LOG.trace(Lock acquired -  + myznode);
}
lockCount.set(1);
zkc.exists(myznode, this);
return;
} else {
LOG.error(Failed to acquire lock with  + myznode
+ ,  + nodes.get(0) +  already has it);
throw new IOException(Could not acquire lock);
}  
{quote}

Say the transition to standby fails to acquire the lock it will throw the 
exception and NN is getting shutdown. Here the problem is, the lock entry 
lock-000X will exists in the ZK till session expiry and the further start-up 
will not be able to acquire lock.

  was:

When a Standby NN becomes Active, it will first create his sequential lock 
entry create lock-000X  in ZK and then tries to acquire the lock as shown below:
{quote}
myznode = zkc.create(lockpath + /lock-, new byte[] {'0'},
 Ids.OPEN_ACL_UNSAFE,
 CreateMode.EPHEMERAL_SEQUENTIAL);
if ((lockpath + / + nodes.get(0)).equals(myznode)) {
if (LOG.isTraceEnabled()) {
LOG.trace(Lock acquired -  + myznode);
}
lockCount.set(1);
zkc.exists(myznode, this);
return;
} else {
LOG.error(Failed to acquire lock with  + myznode
+ ,  + nodes.get(0) +  already has it);
throw new IOException(Could not acquire lock);
}  
{quote}

Say the transition to standby fails to acquire the lock it will throw the 
exception and NN is getting shutdown. Here the problem is, the lock entry 
lock-000X will exists in the ZK till session expiry and the further start-up 
will not be able to acquire lock.

Summary: BK JM : Namenode is not deleting his lock entry 
'/ledgers/lock/lock-X', when fails to acquire the lock  (was: Namenode is 
not deleting his lock entry '/ledgers/lock/lock-X', when fails to acquire 
the lock)

 BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-X', 
 when fails to acquire the lock
 --

 Key: HDFS-3386
 URL: https://issues.apache.org/jira/browse/HDFS-3386
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Reporter: surendra singh lilhore
Priority: Minor
 Fix For: 0.23.0


 When a Standby NN becomes Active, it will first create his sequential lock 
 entry create lock-000X  in ZK and then tries to acquire the lock as shown 
 below:
 {quote}
 myznode = zkc.create(lockpath + /lock-, new byte[] {'0'},
  Ids.OPEN_ACL_UNSAFE,
  CreateMode.EPHEMERAL_SEQUENTIAL);
 if ((lockpath + / + nodes.get(0)).equals(myznode)) {
 if (LOG.isTraceEnabled()) {
 LOG.trace(Lock acquired -  + myznode);
 }
 lockCount.set(1);
 zkc.exists(myznode, this);
 return;
 } else {
 LOG.error(Failed to acquire lock with  + myznode
 + ,  + nodes.get(0) +  already has it);
 throw new IOException(Could not acquire lock);
 }  
 {quote}
 Say the transition to standby fails to acquire the lock it will throw the 
 exception and NN is getting shutdown. Here the problem is, the lock entry 
 lock-000X will exists in the ZK till session expiry and the further start-up 
 will not be able to acquire lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3386) BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-0000X', when fails to acquire the lock


[ 
https://issues.apache.org/jira/browse/HDFS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270502#comment-13270502
 ] 

Uma Maheswara Rao G commented on HDFS-3386:
---

Yes, I think, we can delete that newly created lock file if it fails to aquire 
the lock.

 BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-X', 
 when fails to acquire the lock
 --

 Key: HDFS-3386
 URL: https://issues.apache.org/jira/browse/HDFS-3386
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Reporter: surendra singh lilhore
Priority: Minor
 Fix For: 0.23.0


 When a Standby NN becomes Active, it will first create his sequential lock 
 entry create lock-000X  in ZK and then tries to acquire the lock as shown 
 below:
 {quote}
 myznode = zkc.create(lockpath + /lock-, new byte[] {'0'},
  Ids.OPEN_ACL_UNSAFE,
  CreateMode.EPHEMERAL_SEQUENTIAL);
 if ((lockpath + / + nodes.get(0)).equals(myznode)) {
 if (LOG.isTraceEnabled()) {
 LOG.trace(Lock acquired -  + myznode);
 }
 lockCount.set(1);
 zkc.exists(myznode, this);
 return;
 } else {
 LOG.error(Failed to acquire lock with  + myznode
 + ,  + nodes.get(0) +  already has it);
 throw new IOException(Could not acquire lock);
 }  
 {quote}
 Say the transition to standby fails to acquire the lock it will throw the 
 exception and NN is getting shutdown. Here the problem is, the lock entry 
 lock-000X will exists in the ZK till session expiry and the further start-up 
 will not be able to acquire lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3328) NPE in DataNode.getIpcPort


 [ 
https://issues.apache.org/jira/browse/HDFS-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3328:
--

Summary: NPE in DataNode.getIpcPort  (was: Premature EOFExcepion in 
HdfsProtoUtil#vintPrefixed)

 NPE in DataNode.getIpcPort
 --

 Key: HDFS-3328
 URL: https://issues.apache.org/jira/browse/HDFS-3328
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0
Reporter: Uma Maheswara Rao G
Assignee: Eli Collins
Priority: Minor
 Attachments: hdfs-3328.txt


 While running the tests, I have seen this exceptions.Tests passed. 
 Not sure this is a problem.
 {quote}
 2012-04-26 23:15:51,763 WARN  hdfs.DFSClient (DFSOutputStream.java:run(710)) 
 - DFSOutputStream ResponseProcessor exception  for block 
 BP-1372255573-49.249.124.17-1335462329685:blk_-843504080180201_1005
 java.io.EOFException: Premature EOF: no length prefix available
   at 
 org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:657)
 Exception in thread DataXceiver for client /127.0.0.1:52323 [Cleaning up] 
 java.lang.NullPointerException
   at org.apache.hadoop.ipc.Server$Listener.getAddress(Server.java:669)
   at org.apache.hadoop.ipc.Server.getListenerAddress(Server.java:1988)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getIpcPort(DataNode.java:882)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getDisplayName(DataNode.java:863)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:171)
   at java.lang.Thread.run(Unknown Source){quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3328) NPE in DataNode.getIpcPort


[ 
https://issues.apache.org/jira/browse/HDFS-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270516#comment-13270516
 ] 

Eli Collins commented on HDFS-3328:
---

Thanks Daryn. This puts the behavior back as it was previously - metrics and 
toString previously used the xfer port. It would only matter if people care 
about the particular port value in the DataNodeActivity-hostname-port 
key. I suspect most people care about the hostname, the port is used to make it 
unique for the case of running multiple DNs on the same host.

 NPE in DataNode.getIpcPort
 --

 Key: HDFS-3328
 URL: https://issues.apache.org/jira/browse/HDFS-3328
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0
Reporter: Uma Maheswara Rao G
Assignee: Eli Collins
Priority: Minor
 Attachments: hdfs-3328.txt


 While running the tests, I have seen this exceptions.Tests passed. 
 Not sure this is a problem.
 {quote}
 2012-04-26 23:15:51,763 WARN  hdfs.DFSClient (DFSOutputStream.java:run(710)) 
 - DFSOutputStream ResponseProcessor exception  for block 
 BP-1372255573-49.249.124.17-1335462329685:blk_-843504080180201_1005
 java.io.EOFException: Premature EOF: no length prefix available
   at 
 org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:657)
 Exception in thread DataXceiver for client /127.0.0.1:52323 [Cleaning up] 
 java.lang.NullPointerException
   at org.apache.hadoop.ipc.Server$Listener.getAddress(Server.java:669)
   at org.apache.hadoop.ipc.Server.getListenerAddress(Server.java:1988)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getIpcPort(DataNode.java:882)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getDisplayName(DataNode.java:863)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:171)
   at java.lang.Thread.run(Unknown Source){quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets

2012-05-08 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270551#comment-13270551
 ] 

stack commented on HDFS-3376:
-

+1 on pulling the hadoop-8280, etc., series into 0.23 branch.

 DFSClient fails to make connection to DN if there are many unusable cached 
 sockets
 --

 Key: HDFS-3376
 URL: https://issues.apache.org/jira/browse/HDFS-3376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 2.0.0

 Attachments: hdfs-3376.txt


 After fixing the datanode side of keepalive to properly disconnect stale 
 clients, (HDFS-3357), the client side has the following issue: when it 
 connects to a DN, it first tries to use cached sockets, and will try a 
 configurable number of sockets from the cache. If there are more cached 
 sockets than the configured number of retries, and all of them have been 
 closed by the datanode side, then the client will throw an exception and mark 
 the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3153) For HA, a logical name is visible in URIs - add an explicit logical name

2012-05-08 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270567#comment-13270567
 ] 

Sanjay Radia commented on HDFS-3153:


A configuration change will affect users - The URI is derived from this 
configuration parameter.
Do you feel it is acceptable to make this change after 2.0 is shipped?

 For HA, a logical name is visible in URIs - add an explicit logical name
 

 Key: HDFS-3153
 URL: https://issues.apache.org/jira/browse/HDFS-3153
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Sanjay Radia
Priority: Blocker
 Fix For: 2.0.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3153) For HA, a logical name is visible in URIs - add an explicit logical name

[
https://issues.apache.org/jira/browse/HDFS-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270571#comment-13270571
]

Aaron T. Myers commented on HDFS-3153:
--

bq. A configuration change will affect users - The URI is derived from this
configuration parameter.

The URI is already derived from a configuration parameter, it's just that that
configuration parameter is overloaded to be both the nameservice ID and the URI
which users refer to from clients. As I understand the proposal, it's not
obvious to me that this change would necessarily affect users, since the
publicly-visible URIs could theoretically remain the same both before and after
this change - only the underlying configs would change.

bq. Do you feel it is acceptable to make this change after 2.0 is shipped?

I think we should see a concrete proposal or patch to be able to make the
determination as to whether or not 2.0.0 can ship without this change. Only
then can we determine whether or not this change can be made in a
backward-compatible fashion. Regardless of that, however, I agree with Todd and
Eli that it seems fine to ship 2.0.0-alpha without this change, as it is a
clearly-labeled alpha release.

For HA, a logical name is visible in URIs - add an explicit logical name

Key: HDFS-3153
URL: https://issues.apache.org/jira/browse/HDFS-3153
Project: Hadoop HDFS
Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Sanjay Radia
Priority: Blocker
Fix For: 2.0.0

[jira] [Commented] (HDFS-3370) HDFS hardlink

[
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270589#comment-13270589
]

Daryn Sharp commented on HDFS-3370:
---

While I really like the idea of hardlinks, I believe there are more non-trivial
consideration with this proposed implementation. I'm by no means a SME, but I
experimented with a very different approach awhile ago. Here are some of the
issues I encountered:

I think the quota considerations may be a bit trickier. The original creator
of the file takes the nsquota dsquota hit. The links take just the dsquota
hit. However, when the original creator of the file is removed, one of the
other links must absorb the dsquota. If there are multiple remaining links,
which one takes the hit?

What if none of the remaining links have available quota? If the dsquota can
always be exceeded, I can bypass my quota by creating the file in one dir,
hardlinking from my out-of-dsquota dir, then removing the original. If the
dsquota cannot be exceeded, I can (maliciously?) hardlink from my
out-of-dsquota dir to deny the original creator the ability to delete the file
-- perhaps causing them to be unable to reduce their quota usage.

Block management will also be impacted. The manager currently operates on an
inode mapping (changing to an interface though), but which of the hardlink
inodes will it be? The original? When that link is removed, how will the
block manager be updated with another hardlink inode?

When a file is open for writing, the inode converts to under construction, so
there would need to be a hardlink under construction. You will have to think
about how other hardlinks are affected/handled. The case applies to hardlinks
during file creation and appending.

There may also be an impact to file leases. I believe they are path based so
leases will now need to be enforced across multiple paths.

What if one hardlink changes the replication factor? The maximum replication
factor for all hardlinks should probably be obeyed, but now the setrep command
will never succeed since it waits for the replication value to actually change.

HDFS hardlink
-

Key: HDFS-3370
URL: https://issues.apache.org/jira/browse/HDFS-3370
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Liyin Tang
Attachments: HDFS-HardLinks.pdf

We'd like to add a new feature hardlink to HDFS that allows harlinked files
to share data without copying. Currently we will support hardlinking only
closed files, but it could be extended to unclosed files as well.
Among many potential use cases of the feature, the following two are
primarily used in facebook:
1. This provides a lightweight way for applications like hbase to create a
snapshot;
2. This also allows an application like Hive to move a table to a different
directory without breaking current running hive queries.

[jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets


[ 
https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270596#comment-13270596
 ] 

Daryn Sharp commented on HDFS-3376:
---

Perhaps a naive question, but why can't {{socket.isClosed()}} be used to 
determine if the socket is unusable?  The closed sockets could be skipped and 
removed from the cache.

 DFSClient fails to make connection to DN if there are many unusable cached 
 sockets
 --

 Key: HDFS-3376
 URL: https://issues.apache.org/jira/browse/HDFS-3376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 2.0.0

 Attachments: hdfs-3376.txt


 After fixing the datanode side of keepalive to properly disconnect stale 
 clients, (HDFS-3357), the client side has the following issue: when it 
 connects to a DN, it first tries to use cached sockets, and will try a 
 configurable number of sockets from the cache. If there are more cached 
 sockets than the configured number of retries, and all of them have been 
 closed by the datanode side, then the client will throw an exception and mark 
 the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3328) NPE in DataNode.getIpcPort


[ 
https://issues.apache.org/jira/browse/HDFS-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270609#comment-13270609
 ] 

Daryn Sharp commented on HDFS-3328:
---

+1 Ok, thanks for the clarification!

 NPE in DataNode.getIpcPort
 --

 Key: HDFS-3328
 URL: https://issues.apache.org/jira/browse/HDFS-3328
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0
Reporter: Uma Maheswara Rao G
Assignee: Eli Collins
Priority: Minor
 Attachments: hdfs-3328.txt


 While running the tests, I have seen this exceptions.Tests passed. 
 Not sure this is a problem.
 {quote}
 2012-04-26 23:15:51,763 WARN  hdfs.DFSClient (DFSOutputStream.java:run(710)) 
 - DFSOutputStream ResponseProcessor exception  for block 
 BP-1372255573-49.249.124.17-1335462329685:blk_-843504080180201_1005
 java.io.EOFException: Premature EOF: no length prefix available
   at 
 org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
   at 
 org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:657)
 Exception in thread DataXceiver for client /127.0.0.1:52323 [Cleaning up] 
 java.lang.NullPointerException
   at org.apache.hadoop.ipc.Server$Listener.getAddress(Server.java:669)
   at org.apache.hadoop.ipc.Server.getListenerAddress(Server.java:1988)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getIpcPort(DataNode.java:882)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.getDisplayName(DataNode.java:863)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:171)
   at java.lang.Thread.run(Unknown Source){quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets


[ 
https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270627#comment-13270627
 ] 

Todd Lipcon commented on HDFS-3376:
---

bq. Perhaps a naive question, but why can't socket.isClosed() be used to 
determine if the socket is unusable? The closed sockets could be skipped and 
removed from the cache.

Unfortunately the .isClosed() method just checks a local flag which is set by 
close(). Here's the JDK source:
{code}
public boolean isClosed() {
synchronized(closeLock) {
return closed;
}
}
{code}

It may be possible to determine closed-ness by setting up a selector and 
selecting only for errors, but that seems somewhat complicated and for not much 
gain.

 DFSClient fails to make connection to DN if there are many unusable cached 
 sockets
 --

 Key: HDFS-3376
 URL: https://issues.apache.org/jira/browse/HDFS-3376
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 2.0.0

 Attachments: hdfs-3376.txt


 After fixing the datanode side of keepalive to properly disconnect stale 
 clients, (HDFS-3357), the client side has the following issue: when it 
 connects to a DN, it first tries to use cached sockets, and will try a 
 configurable number of sockets from the cache. If there are more cached 
 sockets than the configured number of retries, and all of them have been 
 closed by the datanode side, then the client will throw an exception and mark 
 the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3153) For HA, a logical name is visible in URIs - add an explicit logical name


[ 
https://issues.apache.org/jira/browse/HDFS-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270631#comment-13270631
 ] 

Todd Lipcon commented on HDFS-3153:
---

bq. Regardless of that, however, I agree with Todd and Eli that it seems fine 
to ship 2.0.0-alpha without this change, as it is a clearly-labeled alpha 
release.

+1. We can also clearly label that the HA features are somewhat new and the 
configurations are liable to change in future 2.0.x releases as it stabilizes 
and we continue to improve it.

 For HA, a logical name is visible in URIs - add an explicit logical name
 

 Key: HDFS-3153
 URL: https://issues.apache.org/jira/browse/HDFS-3153
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Sanjay Radia
Priority: Blocker
 Fix For: 2.0.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser

Brahma Reddy Battula created HDFS-3387:
--

 Summary: [Fsshell]It's better to provide hdfs instead of hadoop in 
GenericOptionsParser
 Key: HDFS-3387
 URL: https://issues.apache.org/jira/browse/HDFS-3387
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Priority: Trivial


Scenario:
--
Execute any fsshell command with invalid options

Like ./hdfs haadmin -transitionToActive...

Here it is logging as following..

bin/hadoop command [genericOptions] [commandOptions]...



Expected: Here help message is misleading to user saying that bin/hadoop that 
is not actually user ran

it's better to log bin/hdfs..Anyway hadoop is deprecated..



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser


 [ 
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-3387:
---

Fix Version/s: 3.0.0
   2.0.0
   Status: Patch Available  (was: Open)

Attaching patch..

 [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
 --

 Key: HDFS-3387
 URL: https://issues.apache.org/jira/browse/HDFS-3387
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Priority: Trivial
 Fix For: 2.0.0, 3.0.0


 Scenario:
 --
 Execute any fsshell command with invalid options
 Like ./hdfs haadmin -transitionToActive...
 Here it is logging as following..
 bin/hadoop command [genericOptions] [commandOptions]...
 Expected: Here help message is misleading to user saying that bin/hadoop that 
 is not actually user ran
 it's better to log bin/hdfs..Anyway hadoop is deprecated..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser


 [ 
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-3387:
---

Attachment: HDFS-3387.patch

 [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
 --

 Key: HDFS-3387
 URL: https://issues.apache.org/jira/browse/HDFS-3387
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Priority: Trivial
 Fix For: 2.0.0, 3.0.0

 Attachments: HDFS-3387.patch


 Scenario:
 --
 Execute any fsshell command with invalid options
 Like ./hdfs haadmin -transitionToActive...
 Here it is logging as following..
 bin/hadoop command [genericOptions] [commandOptions]...
 Expected: Here help message is misleading to user saying that bin/hadoop that 
 is not actually user ran
 it's better to log bin/hdfs..Anyway hadoop is deprecated..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser

[
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270650#comment-13270650
]

Hadoop QA commented on HDFS-3387:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12526018/HDFS-3387.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2389//console

This message is automatically generated.

[Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
--

Key: HDFS-3387
URL: https://issues.apache.org/jira/browse/HDFS-3387
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Priority: Trivial
Fix For: 2.0.0, 3.0.0

Attachments: HDFS-3387.patch

Scenario:
--
Execute any fsshell command with invalid options
Like ./hdfs haadmin -transitionToActive...
Here it is logging as following..
bin/hadoop command [genericOptions] [commandOptions]...
Expected: Here help message is misleading to user saying that bin/hadoop that
is not actually user ran
it's better to log bin/hdfs..Anyway hadoop is deprecated..

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system


[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270658#comment-13270658
 ] 

Todd Lipcon commented on HDFS-234:
--

Uma: I'm a little concerned about having this as a main part of our 2.0 
release, to be honest. There hasn't been activity and the only activity I've 
seen around it has been from Ivan. Is anyone committed to actually QAing and 
running this code? A lot of the other features around HA don't integrate with 
it, it isn't documented, etc. Maybe we should start a discussion on the list.

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3277) fail over to loading a different FSImage if the first one we try to load is corrupt

2012-05-08 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3277:
---

Attachment: HDFS-3277.003.patch

* fix bug where we weren't always loading the newest image(s)

* rebase on trunk

 fail over to loading a different FSImage if the first one we try to load is 
 corrupt
 ---

 Key: HDFS-3277
 URL: https://issues.apache.org/jira/browse/HDFS-3277
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3277.002.patch, HDFS-3277.003.patch


 Most users store multiple copies of the FSImage in order to prevent 
 catastrophic data loss if a hard disk fails.  However, our image loading code 
 is currently not set up to start reading another FSImage if loading the first 
 one does not succeed.  We should add this capability.
 We should also be sure to remove the FSImage directory that failed from the 
 list of FSImage directories to write to, in the way we normally do when a 
 write (as opopsed to read) fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3370) HDFS hardlink

2012-05-08 Thread John George (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270660#comment-13270660
 ] 

John George commented on HDFS-3370:
---

Thanks for uploading the design document. 
Do you plan to support hardlink using FileContext? In the design document, I 
see FileSystem and FsShell being mentioned as client interface - hence the 
question. 

 HDFS hardlink
 -

 Key: HDFS-3370
 URL: https://issues.apache.org/jira/browse/HDFS-3370
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Liyin Tang
 Attachments: HDFS-HardLinks.pdf


 We'd like to add a new feature hardlink to HDFS that allows harlinked files 
 to share data without copying. Currently we will support hardlinking only 
 closed files, but it could be extended to unclosed files as well.
 Among many potential use cases of the feature, the following two are 
 primarily used in facebook:
 1. This provides a lightweight way for applications like hbase to create a 
 snapshot;
 2. This also allows an application like Hive to move a table to a different 
 directory without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system

[
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270686#comment-13270686
]

Uma Maheswara Rao G commented on HDFS-234:
--

Hi Todd,
As this is also one option for HA, we are working on this patch merged to
Hadoop-2 internally and we are testing with BookKeeper. Also this is a contrib
part in HDFS. As this is a pluggable option, why can't we keep this in contrib?
If we have other options ready and performing well, then we can choose one
among, as you mentioned in other discussions.

I have seen the comment from Dhruba for continue invest on BK based approach
also.
https://issues.apache.org/jira/browse/HDFS-3092?focusedCommentId=13265110page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13265110
Since the basic thing ready from BKJM side, it will help some users for
evaluating it as this is an alpha quality release. Identified some issues like
HDFS-3058 in our testing also. Infact we have to push this also in.

Regarding document: I agree, we may have to prepare well documentation for
using this option, currently it is available only in BKJM class.

Integration with BookKeeper logging system
--

Key: HDFS-234
URL: https://issues.apache.org/jira/browse/HDFS-234
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
Fix For: 3.0.0

Attachments: HADOOP-5189-trunk-preview.patch,
HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch,
HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch,
HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff,
HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar,
zookeeper-dev.jar

BookKeeper is a system to reliably log streams of records
(https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a
natural target for such a system for being the metadata repository of the
entire file system for HDFS.

[jira] [Commented] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser


[ 
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270684#comment-13270684
 ] 

Aaron T. Myers commented on HDFS-3387:
--

Hey Brahma, you need to root your patch at repo root. The patch you attached 
doesn't include the path to the file you edited.

 [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
 --

 Key: HDFS-3387
 URL: https://issues.apache.org/jira/browse/HDFS-3387
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Priority: Trivial
 Fix For: 2.0.0, 3.0.0

 Attachments: HDFS-3387.patch


 Scenario:
 --
 Execute any fsshell command with invalid options
 Like ./hdfs haadmin -transitionToActive...
 Here it is logging as following..
 bin/hadoop command [genericOptions] [commandOptions]...
 Expected: Here help message is misleading to user saying that bin/hadoop that 
 is not actually user ran
 it's better to log bin/hdfs..Anyway hadoop is deprecated..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system

2012-05-08 Thread dhruba borthakur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270691#comment-13270691
 ] 

dhruba borthakur commented on HDFS-234:
---

Hi Uma, while it is true that we are dedicating resources to evaluate HDFS-HA 
with BK, it  will possibly be for trunk and not for 2.0. are you folks planning 
to use this in hadoop 2.0 in production?

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-2236) DfsServlet and FileChecksumServlets still use the filename parameter

2012-05-08 Thread lohit giri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lohit giri reassigned HDFS-2236:


Assignee: lohit giri

 DfsServlet and FileChecksumServlets still use the filename parameter 
 -

 Key: HDFS-2236
 URL: https://issues.apache.org/jira/browse/HDFS-2236
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Eli Collins
Assignee: lohit giri
Priority: Minor
 Attachments: HDFS-2236.0.22.patch


 DfsServlet and FileChecksumServlets still use the filename parameter, weren't 
 modified in HDFS-1109 to pass the file name as part of the URI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3385) ClassCastException when trying to append a file


 [ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3385:
-

Description: 
When I try to append a file I got 

{noformat}
2012-05-08 18:13:40,506 WARN  util.KerberosName 
(KerberosName.java:clinit(87)) - Kerberos krb5 configuration not found, 
setting default realm to empty
Exception in thread main java.lang.ClassCastException: 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
...
at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
at 
org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
{noformat}

  was:
When I try to append a file I got 

2012-05-08 18:13:40,506 WARN  util.KerberosName 
(KerberosName.java:clinit(87)) - Kerberos krb5 configuration not found, 
setting default realm to empty
Exception in thread main java.lang.ClassCastException: 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:217)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42592)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)

at org.apache.hadoop.ipc.Client.call(Client.java:1159)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
at $Proxy9.append(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.append(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:204)
at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
at 
org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)



It seems that ClassCastException is not shown up in the existing unit tests.  
How to reproduce this exactly?  Could you give more details?

(Let's remove the rpc stack trace in

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system


[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270705#comment-13270705
 ] 

Eli Collins commented on HDFS-234:
--

Uma,

Given all the bugs you found recently, and haven't been triaged/assigned, there 
are no docs, etc I'd keep this slated for 3.0. If it gets tested/doc'd and 
stabilizes on trunk then we can consider merging.

We're actually trying to get away from using contrib. See this thread for 
history: http://search-hadoop.com/m/EPdTjoLpuh2. I removed most of the hdfs 
contrib components.

Thanks,
Eli

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3385) ClassCastException when trying to append a file


 [ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3385:
-

Attachment: h3385_20120508.patch

The line numbers in the stack trace is off.  If my guess is correct, the fix 
could be as simple as the following patch.

h3385_20120508.patch

 ClassCastException when trying to append a file
 ---

 Key: HDFS-3385
 URL: https://issues.apache.org/jira/browse/HDFS-3385
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 3.0.0
 Environment: HDFS
Reporter: Brahma Reddy Battula
Priority: Minor
 Fix For: 3.0.0

 Attachments: h3385_20120508.patch


 When I try to append a file I got 
 {noformat}
 2012-05-08 18:13:40,506 WARN  util.KerberosName 
 (KerberosName.java:clinit(87)) - Kerberos krb5 configuration not found, 
 setting default realm to empty
 Exception in thread main java.lang.ClassCastException: 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
 ...
   at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
   at 
 org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3370) HDFS hardlink

2012-05-08 Thread Liyin Tang (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270715#comment-13270715
]

Liyin Tang commented on HDFS-3370:
--

@Daryn Sharp: very good comments :)
1) Quota is the trickest for the hard link.

For nsquota usage, it will be added up when creating hardlinks and be decreased
when removing hardlinks.

For dsquota usage, it will only increase and decrease the quota usage for the
directories, which are not any common ancestor directories with any linked
files.
For example, ln /root/dir1/file1 /root/dir1/file2 : there is no need to
increase the ds quota usage when creating the link file: file2.
Also rm /root/dir1/file1 : there is no need to decrease the ds quota usage
when removing the original source file: file1.

The bottom line is there is no such case that we need to increase any dsquota
during the file removal operation. Because if the directory is a common
ancestor directory, no dsquota needs to be updated, otherwise the dsquota has
already been updated during the hard link created time.

2) You are right that each blockInfo of the linked files needs to be updated
when the original file is deleted. I shall update the design doc to explicitly
explain this part in details.

3) Currently, at least for V1, we shall support the hardlinking only for the
closed files and won't support to append operation against linked files, but it
could be extended in the future.

4) Very good point that hardlinked files shall respect the max replication
factors. From my understanding, the setReplication is just a memory footprint
update and the name node will increase actual replication in the background.

HDFS hardlink
-

Key: HDFS-3370
URL: https://issues.apache.org/jira/browse/HDFS-3370
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Liyin Tang
Attachments: HDFS-HardLinks.pdf

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system

2012-05-08 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270721#comment-13270721
 ] 

Suresh Srinivas commented on HDFS-234:
--

Eli, given that is this an option that Uma is trying out and 2.0-alpha is being 
release, I feel it is fine for HDFS-234 to be merged to 2.0. As regards to 
contrib components, lets have a separate discussion on how BK related code 
should be made available, if not contrib.

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file


[ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270725#comment-13270725
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3385:
--

Brahma, thanks for filing the bug.  Could you also check if 
h3385_20120508.patch works for your test case?

 ClassCastException when trying to append a file
 ---

 Key: HDFS-3385
 URL: https://issues.apache.org/jira/browse/HDFS-3385
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 3.0.0
 Environment: HDFS
Reporter: Brahma Reddy Battula
Priority: Minor
 Fix For: 3.0.0

 Attachments: h3385_20120508.patch


 When I try to append a file I got 
 {noformat}
 2012-05-08 18:13:40,506 WARN  util.KerberosName 
 (KerberosName.java:clinit(87)) - Kerberos krb5 configuration not found, 
 setting default realm to empty
 Exception in thread main java.lang.ClassCastException: 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
 ...
   at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
   at 
 org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
   at 
 org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system


[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270729#comment-13270729
 ] 

Uma Maheswara Rao G commented on HDFS-234:
--

Eli, HDFS-2717 and HDFS-3058 are the main issues identified at BKJM side with 
testing and discussions with Ivan. Also we have been filing some in BK project 
also.

{quote}
given that is this an option that Uma is trying out and 2.0-alpha is being 
release, I feel it is fine for HDFS-234 to be merged to 2.0.
{quote}
Yes, Suresh, this is what my opinion. 

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3277) fail over to loading a different FSImage if the first one we try to load is corrupt

[
https://issues.apache.org/jira/browse/HDFS-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270732#comment-13270732
]

Hadoop QA commented on HDFS-3277:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12526022/HDFS-3277.003.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
org.apache.hadoop.hdfs.server.namenode.TestStartup

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2390//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2390//console

This message is automatically generated.

fail over to loading a different FSImage if the first one we try to load is
corrupt
---

Key: HDFS-3277
URL: https://issues.apache.org/jira/browse/HDFS-3277
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Attachments: HDFS-3277.002.patch, HDFS-3277.003.patch

Most users store multiple copies of the FSImage in order to prevent
catastrophic data loss if a hard disk fails. However, our image loading code
is currently not set up to start reading another FSImage if loading the first
one does not succeed. We should add this capability.
We should also be sure to remove the FSImage directory that failed from the
list of FSImage directories to write to, in the way we normally do when a
write (as opopsed to read) fails.

[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened


[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270769#comment-13270769
 ] 

Hudson commented on HDFS-3157:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2284 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2284/])
HDFS-3157. Error in deleting block is keep on coming from DN even after the 
block report and directory scanning has happened. Contributed by Ashish Singhi. 
(Revision 1335719)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335719
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java


 Error in deleting block is keep on coming from DN even after the block report 
 and directory scanning has happened
 -

 Key: HDFS-3157
 URL: https://issues.apache.org/jira/browse/HDFS-3157
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: J.Andreina
Assignee: Ashish Singhi
 Fix For: 0.24.0

 Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch


 Cluster setup:
 1NN,Three DN(DN1,DN2,DN3),replication factor-2,dfs.blockreport.intervalMsec 
 300,dfs.datanode.directoryscan.interval 1
 step 1: write one file a.txt with sync(not closed)
 step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
 replication happened.
 step 3: close the file.
 Since the replication factor is 2 the blocks are replicated to the other 
 datanode.
 Then at the NN side the following cmd is issued to DN from which the block is 
 deleted
 -
 {noformat}
 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
 NameSystem.addToCorruptReplicasMap: duplicate requested for 
 blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
 because reported RBW replica with genstamp 1002 does not match COMPLETE 
 block's genstamp in block map 1003
 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
 Removing block blk_2903555284838653156_1003 from neededReplications as it has 
 enough replicas.
 {noformat}
 From the datanode side in which the block is deleted the following exception 
 occured
 {noformat}
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Unexpected error trying to delete block blk_2903555284838653156_1003. 
 BlockInfo not found in volumeMap.
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Error processing datanode Command
 java.io.IOException: Error in deleting blocks.
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened


[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270774#comment-13270774
 ] 

Uma Maheswara Rao G commented on HDFS-3157:
---

I have committed this to trunk and branch-2. Thanks a lot Ashish for the 
contribution!
Thanks Nicholas, for the review!

 Error in deleting block is keep on coming from DN even after the block report 
 and directory scanning has happened
 -

 Key: HDFS-3157
 URL: https://issues.apache.org/jira/browse/HDFS-3157
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: J.Andreina
Assignee: Ashish Singhi
 Fix For: 0.24.0

 Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch


 Cluster setup:
 1NN,Three DN(DN1,DN2,DN3),replication factor-2,dfs.blockreport.intervalMsec 
 300,dfs.datanode.directoryscan.interval 1
 step 1: write one file a.txt with sync(not closed)
 step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
 replication happened.
 step 3: close the file.
 Since the replication factor is 2 the blocks are replicated to the other 
 datanode.
 Then at the NN side the following cmd is issued to DN from which the block is 
 deleted
 -
 {noformat}
 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
 NameSystem.addToCorruptReplicasMap: duplicate requested for 
 blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
 because reported RBW replica with genstamp 1002 does not match COMPLETE 
 block's genstamp in block map 1003
 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
 Removing block blk_2903555284838653156_1003 from neededReplications as it has 
 enough replicas.
 {noformat}
 From the datanode side in which the block is deleted the following exception 
 occured
 {noformat}
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Unexpected error trying to delete block blk_2903555284838653156_1003. 
 BlockInfo not found in volumeMap.
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Error processing datanode Command
 java.io.IOException: Error in deleting blocks.
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException

2012-05-08 Thread Brandon Li (JIRA)

Brandon Li created HDFS-3388:


 Summary: GetJournalEditServlet should catch more exceptions, not 
just IOException
 Key: HDFS-3388
 URL: https://issues.apache.org/jira/browse/HDFS-3388
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Brandon Li
Assignee: Brandon Li


GetJournalEditServlet has the same problem as that of GetImageServlet 
(HDFS-3330). It should be fixed in the same way. Also need to make 
CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened


[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270772#comment-13270772
 ] 

Hudson commented on HDFS-3157:
--

Integrated in Hadoop-Common-trunk-Commit #2209 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2209/])
HDFS-3157. Error in deleting block is keep on coming from DN even after the 
block report and directory scanning has happened. Contributed by Ashish Singhi. 
(Revision 1335719)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335719
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java


 Error in deleting block is keep on coming from DN even after the block report 
 and directory scanning has happened
 -

 Key: HDFS-3157
 URL: https://issues.apache.org/jira/browse/HDFS-3157
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: J.Andreina
Assignee: Ashish Singhi
 Fix For: 0.24.0

 Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch


 Cluster setup:
 1NN,Three DN(DN1,DN2,DN3),replication factor-2,dfs.blockreport.intervalMsec 
 300,dfs.datanode.directoryscan.interval 1
 step 1: write one file a.txt with sync(not closed)
 step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
 replication happened.
 step 3: close the file.
 Since the replication factor is 2 the blocks are replicated to the other 
 datanode.
 Then at the NN side the following cmd is issued to DN from which the block is 
 deleted
 -
 {noformat}
 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
 NameSystem.addToCorruptReplicasMap: duplicate requested for 
 blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
 because reported RBW replica with genstamp 1002 does not match COMPLETE 
 block's genstamp in block map 1003
 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
 Removing block blk_2903555284838653156_1003 from neededReplications as it has 
 enough replicas.
 {noformat}
 From the datanode side in which the block is deleted the following exception 
 occured
 {noformat}
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Unexpected error trying to delete block blk_2903555284838653156_1003. 
 BlockInfo not found in volumeMap.
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Error processing datanode Command
 java.io.IOException: Error in deleting blocks.
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3370) HDFS hardlink

[
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270780#comment-13270780
]

Daryn Sharp commented on HDFS-3370:
---

I'm glad you find my questions helpful!

bq. For example, ln /root/dir1/file1 /root/dir1/file2 : there is no need to
increase the ds quota usage when creating the link file: file2. Also rm
/root/dir1/file1 : there is no need to decrease the ds quota usage when
removing the original source file: file1.

I agree that ds quota doesn't need to be changed when there are links in the
same directory. I'm referring to the case of hardlinks across directories.
Ie. /dir/dir2/file and /dir/dir3/hardlink. If dir2 and dir3 have separate ds
quotas, then dir3 has to absorb the ds quota when the original file is removed
from dir2. What if there is a /dir/dir4/hardlink2? Does dir3 or dir4 absorb
the ds quota? What if neither has the necessary quota available?

bq. Currently, at least for V1, we shall support the hardlinking only for the
closed files and won't support to append operation against linked files, but it
could be extended in the future.

A reasonable approach, but it may lead to user confusion. It almost begs for a
immutable flag (ie. chattr +i/-i) to prevent inadvertent hard linking to files
intended to be mutable.

Nonetheless, I'd suggest exploring the difficulties reconciling the current
design of the namesystem/block management with your design. It may help avoid
boxing ourselves into a corner with limited hard link support.

bq. From my understanding, the setReplication is just a memory footprint
update and the name node will increase actual replication in the background.

Yes, but the FsShell setrep command actively monitors the files and does not
exit until the replication factor is what the user requested -- as determined
by the number of hosts per block. Another consideration is ds quota is based
on a multiple of replication factor, so who is allowed to change the
replication factor since increasing it may impact a different user's quota?

HDFS hardlink
-

Key: HDFS-3370
URL: https://issues.apache.org/jira/browse/HDFS-3370
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Liyin Tang
Attachments: HDFS-HardLinks.pdf

[jira] [Assigned] (HDFS-3389) Document the BKJM usage in Namenode HA.


 [ 
https://issues.apache.org/jira/browse/HDFS-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-3389:
-

Assignee: Uma Maheswara Rao G

 Document the BKJM usage in Namenode HA.
 ---

 Key: HDFS-3389
 URL: https://issues.apache.org/jira/browse/HDFS-3389
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0, 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G

 As per the discussion in HDFS-234, We need clear documentation for BKJM usage 
 in Namenode HA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3389) Document the BKJM usage in Namenode HA.

Uma Maheswara Rao G created HDFS-3389:
-

 Summary: Document the BKJM usage in Namenode HA.
 Key: HDFS-3389
 URL: https://issues.apache.org/jira/browse/HDFS-3389
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0, 3.0.0
Reporter: Uma Maheswara Rao G


As per the discussion in HDFS-234, We need clear documentation for BKJM usage 
in Namenode HA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened


[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270781#comment-13270781
 ] 

Hadoop QA commented on HDFS-3157:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526038/HDFS-3157.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2391//console

This message is automatically generated.

 Error in deleting block is keep on coming from DN even after the block report 
 and directory scanning has happened
 -

 Key: HDFS-3157
 URL: https://issues.apache.org/jira/browse/HDFS-3157
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: J.Andreina
Assignee: Ashish Singhi
 Fix For: 0.24.0

 Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch


 Cluster setup:
 1NN,Three DN(DN1,DN2,DN3),replication factor-2,dfs.blockreport.intervalMsec 
 300,dfs.datanode.directoryscan.interval 1
 step 1: write one file a.txt with sync(not closed)
 step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
 replication happened.
 step 3: close the file.
 Since the replication factor is 2 the blocks are replicated to the other 
 datanode.
 Then at the NN side the following cmd is issued to DN from which the block is 
 deleted
 -
 {noformat}
 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
 NameSystem.addToCorruptReplicasMap: duplicate requested for 
 blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
 because reported RBW replica with genstamp 1002 does not match COMPLETE 
 block's genstamp in block map 1003
 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
 Removing block blk_2903555284838653156_1003 from neededReplications as it has 
 enough replicas.
 {noformat}
 From the datanode side in which the block is deleted the following exception 
 occured
 {noformat}
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Unexpected error trying to delete block blk_2903555284838653156_1003. 
 BlockInfo not found in volumeMap.
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Error processing datanode Command
 java.io.IOException: Error in deleting blocks.
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened


[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270782#comment-13270782
 ] 

Hudson commented on HDFS-3157:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2226 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2226/])
HDFS-3157. Error in deleting block is keep on coming from DN even after the 
block report and directory scanning has happened. Contributed by Ashish Singhi. 
(Revision 1335719)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335719
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java


 Error in deleting block is keep on coming from DN even after the block report 
 and directory scanning has happened
 -

 Key: HDFS-3157
 URL: https://issues.apache.org/jira/browse/HDFS-3157
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: J.Andreina
Assignee: Ashish Singhi
 Fix For: 0.24.0

 Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch


 Cluster setup:
 1NN,Three DN(DN1,DN2,DN3),replication factor-2,dfs.blockreport.intervalMsec 
 300,dfs.datanode.directoryscan.interval 1
 step 1: write one file a.txt with sync(not closed)
 step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
 replication happened.
 step 3: close the file.
 Since the replication factor is 2 the blocks are replicated to the other 
 datanode.
 Then at the NN side the following cmd is issued to DN from which the block is 
 deleted
 -
 {noformat}
 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
 NameSystem.addToCorruptReplicasMap: duplicate requested for 
 blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
 because reported RBW replica with genstamp 1002 does not match COMPLETE 
 block's genstamp in block map 1003
 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
 Removing block blk_2903555284838653156_1003 from neededReplications as it has 
 enough replicas.
 {noformat}
 From the datanode side in which the block is deleted the following exception 
 occured
 {noformat}
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Unexpected error trying to delete block blk_2903555284838653156_1003. 
 BlockInfo not found in volumeMap.
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Error processing datanode Command
 java.io.IOException: Error in deleting blocks.
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system


[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270790#comment-13270790
 ] 

Uma Maheswara Rao G commented on HDFS-234:
--

Others, Agreeing with Suresh?

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser


[ 
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270797#comment-13270797
 ] 

Daryn Sharp commented on HDFS-3387:
---

Are you sure hadoop fs is deprecated?  It doesn't seem to make sense to run 
hdfs fs on a non-hdfs filesystem.

 [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
 --

 Key: HDFS-3387
 URL: https://issues.apache.org/jira/browse/HDFS-3387
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Priority: Trivial
 Fix For: 2.0.0, 3.0.0

 Attachments: HDFS-3387.patch


 Scenario:
 --
 Execute any fsshell command with invalid options
 Like ./hdfs haadmin -transitionToActive...
 Here it is logging as following..
 bin/hadoop command [genericOptions] [commandOptions]...
 Expected: Here help message is misleading to user saying that bin/hadoop that 
 is not actually user ran
 it's better to log bin/hdfs..Anyway hadoop is deprecated..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException

2012-05-08 Thread Brandon Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3388:
-

Attachment: HDFS-3388.HDFS-3092.patch

 GetJournalEditServlet should catch more exceptions, not just IOException
 

 Key: HDFS-3388
 URL: https://issues.apache.org/jira/browse/HDFS-3388
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-3388.HDFS-3092.patch


 GetJournalEditServlet has the same problem as that of GetImageServlet 
 (HDFS-3330). It should be fixed in the same way. Also need to make 
 CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3370) HDFS hardlink

2012-05-08 Thread Liyin Tang (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270805#comment-13270805
]

Liyin Tang commented on HDFS-3370:
--

bg. I agree that ds quota doesn't need to be changed when there are links in
the same directory. I'm referring to the case of hardlinks across directories.
Ie. /dir/dir2/file and /dir/dir3/hardlink. If dir2 and dir3 have separate ds
quotas, then dir3 has to absorb the ds quota when the original file is removed
from dir2. What if there is a /dir/dir4/hardlink2? Does dir3 or dir4 absorb the
ds quota? What if neither has the necessary quota available?

Based on the same example you commented, when linking /dir/dir2/file and
/dir/dir3/hardlink, it will increase the dsquota for dir3 but not /dir. Because
dir3 is NOT a common ancestor but dir is. And if dir3 doesn't have enough
dsquota, then it shall throw quota exceptions. Also if there is a
/dir/dir4/hardlink2, it absorbs the dsquota for dir4 as well. So the point is
that it only absorbs the dsquota during the link creation time and decreases
the dsquota during the link deletion time.

From my understanding, the basic semantics for HardLink is to allow user
create multiple logic files referencing to the same set of blocks/bytes on
disks. So user could set different file level attributes for each linked file
such as owner, permission, modification time.
Since these linked files share the same set of blocks, the block level setting
shall be shared.
It may be a little confused to distinguish the replication factor in HDFS
between file-level attributes and block-level attributes.
If we agree that replication factor is a block-level attribute, then we shall
pay the overhead (wait time) when increasing replication factor, just as
increasing the replication factor against a regular file, and the
setReplication operation is supposed to fail if it breaks the dsquota.

HDFS hardlink
-

Key: HDFS-3370
URL: https://issues.apache.org/jira/browse/HDFS-3370
Project: Hadoop HDFS
Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Liyin Tang
Attachments: HDFS-HardLinks.pdf

[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode

[
https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270816#comment-13270816
]

Todd Lipcon commented on HDFS-3092:
---

A few of us are meeting up today at 3:30pm PST to discuss this JIRA. If you'd
like to join by phone, please ping me by email. Sorry for the late notice, we
just planned this impromptu.

Enable journal protocol based editlog streaming for standby namenode

Key: HDFS-3092
URL: https://issues.apache.org/jira/browse/HDFS-3092
Project: Hadoop HDFS
Issue Type: Improvement
Components: ha, name-node
Affects Versions: 0.24.0, 0.23.3
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Attachments: ComparisonofApproachesforHAJournals.pdf, JNStates.png,
MultipleSharedJournals.pdf, MultipleSharedJournals.pdf,
MultipleSharedJournals.pdf

Currently standby namenode relies on reading shared editlogs to stay current
with the active namenode, for namespace changes. BackupNode used streaming
edits from active namenode for doing the same. This jira is to explore using
journal protocol based editlog streams for the standby namenode. A daemon in
standby will get the editlogs from the active and write it to local edits. To
begin with, the existing standby mechanism of reading from a file, will
continue to be used, instead of from shared edits, from the local edits.

[jira] [Commented] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException


[ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270832#comment-13270832
 ] 

Todd Lipcon commented on HDFS-3388:
---

Good idea. Maybe we should also rename CheckpointFaultInjector to 
FileTransferFaultInjector or something? Or split it in two, since if I recall 
correctly some of the fault points are checkpoint-specific whereas others are 
just for the GetImageServlet.

 GetJournalEditServlet should catch more exceptions, not just IOException
 

 Key: HDFS-3388
 URL: https://issues.apache.org/jira/browse/HDFS-3388
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-3388.HDFS-3092.patch


 GetJournalEditServlet has the same problem as that of GetImageServlet 
 (HDFS-3330). It should be fixed in the same way. Also need to make 
 CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3390) DFSAdmin should print full stack traces of errors when DEBUG logging is enabled

Aaron T. Myers created HDFS-3390:


 Summary: DFSAdmin should print full stack traces of errors when 
DEBUG logging is enabled
 Key: HDFS-3390
 URL: https://issues.apache.org/jira/browse/HDFS-3390
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor


If an error is encountered when running an `hdfs dfsadmin ...' command, only 
the exception's message is output. It would be handy for debugging if the full 
stack trace of the exception were output when DEBUG logging is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2236) DfsServlet and FileChecksumServlets still use the filename parameter


 [ 
https://issues.apache.org/jira/browse/HDFS-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2236:
--

Affects Version/s: (was: 0.23.0)
   0.22.0

This was fixed for 23/2.0 by HDFS-2235.

 DfsServlet and FileChecksumServlets still use the filename parameter 
 -

 Key: HDFS-2236
 URL: https://issues.apache.org/jira/browse/HDFS-2236
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.22.0
Reporter: Eli Collins
Assignee: lohit giri
Priority: Minor
 Attachments: HDFS-2236.0.22.patch


 DfsServlet and FileChecksumServlets still use the filename parameter, weren't 
 modified in HDFS-1109 to pass the file name as part of the URI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException

2012-05-08 Thread Brandon Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270853#comment-13270853
 ] 

Brandon Li commented on HDFS-3388:
--

Renaming it sounds better to me. How about calling it FaultInjector so it can 
also be used for error simulation in other ears inside HDFS?

 GetJournalEditServlet should catch more exceptions, not just IOException
 

 Key: HDFS-3388
 URL: https://issues.apache.org/jira/browse/HDFS-3388
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-3388.HDFS-3092.patch


 GetJournalEditServlet has the same problem as that of GetImageServlet 
 (HDFS-3330). It should be fixed in the same way. Also need to make 
 CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3390) DFSAdmin should print full stack traces of errors when DEBUG logging is enabled


 [ 
https://issues.apache.org/jira/browse/HDFS-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3390:
-

Status: Patch Available  (was: Open)

 DFSAdmin should print full stack traces of errors when DEBUG logging is 
 enabled
 ---

 Key: HDFS-3390
 URL: https://issues.apache.org/jira/browse/HDFS-3390
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Attachments: HDFS-3390.patch


 If an error is encountered when running an `hdfs dfsadmin ...' command, only 
 the exception's message is output. It would be handy for debugging if the 
 full stack trace of the exception were output when DEBUG logging is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3390) DFSAdmin should print full stack traces of errors when DEBUG logging is enabled


 [ 
https://issues.apache.org/jira/browse/HDFS-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3390:
-

Attachment: HDFS-3390.patch

Here's a patch which addresses the issue. It catches all exceptions from 
DFSAdmin commands, logs the full stack trace of the exception if 
LOG.isDebugEnabled, and then re-throws the exception in all cases.

I tested this manually by setting HADOOP_ROOT_LOGGER=DEBUG,console and running  
a few invalid commands, e.g. with illegal arguments or non-HDFS URIs.

 DFSAdmin should print full stack traces of errors when DEBUG logging is 
 enabled
 ---

 Key: HDFS-3390
 URL: https://issues.apache.org/jira/browse/HDFS-3390
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Attachments: HDFS-3390.patch


 If an error is encountered when running an `hdfs dfsadmin ...' command, only 
 the exception's message is output. It would be handy for debugging if the 
 full stack trace of the exception were output when DEBUG logging is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-234) Integration with BookKeeper logging system

2012-05-08 Thread Nathan Roberts (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270874#comment-13270874
 ] 

Nathan Roberts commented on HDFS-234:
-

I agree it would be good to have it in 2.0-alpha for folks to experiment with 
and compare with other alternatives.

 Integration with BookKeeper logging system
 --

 Key: HDFS-234
 URL: https://issues.apache.org/jira/browse/HDFS-234
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Luca Telloli
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-trunk-preview.patch, HADOOP-5189-trunk-preview.patch, 
 HADOOP-5189-v.19.patch, HADOOP-5189.patch, HDFS-234-branch-2.patch, 
 HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, HDFS-234.diff, 
 HDFS-234.patch, create.png, hdfs_tpt_lat.pdf, zookeeper-dev-bookkeeper.jar, 
 zookeeper-dev.jar


 BookKeeper is a system to reliably log streams of records 
 (https://issues.apache.org/jira/browse/ZOOKEEPER-276). The NameNode is a 
 natural target for such a system for being the metadata repository of the 
 entire file system for HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3363) blockmanagement should stop using INodeFile INodeFileUC


[ 
https://issues.apache.org/jira/browse/HDFS-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270880#comment-13270880
 ] 

Hudson commented on HDFS-3363:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2285 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2285/])
Remove the empty file FSInodeInfo.java for HDFS-3363. (Revision 1335788)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335788
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSInodeInfo.java


 blockmanagement should stop using INodeFile  INodeFileUC 
 --

 Key: HDFS-3363
 URL: https://issues.apache.org/jira/browse/HDFS-3363
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0, 3.0.0
Reporter: John George
Assignee: John George
Priority: Minor
 Fix For: 2.0.0

 Attachments: HDFS-3363.java, HDFS-3363.java, HDFS-3363.java, 
 HDFS-3363.patch


 Blockmanagement should stop using INodeFile and INodeFileUnderConstruction. 
 One way would be to create an interface, like BlockColletion, that is passed 
 along to the blockmanagement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3363) blockmanagement should stop using INodeFile INodeFileUC


[ 
https://issues.apache.org/jira/browse/HDFS-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270882#comment-13270882
 ] 

Hudson commented on HDFS-3363:
--

Integrated in Hadoop-Common-trunk-Commit #2210 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2210/])
Remove the empty file FSInodeInfo.java for HDFS-3363. (Revision 1335788)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1335788
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSInodeInfo.java


 blockmanagement should stop using INodeFile  INodeFileUC 
 --

 Key: HDFS-3363
 URL: https://issues.apache.org/jira/browse/HDFS-3363
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0, 3.0.0
Reporter: John George
Assignee: John George
Priority: Minor
 Fix For: 2.0.0

 Attachments: HDFS-3363.java, HDFS-3363.java, HDFS-3363.java, 
 HDFS-3363.patch


 Blockmanagement should stop using INodeFile and INodeFileUnderConstruction. 
 One way would be to create an interface, like BlockColletion, that is passed 
 along to the blockmanagement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-08 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3335:
---

Attachment: HDFS-3335.006.patch

* rename verifyEndOfLog to verifyTerminator

* add GarbageAfterTerminatorException.  This exception stores the number of 
padding bytes we successfully read.  This allows us to skip them after a 
resync().  This makes recovery work faster, which is important for unit tests.

 check for edit log corruption at the end of the log
 ---

 Key: HDFS-3335
 URL: https://issues.apache.org/jira/browse/HDFS-3335
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
 HDFS-3335-b1.003.patch, HDFS-3335.001.patch, HDFS-3335.002.patch, 
 HDFS-3335.003.patch, HDFS-3335.004.patch, HDFS-3335.005.patch, 
 HDFS-3335.006.patch


 Even after encountering an OP_INVALID, we should check the end of the edit 
 log to make sure that it contains no more edits.
 This will catch things like rare race conditions or log corruptions that 
 would otherwise remain undetected.  They will got from being silent data loss 
 scenarios to being cases that we can detect and fix.
 Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException


[ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270955#comment-13270955
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3388:
--

The correct way to fix the bug is not to close output stream when an error is 
sent.  Below is a suggestion.
{code}
@@ -140,8 +141,12 @@
 DataTransferThrottler throttler = 
GetImageServlet.getThrottler(conf);
 
 // send edits
-TransferFsImage.getFileServer(response.getOutputStream(),
-editFile, throttler);
+OutputStream out = response.getOutputStream();
+try {
+  TransferFsImage.getFileServer(out, editFile, throttler);
+} finally {
+  out.close();
+}
   } else {
 response
 .sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED,
@@ -155,8 +160,6 @@
   String errMsg = getedit failed.  + StringUtils.stringifyException(ie);
   response.sendError(HttpServletResponse.SC_GONE, errMsg);
   throw new IOException(errMsg);
-} finally {
-  response.getOutputStream().close();
 }
   }
{code}


 GetJournalEditServlet should catch more exceptions, not just IOException
 

 Key: HDFS-3388
 URL: https://issues.apache.org/jira/browse/HDFS-3388
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-3388.HDFS-3092.patch


 GetJournalEditServlet has the same problem as that of GetImageServlet 
 (HDFS-3330). It should be fixed in the same way. Also need to make 
 CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException


[ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270995#comment-13270995
 ] 

Todd Lipcon commented on HDFS-3388:
---

Hey Brandon. I personally prefer to keep the fault injector classes local to 
each individual component being fault-tested. Otherwise it might grow 
unmanageable over time. They could even become package-private inner classes of 
each individual class that uses this fault injection technique. I think that's 
cleanest if doable?

 GetJournalEditServlet should catch more exceptions, not just IOException
 

 Key: HDFS-3388
 URL: https://issues.apache.org/jira/browse/HDFS-3388
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-3388.HDFS-3092.patch


 GetJournalEditServlet has the same problem as that of GetImageServlet 
 (HDFS-3330). It should be fixed in the same way. Also need to make 
 CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-744) Support hsync in HDFS

2012-05-08 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-744:
---

Attachment: HDFS-744-trunk.patch

Here's a patch against trunk.
Principle is the same, CreateFlag now also has a FORCE flag.

It compiles and I did test with a simple client.

I have not managed to get HBase running with Hadoop trunk, yet.


 Support hsync in HDFS
 -

 Key: HDFS-744
 URL: https://issues.apache.org/jira/browse/HDFS-744
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Hairong Kuang
 Attachments: HDFS-744-trunk.patch, hdfs-744-v2.txt, hdfs-744-v3.txt, 
 hdfs-744.txt


 HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, 
 the real expected semantics should be flushes out to all replicas and all 
 replicas have done posix fsync equivalent - ie the OS has flushed it to the 
 disk device (but the disk may have it in its cache). This jira aims to 
 implement the expected behaviour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3390) DFSAdmin should print full stack traces of errors when DEBUG logging is enabled

[
https://issues.apache.org/jira/browse/HDFS-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aaron T. Myers updated HDFS-3390:
-

Attachment: HDFS-3390.patch

Thanks a lot for the review, Nicholas.

The reason I did it the way I did in the original patch is that the DFSAdmin
code currently has multiple separate catch blocks, and thus some code would
need to be repeated in each one. In the attached patch, I've assigned the
exception to a variable and only printed it out in one place at the end. I'm
not sure I agree that this is much clearer, but I don't feel strongly about it.
If you think this is better, then that's fine by me.

I strongly suspect that the test failures are unrelated to this patch.

DFSAdmin should print full stack traces of errors when DEBUG logging is
enabled
---

Key: HDFS-3390
URL: https://issues.apache.org/jira/browse/HDFS-3390
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client
Affects Versions: 2.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
Attachments: HDFS-3390.patch, HDFS-3390.patch

If an error is encountered when running an `hdfs dfsadmin ...' command, only
the exception's message is output. It would be handy for debugging if the
full stack trace of the exception were output when DEBUG logging is enabled.

[jira] [Commented] (HDFS-3390) DFSAdmin should print full stack traces of errors when DEBUG logging is enabled

[
https://issues.apache.org/jira/browse/HDFS-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271048#comment-13271048
]

Hadoop QA commented on HDFS-3390:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12526090/HDFS-3390.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2394//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2394//console

This message is automatically generated.

DFSAdmin should print full stack traces of errors when DEBUG logging is
enabled
---

[jira] [Commented] (HDFS-3391) Failing tests in branch-2


[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271060#comment-13271060
 ] 

Todd Lipcon commented on HDFS-3391:
---

Are you seeing these fail regularly? I haven't seen either fail in recent 
months.

 Failing tests in branch-2
 -

 Key: HDFS-3391
 URL: https://issues.apache.org/jira/browse/HDFS-3391
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Arun C Murthy
Priority: Critical
 Fix For: 2.0.0


 Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec  
 FAILURE!
 --
 Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
 Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
  FAILURE!
 --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3384) DataStreamer thread should be closed immediatly when failed to setup a PipelineForAppendOrRecovery


 [ 
https://issues.apache.org/jira/browse/HDFS-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3384:
--

  Component/s: hdfs client
 Priority: Major  (was: Minor)
 Target Version/s: 2.0.0
Affects Version/s: 2.0.0

 DataStreamer thread should be closed immediatly when failed to setup a 
 PipelineForAppendOrRecovery
 --

 Key: HDFS-3384
 URL: https://issues.apache.org/jira/browse/HDFS-3384
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula

 Scenraio:
 =
 write a file
 corrupt block manually
 call append..
 {noformat}
 2012-04-19 09:33:10,776 INFO  hdfs.DFSClient 
 (DFSOutputStream.java:createBlockOutputStream(1059)) - Exception in 
 createBlockOutputStream
 java.io.EOFException: Premature EOF: no length prefix available
   at 
 org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1039)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:939)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
 2012-04-19 09:33:10,807 WARN  hdfs.DFSClient (DFSOutputStream.java:run(549)) 
 - DataStreamer Exception
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:510)
 2012-04-19 09:33:10,807 WARN  hdfs.DFSClient 
 (DFSOutputStream.java:hflush(1511)) - Error while syncing
 java.io.IOException: All datanodes 10.18.40.20:50010 are bad. Aborting...
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:908)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
 java.io.IOException: All datanodes 10.18.40.20:50010 are bad. Aborting...
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:908)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3391) Failing tests in branch-2


[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271075#comment-13271075
 ] 

Eli Collins commented on HDFS-3391:
---

I think this is a stale jar in mvn, re-running after mvn clean works and there 
is such a method.

 Failing tests in branch-2
 -

 Key: HDFS-3391
 URL: https://issues.apache.org/jira/browse/HDFS-3391
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Arun C Murthy
Assignee: Tsz Wo (Nicholas), SZE
Priority: Critical
 Fix For: 2.0.0


 Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec  
 FAILURE!
 --
 Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
 Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
  FAILURE!
 --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3391) Failing tests in branch-2


 [ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HDFS-3391.
---

Resolution: Not A Problem
  Assignee: (was: Tsz Wo (Nicholas), SZE)

After mvn clean both tests pass for me, and this error looks like a stale jar. 
Closing.

 Failing tests in branch-2
 -

 Key: HDFS-3391
 URL: https://issues.apache.org/jira/browse/HDFS-3391
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Arun C Murthy
Priority: Critical
 Fix For: 2.0.0


 Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec  
 FAILURE!
 --
 Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
 Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
  FAILURE!
 --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-3384) DataStreamer thread should be closed immediatly when failed to setup a PipelineForAppendOrRecovery


 [ 
https://issues.apache.org/jira/browse/HDFS-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-3384:
-

Assignee: Uma Maheswara Rao G

 DataStreamer thread should be closed immediatly when failed to setup a 
 PipelineForAppendOrRecovery
 --

 Key: HDFS-3384
 URL: https://issues.apache.org/jira/browse/HDFS-3384
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Assignee: Uma Maheswara Rao G

 Scenraio:
 =
 write a file
 corrupt block manually
 call append..
 {noformat}
 2012-04-19 09:33:10,776 INFO  hdfs.DFSClient 
 (DFSOutputStream.java:createBlockOutputStream(1059)) - Exception in 
 createBlockOutputStream
 java.io.EOFException: Premature EOF: no length prefix available
   at 
 org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1039)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:939)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
 2012-04-19 09:33:10,807 WARN  hdfs.DFSClient (DFSOutputStream.java:run(549)) 
 - DataStreamer Exception
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:510)
 2012-04-19 09:33:10,807 WARN  hdfs.DFSClient 
 (DFSOutputStream.java:hflush(1511)) - Error while syncing
 java.io.IOException: All datanodes 10.18.40.20:50010 are bad. Aborting...
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:908)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
 java.io.IOException: All datanodes 10.18.40.20:50010 are bad. Aborting...
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:908)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HDFS-3391) Failing tests in branch-2

2012-05-08 Thread Arun C Murthy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reopened HDFS-3391:
-


TestRBWBlockInvalidation fails consistently with:

{noformat}
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.827 sec  
FAILURE!
testBlockInvalidationWhenRBWReplicaMissedInDN(org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation)
  Time elapsed: 6.725 sec   FAILURE!
java.lang.AssertionError: There should be three live replicas expected:3 but 
was:2
  at org.junit.Assert.fail(Assert.java:91)
{noformat}



TestPipelinesFailover fails sporadically.


 Failing tests in branch-2
 -

 Key: HDFS-3391
 URL: https://issues.apache.org/jira/browse/HDFS-3391
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Arun C Murthy
Priority: Critical
 Fix For: 2.0.0


 Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec  
 FAILURE!
 --
 Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
 Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
  FAILURE!
 --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3391) Failing tests in branch-2


[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271080#comment-13271080
 ] 

Eli Collins commented on HDFS-3391:
---

testBlockInvalidationWhenRBWReplicaMissedInDN passes consistently for me in a 
fresh branch-2 tree. Does your tree have HDFS-3157?

I looped TestPipelinesFailover and sometimes it fails with:

{noformat}
testLeaseRecoveryAfterFailover(org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover)
  Time elapsed: 18.19 sec   ERROR!
java.io.IOException: p=/test-file, length=6144, i=4096
at org.apache.hadoop.hdfs.AppendTestUtil.check(AppendTestUtil.java:135)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.testLeaseRecoveryAfterFailover(TestPipelinesFailover.java:250)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
block: BP-1782491480-192.168.1.113-1336537917180:blk_-3060921670577866018_1006 
file=/test-file
at 
org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:644)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:437)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:577)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:500)
at java.io.FilterInputStream.read(FilterInputStream.java:66)
at org.apache.hadoop.hdfs.AppendTestUtil.check(AppendTestUtil.java:129)
... 10 more
{noformat}


 Failing tests in branch-2
 -

 Key: HDFS-3391
 URL: https://issues.apache.org/jira/browse/HDFS-3391
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Arun C Murthy
Priority: Critical
 Fix For: 2.0.0


 Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec  
 FAILURE!
 --
 Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
 Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
  FAILURE!
 --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened


 [ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3157:
--

  Resolution: Fixed
   Fix Version/s: (was: 0.24.0)
  3.0.0
  2.0.0
Target Version/s: 2.0.0, 3.0.0  (was: 0.24.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Error in deleting block is keep on coming from DN even after the block report 
 and directory scanning has happened
 -

 Key: HDFS-3157
 URL: https://issues.apache.org/jira/browse/HDFS-3157
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: J.Andreina
Assignee: Ashish Singhi
 Fix For: 2.0.0, 3.0.0

 Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch


 Cluster setup:
 1NN,Three DN(DN1,DN2,DN3),replication factor-2,dfs.blockreport.intervalMsec 
 300,dfs.datanode.directoryscan.interval 1
 step 1: write one file a.txt with sync(not closed)
 step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
 replication happened.
 step 3: close the file.
 Since the replication factor is 2 the blocks are replicated to the other 
 datanode.
 Then at the NN side the following cmd is issued to DN from which the block is 
 deleted
 -
 {noformat}
 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
 NameSystem.addToCorruptReplicasMap: duplicate requested for 
 blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
 because reported RBW replica with genstamp 1002 does not match COMPLETE 
 block's genstamp in block map 1003
 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
 Removing block blk_2903555284838653156_1003 from neededReplications as it has 
 enough replicas.
 {noformat}
 From the datanode side in which the block is deleted the following exception 
 occured
 {noformat}
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Unexpected error trying to delete block blk_2903555284838653156_1003. 
 BlockInfo not found in volumeMap.
 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
 Error processing datanode Command
 java.io.IOException: Error in deleting blocks.
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
   at java.lang.Thread.run(Thread.java:619)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3391) Failing tests in branch-2

2012-05-08 Thread Arun C Murthy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271087#comment-13271087
 ] 

Arun C Murthy commented on HDFS-3391:
-

bq. testBlockInvalidationWhenRBWReplicaMissedInDN passes consistently for me in 
a fresh branch-2 tree. Does your tree have HDFS-3157?

Yep, I do have HDFS-3157.

Can you pls check on branch-2.0.0-alpha? Tx.

 Failing tests in branch-2
 -

 Key: HDFS-3391
 URL: https://issues.apache.org/jira/browse/HDFS-3391
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Arun C Murthy
Priority: Critical
 Fix For: 2.0.0


 Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec  
 FAILURE!
 --
 Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
 Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
  FAILURE!
 --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode

[
https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13271098#comment-13271098
]

Eli Collins commented on HDFS-3092:
---

Notes from the meeting:
* Discussed the two approaches for the client side (HDFS-3077 and HDFS-3092),
the server sides are similar modulo small differences in recovery. Discussed
the tradeoffs in terms of going down to n-1 servers/requiring a quorum vs
latency sensitivity, in the context of typical cluster configurations. Think
it's possible to wed the client side of 3077 (quorum journal) to the server
side of 3092 (journal daemon). Will pursue further on jira.
* Discussed journal-based fencing. Current NN fencers are not needed when the
journal handles fencing.
* Discussed IP based failover and stonith, primary motivation for the current
approach is that other master services that are not yet HA often run on the
same machines, stonith doesn't work here.
* Discussed making the state machine for auto failover more explicit
* Discussed separate vs embedded FCs, separate works well for now, though
currently means we'll need a FC per service that gets failover (vs embedding a
FC in each service that will need failover)

Enable journal protocol based editlog streaming for standby namenode

[jira] [Updated] (HDFS-3092) Enable journal protocol based editlog streaming for standby namenode


 [ 
https://issues.apache.org/jira/browse/HDFS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3092:
--

Attachment: Removingfilerdependency.pdf

Attaching some thoughts/framework for comparing approaches I wrote last month.

 Enable journal protocol based editlog streaming for standby namenode
 

 Key: HDFS-3092
 URL: https://issues.apache.org/jira/browse/HDFS-3092
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 0.24.0, 0.23.3
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: ComparisonofApproachesforHAJournals.pdf, JNStates.png, 
 MultipleSharedJournals.pdf, MultipleSharedJournals.pdf, 
 MultipleSharedJournals.pdf, Removingfilerdependency.pdf


 Currently standby namenode relies on reading shared editlogs to stay current 
 with the active namenode, for namespace changes. BackupNode used streaming 
 edits from active namenode for doing the same. This jira is to explore using 
 journal protocol based editlog streams for the standby namenode. A daemon in 
 standby will get the editlogs from the active and write it to local edits. To 
 begin with, the existing standby mechanism of reading from a file, will 
 continue to be used, instead of from shared edits, from the local edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3391) Failing tests in branch-2