[jira] [Updated] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3667:
-

Attachment: h3667_20120730_b-1.patch

h3667_20120730_b-1.patch: for branch-1.

 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425577#comment-13425577
 ] 

Hadoop QA commented on HDFS-3667:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538504/h3667_20120730.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2928//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2928//console

This message is automatically generated.

 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3637) Add support for encrypting the DataTransferProtocol

2012-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425578#comment-13425578
 ] 

Hadoop QA commented on HDFS-3637:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538495/HDFS-3637.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 10 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.ha.TestZKFailoverController

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2927//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2927//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2927//console

This message is automatically generated.

 Add support for encrypting the DataTransferProtocol
 ---

 Key: HDFS-3637
 URL: https://issues.apache.org/jira/browse/HDFS-3637
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client, security
Affects Versions: 2.0.0-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3637.patch, HDFS-3637.patch


 Currently all HDFS RPCs performed by NNs/DNs/clients can be optionally 
 encrypted. However, actual data read or written between DNs and clients (or 
 DNs to DNs) is sent in the clear. When processing sensitive data on a shared 
 cluster, confidentiality of the data read/written from/to HDFS may be desired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425580#comment-13425580
 ] 

Hadoop QA commented on HDFS-3667:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12538506/h3667_20120730_b-1.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2929//console

This message is automatically generated.

 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3637) Add support for encrypting the DataTransferProtocol

2012-07-31 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3637:
-

Attachment: HDFS-3637.patch

Identical to the last patch, but fixes the findbugs warning.

 Add support for encrypting the DataTransferProtocol
 ---

 Key: HDFS-3637
 URL: https://issues.apache.org/jira/browse/HDFS-3637
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client, security
Affects Versions: 2.0.0-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3637.patch, HDFS-3637.patch, HDFS-3637.patch


 Currently all HDFS RPCs performed by NNs/DNs/clients can be optionally 
 encrypted. However, actual data read or written between DNs and clients (or 
 DNs to DNs) is sent in the clear. When processing sensitive data on a shared 
 cluster, confidentiality of the data read/written from/to HDFS may be desired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-528) Add ability for safemode to wait for a minimum number of live datanodes

2012-07-31 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-528:


Attachment: h528_20120731_b-1.patch

h528_20120731_b-1.patch: for branch-1.

 Add ability for safemode to wait for a minimum number of live datanodes
 ---

 Key: HDFS-528
 URL: https://issues.apache.org/jira/browse/HDFS-528
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: scripts
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: h528_20120731_b-1.patch, hdfs-528-v2.txt, 
 hdfs-528-v3.txt, hdfs-528-v4.txt, hdfs-528.txt, hdfs-528.txt


 When starting up a fresh cluster programatically, users often want to wait 
 until DFS is writable before continuing in a script. dfsadmin -safemode 
 wait doesn't quite work for this on a completely fresh cluster, since when 
 there are 0 blocks on the system, 100% of them are accounted for before any 
 DNs have reported.
 This JIRA is to add a command which waits until a certain number of DNs have 
 reported as alive to the NN.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3703) Decrease the datanode failure detection time

2012-07-31 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425629#comment-13425629
 ] 

nkeywal commented on HDFS-3703:
---

For HBase, it would be good to have an option to cap this value, as HBase 
itself relies on ZooKeeper, and ZooKeeper has a fix timeout. Imho, if a server 
does not respond to a ping for 30s, in 99% of the cases it won't be able to 
respond to a read or write request either. We could imagine to slow down the 
client if we detect that 20% of the cluster is missing... But today we're still 
optimizing for the simple situations such as a few nodes missing...

 Decrease the datanode failure detection time
 

 Key: HDFS-3703
 URL: https://issues.apache.org/jira/browse/HDFS-3703
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, name-node
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: nkeywal
Assignee: Suresh Srinivas

 By default, if a box dies, the datanode will be marked as dead by the 
 namenode after 10:30 minutes. In the meantime, this datanode will still be 
 proposed  by the nanenode to write blocks or to read replicas. It happens as 
 well if the datanode crashes: there is no shutdown hooks to tell the nanemode 
 we're not there anymore.
 It especially an issue with HBase. HBase regionserver timeout for production 
 is often 30s. So with these configs, when a box dies HBase starts to recover 
 after 30s and, while 10 minutes, the namenode will consider the blocks on the 
 same box as available. Beyond the write errors, this will trigger a lot of 
 missed reads:
 - during the recovery, HBase needs to read the blocks used on the dead box 
 (the ones in the 'HBase Write-Ahead-Log')
 - after the recovery, reading these data blocks (the 'HBase region') will 
 fail 33% of the time with the default number of replica, slowering the data 
 access, especially when the errors are socket timeout (i.e. around 60s most 
 of the time). 
 Globally, it would be ideal if HDFS settings could be under HBase settings. 
 As a side note, HBase relies on ZooKeeper to detect regionservers issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3637) Add support for encrypting the DataTransferProtocol

2012-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425663#comment-13425663
 ] 

Hadoop QA commented on HDFS-3637:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538519/HDFS-3637.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 10 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2930//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2930//console

This message is automatically generated.

 Add support for encrypting the DataTransferProtocol
 ---

 Key: HDFS-3637
 URL: https://issues.apache.org/jira/browse/HDFS-3637
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client, security
Affects Versions: 2.0.0-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3637.patch, HDFS-3637.patch, HDFS-3637.patch


 Currently all HDFS RPCs performed by NNs/DNs/clients can be optionally 
 encrypted. However, actual data read or written between DNs and clients (or 
 DNs to DNs) is sent in the clear. When processing sensitive data on a shared 
 cluster, confidentiality of the data read/written from/to HDFS may be desired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-07-31 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425798#comment-13425798
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

I thought that hardlinks to directories are not typically supported.  HSF+ on 
the mac is the only one I know of that allows it.  I am nervous about 
implementing an upgrade path that will only work on a Mac.  Did you actually 
mean a symbolic link, or did you intend to hardlink all of the files in the 
directories? 

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Todd Lipcon
Priority: Blocker

 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-07-31 Thread Raju (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425849#comment-13425849
 ] 

Raju commented on HDFS-3731:


{quote}We should probably hook the finalize code to also rm -rf the 
blocksbeingwritten directory, or else the storage will be leaked forever, 
right?{quote}

Yes Todd forget to mention about finalize, we need to delete the BBW dir in 
finalize

{quote}If you agree, may be we can file separate JIRA for that, as this JIRA 
mainly talking about 2.0 upgrade from 1.0.
{quote}

Uma I accept your opinion but we need to modify the code which is already 
released, I am not very clear on how to fix the same.


{quote}I thought that hardlinks to directories are not typically supported 
...{quote}
Robert Here I am not directly referring to Hardlink like
{code}
ln sourceDir destDir
{code}

I am talking about using 
{code}
void org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocks(File from, 
File to, int oldLV, HardLink hl)
{code}

which will do the individual file linking for all the blocks.




 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Todd Lipcon
Priority: Blocker

 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2956) calling fetchdt without a --renewer argument throws NPE

2012-07-31 Thread Raju (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425861#comment-13425861
 ] 

Raju commented on HDFS-2956:


Here we are defining the protocol message as
{code}
message GetDelegationTokenRequestProto {
  required string renewer = 1;
}
{code}

based on some of the above comments I feel the renewer should be optional 
(since null can be passed I mean we are not providing renewer).

Even with optional we will have the generated class with null check for 
renewer, so I guess we can have null check for renewer like

{code}
if(renewer != null) {
GetDelegationTokenRequestProto req = GetDelegationTokenRequestProto
.newBuilder()
.setRenewer(renewer.toString())
.build();
} else {
  GetDelegationTokenRequestProto req = GetDelegationTokenRequestProto
  .newBuilder()
  .build();
}
{code} 
This should be possible since we are declaring the renewer optional, similarly 
we can parse the message back at serverside translator.

Please correct me if I am wrong

 calling fetchdt without a --renewer argument throws NPE
 ---

 Key: HDFS-2956
 URL: https://issues.apache.org/jira/browse/HDFS-2956
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 0.24.0
Reporter: Todd Lipcon
Assignee: Daryn Sharp

 If I call bin/hdfs fetchdt /tmp/mytoken without a --renewer foo argument, 
 then it will throw a NullPointerException:
 Exception in thread main java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:830)
 this is because getDelegationToken is being called with a null renewer

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3743) QJM: improve formatting behavior for JNs

2012-07-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13425982#comment-13425982
 ] 

Andrew Purtell commented on HDFS-3743:
--

Not sure about the notion of automating an unsafe startup in the case the 
majority of JNs are unformatted. What if instead, it's possible to start up the 
NN in recovery mode and have it interactively suggest actions including 
initializing the unformatted JNs? Could summarize the most recent txn (or a few 
txns) of the available logs before asking which txid to choose as latest?

 QJM: improve formatting behavior for JNs
 

 Key: HDFS-3743
 URL: https://issues.apache.org/jira/browse/HDFS-3743
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon

 Currently, the JournalNodes automatically format themselves when a new writer 
 takes over, if they don't have any data for that namespace. However, this has 
 a few problems:
 1) if the administrator accidentally points a new NN at the wrong quorum (eg 
 corresponding to another cluster), it will auto-format a directory on those 
 nodes. This doesn't cause any data loss, but would be better to bail out with 
 an error indicating that they need to be formatted.
 2) if a journal node crashes and needs to be reformatted, it should be able 
 to re-join the cluster and start storing new segments without having to fail 
 over to a new NN.
 3) if 2/3 JNs get accidentally reformatted (eg the mount point becomes 
 undone), and the user starts the NN, it should fail to start, because it may 
 end up missing edits. If it auto-formats in this case, the user might have 
 silent rollback of the most recent edits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3579) libhdfs: fix exception handling

2012-07-31 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3579:
---

  Description: 
libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
memory associated with them (memory leak).  Sometimes we invoke JNI functions 
that are not supposed to be invoked when an exception is active.

Running a libhdfs test program with -Xcheck:jni shows the latter problem 
clearly:
{code}
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
Exception in thread main java.io.IOException: ...
{code}

  was:
Running a libhdfs test program with -Xcheck:jni reveals some problems.

{code}
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
WARNING in native method: JNI call made with exception pending
Exception in thread main java.io.IOException: ...
{code}

The problem seems to be that in errnoFromException, we are calling 
classNameOfObject and some other JNI methods prior to clearing the pending 
exception.  It should be simple enough to avoid doing this.

 Priority: Major  (was: Minor)
 Target Version/s: 2.2.0-alpha
Affects Version/s: (was: 2.1.0-alpha)
   2.0.1-alpha
Fix Version/s: (was: 2.1.0-alpha)
  Summary: libhdfs: fix exception handling  (was: libhdfs: fix 
WARNING in native method: JNI call made with exception pending in 
errnoFromException)

 libhdfs: fix exception handling
 ---

 Key: HDFS-3579
 URL: https://issues.apache.org/jira/browse/HDFS-3579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3579.004.patch, HDFS-3579.005.patch


 libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
 memory associated with them (memory leak).  Sometimes we invoke JNI functions 
 that are not supposed to be invoked when an exception is active.
 Running a libhdfs test program with -Xcheck:jni shows the latter problem 
 clearly:
 {code}
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 Exception in thread main java.io.IOException: ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3739) Not all HA props support nameservice-id specific config

2012-07-31 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe reassigned HDFS-3739:
--

Assignee: Colin Patrick McCabe

 Not all HA props support nameservice-id specific config
 ---

 Key: HDFS-3739
 URL: https://issues.apache.org/jira/browse/HDFS-3739
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Colin Patrick McCabe

 Many parameters like dfs.client.failover.proxy.provider, 
 dfs.namenode.rpc-address, and dfs.namenode.http-address support 
 nameservice-id specific definition in an effort to support a single 
 configuration file everywhere. Some HA properties, however, don't seem to 
 support this. Specifically, dfs.namenode.shared.edits.dir, 
 dfs.ha.fencing.methods, dfs.ha.automatic-failover.enabled, and 
 ha.zookeeper.quorum (core-site) seem to have this issue (or are documented as 
 such).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-07-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-3672:
--

Attachment: hdfs-3672-4.patch

Fix findbugs. I also parallelized the DN RPCs with Callables and a threadpool.

 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-3672-1.patch, hdfs-3672-2.patch, hdfs-3672-3.patch, 
 hdfs-3672-4.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3744) Decommissioned nodes are included in cluster after switch which is not expected

2012-07-31 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426154#comment-13426154
 ] 

Aaron T. Myers commented on HDFS-3744:
--

Brahma, in the above description, I assume that you excluded DN1 by adding it 
to the excludes file? Did you add it to that file on both of the NN machines?

The issue is that some dsfadmin commands need to perform client-side failover 
and talk to the appropriate NN, while others should actually be run against 
both NNs.

 Decommissioned nodes are included in cluster after switch which is not 
 expected
 ---

 Key: HDFS-3744
 URL: https://issues.apache.org/jira/browse/HDFS-3744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 2.0.1-alpha
Reporter: Brahma Reddy Battula

 Scenario:
 =
 Start ANN and SNN with three DN's
 Exclude DN1 from cluster by using decommission feature 
 (./hdfs dfsadmin -fs hdfs://ANNIP:8020 -refreshNodes)
 After decommission successful,do switch such that SNN will become Active.
 Here exclude node(DN1) is included in cluster.Able to write files to excluded 
 node since it's not excluded.
 Checked SNN(Which Active before switch) UI decommissioned=1 and ANN UI 
 decommissioned=0
 One more Observation:
 
 All dfsadmin commands will create proxy only on nn1 irrespective of Active or 
 standby.I think this also we need to re-look once..
 I am not getting , why we are not given HA for dfsadmin commands..?
 Please correct me,,If I am wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3579) libhdfs: fix exception handling

2012-07-31 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426166#comment-13426166
 ] 

Andy Isaacson commented on HDFS-3579:
-

{code}
...
+jvalue jVal;
+
+jthrowable jthr = classNameOfObject(exc, env, className);
+if (jthr) {
{code}
the blank line should be after the declaration of jthr.
{code}
+fprintf(stderr, PrintExceptionAndFree: error determining class name 
+of exception!);
{code}
add a \n to this printf.  If it were me I would replace all the surprised !s 
with . but I am not going to insist on unexcitifying the error messages.
{code}
+fprintf(stderr,  error: (no exception));
{code}
another missing \n.
{code}
-constructNewObjectOfClass(env, NULL, org/apache/hadoop/fs/Path,
+jthr = constructNewObjectOfClass(env, jPath, org/apache/hadoop/fs/Path,
   (Ljava/lang/String;)V, jPathString);
{code}
indent the continuation line to match the (.
{code}
+jthr = newCStr(env, jRet, val);
+if (jthr)
 goto done;
 done:
{code}
having a goto done on the line before done: is a bit confusing.  I'd just 
drop it.
{code}
--- hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.h
+++ hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.h
...
+void destroyLocalReference(JNIEnv *env, jobject jObject);
{code}
Since this is external linkage C API, I think we need to use a reserved 
namespace prefix to avoid breaking outside code.  So, make this 
hdfsDestroyLocalReference perhaps?  This is a large change, and this is newly 
exposed code, so that change can be done in a separate jira.

An alternate fix, if we only support a .so and don't ship a .a, is to use a 
linker script to control symbol visibility.

But one way or another we need to make sure we have sanitary symbol table usage.

{code}
+if (jthr) {
+ret = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
+hdfsDisconnect: org.apache.hadoop.fs.FileSystem::close);
{code}
I don't think :: is right for Java?  I think it should be 
org.apache.hadoop.fs.FileSystem#close.

Please check all the printExceptionAndFree calls and verify that they are 
consistent, I see some with and some without the org.class.path.  I don't care 
what standard, but I think cFuncName: org.class.path#method is the right 
string unless there's a better idea.

{code}
 //bufferSize
 if (!bufferSize) {
{code}
Please delete this useless comment. (Not your code, but you're editing right 
here, let's just fix it.)

{code}
 }  else if ((flags  O_WRONLY)  (flags  O_APPEND)) {
+// WRITE/APPEND?
+   jthr = invokeMethod(env, jVal, INSTANCE, jFS, HADOOP_FS,
{code}
There's a bunch of funky whitespace here, let's fix it.  }  else has an extra 
space, and I think the // WRITE is indented one space too far.
{code}
+file = calloc(1, sizeof(struct hdfsFile_internal));
{code}
I find {{file = calloc(1, sizeof *file);}} easier to convince myself of, but I 
don't know if we have a local style guideline on this point?
{code}
+if ((flags  O_WRONLY) == 0) {
{code}
This is wrong per HDFS-3710. (Many occurrences of this pattern.)  No need to 
fix the existing ones outside of the code you're touching, but please don't add 
more.

more to come in next comment...

 libhdfs: fix exception handling
 ---

 Key: HDFS-3579
 URL: https://issues.apache.org/jira/browse/HDFS-3579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3579.004.patch, HDFS-3579.005.patch


 libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
 memory associated with them (memory leak).  Sometimes we invoke JNI functions 
 that are not supposed to be invoked when an exception is active.
 Running a libhdfs test program with -Xcheck:jni shows the latter problem 
 clearly:
 {code}
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 Exception in thread main java.io.IOException: ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3745) fsck prints that it's using KSSL even when it's in fact using SPNEGO for authentication

2012-07-31 Thread Aaron T. Myers (JIRA)
Aaron T. Myers created HDFS-3745:


 Summary: fsck prints that it's using KSSL even when it's in fact 
using SPNEGO for authentication
 Key: HDFS-3745
 URL: https://issues.apache.org/jira/browse/HDFS-3745
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, security
Affects Versions: 2.0.0-alpha, 1.2.0
Reporter: Aaron T. Myers
Priority: Trivial


In branch-2 (which exclusively uses SPNEGO for HTTP authentication) and in 
branch-1 (which can optionally use SPNEGO for HTTP authentication), running 
fsck will print the following, which isn't quite right:

{quote}
FSCK started by hdfs (auth:KERBEROS_SSL) from...
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3745) fsck prints that it's using KSSL even when it's in fact using SPNEGO for authentication

2012-07-31 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426169#comment-13426169
 ] 

Aaron T. Myers commented on HDFS-3745:
--

I should've mentioned, this issue was discovered by Stephen Chu.

 fsck prints that it's using KSSL even when it's in fact using SPNEGO for 
 authentication
 ---

 Key: HDFS-3745
 URL: https://issues.apache.org/jira/browse/HDFS-3745
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, security
Affects Versions: 1.2.0, 2.0.0-alpha
Reporter: Aaron T. Myers
Priority: Trivial
  Labels: newbie

 In branch-2 (which exclusively uses SPNEGO for HTTP authentication) and in 
 branch-1 (which can optionally use SPNEGO for HTTP authentication), running 
 fsck will print the following, which isn't quite right:
 {quote}
 FSCK started by hdfs (auth:KERBEROS_SSL) from...
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3721) hsync support broke wire compatibility

2012-07-31 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426176#comment-13426176
 ] 

Aaron T. Myers commented on HDFS-3721:
--

The patch looks pretty good to me, and I agree it's a good refactor to make 
RemoteBlockReader and RemoteBlockReader2 share some code.

One tiny nit on the code, it looks like you have some extra whitespace here:
{code}
if ( checksumBuf.capacity() != checksumLen) {
{code}

It looks to me like the TestFileConcurrentReader failure is due to this patch. 
I can't recall that test being flaky, and at least on my box the test passes 
without this patch, but fails with it applied.

 hsync support broke wire compatibility
 --

 Key: HDFS-3721
 URL: https://issues.apache.org/jira/browse/HDFS-3721
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client
Affects Versions: 2.1.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Attachments: hdfs-3721.txt


 HDFS-744 added support for hsync to the data transfer wire protocol. However, 
 it actually broke wire compatibility: if the client has hsync support but the 
 server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3721) hsync support broke wire compatibility

2012-07-31 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426182#comment-13426182
 ] 

Suresh Srinivas commented on HDFS-3721:
---

Todd, I meant to review this. But the code refactoring, even though it is a 
good idea, has made the review difficult. Given that I may not be able to get 
though my code review, in a short period of time, here are the comments I had 
based on my review so far:


# DFSOutputStream.java
#* Packet consturctor, can you please add javadoc (especially to describe 
pktSize)
#* Math in computePacketChunkSize seems correct, but it results in different 
value from previous code.
# PacketReceiver.java
#* Make #bufferPool final
# PacketHeader.java
#* Builder import not used
#* Please add javadoc on PacketHeader structure
#* BlockSender.java javadoc could just point to the javadoc of PacketHeader for 
header information. We have this in multiple places.


 hsync support broke wire compatibility
 --

 Key: HDFS-3721
 URL: https://issues.apache.org/jira/browse/HDFS-3721
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client
Affects Versions: 2.1.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Attachments: hdfs-3721.txt


 HDFS-744 added support for hsync to the data transfer wire protocol. However, 
 it actually broke wire compatibility: if the client has hsync support but the 
 server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3721) hsync support broke wire compatibility

2012-07-31 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426183#comment-13426183
 ] 

Suresh Srinivas commented on HDFS-3721:
---

Also there are bunch of empty line additions in the patch that could be removed 
(in DFSOutputStream.java, RemoteBlockReader2.java etc.).

 hsync support broke wire compatibility
 --

 Key: HDFS-3721
 URL: https://issues.apache.org/jira/browse/HDFS-3721
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client
Affects Versions: 2.1.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Attachments: hdfs-3721.txt


 HDFS-744 added support for hsync to the data transfer wire protocol. However, 
 it actually broke wire compatibility: if the client has hsync support but the 
 server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (HDFS-3721) hsync support broke wire compatibility

2012-07-31 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426182#comment-13426182
 ] 

Suresh Srinivas edited comment on HDFS-3721 at 7/31/12 10:36 PM:
-

Todd, I meant to review this. But the code refactoring, even though it is a 
good idea, has made the review difficult. Given that I may not be able to get 
though my code review in a short period of time, here are the comments I had 
accumulated based on my review so far:


# DFSOutputStream.java
#* Packet consturctor, can you please add javadoc (especially to describe 
pktSize)
#* Math in computePacketChunkSize seems correct, but it results in different 
value from previous code.
# PacketReceiver.java
#* Make #bufferPool final
# PacketHeader.java
#* Builder import not used
#* Please add javadoc on PacketHeader structure
#* BlockSender.java javadoc could just point to the javadoc of PacketHeader for 
header information. We have this in multiple places.


  was (Author: sureshms):
Todd, I meant to review this. But the code refactoring, even though it is a 
good idea, has made the review difficult. Given that I may not be able to get 
though my code review, in a short period of time, here are the comments I had 
based on my review so far:


# DFSOutputStream.java
#* Packet consturctor, can you please add javadoc (especially to describe 
pktSize)
#* Math in computePacketChunkSize seems correct, but it results in different 
value from previous code.
# PacketReceiver.java
#* Make #bufferPool final
# PacketHeader.java
#* Builder import not used
#* Please add javadoc on PacketHeader structure
#* BlockSender.java javadoc could just point to the javadoc of PacketHeader for 
header information. We have this in multiple places.

  
 hsync support broke wire compatibility
 --

 Key: HDFS-3721
 URL: https://issues.apache.org/jira/browse/HDFS-3721
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client
Affects Versions: 2.1.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Attachments: hdfs-3721.txt


 HDFS-744 added support for hsync to the data transfer wire protocol. However, 
 it actually broke wire compatibility: if the client has hsync support but the 
 server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3475) Make the replication monitor multipliers configurable

2012-07-31 Thread Adam Muise (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426186#comment-13426186
 ] 

Adam Muise commented on HDFS-3475:
--

Note on this value: while we tested at 100, this may be too high for a cluster 
under even moderate workload. The cluster in question had very powerful nodes 
and dual-bonded 10Gb interfaces. We also had to increase the DataNode memory in 
the range of 4-6Gb. You may choose to start the value at 10 and go up based on 
your available memory for the Datanode, I/O, and network capacity. 

 Make the replication monitor multipliers configurable
 -

 Key: HDFS-3475
 URL: https://issues.apache.org/jira/browse/HDFS-3475
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.1.0-alpha

 Attachments: HDFS-3475.patch, HDFS-3475.patch, HDFS-3475.patch


 BlockManager currently hardcodes the following two constants:
 {code}
 private static final int INVALIDATE_WORK_PCT_PER_ITERATION = 32;
 private static final int REPLICATION_WORK_MULTIPLIER_PER_ITERATION = 2;
 {code}
 These are used to throttle/limit the amount of deletion and 
 replication-to-other-DN work done per heartbeat interval of a live DN.
 Not many have had reasons to want these changed so far but there have been a 
 few requests I've faced over the past year from a variety of clusters I've 
 helped maintain. I think with the improvements in disks and network thats 
 already started to be rolled out in production environments out there, 
 changing these may start making sense to some.
 Lets at least make it advanced-configurable with proper docs that warn 
 adequately, with the defaults being what they are today. With hardcodes, it 
 comes down to a recompile for admins, which is not something they may like.
 Please let me know your thoughts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3735) [ NNUI -- NNJspHelper.java ] Last three fields not considered for display data in sorting

2012-07-31 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3735:
--

Description: 
Live datanode list is not correctly sorted for columns Block Pool Used (GB),
Block Pool Used (%)  and Failed Volumes. Read comments for more details.

  was:
 *Fields which will display on dfsnodelist.jsp?whatNodes=LIVE ( NNUI-click on 
live nodes)* 

Node| Last-contact |Admin State| Configured Capacity(GB)|Used(GB)|

Non DFSUsed(GB)Remaining(GB)|Used(%)|Used(%)|Remaining(%)|Blocks|

 Block Pool Used (GB)|Block Pool Used (%) Blocks|Failed Volumes H *--*  
{color:red}these three fields not considered for sorting{color}

we can display data in sorting order by clicking on the anyone of above 
fields.It's working fine for all fields expect last three fields.

 *code where we are considering fileds.* 
{code}
class NodeComapare implements ComparatorDatanodeDescriptor {
  static final int 
FIELD_NAME  = 1,
FIELD_LAST_CONTACT  = 2,
FIELD_BLOCKS= 3,
FIELD_CAPACITY  = 4,
FIELD_USED  = 5,
FIELD_PERCENT_USED  = 6,
FIELD_NONDFS_USED   = 7,
FIELD_REMAINING = 8,
FIELD_PERCENT_REMAINING = 9,
FIELD_ADMIN_STATE   = 10,
FIELD_DECOMMISSIONED= 11,
SORT_ORDER_ASC  = 1,
SORT_ORDER_DSC  = 2;
{code}

Here,last three fields we are not considering..hence it's assign default filed.
 {code}
 } else if (field.equals(blocks)) {
  sortField = FIELD_BLOCKS;
} else if (field.equals(adminstate)) {
  sortField = FIELD_ADMIN_STATE;
} else if (field.equals(decommissioned)) {
  sortField = FIELD_DECOMMISSIONED;
} else {
  sortField = FIELD_NAME;
}
{code}

Please correct me ,If I am wrong...


 *Fields which will display on dfsnodelist.jsp?whatNodes=LIVE ( NNUI-click on 
live nodes)* 

Node| Last-contact |Admin State| Configured Capacity(GB)|Used(GB)|

Non DFSUsed(GB)Remaining(GB)|Used(%)|Used(%)|Remaining(%)|Blocks|

 Block Pool Used (GB)|Block Pool Used (%) Blocks|Failed Volumes H *--*  
{color:red}these three fields not considered for sorting{color}

we can display data in sorting order by clicking on the anyone of above 
fields.It's working fine for all fields expect last three fields.

 *code where we are considering fileds.* 
{code}
class NodeComapare implements ComparatorDatanodeDescriptor {
  static final int 
FIELD_NAME  = 1,
FIELD_LAST_CONTACT  = 2,
FIELD_BLOCKS= 3,
FIELD_CAPACITY  = 4,
FIELD_USED  = 5,
FIELD_PERCENT_USED  = 6,
FIELD_NONDFS_USED   = 7,
FIELD_REMAINING = 8,
FIELD_PERCENT_REMAINING = 9,
FIELD_ADMIN_STATE   = 10,
FIELD_DECOMMISSIONED= 11,
SORT_ORDER_ASC  = 1,
SORT_ORDER_DSC  = 2;
{code}

Here,last three fields we are not considering..hence it's assign default filed.
 {code}
 } else if (field.equals(blocks)) {
  sortField = FIELD_BLOCKS;
} else if (field.equals(adminstate)) {
  sortField = FIELD_ADMIN_STATE;
} else if (field.equals(decommissioned)) {
  sortField = FIELD_DECOMMISSIONED;
} else {
  sortField = FIELD_NAME;
}
{code}

Please correct me ,If I am wrong...

 [ NNUI -- NNJspHelper.java ] Last three fields not considered for display 
 data in sorting
 --

 Key: HDFS-3735
 URL: https://issues.apache.org/jira/browse/HDFS-3735
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha, 2.0.1-alpha
Reporter: Brahma Reddy Battula
Priority: Minor

 Live datanode list is not correctly sorted for columns Block Pool Used (GB),
 Block Pool Used (%)  and Failed Volumes. Read comments for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3735) [ NNUI -- NNJspHelper.java ] Last three fields not considered for display data in sorting

2012-07-31 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426197#comment-13426197
 ] 

Suresh Srinivas commented on HDFS-3735:
---

Brahma, can you please keep the description of the jira short and add the 
details in a comment later. 

Your analysis seems to be correct. Please post a patch, I will review it and 
commit it.

 [ NNUI -- NNJspHelper.java ] Last three fields not considered for display 
 data in sorting
 --

 Key: HDFS-3735
 URL: https://issues.apache.org/jira/browse/HDFS-3735
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha, 2.0.1-alpha
Reporter: Brahma Reddy Battula
Priority: Minor

 Live datanode list is not correctly sorted for columns Block Pool Used (GB),
 Block Pool Used (%)  and Failed Volumes. Read comments for more details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426203#comment-13426203
 ] 

Hadoop QA commented on HDFS-3672:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538617/hdfs-3672-4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2931//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2931//console

This message is automatically generated.

 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-3672-1.patch, hdfs-3672-2.patch, hdfs-3672-3.patch, 
 hdfs-3672-4.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3723) All commands should support meaningful --help

2012-07-31 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3723:


Attachment: (was: HDFS-3723.patch)

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3723) All commands should support meaningful --help

2012-07-31 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3723:


Attachment: HDFS-3723.patch

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3723) All commands should support meaningful --help

2012-07-31 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426207#comment-13426207
 ] 

Aaron T. Myers commented on HDFS-3723:
--

Hi Jing Zhao, you should mark the patch patch available by clicking the 
Submit Patch button so that the pre-commit Jenkins tests run.

I'll do that for you now.

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3723) All commands should support meaningful --help

2012-07-31 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3723:
-

Target Version/s: 2.2.0-alpha
  Status: Patch Available  (was: Open)

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3579) libhdfs: fix exception handling

2012-07-31 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3579:
---

Attachment: HDFS-3579.006.patch

* rebase on trunk

* fix whitespace

* NOPRINT_EXC_ILLEGAL_ARGUMENT should be 0x10

* get rid of typedef jvalue RetVal-- it provides no value beyond just using 
jvalue directly

* Use # instead of :: to describe Java methods

* don't restate the long name of java classes in error messages-- the 
exceptions themselves contain that information, so it's just visual clutter

* newRuntimeError is declared in exception.h, so it doesn't need to also be 
declared in jni_helper.h

 libhdfs: fix exception handling
 ---

 Key: HDFS-3579
 URL: https://issues.apache.org/jira/browse/HDFS-3579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3579.004.patch, HDFS-3579.005.patch, 
 HDFS-3579.006.patch


 libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
 memory associated with them (memory leak).  Sometimes we invoke JNI functions 
 that are not supposed to be invoked when an exception is active.
 Running a libhdfs test program with -Xcheck:jni shows the latter problem 
 clearly:
 {code}
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 Exception in thread main java.io.IOException: ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3579) libhdfs: fix exception handling

2012-07-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426214#comment-13426214
 ] 

Colin Patrick McCabe commented on HDFS-3579:


Thanks for the review, Andy.  I'm looking forward to the next comment.

bq. [O_WRONLY discussion]

I held off on the O_WRONLY fixes, since we have HDFS-3710 open for that.  What 
I've done here is just move the code, not add any new (mis)uses.

bq. [linker script discussion]

Yeah, HDFS-3742 is open for this.

I tried to fix as much whitespace as I could; I'm sure there's still funky 
stuff lingering somewhere.  We can always circle back on that later, though-- 
as long as this patch moves things in the right direction.

 libhdfs: fix exception handling
 ---

 Key: HDFS-3579
 URL: https://issues.apache.org/jira/browse/HDFS-3579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3579.004.patch, HDFS-3579.005.patch, 
 HDFS-3579.006.patch


 libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
 memory associated with them (memory leak).  Sometimes we invoke JNI functions 
 that are not supposed to be invoked when an exception is active.
 Running a libhdfs test program with -Xcheck:jni shows the latter problem 
 clearly:
 {code}
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 Exception in thread main java.io.IOException: ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426216#comment-13426216
 ] 

Suresh Srinivas commented on HDFS-3667:
---

+1 for the new patch.

 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2984) S-live: Rate operation count for delete is worse than 0.20.204 by 28.8%

2012-07-31 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash reassigned HDFS-2984:
--

Assignee: Ravi Prakash  (was: Eric Payne)

 S-live: Rate operation count for delete is worse than 0.20.204 by 28.8%
 ---

 Key: HDFS-2984
 URL: https://issues.apache.org/jira/browse/HDFS-2984
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: benchmarks
Affects Versions: 0.23.1
Reporter: Vinay Kumar Thota
Assignee: Ravi Prakash
Priority: Critical

 Rate operation count for delete is worse than 0.20.204.xx by 28.8%

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2984) S-live: Rate operation count for delete is worse than 0.20.204 by 28.8%

2012-07-31 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-2984:
---

Attachment: slive.tar.gz

Ok! I've been slacking on this bug for way too long. But here are my 
experiments and the data.

WHAT ARE THE FILES IN THIS TARBALL?

patch is the diff of 2 minor optimizations I made in hadoop-23.

I then ran Slive on clean HDFS installations for 0.23 and 0.204. These are the 
commands I ran. First create 20 files (hopefully that's what it does... 
though its not important if it doesn't)
bin/hadoop org.apache.hadoop.fs.slive.SliveTest -duration 50 -dirSize 1225 
-files 20 -maps 4 -readSize 104850,104850 -writeSize 104850,104850 
-appendSize 104850,104850 -replication 1,1 -reduces 1 -blockSize 1024,1024 
-mkdir 0,uniform -rename 0,uniform -append 0,uniform -delete 0,uniform -ls 
0,uniform -read 0,uniform -create 100,uniform and then delete 5 files 
(again, hopefully that's what it does)
bin/hadoop org.apache.hadoop.fs.slive.SliveTest -duration 50 -dirSize 1225 
-files 5 -maps 4 -readSize 104850,104850  -writeSize 104850,104850 
-appendSize 104850,104850 -replication 1,1 -reduces 1 -blockSize 1024,1024 
-mkdir 0,uniform -rename 0,uniform -append 0,uniform -delete 100,uniform -ls 
0,uniform -read 0,uniform -create 0,uniform
I do this 3 times. Hence the 6 files
branch.C200 - create 200k files
branch.C200D50 - delete 50k files

In the last run, I delete 50 files, and use jvisualvm to create snapshots
while I am profiling. The two snapshot*.npm files can be loaded into jvisualvm.



OBSERVATIONS
=

Create seems to be twice as fast in 0.23. So I'm not too worried about that.

Delete on the other hand is a lot slower. I've tried optimizing, but I don't
know if there's much else that can be done. A huge reason is probably this:
http://blog.rapleaf.com/dev/2011/06/16/java-performance-synchronized-vs-lock/
In 0.20 we were using the synchronized variable, which although is 2-7.5x
faster (as reported in the blog), is unfair. In 0.23 we are using a fair
ReentrantReadWriteLock. This is obviously going to be slower and since
writeLock() is what's taking the most amount of time (ref the jvisualvm
profile), I am led to believe that we must incur the performance hit in order
to be fair.

Comments are welcome. Please let me know your thoughts.


@Todd: These are on the latest branch-23 
74fd5cb929adc926a13eb062df7869894c0cc013

 S-live: Rate operation count for delete is worse than 0.20.204 by 28.8%
 ---

 Key: HDFS-2984
 URL: https://issues.apache.org/jira/browse/HDFS-2984
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: benchmarks
Affects Versions: 0.23.1
Reporter: Vinay Kumar Thota
Assignee: Ravi Prakash
Priority: Critical
 Attachments: slive.tar.gz


 Rate operation count for delete is worse than 0.20.204.xx by 28.8%

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2984) S-live: Rate operation count for delete is worse than 0.20.204 by 28.8%

2012-07-31 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426236#comment-13426236
 ] 

Ravi Prakash commented on HDFS-2984:


@Todd : And yes these are against branch-204. 
43d9fd86c87514bd0e0e9ea19c84c2a0109bf77e

 S-live: Rate operation count for delete is worse than 0.20.204 by 28.8%
 ---

 Key: HDFS-2984
 URL: https://issues.apache.org/jira/browse/HDFS-2984
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: benchmarks
Affects Versions: 0.23.1
Reporter: Vinay Kumar Thota
Assignee: Ravi Prakash
Priority: Critical
 Attachments: slive.tar.gz


 Rate operation count for delete is worse than 0.20.204.xx by 28.8%

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3579) libhdfs: fix exception handling

2012-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426253#comment-13426253
 ] 

Hadoop QA commented on HDFS-3579:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538632/HDFS-3579.006.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2932//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2932//console

This message is automatically generated.

 libhdfs: fix exception handling
 ---

 Key: HDFS-3579
 URL: https://issues.apache.org/jira/browse/HDFS-3579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3579.004.patch, HDFS-3579.005.patch, 
 HDFS-3579.006.patch


 libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
 memory associated with them (memory leak).  Sometimes we invoke JNI functions 
 that are not supposed to be invoked when an exception is active.
 Running a libhdfs test program with -Xcheck:jni shows the latter problem 
 clearly:
 {code}
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 Exception in thread main java.io.IOException: ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3738) TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config

2012-07-31 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426260#comment-13426260
 ] 

Eli Collins commented on HDFS-3738:
---

+1

And thanks for the spelunking that led to HDFS-3401.

 TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config
 --

 Key: HDFS-3738
 URL: https://issues.apache.org/jira/browse/HDFS-3738
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Attachments: HDFS-3738.patch, HDFS-3738.patch


 TestDFSClientRetries#testFailuresArePerOperation involves testing retries by 
 making use of expected timeouts. However, this test sets the wrong config to 
 lower the timeout, and thus takes far longer than it should.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3579) libhdfs: fix exception handling

2012-07-31 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426262#comment-13426262
 ] 

Andy Isaacson commented on HDFS-3579:
-

{code}
@@ -870,15 +812,12 @@ int hdfsCloseFile(hdfsFS fs, hdfsFile file)
-//Parameters
-jobject jStream = (jobject)(file ? file-file : NULL);
...
+(*env)-DeleteGlobalRef(env, file-file);
 free(file);
-(*env)-DeleteGlobalRef(env, jStream);
{code}
Let's preserve the interface that {{hdfsFile file}} can be NULL without causing 
SEGV. Just toss in a {{if (file == NULL) return -1;}} near the top.

{code}
+jthr = invokeMethod(env, jVal, INSTANCE, jInputStream, HADOOP_ISTRM,
read, ([B)I, jbRarray);
...
+if (jVal.i  0) {
+// EOF
+destroyLocalReference(env, jbRarray);
+return 0;
+} else if (jVal.i == 0) {
+destroyLocalReference(env, jbRarray);
+errno = EINTR;
+return -1;
{code}
Is this correct?  FSDataInputStream#read returns -1 on EOF and 0 on EINTR?  
That's special.  I see docs for the -1 case, but I don't see anywhere that the 
0 could come from?

{code}
 tSize hdfsPread(hdfsFS fs, hdfsFile f, tOffset position,
 void* buffer, tSize length)
 {
-// JAVA EQUIVALENT:
-//  byte [] bR = new byte[length];
-//  fis.read(pos, bR, 0, length);
{code}

I find these JAVA EQUIVALENT comments to be very helpful, could we keep them 
around?  so long as they're accurate, I mean.  If they're misleading then 
deleting is correct.
{code}
+if (jthr) {
+errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
+hdfsTell: org.apache.hadoop.fs.%s::getPos,
+((f-type == INPUT) ? FSDataInputStream :
+ FSDataOutputStream));
{code}
Please use {{interface}} here rather than recapitulating its ternary.

more review to come ... two-thirds done now ...

 libhdfs: fix exception handling
 ---

 Key: HDFS-3579
 URL: https://issues.apache.org/jira/browse/HDFS-3579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3579.004.patch, HDFS-3579.005.patch, 
 HDFS-3579.006.patch


 libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
 memory associated with them (memory leak).  Sometimes we invoke JNI functions 
 that are not supposed to be invoked when an exception is active.
 Running a libhdfs test program with -Xcheck:jni shows the latter problem 
 clearly:
 {code}
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 Exception in thread main java.io.IOException: ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3723) All commands should support meaningful --help

2012-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426263#comment-13426263
 ] 

Hadoop QA commented on HDFS-3723:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12538630/HDFS-3723.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.ha.TestHAAdmin
  org.apache.hadoop.hdfs.tools.TestDFSHAAdmin

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2933//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2933//console

This message is automatically generated.

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3667:
-

Fix Version/s: 2.0.1-alpha
 Hadoop Flags: Reviewed

I have committed the new patch to trunk and branch-2.

 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 2.0.1-alpha

 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3737) retry support for webhdfs is breaking HttpFS support

2012-07-31 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3737:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

The problematic patch was reverted and a new patch was committed in HDFS-3667.


 retry support for webhdfs is breaking HttpFS support
 

 Key: HDFS-3737
 URL: https://issues.apache.org/jira/browse/HDFS-3737
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Tsz Wo (Nicholas), SZE
Priority: Blocker
 Fix For: 2.2.0-alpha

 Attachments: h3737_20120730.patch


 HDFS-3667 is breaking HttpFS testcases:
 {code}
 Running org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem
 Tests run: 30, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 14.585 sec 
  FAILURE!
 Results :
 Tests in error: 
   testOperation[1](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=200 != 307, op=OPEN, message=OK
   
 testOperationDoAs[1](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=200 != 307, op=OPEN, message=OK
   testOperation[2](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem)
   testOperationDoAs[2](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem)
   testOperation[3](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=400 != 200, op=APPEND, message=Data upload 
 requests must have content-type set to 'application/octet-stream'
   
 testOperationDoAs[3](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=400 != 200, op=APPEND, message=Data upload 
 requests must have content-type set to 'application/octet-stream'
   testOperation[13](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=200 != 307, op=GETFILECHECKSUM, message=OK
   
 testOperationDoAs[13](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem):
  Unexpected HTTP response: code=200 != 307, op=GETFILECHECKSUM, message=OK
 Tests run: 30, Failures: 0, Errors: 8, Skipped: 0
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-3740) Multiple test cases in TestWebhdfsFileSystem are failing

2012-07-31 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-3740.
--

   Resolution: Fixed
Fix Version/s: (was: 2.2.0-alpha)
   (was: 3.0.0)

The problematic patch was reverted and a new patch was committed in HDFS-3667.

 Multiple test cases in TestWebhdfsFileSystem are failing
 

 Key: HDFS-3740
 URL: https://issues.apache.org/jira/browse/HDFS-3740
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.1.0-alpha
Reporter: Kihwal Lee
Assignee: Tsz Wo (Nicholas), SZE

 After HDFS-3667, 7-8 cases have been failing in 2.0 build.
 These are with Clover enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3737) retry support for webhdfs is breaking HttpFS support

2012-07-31 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3737:
-

Fix Version/s: (was: 2.2.0-alpha)

 retry support for webhdfs is breaking HttpFS support
 

 Key: HDFS-3737
 URL: https://issues.apache.org/jira/browse/HDFS-3737
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0-alpha
Reporter: Alejandro Abdelnur
Assignee: Tsz Wo (Nicholas), SZE
Priority: Blocker
 Attachments: h3737_20120730.patch


 HDFS-3667 is breaking HttpFS testcases:
 {code}
 Running org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem
 Tests run: 30, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 14.585 sec 
  FAILURE!
 Results :
 Tests in error: 
   testOperation[1](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=200 != 307, op=OPEN, message=OK
   
 testOperationDoAs[1](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=200 != 307, op=OPEN, message=OK
   testOperation[2](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem)
   testOperationDoAs[2](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem)
   testOperation[3](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=400 != 200, op=APPEND, message=Data upload 
 requests must have content-type set to 'application/octet-stream'
   
 testOperationDoAs[3](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=400 != 200, op=APPEND, message=Data upload 
 requests must have content-type set to 'application/octet-stream'
   testOperation[13](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem): 
 Unexpected HTTP response: code=200 != 307, op=GETFILECHECKSUM, message=OK
   
 testOperationDoAs[13](org.apache.hadoop.fs.http.client.TestWebhdfsFileSystem):
  Unexpected HTTP response: code=200 != 307, op=GETFILECHECKSUM, message=OK
 Tests run: 30, Failures: 0, Errors: 8, Skipped: 0
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426278#comment-13426278
 ] 

Hudson commented on HDFS-3667:
--

Integrated in Hadoop-Common-trunk-Commit #2545 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2545/])
HDFS-3667.  Add retry support to WebHdfsFileSystem. (Revision 1367841)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1367841
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ByteRangeInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/DeleteOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/HttpOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/PostOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/PutOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestDelegationTokenForProxyUser.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestOffsetUrlInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java


 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 2.0.1-alpha

 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426277#comment-13426277
 ] 

Hudson commented on HDFS-3667:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2610 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2610/])
HDFS-3667.  Add retry support to WebHdfsFileSystem. (Revision 1367841)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1367841
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ByteRangeInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/DeleteOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/HttpOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/PostOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/PutOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestDelegationTokenForProxyUser.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestOffsetUrlInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java


 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 2.0.1-alpha

 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3738) TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config

2012-07-31 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3738:
-

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks a lot for the review, Eli. I've just committed this to trunk and 
branch-2.

 TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config
 --

 Key: HDFS-3738
 URL: https://issues.apache.org/jira/browse/HDFS-3738
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-3738.patch, HDFS-3738.patch


 TestDFSClientRetries#testFailuresArePerOperation involves testing retries by 
 making use of expected timeouts. However, this test sets the wrong config to 
 lower the timeout, and thus takes far longer than it should.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3738) TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config

2012-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426284#comment-13426284
 ] 

Hudson commented on HDFS-3738:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2611 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2611/])
HDFS-3738. TestDFSClientRetries#testFailuresArePerOperation sets incorrect 
timeout config. Contributed by Aaron T. Myers. (Revision 1367844)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1367844
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java


 TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config
 --

 Key: HDFS-3738
 URL: https://issues.apache.org/jira/browse/HDFS-3738
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-3738.patch, HDFS-3738.patch


 TestDFSClientRetries#testFailuresArePerOperation involves testing retries by 
 making use of expected timeouts. However, this test sets the wrong config to 
 lower the timeout, and thus takes far longer than it should.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3738) TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config

2012-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426286#comment-13426286
 ] 

Hudson commented on HDFS-3738:
--

Integrated in Hadoop-Common-trunk-Commit #2546 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2546/])
HDFS-3738. TestDFSClientRetries#testFailuresArePerOperation sets incorrect 
timeout config. Contributed by Aaron T. Myers. (Revision 1367844)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1367844
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java


 TestDFSClientRetries#testFailuresArePerOperation sets incorrect timeout config
 --

 Key: HDFS-3738
 URL: https://issues.apache.org/jira/browse/HDFS-3738
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-3738.patch, HDFS-3738.patch


 TestDFSClientRetries#testFailuresArePerOperation involves testing retries by 
 making use of expected timeouts. However, this test sets the wrong config to 
 lower the timeout, and thus takes far longer than it should.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread meng gong (JIRA)
meng gong created HDFS-3746:
---

 Summary: Invalid counter tag in HDFS balancer which lead to 
infinite loop
 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
 Fix For: 2.1.0-alpha, 3.0.0


Everytime banlancer try to move a block cross the rack. For every 
NameNodeConnector it will instance a new banlancer. The tag 
notChangedIterations is reseted to 0. Then it won't reach 5 and exit the thread 
for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3579) libhdfs: fix exception handling

2012-07-31 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426291#comment-13426291
 ] 

Colin Patrick McCabe commented on HDFS-3579:


bq. I find these JAVA EQUIVALENT comments to be very helpful, could we keep 
them around? so long as they're accurate, I mean. If they're misleading then 
deleting is correct.

ok.

bq. Please use interface here rather than recapitulating its ternary.

It was done to avoid printing out org/apache/hadoop/fs/FSDataInputStream etc. 
since- as you commeted above-- it's nicer to print something shorter.  I don't 
have a strong feeling about it either way, but I suspect it's easier just to 
redo the ternary here.

bq. Is this correct? FSDataInputStream#read returns -1 on EOF and 0 on EINTR? 
That's special. I see docs for the -1 case, but I don't see anywhere that the 0 
could come from?

The standard Java convention is that -1 means EOF, and 0 is just a short read.  
hdfsRead, on the other hand, follows the UNIX convention.  See HADOOP-1582 for 
more details.

bq. Let's preserve the interface that hdfsFile file can be NULL without causing 
SEGV. Just toss in a if (file == NULL) return -1; near the top.

The whole thing is kind of messy.  Passing a NULL pointer to hdfsClose is a 
user error, yet we check for it for some reason.

There's no way to actually *get* an hdfsFile which has file-type 
UNINITIALIZED.  Every hdfsOpen path either leads to returning null, or 
returning a file of type INPUT or OUTPUT.  There's no way to close a file that 
hasn't been opened either.  Similarly, there's no way to get a file where 
file-file is NULL.

The original code didn't check for file-file being NULL either (look at it 
carefully, you'll see what I mean).

tl;dr:  I didn't change the behavior here.  But someone should eventually.

 libhdfs: fix exception handling
 ---

 Key: HDFS-3579
 URL: https://issues.apache.org/jira/browse/HDFS-3579
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3579.004.patch, HDFS-3579.005.patch, 
 HDFS-3579.006.patch


 libhdfs does not consistently handle exceptions.  Sometimes we don't free the 
 memory associated with them (memory leak).  Sometimes we invoke JNI functions 
 that are not supposed to be invoked when an exception is active.
 Running a libhdfs test program with -Xcheck:jni shows the latter problem 
 clearly:
 {code}
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 WARNING in native method: JNI call made with exception pending
 Exception in thread main java.io.IOException: ...
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426292#comment-13426292
 ] 

Hudson commented on HDFS-3667:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2563 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2563/])
HDFS-3667.  Add retry support to WebHdfsFileSystem. (Revision 1367841)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1367841
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/ByteRangeInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/DeleteOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/GetOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/HttpOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/PostOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/PutOpParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestDelegationTokenForProxyUser.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestOffsetUrlInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java


 Add retry support to WebHdfsFileSystem
 --

 Key: HDFS-3667
 URL: https://issues.apache.org/jira/browse/HDFS-3667
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 2.0.1-alpha

 Attachments: h3667_20120718.patch, h3667_20120721.patch, 
 h3667_20120722.patch, h3667_20120725.patch, h3667_20120730.patch, 
 h3667_20120730_b-1.patch


 DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
 retries on exceptions such as connection failure, safemode.  
 WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reassigned HDFS-3746:
-

Assignee: Jing Zhao  (was: Jingguo Yao)

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Jing Zhao
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reassigned HDFS-3746:
-

Assignee: Jingguo Yao

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Jingguo Yao
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3747) In some TestCace used in maven. The balancer-Test failed because of TimeOut

2012-07-31 Thread meng gong (JIRA)
meng gong created HDFS-3747:
---

 Summary: In some TestCace used in maven. The balancer-Test failed 
because of TimeOut
 Key: HDFS-3747
 URL: https://issues.apache.org/jira/browse/HDFS-3747
 Project: Hadoop HDFS
  Issue Type: Test
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
 Fix For: 2.1.0-alpha, 3.0.0


When run a given test case for Banlancer Test. In the test the banlancer thread 
try to move some block cross the rack but it can't find any available blocks in 
the source rack. Then the thread won't interrupt until the tag isTimeUp reaches 
20min. But maven judges the test failed because the thread have runned for 
15min.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread meng gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

meng gong resolved HDFS-3746.
-

  Resolution: Fixed
Release Note: Changed banlancer to single instance model for every namenode 
in the multiNameNode frame

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Jing Zhao
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426301#comment-13426301
 ] 

Eli Collins commented on HDFS-3746:
---

Why was this closed as fixed? There's no patch.

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Jing Zhao
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du reopened HDFS-3746:
--


 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Junping Du
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du reassigned HDFS-3746:


Assignee: Junping Du  (was: Jing Zhao)

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Junping Du
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426310#comment-13426310
 ] 

Junping Du commented on HDFS-3746:
--

Meng, that's good finding. I think we should have a HashMap to keep connector 
to balancer instance then have singleton for each connect. Please provide patch 
and upload. Before that, I will reopen this issue and will assign to you later. 

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Junping Du
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426312#comment-13426312
 ] 

Junping Du commented on HDFS-3746:
--

Eli, can you add Meng Gong to some alias so that I can assign this issue to 
him? Thanks!

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: Junping Du
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins reassigned HDFS-3746:
-

Assignee: meng gong  (was: Junping Du)

Junping, done.

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: meng gong
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop

2012-07-31 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426315#comment-13426315
 ] 

Junping Du commented on HDFS-3746:
--

Thanks! Eli.

 Invalid counter tag in HDFS balancer which lead to infinite loop
 

 Key: HDFS-3746
 URL: https://issues.apache.org/jira/browse/HDFS-3746
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: meng gong
Assignee: meng gong
  Labels: patch
 Fix For: 2.1.0-alpha, 3.0.0


 Everytime banlancer try to move a block cross the rack. For every 
 NameNodeConnector it will instance a new banlancer. The tag 
 notChangedIterations is reseted to 0. Then it won't reach 5 and exit the 
 thread for not moved in 5 consecutive iterations. This lead to a infinite loop

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3744) Decommissioned nodes are included in cluster after switch which is not expected

2012-07-31 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426336#comment-13426336
 ] 

Brahma Reddy Battula commented on HDFS-3744:


{quote}
The issue is that some dsfadmin commands need to perform client-side failover 
and talk to the appropriate NN, while others should actually be run against 
both NNs.
{quote}

But if we do client-side failover also,I think BNN will not come know.
We need to execute command on both NN's.

 Decommissioned nodes are included in cluster after switch which is not 
 expected
 ---

 Key: HDFS-3744
 URL: https://issues.apache.org/jira/browse/HDFS-3744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 2.0.1-alpha
Reporter: Brahma Reddy Battula

 Scenario:
 =
 Start ANN and SNN with three DN's
 Exclude DN1 from cluster by using decommission feature 
 (./hdfs dfsadmin -fs hdfs://ANNIP:8020 -refreshNodes)
 After decommission successful,do switch such that SNN will become Active.
 Here exclude node(DN1) is included in cluster.Able to write files to excluded 
 node since it's not excluded.
 Checked SNN(Which Active before switch) UI decommissioned=1 and ANN UI 
 decommissioned=0
 One more Observation:
 
 All dfsadmin commands will create proxy only on nn1 irrespective of Active or 
 standby.I think this also we need to re-look once..
 I am not getting , why we are not given HA for dfsadmin commands..?
 Please correct me,,If I am wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3744) Decommissioned nodes are included in cluster after switch which is not expected

2012-07-31 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426337#comment-13426337
 ] 

Uma Maheswara Rao G commented on HDFS-3744:
---

From the issue What I understud is,

We have to execute this particular command on both the nodes. 
Otherwise, even though we make it failover work here, still SNN will not know 
about the excluded nodes as it did not get any refresh nodes command.

Do we need to handle from DFSAdmin separate for this kind of commands 
separately?
ex: we can iterate all the nns which are available in list and send the 
refreshnodes commands.

other option might be like we have to perform refresh nodes on switch.
We have to think about the impacts on that.

 

 Decommissioned nodes are included in cluster after switch which is not 
 expected
 ---

 Key: HDFS-3744
 URL: https://issues.apache.org/jira/browse/HDFS-3744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 2.0.1-alpha
Reporter: Brahma Reddy Battula

 Scenario:
 =
 Start ANN and SNN with three DN's
 Exclude DN1 from cluster by using decommission feature 
 (./hdfs dfsadmin -fs hdfs://ANNIP:8020 -refreshNodes)
 After decommission successful,do switch such that SNN will become Active.
 Here exclude node(DN1) is included in cluster.Able to write files to excluded 
 node since it's not excluded.
 Checked SNN(Which Active before switch) UI decommissioned=1 and ANN UI 
 decommissioned=0
 One more Observation:
 
 All dfsadmin commands will create proxy only on nn1 irrespective of Active or 
 standby.I think this also we need to re-look once..
 I am not getting , why we are not given HA for dfsadmin commands..?
 Please correct me,,If I am wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3744) Decommissioned nodes are included in cluster after switch which is not expected

2012-07-31 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13426339#comment-13426339
 ] 

Uma Maheswara Rao G commented on HDFS-3744:
---

{quote}other option might be like we have to perform refresh nodes on 
switch.{quote}
one problem here is, if we just configure excludes and not executed 
refreshNodes command yet and just switch happend then this may perform 
refereshNodes simply. So, This may not be the preferable way to go.

We can think about the other option in the above comment.

 Decommissioned nodes are included in cluster after switch which is not 
 expected
 ---

 Key: HDFS-3744
 URL: https://issues.apache.org/jira/browse/HDFS-3744
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 2.0.1-alpha
Reporter: Brahma Reddy Battula

 Scenario:
 =
 Start ANN and SNN with three DN's
 Exclude DN1 from cluster by using decommission feature 
 (./hdfs dfsadmin -fs hdfs://ANNIP:8020 -refreshNodes)
 After decommission successful,do switch such that SNN will become Active.
 Here exclude node(DN1) is included in cluster.Able to write files to excluded 
 node since it's not excluded.
 Checked SNN(Which Active before switch) UI decommissioned=1 and ANN UI 
 decommissioned=0
 One more Observation:
 
 All dfsadmin commands will create proxy only on nn1 irrespective of Active or 
 standby.I think this also we need to re-look once..
 I am not getting , why we are not given HA for dfsadmin commands..?
 Please correct me,,If I am wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira