[jira] [Commented] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422038#comment-13422038
 ] 

Hadoop QA commented on HDFS-3721:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537798/hdfs-3721.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestFileConcurrentReader

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2902//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2902//console

This message is automatically generated.

> hsync support broke wire compatibility
> --
>
> Key: HDFS-3721
> URL: https://issues.apache.org/jira/browse/HDFS-3721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 2.1.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3721.txt
>
>
> HDFS-744 added support for hsync to the data transfer wire protocol. However, 
> it actually broke wire compatibility: if the client has hsync support but the 
> server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422036#comment-13422036
 ] 

Hadoop QA commented on HDFS-3696:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537802/h3696_20120724.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestReplication
  org.apache.hadoop.hdfs.TestDatanodeBlockScanner
  org.apache.hadoop.hdfs.TestPersistBlocks

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2901//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2901//console

This message is automatically generated.

> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: h3696_20120724.patch
>
>
> When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes 
> OOM if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3722) TaskTracker's heartbeat is out of control

2012-07-24 Thread Liyin Liang (JIRA)
Liyin Liang created HDFS-3722:
-

 Summary: TaskTracker's heartbeat is out of control
 Key: HDFS-3722
 URL: https://issues.apache.org/jira/browse/HDFS-3722
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.0.3, 1.0.2, 1.0.1, 1.0.0
Reporter: Liyin Liang




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422009#comment-13422009
 ] 

Suresh Srinivas commented on HDFS-3718:
---

+1 for the patch

> Datanode won't shutdown because of runaway DataBlockScanner thread
> --
>
> Key: HDFS-3718
> URL: https://issues.apache.org/jira/browse/HDFS-3718
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3718.patch.txt
>
>
> Datanode sometimes does not shutdown because the block pool scanner thread 
> keeps running. It prints out "Starting a new period" every five seconds, even 
> after {{shutdown()}} is called.  Somehow the interrupt is missed.
> {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
> but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
> before it is being set to false.
> Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422007#comment-13422007
 ] 

Suresh Srinivas commented on HDFS-2815:
---

Uma, the second option makes sense. Let's back port HDFS-173 and then this 
patch. Thanks for doing it.

> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422003#comment-13422003
 ] 

Uma Maheswara Rao G commented on HDFS-2815:
---

Yeah, Suresh.
That would be good if I include HDFS-173 as well. Because almost I have taken 
the parts from HDFS-173 and added current fix.

But per the 
[comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13207463&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13207463],
  we decided not including HDFS-173 completely in this patch.

But after the above comment fixes, I think there won't be much difference with 
HDFS-173. So, I will consider HDFS-173 also now with this patch and will 
include that tests as well.

Other option is, I will provide a back port patch in HDFS-173 itself. After 
that issue committed, we can add straightforward patch for this JIRA? This may 
be more clear right? (instead of mixing other JIRA changes in this). How does 
this sounds to you?


Thanks,
Uma


> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421991#comment-13421991
 ] 

Hadoop QA commented on HDFS-3718:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537784/hdfs-3718.patch.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDatanodeBlockScanner
  org.apache.hadoop.hdfs.TestPersistBlocks

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2899//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2899//console

This message is automatically generated.

> Datanode won't shutdown because of runaway DataBlockScanner thread
> --
>
> Key: HDFS-3718
> URL: https://issues.apache.org/jira/browse/HDFS-3718
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3718.patch.txt
>
>
> Datanode sometimes does not shutdown because the block pool scanner thread 
> keeps running. It prints out "Starting a new period" every five seconds, even 
> after {{shutdown()}} is called.  Somehow the interrupt is missed.
> {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
> but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
> before it is being set to false.
> Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421986#comment-13421986
 ] 

Hadoop QA commented on HDFS-3553:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537781/HDFS-3553.trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.common.TestJspHelper
  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2900//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2900//console

This message is automatically generated.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch, HDFS-3553.branch-23.patch, HDFS-3553.trunk.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-24 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3696:
-

Status: Patch Available  (was: Open)

> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: h3696_20120724.patch
>
>
> When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes 
> OOM if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-24 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3696:
-

Attachment: h3696_20120724.patch

h3696_20120724.patch: add setChunkedStreamingMode(32kB).

I tried several chunk sizes for writing 300MB files.  32kB was the best in my 
test.

|| Chunk size || 1st || 2nd ||
| 4kB   |  3.95MB/s |  3.95MB/s |
| 16kB  |  7.81MB/s |  7.70MB/s |
| 24kB  | 12.58MB/s | 12.29MB/s |
| 32kB  | 14.15MB/s | 14.28MB/s |
| 48kB  | 14.25MB/s | 13.29MB/s |
| 64kB  | 13.65MB/s | 13.57MB/s |
| 128kB | 13.94MB/s | 13.15MB/s |
| 1MB   | 13.11MB/s | 13.45MB/s |


> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: h3696_20120724.patch
>
>
> When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes 
> OOM if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421972#comment-13421972
 ] 

Todd Lipcon commented on HDFS-3721:
---

(I should also note that, even with this patch, a 2.0 client can still talk to 
a trunk cluster, since the server now properly handles variable-length headers)

> hsync support broke wire compatibility
> --
>
> Key: HDFS-3721
> URL: https://issues.apache.org/jira/browse/HDFS-3721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 2.1.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3721.txt
>
>
> HDFS-744 added support for hsync to the data transfer wire protocol. However, 
> it actually broke wire compatibility: if the client has hsync support but the 
> server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3721:
--

Attachment: hdfs-3721.txt

This patch fixes the issue as follows:
- Refactors out the packet-reading code from BlockReceiver and 
RemoteBlockReader2 into a new {{PacketReceiver}} class. This really simplified 
BlockReceiver in particular, and has the nice side effect of getting us 
significantly closer to HDFS-3529.
- All places where we used to assume a fixed-length packet header now support 
variable length. In some cases this is achieved by allocating a 
larger-than-necessary buffer and then, once we know the size for the header, 
putting it at the right spot to make the header contiguous with the data. In 
other cases, this is achieved by simply separating the buffer containing the 
header from the buffer containing the data.

- Regarding the issue above with 2.1 clients writing to 2.0 servers, I fixed 
the issue by having the PacketHeader class not set any value for the 
{{syncBlock}} flag when it is false. That means that, so long as the new hsync 
functionality isn't used, new clients can still talk to 2.0 servers. If hsync 
is used, an error will occur. It's slightly unfortunate, but given that the 2.0 
branch is pretty new, and this is a new feature. I think this is acceptable.

I manually tested a trunk client both reading and writing from a 2.0.0-alpha 
cluster with this patch applied.

> hsync support broke wire compatibility
> --
>
> Key: HDFS-3721
> URL: https://issues.apache.org/jira/browse/HDFS-3721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 2.1.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3721.txt
>
>
> HDFS-744 added support for hsync to the data transfer wire protocol. However, 
> it actually broke wire compatibility: if the client has hsync support but the 
> server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3721:
--

Status: Patch Available  (was: Open)

> hsync support broke wire compatibility
> --
>
> Key: HDFS-3721
> URL: https://issues.apache.org/jira/browse/HDFS-3721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 2.1.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3721.txt
>
>
> HDFS-744 added support for hsync to the data transfer wire protocol. However, 
> it actually broke wire compatibility: if the client has hsync support but the 
> server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-24 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE reassigned HDFS-3696:


Assignee: Tsz Wo (Nicholas), SZE  (was: Jing Zhao)

> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
>
> When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes 
> OOM if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1362) Provide volume management functionality for DataNode

2012-07-24 Thread Wang Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421959#comment-13421959
 ] 

Wang Xu commented on HDFS-1362:
---

hi jiwan,

I know, I am still keeping look at this issue. But as current my job is not on 
hdfs, I cannot keep track the updates in time. Sorry for that.

And for "if the failed disk still readable and the node has enouth space, it 
can migrate data on the disks to other disks in the same node.", it was done in 
the original version, but was removed in the posted version. Having discussed 
with some committer, we all thought it is better to make the patch smaller and 
cleaner.

Thanks for your attention.


> Provide volume management functionality for DataNode
> 
>
> Key: HDFS-1362
> URL: https://issues.apache.org/jira/browse/HDFS-1362
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.23.0
>Reporter: Wang Xu
>Assignee: Wang Xu
> Fix For: 0.24.0
>
> Attachments: DataNode Volume Refreshment in HDFS-1362.pdf, 
> HDFS-1362.4_w7001.txt, HDFS-1362.5.patch, HDFS-1362.6.patch, 
> HDFS-1362.7.patch, HDFS-1362.8.patch, HDFS-1362.txt, 
> Provide_volume_management_for_DN_v1.pdf
>
>
> The current management unit in Hadoop is a node, i.e. if a node failed, it 
> will be kicked out and all the data on the node will be replicated.
> As almost all SATA controller support hotplug, we add a new command line 
> interface to datanode, thus it can list, add or remove a volume online, which 
> means we can change a disk without node decommission. Moreover, if the failed 
> disk still readable and the node has enouth space, it can migrate data on the 
> disks to other disks in the same node.
> A more detailed design document will be attached.
> The original version in our lab is implemented against 0.20 datanode 
> directly, and is it better to implemented it in contrib? Or any other 
> suggestion?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1362) Provide volume management functionality for DataNode

2012-07-24 Thread jiwan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421957#comment-13421957
 ] 

jiwan commented on HDFS-1362:
-

hi, all, who know the progress about this issue?

> Provide volume management functionality for DataNode
> 
>
> Key: HDFS-1362
> URL: https://issues.apache.org/jira/browse/HDFS-1362
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: data-node
>Affects Versions: 0.23.0
>Reporter: Wang Xu
>Assignee: Wang Xu
> Fix For: 0.24.0
>
> Attachments: DataNode Volume Refreshment in HDFS-1362.pdf, 
> HDFS-1362.4_w7001.txt, HDFS-1362.5.patch, HDFS-1362.6.patch, 
> HDFS-1362.7.patch, HDFS-1362.8.patch, HDFS-1362.txt, 
> Provide_volume_management_for_DN_v1.pdf
>
>
> The current management unit in Hadoop is a node, i.e. if a node failed, it 
> will be kicked out and all the data on the node will be replicated.
> As almost all SATA controller support hotplug, we add a new command line 
> interface to datanode, thus it can list, add or remove a volume online, which 
> means we can change a disk without node decommission. Moreover, if the failed 
> disk still readable and the node has enouth space, it can migrate data on the 
> disks to other disks in the same node.
> A more detailed design document will be attached.
> The original version in our lab is implemented against 0.20 datanode 
> directly, and is it better to implemented it in contrib? Or any other 
> suggestion?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-24 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE reassigned HDFS-3696:


Assignee: Jing Zhao

> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Assignee: Jing Zhao
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
>
> When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes 
> OOM if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2656) Implement a pure c client based on webhdfs

2012-07-24 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-2656:


Attachment: HDFS-2656.unfinished.patch

I'm now working on it based on HDFS-2631.patch by Jaimin. The patch is 
unfinished and in progress. 

> Implement a pure c client based on webhdfs
> --
>
> Key: HDFS-2656
> URL: https://issues.apache.org/jira/browse/HDFS-2656
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Zhanwei.Wang
> Attachments: HDFS-2656.unfinished.patch
>
>
> Currently, the implementation of libhdfs is based on JNI. The overhead of JVM 
> seems a little big, and libhdfs can also not be used in the environment 
> without hdfs.
> It seems a good idea to implement a pure c client by wrapping webhdfs. It 
> also can be used to access different version of hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2533) Remove needless synchronization on FSDataSet.getBlockFile

2012-07-24 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421925#comment-13421925
 ] 

Brandon Li commented on HDFS-2533:
--

I did teragen/terasort/teravalid with the branch-1 patch and noticed some 
performance gain in terasort and teravalid, no noticeable performance gain with 
teragen.

> Remove needless synchronization on FSDataSet.getBlockFile
> -
>
> Key: HDFS-2533
> URL: https://issues.apache.org/jira/browse/HDFS-2533
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 0.24.0, 0.23.1
>
> Attachments: HDFS-2533.branch-1.patch, hdfs-2533.txt, hdfs-2533.txt
>
>
> HDFS-1148 discusses lock contention issues in FSDataset. It provides a more 
> comprehensive fix, converting it all to RWLocks, etc. This JIRA is for one 
> very specific fix which gives a decent performance improvement for 
> TestParallelRead: getBlockFile() currently holds the lock which is completely 
> unnecessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421923#comment-13421923
 ] 

Todd Lipcon commented on HDFS-3721:
---

I have a fix for this, that rewrites BlockReceiver and RemoteBlockReader2 to 
handle variable-length packet headers. Unfortunately, since it modifies the 
server side, it still doesn't allow a new client to write to an old server 
(just read).

I'm trying to think of a creative way to fix this issue -- perhaps adding a 
flag in the response to "writeBlock" which indicates 
{{supportsVariableLengthPacketHeader}}. If the flag isn't set, then hsync 
wouldn't be supported on this stream, and it would fall back to the old fixed 
format header.

> hsync support broke wire compatibility
> --
>
> Key: HDFS-3721
> URL: https://issues.apache.org/jira/browse/HDFS-3721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 2.1.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>
> HDFS-744 added support for hsync to the data transfer wire protocol. However, 
> it actually broke wire compatibility: if the client has hsync support but the 
> server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2751) Datanode drops OS cache behind reads even for short reads

2012-07-24 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421921#comment-13421921
 ] 

Brandon Li commented on HDFS-2751:
--

I did teragen/terasort/teravalid with the branch-1 patch.

> Datanode drops OS cache behind reads even for short reads
> -
>
> Key: HDFS-2751
> URL: https://issues.apache.org/jira/browse/HDFS-2751
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: HDFS-2751.branch-1.patch, hdfs-2751.txt, hdfs-2751.txt
>
>
> HDFS-2465 has some code which attempts to disable the "drop cache behind 
> reads" functionality when the reads are <256KB (eg HBase random access). But 
> this check was missing in the {{close()}} function, so it always drops cache 
> behind reads regardless of the size of the read. This hurts HBase random read 
> performance when this patch is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2533) Remove needless synchronization on FSDataSet.getBlockFile

2012-07-24 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421916#comment-13421916
 ] 

Brandon Li commented on HDFS-2533:
--

Just noticed that this JIRA is interesting. Basically it moves File.exists() 
call out of the lock and it could make non-trivial performance improvement. 
Depends on the workload, the cost of File.exists() call can be variant and thus 
the gain of this patch. 

Here is a discussion of the cost of File.exists(): 
http://stackoverflow.com/questions/6321180/how-expensive-is-file-exists-in-java

I did a test on my laptop(Darwin Kernel Version 11.3.0). For a non-exist file, 
the first check took 30ms.

Uploaded a patch for branch-1.

> Remove needless synchronization on FSDataSet.getBlockFile
> -
>
> Key: HDFS-2533
> URL: https://issues.apache.org/jira/browse/HDFS-2533
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 0.24.0, 0.23.1
>
> Attachments: HDFS-2533.branch-1.patch, hdfs-2533.txt, hdfs-2533.txt
>
>
> HDFS-1148 discusses lock contention issues in FSDataset. It provides a more 
> comprehensive fix, converting it all to RWLocks, etc. This JIRA is for one 
> very specific fix which gives a decent performance improvement for 
> TestParallelRead: getBlockFile() currently holds the lock which is completely 
> unnecessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2533) Remove needless synchronization on FSDataSet.getBlockFile

2012-07-24 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-2533:
-

Attachment: HDFS-2533.branch-1.patch

> Remove needless synchronization on FSDataSet.getBlockFile
> -
>
> Key: HDFS-2533
> URL: https://issues.apache.org/jira/browse/HDFS-2533
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 0.24.0, 0.23.1
>
> Attachments: HDFS-2533.branch-1.patch, hdfs-2533.txt, hdfs-2533.txt
>
>
> HDFS-1148 discusses lock contention issues in FSDataset. It provides a more 
> comprehensive fix, converting it all to RWLocks, etc. This JIRA is for one 
> very specific fix which gives a decent performance improvement for 
> TestParallelRead: getBlockFile() currently holds the lock which is completely 
> unnecessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread

2012-07-24 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-3718:
-

Status: Patch Available  (was: Open)

> Datanode won't shutdown because of runaway DataBlockScanner thread
> --
>
> Key: HDFS-3718
> URL: https://issues.apache.org/jira/browse/HDFS-3718
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3718.patch.txt
>
>
> Datanode sometimes does not shutdown because the block pool scanner thread 
> keeps running. It prints out "Starting a new period" every five seconds, even 
> after {{shutdown()}} is called.  Somehow the interrupt is missed.
> {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
> but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
> before it is being set to false.
> Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread

2012-07-24 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-3718:
-

Attachment: hdfs-3718.patch.txt

Moved up the line where {{datanode.shouldRun}} is set to false.  I ran all HDFS 
tests and there wasn't any new test failure.

> Datanode won't shutdown because of runaway DataBlockScanner thread
> --
>
> Key: HDFS-3718
> URL: https://issues.apache.org/jira/browse/HDFS-3718
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3718.patch.txt
>
>
> Datanode sometimes does not shutdown because the block pool scanner thread 
> keeps running. It prints out "Starting a new period" every five seconds, even 
> after {{shutdown()}} is called.  Somehow the interrupt is missed.
> {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
> but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
> before it is being set to false.
> Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3553:
--

Attachment: HDFS-3553.trunk.patch

Some tests will fail until the dependency HADOOP-8613 is integrated.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch, HDFS-3553.branch-23.patch, HDFS-3553.trunk.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421896#comment-13421896
 ] 

Daryn Sharp commented on HDFS-3553:
---

The trunk patch will not be as complete in DelegationTokenFetcher because kssl 
is removed from trunk.  The means for making authenticated url connections for 
spnego in fetchdt intersects the changes being made on HDFS-3509 for webhdfs 
proxy user support.  I'm going to leave it to that jira to handle making the 
spegno url authentication work for both webhdfs and fetchdt since the solution 
should be common/shared.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch, HDFS-3553.branch-23.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3509) WebHdfsFilesystem does not work within a proxyuser doAs call in secure mode

2012-07-24 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421890#comment-13421890
 ] 

Daryn Sharp commented on HDFS-3509:
---

I think we need to see if the change to use real user needs to be pushed down 
lower, perhaps into {{SecurityUtil.openSecureHttpConnection}} or deeper.  
Otherwise it looks like things such as fetchdt aren't going to work and will 
need a copy-n-paste (ick) of the logic.  I'm noticing this because the trunk 
patch for HDFS-3553 is intersecting with this patch.

> WebHdfsFilesystem does not work within a proxyuser doAs call in secure mode
> ---
>
> Key: HDFS-3509
> URL: https://issues.apache.org/jira/browse/HDFS-3509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
>Priority: Critical
> Attachments: HDFS-3509-branch1.patch, HDFS-3509.patch
>
>
> It does not find kerberos credentials in the context (the UGI is logged in 
> from a keytab) and it fails with the following trace:
> {code}
> java.lang.IllegalStateException: unknown char '<'(60) in 
> org.mortbay.util.ajax.JSON$ReaderSource@23245e75
>   at org.mortbay.util.ajax.JSON.handleUnknown(JSON.java:788)
>   at org.mortbay.util.ajax.JSON.parse(JSON.java:777)
>   at org.mortbay.util.ajax.JSON.parse(JSON.java:603)
>   at org.mortbay.util.ajax.JSON.parse(JSON.java:183)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.jsonParse(WebHdfsFileSystem.java:259)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:268)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:427)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:722)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3650) Use MutableQuantiles to provide latency histograms for various operations

2012-07-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421886#comment-13421886
 ] 

Aaron T. Myers commented on HDFS-3650:
--

Patch looks great, Andrew. Just a few little comments:

# Instead of doing Configuration#get(...) and then doing the comma separating 
yourself, you can use Configuration#getTrimmedStringCollection, which will do 
the comma handling for you. For that matter, it might be nice to add a 
getIntegerCollection method to the Configuration class, to also handle the 
integer parsing.
# I find the variable name "splitted" rather unfortunate. How about 
"splitValues" ?
# There are a few spurious whitespace changes in TestDataNodeMetrics.
# You should add an entry in hdfs-default.xml for the new 
dfs.metrics.percentiles.intervals.key, even if it has an empty value, so that 
you can add a description of what it does, and the format of what it should be 
set to.
# I find the loop try/catch of AssertionError in 
TestDataNodeMetrics#testRoundTripAckPercentilesMetric kind of unfortunate. How 
about instead you get the list of DNs involved in the write pipeline via 
DFSOutputStream#getPipeline when writing the file, and then always assert the 
quantile gauges on the actual appropriate DN?
# If assertQuantileGauges are identical between TestDataNodeMetrics and 
TestNameNodeMetrics, how about refactoring those methods? Perhaps as a static 
method in DFSTestUtil?
# Could also stand to refactor the two new tests in TestNameNodeMetrics, since 
they appear identical, save for two values.

> Use MutableQuantiles to provide latency histograms for various operations
> -
>
> Key: HDFS-3650
> URL: https://issues.apache.org/jira/browse/HDFS-3650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3650-1.patch
>
>
> MutableQuantiles provide accurate estimation of various percentiles for a 
> stream of data. Many existing metrics reported by a MutableRate would also 
> benefit from having these percentiles; lets add MutableQuantiles where we 
> think it'd be useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-24 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421875#comment-13421875
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3667:
--

Daryn and Robert,

I see your concerns now.  I am okay to separate the patch.

> Add retry support to WebHdfsFileSystem
> --
>
> Key: HDFS-3667
> URL: https://issues.apache.org/jira/browse/HDFS-3667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3667_20120718.patch, h3667_20120721.patch, 
> h3667_20120722.patch
>
>
> DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
> retries on exceptions such as connection failure, safemode.  
> WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3719) Fix TestFileConcurrentReader#testUnfinishedBlockCrcErrorTransferToAppend and #testUnfinishedBlockCRCErrorNormalTransferAppend

2012-07-24 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-3719:
--

Affects Version/s: 2.0.0-alpha

Set affects version to 2.0.0-alpha, thanks Suresh.

> Fix TestFileConcurrentReader#testUnfinishedBlockCrcErrorTransferToAppend and 
> #testUnfinishedBlockCRCErrorNormalTransferAppend
> -
>
> Key: HDFS-3719
> URL: https://issues.apache.org/jira/browse/HDFS-3719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>
> Both of these tests are disabled. We should figure out what append 
> functionality we need to make the tests work again, and reenable them.
> {code}
>   // fails due to issue w/append, disable 
>   @Ignore
>   @Test
>   public void _testUnfinishedBlockCRCErrorTransferToAppend()
> throws IOException {
> runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
>   }
>   // fails due to issue w/append, disable 
>   @Ignore
>   @Test
>   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
> throws IOException {
> runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
> DEFAULT_WRITE_SIZE);
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3652) 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name

2012-07-24 Thread Matt Foley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated HDFS-3652:
-

 Target Version/s: 1.0.4, 1.1.0  (was: 1.0.4, 1.1.0, 1.2.0)
Affects Version/s: (was: 1.2.0)
Fix Version/s: (was: 1.2.0)

since 1.2.0 is unreleased, it is sufficient to state it is fixed in 1.1.0.

> 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have 
> same name
> -
>
> Key: HDFS-3652
> URL: https://issues.apache.org/jira/browse/HDFS-3652
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.3, 1.1.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 1.0.4, 1.1.0
>
> Attachments: hdfs-3652.txt
>
>
> In {{FSEditLog.removeEditsForStorageDir}}, we iterate over the edits streams 
> trying to find the stream corresponding to a given dir. To check equality, we 
> currently use the following condition:
> {code}
>   File parentDir = getStorageDirForStream(idx);
>   if (parentDir.getName().equals(sd.getRoot().getName())) {
> {code}
> ... which is horribly incorrect. If two or more storage dirs happen to have 
> the same terminal path component (eg /data/1/nn and /data/2/nn) then it will 
> pick the wrong stream(s) to remove.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3719) Fix TestFileConcurrentReader#testUnfinishedBlockCrcErrorTransferToAppend and #testUnfinishedBlockCRCErrorNormalTransferAppend

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421859#comment-13421859
 ] 

Suresh Srinivas commented on HDFS-3719:
---

Can you set Affect Version/s field please.

> Fix TestFileConcurrentReader#testUnfinishedBlockCrcErrorTransferToAppend and 
> #testUnfinishedBlockCRCErrorNormalTransferAppend
> -
>
> Key: HDFS-3719
> URL: https://issues.apache.org/jira/browse/HDFS-3719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Wang
>
> Both of these tests are disabled. We should figure out what append 
> functionality we need to make the tests work again, and reenable them.
> {code}
>   // fails due to issue w/append, disable 
>   @Ignore
>   @Test
>   public void _testUnfinishedBlockCRCErrorTransferToAppend()
> throws IOException {
> runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
>   }
>   // fails due to issue w/append, disable 
>   @Ignore
>   @Test
>   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
> throws IOException {
> runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
> DEFAULT_WRITE_SIZE);
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421848#comment-13421848
 ] 

Hadoop QA commented on HDFS-3553:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12537763/HDFS-3553.branch-23.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2898//console

This message is automatically generated.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch, HDFS-3553.branch-23.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3553:
--

Attachment: HDFS-3553.branch-23.patch

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch, HDFS-3553.branch-23.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3720) hdfs.h must get packaged

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421832#comment-13421832
 ] 

Hadoop QA commented on HDFS-3720:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537748/HDFS-3720.001.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in hadoop-assemblies.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2897//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2897//console

This message is automatically generated.

> hdfs.h must get packaged
> 
>
> Key: HDFS-3720
> URL: https://issues.apache.org/jira/browse/HDFS-3720
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.1-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3720.001.patch
>
>
> hdfs.h should be packaged, but it currently is not.  This was broken when 
> some header files got renamed by HDFS-3537.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3720) hdfs.h must get packaged

2012-07-24 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421796#comment-13421796
 ] 

Eli Collins commented on HDFS-3720:
---

+1 pending jenkins

> hdfs.h must get packaged
> 
>
> Key: HDFS-3720
> URL: https://issues.apache.org/jira/browse/HDFS-3720
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.1-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3720.001.patch
>
>
> hdfs.h should be packaged, but it currently is not.  This was broken when 
> some header files got renamed by HDFS-3537.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421794#comment-13421794
 ] 

Andrew Wang commented on HDFS-3717:
---

One note here is that HDFS-3711 uses a small, non-zero value for {{DELTA}}, not 
0.0. I chose this because floating point values can vary slightly based on the 
order of supposedly commutative operations, so doing a straight equality 
comparison often won't work.

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3720) hdfs.h must get packaged

2012-07-24 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3720:
--

Status: Patch Available  (was: Open)

> hdfs.h must get packaged
> 
>
> Key: HDFS-3720
> URL: https://issues.apache.org/jira/browse/HDFS-3720
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.1-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3720.001.patch
>
>
> hdfs.h should be packaged, but it currently is not.  This was broken when 
> some header files got renamed by HDFS-3537.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3717:
-

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Thanks for looking into this, Kihwal. Resolving this JIRA as a duplicate.

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3721:
--

Description: HDFS-744 added support for hsync to the data transfer wire 
protocol. However, it actually broke wire compatibility: if the client has 
hsync support but the server does not, the client cannot read or write data on 
the old cluster.  (was: HDFS-744 added support for hsync to the data transfer 
wire protocol. However, it actually broke wire compatibility: if the client has 
hsync support but the server does not, the client cannot write to the old 
cluster.)

This also affects the read path, since it uses the same packet header as well 
assuming it's fixed size.

> hsync support broke wire compatibility
> --
>
> Key: HDFS-3721
> URL: https://issues.apache.org/jira/browse/HDFS-3721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 2.1.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>
> HDFS-744 added support for hsync to the data transfer wire protocol. However, 
> it actually broke wire compatibility: if the client has hsync support but the 
> server does not, the client cannot read or write data on the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421771#comment-13421771
 ] 

Kihwal Lee commented on HDFS-3717:
--

HDFS-3711 was committed several hours ago and it fixed the issue. Wecan dupe it 
to HDFS-3711. 

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3710) libhdfs misuses O_RDONLY/WRONLY/RDWR

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421766#comment-13421766
 ] 

Hadoop QA commented on HDFS-3710:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537738/hdfs-3710-2.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDatanodeBlockScanner

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2896//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2896//console

This message is automatically generated.

> libhdfs misuses O_RDONLY/WRONLY/RDWR
> 
>
> Key: HDFS-3710
> URL: https://issues.apache.org/jira/browse/HDFS-3710
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3710-2.txt, hdfs-3710.txt
>
>
> The {{O_RDONLY}} / {{O_WRONLY}} / {{O_RDWR}} macros in {{fcntl.h}} are not a 
> bitmask; they are an enum stored in the low bits of the flag word.  The 
> proper way to use them is
> {code}
> if ((flags & O_ACCMODE) == O_RDONLY)
> {code}
> rather than
> {code}
> if ((flags & O_RDONLY) == 0)
> {code}
> There are many examples of this misuse in {{hdfs.c}}.
> As a result of this incorrect testing, erroneous code may be accepted without 
> error and correct code might not work correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421763#comment-13421763
 ] 

Andrew Purtell commented on HDFS-3717:
--

The JUnit documentation just says that INFINITY (without saying either way the 
sign) means the delta is ignored. Anyway, just reporting what worked here. 
Maybe it's wrong. 

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3710) libhdfs misuses O_RDONLY/WRONLY/RDWR

2012-07-24 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421754#comment-13421754
 ] 

Colin Patrick McCabe commented on HDFS-3710:


+1 lgtm

> libhdfs misuses O_RDONLY/WRONLY/RDWR
> 
>
> Key: HDFS-3710
> URL: https://issues.apache.org/jira/browse/HDFS-3710
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3710-2.txt, hdfs-3710.txt
>
>
> The {{O_RDONLY}} / {{O_WRONLY}} / {{O_RDWR}} macros in {{fcntl.h}} are not a 
> bitmask; they are an enum stored in the low bits of the flag word.  The 
> proper way to use them is
> {code}
> if ((flags & O_ACCMODE) == O_RDONLY)
> {code}
> rather than
> {code}
> if ((flags & O_RDONLY) == 0)
> {code}
> There are many examples of this misuse in {{hdfs.c}}.
> As a result of this incorrect testing, erroneous code may be accepted without 
> error and correct code might not work correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421751#comment-13421751
 ] 

Todd Lipcon commented on HDFS-3721:
---

The issue here is the following code:
{code}
  /** Header size for a packet */
  private static final int PROTO_SIZE = 
PacketHeaderProto.newBuilder()
  .setOffsetInBlock(0)
  .setSeqno(0)
  .setLastPacketInBlock(false)
  .setDataLen(0)
  .setSyncBlock(false)
  .build().getSerializedSize();
  public static final int PKT_HEADER_LEN =
6 + PROTO_SIZE;
{code}

Since the new {{syncBlock}} flag is optional, this caused the packet header to 
become variable-length depending on whether the client is post-hsync or not. 
This screws up the datanode, resulting in an exception:
{code}
12/07/24 13:55:45 INFO datanode.DataNode: Exception in receiveBlock for 
BP-2093170007-127.0.0.1-1342943513882:blk_3332306339985613438_1008
java.io.IOException: Data remaining in packet does not matchsum of checksumLen 
and dataLen  size remaining: 22 data len: 20 checksum Len: 4
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:595)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:532)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:748)
{code}

> hsync support broke wire compatibility
> --
>
> Key: HDFS-3721
> URL: https://issues.apache.org/jira/browse/HDFS-3721
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 2.1.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>
> HDFS-744 added support for hsync to the data transfer wire protocol. However, 
> it actually broke wire compatibility: if the client has hsync support but the 
> server does not, the client cannot write to the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421749#comment-13421749
 ] 

Aaron T. Myers commented on HDFS-3717:
--

Thanks for noticing this, Kihwal.

The patch looks good to me, and I agree that 0.0 seems to make the most sense 
for the value of the delta.

One thing I don't understand, though - why didn't this test fail during the 
test-patch run of HDFS-3583? For that matter, this test currently passes when 
run on my local box, on both trunk and branch-2. Any ideas why this might be?

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3721) hsync support broke wire compatibility

2012-07-24 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3721:
-

 Summary: hsync support broke wire compatibility
 Key: HDFS-3721
 URL: https://issues.apache.org/jira/browse/HDFS-3721
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client
Affects Versions: 2.1.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical


HDFS-744 added support for hsync to the data transfer wire protocol. However, 
it actually broke wire compatibility: if the client has hsync support but the 
server does not, the client cannot write to the old cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-07-24 Thread Tom White (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421745#comment-13421745
 ] 

Tom White commented on HDFS-3672:
-

A small comment on the patch - how about "DiskBlockLocation" instead of 
"HdfsBlockLocation" (and similarly for the method name) since other FileSystem 
implementations could implement this too.

> Expose disk-location information for blocks to enable better scheduling
> ---
>
> Key: HDFS-3672
> URL: https://issues.apache.org/jira/browse/HDFS-3672
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3672-1.patch
>
>
> Currently, HDFS exposes on which datanodes a block resides, which allows 
> clients to make scheduling decisions for locality and load balancing. 
> Extending this to also expose on which disk on a datanode a block resides 
> would enable even better scheduling, on a per-disk rather than coarse 
> per-datanode basis.
> This API would likely look similar to Filesystem#getFileBlockLocations, but 
> also involve a series of RPCs to the responsible datanodes to determine disk 
> ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3720) hdfs.h must get packaged

2012-07-24 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3720:
---

Attachment: HDFS-3720.001.patch

package hdfs.h in

{code}
./hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/include/hdfs.h
{code}
(or a similar directory, depending on version numbe)

> hdfs.h must get packaged
> 
>
> Key: HDFS-3720
> URL: https://issues.apache.org/jira/browse/HDFS-3720
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.1-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3720.001.patch
>
>
> hdfs.h should be packaged, but it currently is not.  This was broken when 
> some header files got renamed by HDFS-3537.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3720) hdfs.h must get packaged

2012-07-24 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-3720:
--

 Summary: hdfs.h must get packaged
 Key: HDFS-3720
 URL: https://issues.apache.org/jira/browse/HDFS-3720
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: libhdfs
Affects Versions: 2.0.1-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3720.001.patch

hdfs.h should be packaged, but it currently is not.  This was broken when some 
header files got renamed by HDFS-3537.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3583) Convert remaining tests to Junit4

2012-07-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421739#comment-13421739
 ] 

Aaron T. Myers commented on HDFS-3583:
--

bq. Should we revert this patch in 2.0 until HDFS-3717 is fixed?

I don't think so. It looks like we can commit HDFS-3717 quite soon, so 
reverting this patch from branch-2 will only make more work for us.

My bad for not noticing this test failure on branch-2. I'll go take a look at 
HDFS-3717 now.

> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421738#comment-13421738
 ] 

Kihwal Lee commented on HDFS-3717:
--

I don't think the use of {{Double.POSITIVE_INFINITY}} is correct. The condition 
{{(a < Double.POSITIVE_INFINITY)}} is always true for any real double number a. 
 Similarly, {{(a < Double.NEGATIVE_INFINITY)}} will be false.  By looking at 
junit4's source, using {{Double.NEGATIVE_INFINITY}} as {{delta}} will work. 
Probably you actually meant this. But the junit documentation says nothing 
about it, so I would prefer using 0.0.

{code:title=From Assert.java, junit 4.3}
static public void  [More ...] assertEquals(String message, double expected, 
double actual, double delta) {
  if (Double.compare(expected, actual) == 0)
return;
  if (!(Math.abs(expected - actual) <= delta))
failNotEquals(message, new Double(expected), new Double(actual));
}
{code}

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3583) Convert remaining tests to Junit4

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421736#comment-13421736
 ] 

Suresh Srinivas commented on HDFS-3583:
---

Should we revert this patch in 2.0 until HDFS-3717 is fixed?

> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-07-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421728#comment-13421728
 ] 

Todd Lipcon commented on HDFS-3672:
---

bq. My concern is, if this is used in MapReduce it might be okay. But once it 
starts getting used in other downstream projects removing this would be a 
challenge

That's the whole point of the Unstable API annotation, isn't it? We can change 
the API and downstream projects should accept that.

What if we explicitly also mark it as {{throws UnsupportedOperationException}}. 
So users of the API would be encouraged to catch this exception.

Since it's a performance API, it's always going to be used in an "advisory" 
role anyway -- any use of it could safely fall back to the non-optimized code 
path.

I'd be OK compromising and calling it LimitedPrivate(MapReduce), but I know 
that at least one of our customers is interested in using an API like this as 
well. Unfortunately I can't give too many details on their use case due to NDA 
(lame, I know), but I just wanted to provide a data point that there is demand 
for this "in the wild".

bq. We're still running some experiments locally, but our assumption is that, 
within short time-scales (~0.5 seconds), the lagging 0.5 second usage is a 
reasonably good predictor of the next 0.5 seconds, given most Hadoop-style 
access is of 100MB+ chunks of data

I ran a simple experiment yesterday on one of our test clusters. The cluster is 
doing a mix of workloads - I think at the time it was running a Hive benchmark 
suite on ~100 nodes. So, it was under load, but not 100% utilization.

On all of the nodes, I collected /proc/diskstats once a second for an hour. I 
then removed all disk samples where there was 0 load on the disk, since that 
was just periods of inactivity between test runs. Then, I took the disk 
utilization at each sample, and appended it as a column to the data from the 
previous second. I loaded the data into R and constructed a few simple models 
for each second's disk utilization on a given disk based on the previous 
second's disk statistics.

Linear model using only the current utilization to predict the next second's 
utilization:
{code}
> m.linear.on.only.util <- lm(next_sec_util ~ ms_doing_io, data=d)
{code}
(this would correspond to a trivial model like "assume that if a disk is busy 
now, it will still be busy in the next second")

Linear model using all of the current statistics (queue length, read/write mix, 
etc) to predict next second's util:
{code}
> m.linear <- lm(next_sec_util ~ ., data=d)
{code}

Quadratic model using all of the current statistics, and their interaction 
terms, to predict next second's util:
{code}
> d.sample.200k <- d[sample(nrow(d), size=20),]
> m.quadratic <- lm(next_sec_util ~ .:., data=d.sample.200k)
{code}

Random forest (a decision-tree based model, trained using only 1% of the data, 
since it's slow):
{code}
> d.sample.10k <- d[sample(nrow(d), size=1),]
> m.rf <- randomForest(next_sec_util~., data=d.sample.10k)
{code}

The models fared as follows:

||Model||Percent variance explained||
| Linear on only utilization | 58.4% |
| Linear | 70.6% |
| Quadratic | 73.9% |
| Random forest | 76.9% |

Certainly the above analysis is just one workload, and one in which the disks 
are not being particularly slammed. But, it does show that looking at a disk's 
current status is a reasonable predictor of status over the next second on a 
typical MR cluster.

> Expose disk-location information for blocks to enable better scheduling
> ---
>
> Key: HDFS-3672
> URL: https://issues.apache.org/jira/browse/HDFS-3672
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3672-1.patch
>
>
> Currently, HDFS exposes on which datanodes a block resides, which allows 
> clients to make scheduling decisions for locality and load balancing. 
> Extending this to also expose on which disk on a datanode a block resides 
> would enable even better scheduling, on a per-disk rather than coarse 
> per-datanode basis.
> This API would likely look similar to Filesystem#getFileBlockLocations, but 
> also involve a series of RPCs to the responsible datanodes to determine disk 
> ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3710) libhdfs misuses O_RDONLY/WRONLY/RDWR

2012-07-24 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3710:


Attachment: hdfs-3710-2.txt

Add a test that invalid flags value is disallowed.

Also tweak errno on invalid flags; return EINVAL rather than ENOTSUP.

> libhdfs misuses O_RDONLY/WRONLY/RDWR
> 
>
> Key: HDFS-3710
> URL: https://issues.apache.org/jira/browse/HDFS-3710
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3710-2.txt, hdfs-3710.txt
>
>
> The {{O_RDONLY}} / {{O_WRONLY}} / {{O_RDWR}} macros in {{fcntl.h}} are not a 
> bitmask; they are an enum stored in the low bits of the flag word.  The 
> proper way to use them is
> {code}
> if ((flags & O_ACCMODE) == O_RDONLY)
> {code}
> rather than
> {code}
> if ((flags & O_RDONLY) == 0)
> {code}
> There are many examples of this misuse in {{hdfs.c}}.
> As a result of this incorrect testing, erroneous code may be accepted without 
> error and correct code might not work correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3650) Use MutableQuantiles to provide latency histograms for various operations

2012-07-24 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421698#comment-13421698
 ] 

Andrew Wang commented on HDFS-3650:
---

These two test failures look like HADOOP-8596 and HDFS-3660, I think unrelated.

> Use MutableQuantiles to provide latency histograms for various operations
> -
>
> Key: HDFS-3650
> URL: https://issues.apache.org/jira/browse/HDFS-3650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3650-1.patch
>
>
> MutableQuantiles provide accurate estimation of various percentiles for a 
> stream of data. Many existing metrics reported by a MutableRate would also 
> benefit from having these percentiles; lets add MutableQuantiles where we 
> think it'd be useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421689#comment-13421689
 ] 

Owen O'Malley commented on HDFS-3553:
-

Daryn, the current patch looks good. Thanks!

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421685#comment-13421685
 ] 

Andrew Purtell commented on HDFS-3717:
--

Or if you use Double.POSITIVE_INFINITY then the delta parameter is ignored. I'm 
not sure if that is better than 0.0 but Double.POSITIVE_INFINITY fixed this for 
me on our private Jenkins last night.

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3694) QJM: Fix getEditLogManifest to fetch httpPort if necessary

2012-07-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421677#comment-13421677
 ] 

Aaron T. Myers commented on HDFS-3694:
--

+1, the patch looks good to me.

> QJM: Fix getEditLogManifest to fetch httpPort if necessary
> --
>
> Key: HDFS-3694
> URL: https://issues.apache.org/jira/browse/HDFS-3694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3694.txt
>
>
> This is necessary for QJM to work with HA. When the NNs start up, they start 
> in standby state and try to read from the JournalNodes. So they call 
> {{getEditLogManifest()}}. But they don't call {{getJournalInfo}} so the 
> {{httpPort}} field doesn't get filled in. This means that when they try to 
> actually fetch the remote edits, they fail since they don't know the JN's 
> HTTP port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-24 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421678#comment-13421678
 ] 

Robert Joseph Evans commented on HDFS-3667:
---

Nicholas,

If it is too much of a pain to separate them, that is OK. I want the OOM fix in 
0.23, but I realize that is not a priority for a lot of others and I can port 
it over myself once this goes into branch-2. 

> Add retry support to WebHdfsFileSystem
> --
>
> Key: HDFS-3667
> URL: https://issues.apache.org/jira/browse/HDFS-3667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3667_20120718.patch, h3667_20120721.patch, 
> h3667_20120722.patch
>
>
> DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
> retries on exceptions such as connection failure, safemode.  
> WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3650) Use MutableQuantiles to provide latency histograms for various operations

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421673#comment-13421673
 ] 

Hadoop QA commented on HDFS-3650:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537710/hdfs-3650-1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestFileAppend4
  org.apache.hadoop.hdfs.TestDatanodeBlockScanner

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2894//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2894//console

This message is automatically generated.

> Use MutableQuantiles to provide latency histograms for various operations
> -
>
> Key: HDFS-3650
> URL: https://issues.apache.org/jira/browse/HDFS-3650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3650-1.patch
>
>
> MutableQuantiles provide accurate estimation of various percentiles for a 
> stream of data. Many existing metrics reported by a MutableRate would also 
> benefit from having these percentiles; lets add MutableQuantiles where we 
> think it'd be useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3693) QJM: JNStorage should read its storage info even before a writer becomes active

2012-07-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421671#comment-13421671
 ] 

Aaron T. Myers commented on HDFS-3693:
--

One tiny nit, you might want to put a space after ":" here:
{code}
System.err.println("storage string:" + storageString);
{code}

Otherwise the patch looks good to me. +1.

> QJM: JNStorage should read its storage info even before a writer becomes 
> active
> ---
>
> Key: HDFS-3693
> URL: https://issues.apache.org/jira/browse/HDFS-3693
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3693.txt
>
>
> In order for QJM to work with HA, the standby needs to be able to read from a 
> JournalNode even when no active has written to it. In the current code, it 
> reads the StorageInfo only when {{newEpoch()}} is called. But, that's only 
> called when a writer becomes active. This causes the SBN to fail at startup 
> since the JN thinks its storage info is uninitialized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3692) QJM: support purgeEditLogs() call to remotely purge logs

2012-07-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421657#comment-13421657
 ] 

Aaron T. Myers commented on HDFS-3692:
--

+1, the patch looks good to me.

> QJM: support purgeEditLogs() call to remotely purge logs
> 
>
> Key: HDFS-3692
> URL: https://issues.apache.org/jira/browse/HDFS-3692
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3692.txt
>
>
> This API is currently marked TODO. We need it to maintain bounded disk space 
> usage on the JournalNode storage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421625#comment-13421625
 ] 

Hadoop QA commented on HDFS-3553:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12537725/HDFS-3553-3.branch-1.0.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2895//console

This message is automatically generated.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421622#comment-13421622
 ] 

Suresh Srinivas commented on HDFS-3718:
---

There is no specific reason for that order. We should set DataNode#shouldRun to 
false before other shutdown calls.

> Datanode won't shutdown because of runaway DataBlockScanner thread
> --
>
> Key: HDFS-3718
> URL: https://issues.apache.org/jira/browse/HDFS-3718
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.2.0-alpha
>
>
> Datanode sometimes does not shutdown because the block pool scanner thread 
> keeps running. It prints out "Starting a new period" every five seconds, even 
> after {{shutdown()}} is called.  Somehow the interrupt is missed.
> {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
> but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
> before it is being set to false.
> Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3553) Hftp proxy tokens are broken

2012-07-24 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3553:
--

Attachment: HDFS-3553-3.branch-1.0.patch

Depends on HADOOP-8613 patch for correct ugi auth type from token, and adds 
more tests to verify the jsp returns a ugi with the correct auth type with or 
w/o token.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553-3.branch-1.0.patch, 
> HDFS-3553.branch-1.0.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3708) transitionToStandby may NPE

2012-07-24 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421588#comment-13421588
 ] 

Eli Collins commented on HDFS-3708:
---

Yup, thinking we should warn as well if they're used for those who don't rtfm.

> transitionToStandby may NPE
> ---
>
> Key: HDFS-3708
> URL: https://issues.apache.org/jira/browse/HDFS-3708
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Minor
>  Labels: newbie
>
> Setting both namenodes active and then trying to turn one to standby results 
> in a NullPointerException and the NameNode process is killed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-07-24 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421585#comment-13421585
 ] 

Eli Collins commented on HDFS-3626:
---

Thanks Daryn, want to file a follow-up jira so we can address your concern?

> Creating file with invalid path can corrupt edit log
> 
>
> Key: HDFS-3626
> URL: https://issues.apache.org/jira/browse/HDFS-3626
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-3626.txt, hdfs-3626.txt, hdfs-3626.txt, 
> hdfs-3626.txt
>
>
> Joris Bontje reports the following:
> The following command results in a corrupt NN editlog (note the double slash 
> and reading from stdin):
> $ cat /usr/share/dict/words | hadoop fs -put - 
> hdfs://localhost:8020//path/file
> After this, restarting the namenode will result into the following fatal 
> exception:
> {code}
> 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
>  expecting start txid #173
> 2012-07-10 06:29:19,912 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
> permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
> java.lang.ArrayIndexOutOfBoundsException: -1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins reassigned HDFS-3717:
-

Assignee: Kihwal Lee

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-496) Use PureJavaCrc32 in HDFS

2012-07-24 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-496:


Attachment: HDFS-496.branch-1.patch

> Use PureJavaCrc32 in HDFS
> -
>
> Key: HDFS-496
> URL: https://issues.apache.org/jira/browse/HDFS-496
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, hdfs client, performance
>Affects Versions: 0.21.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: HDFS-496.branch-1.patch, hdfs-496.txt, hdfs-496.txt
>
>
> Common now has a pure java CRC32 implementation which is more efficient than 
> java.util.zip.CRC32. This issue is to make use of it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3650) Use MutableQuantiles to provide latency histograms for various operations

2012-07-24 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-3650:
--

Attachment: (was: hdfs-3650-1.patch)

> Use MutableQuantiles to provide latency histograms for various operations
> -
>
> Key: HDFS-3650
> URL: https://issues.apache.org/jira/browse/HDFS-3650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3650-1.patch
>
>
> MutableQuantiles provide accurate estimation of various percentiles for a 
> stream of data. Many existing metrics reported by a MutableRate would also 
> benefit from having these percentiles; lets add MutableQuantiles where we 
> think it'd be useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3650) Use MutableQuantiles to provide latency histograms for various operations

2012-07-24 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-3650:
--

Attachment: hdfs-3650-1.patch

Freshly rebased for trunk.

> Use MutableQuantiles to provide latency histograms for various operations
> -
>
> Key: HDFS-3650
> URL: https://issues.apache.org/jira/browse/HDFS-3650
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3650-1.patch
>
>
> MutableQuantiles provide accurate estimation of various percentiles for a 
> stream of data. Many existing metrics reported by a MutableRate would also 
> benefit from having these percentiles; lets add MutableQuantiles where we 
> think it'd be useful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3719) Fix TestFileConcurrentReader#testUnfinishedBlockCrcErrorTransferToAppend and #testUnfinishedBlockCRCErrorNormalTransferAppend

2012-07-24 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-3719:
-

 Summary: Fix 
TestFileConcurrentReader#testUnfinishedBlockCrcErrorTransferToAppend and 
#testUnfinishedBlockCRCErrorNormalTransferAppend
 Key: HDFS-3719
 URL: https://issues.apache.org/jira/browse/HDFS-3719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Andrew Wang


Both of these tests are disabled. We should figure out what append 
functionality we need to make the tests work again, and reenable them.

{code}
  // fails due to issue w/append, disable 
  @Ignore
  @Test
  public void _testUnfinishedBlockCRCErrorTransferToAppend()
throws IOException {
runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
  }

  // fails due to issue w/append, disable 
  @Ignore
  @Test
  public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
throws IOException {
runTestUnfinishedBlockCRCError(false, SyncType.APPEND, DEFAULT_WRITE_SIZE);
  }
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421543#comment-13421543
 ] 

Suresh Srinivas commented on HDFS-2815:
---

Additionally the change in INodeFile.java to set the InodeFile's blocks as null 
may be necessary in this patch as well.

> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread

2012-07-24 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-3718:


 Summary: Datanode won't shutdown because of runaway 
DataBlockScanner thread
 Key: HDFS-3718
 URL: https://issues.apache.org/jira/browse/HDFS-3718
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.1-alpha
Reporter: Kihwal Lee
Priority: Critical
 Fix For: 3.0.0, 2.2.0-alpha


Datanode sometimes does not shutdown because the block pool scanner thread 
keeps running. It prints out "Starting a new period" every five seconds, even 
after {{shutdown()}} is called.  Somehow the interrupt is missed.

{{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked before 
it is being set to false.

Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421537#comment-13421537
 ] 

Suresh Srinivas commented on HDFS-2815:
---

bq. I think I have to remove the inner synchronization block
I think you should remove outer method synchronization and retain inner 
synchronization. That way you do not sync editlog holding the lock.

Is this patch quite a bit different from HDFS-173 with the above change? If so, 
should we just mark both HDFS-173 and HDFS-2815 as done for branch-1 also. The 
test from HDFS-173 can be included in this patch then?

> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-07-24 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421532#comment-13421532
 ] 

Daryn Sharp commented on HDFS-3626:
---

Just to be sure I'm not holding this up, I'm a +0.5 because I'm not happy about 
normalization but it's important to fix the NN corruption.  Eli's +1 is 
sufficient to commit.

> Creating file with invalid path can corrupt edit log
> 
>
> Key: HDFS-3626
> URL: https://issues.apache.org/jira/browse/HDFS-3626
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-3626.txt, hdfs-3626.txt, hdfs-3626.txt, 
> hdfs-3626.txt
>
>
> Joris Bontje reports the following:
> The following command results in a corrupt NN editlog (note the double slash 
> and reading from stdin):
> $ cat /usr/share/dict/words | hadoop fs -put - 
> hdfs://localhost:8020//path/file
> After this, restarting the namenode will result into the following fatal 
> exception:
> {code}
> 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
>  expecting start txid #173
> 2012-07-10 06:29:19,912 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
> permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
> java.lang.ArrayIndexOutOfBoundsException: -1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421530#comment-13421530
 ] 

Suresh Srinivas commented on HDFS-2815:
---

@manish - you meant to post the above comment in some other jira?

> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421526#comment-13421526
 ] 

Hadoop QA commented on HDFS-3717:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537699/hdfs-3717.patch.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2893//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2893//console

This message is automatically generated.

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3711) Manually convert remaining tests to JUnit4

2012-07-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421511#comment-13421511
 ] 

Hudson commented on HDFS-3711:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2535 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2535/])
HDFS-3711. Manually convert remaining tests to JUnit4. Contributed by 
Andrew Wang. (Revision 1365119)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365119
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/unit/org/apache/hadoop/hdfs/server/namenode/TestNNLeaseRecovery.java


> Manually convert remaining tests to JUnit4
> --
>
> Key: HDFS-3711
> URL: https://issues.apache.org/jira/browse/HDFS-3711
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3711.patch
>
>
> In HDFS-3583, we used a fixup script to automatically convert most of the 
> HDFS tests to JUnit4. There are a couple tests left that were too difficult 
> and need to be manually converted (try {{`grep -r "junit.framework" 
> hadoop-hdfs-project}} to see what's left).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread manish v dunani (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421497#comment-13421497
 ] 

manish v dunani commented on HDFS-2815:
---

Hav u try to to do safemode off wid the command:bin/hadoop dfsadmin -safemode 
leave.I THINK IT'S compatible.


> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3711) Manually convert remaining tests to JUnit4

2012-07-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421479#comment-13421479
 ] 

Hudson commented on HDFS-3711:
--

Integrated in Hadoop-Common-trunk-Commit #2514 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2514/])
HDFS-3711. Manually convert remaining tests to JUnit4. Contributed by 
Andrew Wang. (Revision 1365119)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365119
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/unit/org/apache/hadoop/hdfs/server/namenode/TestNNLeaseRecovery.java


> Manually convert remaining tests to JUnit4
> --
>
> Key: HDFS-3711
> URL: https://issues.apache.org/jira/browse/HDFS-3711
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3711.patch
>
>
> In HDFS-3583, we used a fixup script to automatically convert most of the 
> HDFS tests to JUnit4. There are a couple tests left that were too difficult 
> and need to be manually converted (try {{`grep -r "junit.framework" 
> hadoop-hdfs-project}} to see what's left).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3711) Manually convert remaining tests to JUnit4

2012-07-24 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3711:
-

  Resolution: Fixed
   Fix Version/s: 2.2.0-alpha
Target Version/s: 2.2.0-alpha
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I've just committed this to trunk and branch-2.

Thanks a lot for the contribution, Andrew, and thanks for the review, Eli.

> Manually convert remaining tests to JUnit4
> --
>
> Key: HDFS-3711
> URL: https://issues.apache.org/jira/browse/HDFS-3711
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3711.patch
>
>
> In HDFS-3583, we used a fixup script to automatically convert most of the 
> HDFS tests to JUnit4. There are a couple tests left that were too difficult 
> and need to be manually converted (try {{`grep -r "junit.framework" 
> hadoop-hdfs-project}} to see what's left).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3711) Manually convert remaining tests to JUnit4

2012-07-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421476#comment-13421476
 ] 

Hudson commented on HDFS-3711:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2579 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2579/])
HDFS-3711. Manually convert remaining tests to JUnit4. Contributed by 
Andrew Wang. (Revision 1365119)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1365119
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestSimulatedFSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestParallelImageWrite.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/unit/org/apache/hadoop/hdfs/server/namenode/TestNNLeaseRecovery.java


> Manually convert remaining tests to JUnit4
> --
>
> Key: HDFS-3711
> URL: https://issues.apache.org/jira/browse/HDFS-3711
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3711.patch
>
>
> In HDFS-3583, we used a fixup script to automatically convert most of the 
> HDFS tests to JUnit4. There are a couple tests left that were too difficult 
> and need to be manually converted (try {{`grep -r "junit.framework" 
> hadoop-hdfs-project}} to see what's left).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3711) Manually convert remaining tests to JUnit4

2012-07-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421473#comment-13421473
 ] 

Aaron T. Myers commented on HDFS-3711:
--

The test failure is unrelated. TestFileAppend4 is known to be flaky sometimes, 
and I just ran it locally with and without this patch, and confirmed that it 
passed just fine.

I'm going to commit this momentarily.

> Manually convert remaining tests to JUnit4
> --
>
> Key: HDFS-3711
> URL: https://issues.apache.org/jira/browse/HDFS-3711
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3711.patch
>
>
> In HDFS-3583, we used a fixup script to automatically convert most of the 
> HDFS tests to JUnit4. There are a couple tests left that were too difficult 
> and need to be manually converted (try {{`grep -r "junit.framework" 
> hadoop-hdfs-project}} to see what's left).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3717) Test cases in TestPBHelper fail

2012-07-24 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-3717:
-

Summary: Test cases in TestPBHelper fail  (was: Test cases in TestPBHelper 
fails)

> Test cases in TestPBHelper fail
> ---
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3717) Test cases in TestPBHelper fails

2012-07-24 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-3717:
-

Status: Patch Available  (was: Open)

> Test cases in TestPBHelper fails
> 
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3717) Test cases in TestPBHelper fails

2012-07-24 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-3717:
-

Attachment: hdfs-3717.patch.txt

> Test cases in TestPBHelper fails
> 
>
> Key: HDFS-3717
> URL: https://issues.apache.org/jira/browse/HDFS-3717
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Kihwal Lee
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3717.patch.txt
>
>
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
> {{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}
> They all fail with:
> {noformat}
> java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
>  to compare floating-point numbers
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421455#comment-13421455
 ] 

Uma Maheswara Rao G commented on HDFS-2815:
---

Thanks a lot Suresh for the review!

I think I have to remove the inner synchronization block as I am not protecting 
the removeBlocks in the synchronized block separately. Also this JIRA is not 
targetting to fix the synchronization issue in "Large directory deletion issue" 
right.

For the test, I could't get any clear assertion for the behaviour of this issue 
:(.
Do you have any suggestion? 

> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3679) fuse_dfs notrash option sets usetrash

2012-07-24 Thread Conrad Meyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421449#comment-13421449
 ] 

Conrad Meyer commented on HDFS-3679:


Great, thanks.

> fuse_dfs notrash option sets usetrash
> -
>
> Key: HDFS-3679
> URL: https://issues.apache.org/jira/browse/HDFS-3679
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.0.3, 2.0.0-alpha
>Reporter: Conrad Meyer
>Assignee: Conrad Meyer
>Priority: Minor
> Attachments: hdfs-3679.diff
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> fuse_dfs sets usetrash option when the "notrash" flag is given. This is the 
> exact opposite of the desired behavior. The "usetrash" flag sets usetrash as 
> well, but this is correct. Here are the relevant lines from fuse_options.c, 
> in latest HDFS HEAD[0]:
> 123 case KEY_USETRASH:
> 124   options.usetrash = 1;
> 125   break;
> 126 case KEY_NOTRASH:
> 127   options.usetrash = 1;
> 128   break;
> This is a pretty trivial bug to fix. I'm not familiar with the process here, 
> but I can attach a patch if needed.
> [0]: 
> https://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c?view=markup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421447#comment-13421447
 ] 

Suresh Srinivas commented on HDFS-3672:
---

bq. are you OK with introducing these as Unstable-annotated APIs
My concern is, if this is used in MapReduce it might be okay. But once it 
starts getting used in other downstream projects removing this would be a 
challenge.

> Expose disk-location information for blocks to enable better scheduling
> ---
>
> Key: HDFS-3672
> URL: https://issues.apache.org/jira/browse/HDFS-3672
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3672-1.patch
>
>
> Currently, HDFS exposes on which datanodes a block resides, which allows 
> clients to make scheduling decisions for locality and load balancing. 
> Extending this to also expose on which disk on a datanode a block resides 
> would enable even better scheduling, on a per-disk rather than coarse 
> per-datanode basis.
> This API would likely look similar to Filesystem#getFileBlockLocations, but 
> also involve a series of RPCs to the responsible datanodes to determine disk 
> ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3717) Test cases in TestPBHelper fails

2012-07-24 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-3717:


 Summary: Test cases in TestPBHelper fails
 Key: HDFS-3717
 URL: https://issues.apache.org/jira/browse/HDFS-3717
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Kihwal Lee
 Fix For: 3.0.0, 2.2.0-alpha


{{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertBlockCommand}}
{{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertLocatedBlock}}
{{org.apache.hadoop.hdfs.protocolPB.TestPBHelper.testConvertRecoveringBlock}}

They all fail with:

{noformat}
java.lang.AssertionError: Use assertEquals(expected, actual, delta) 
 to compare floating-point numbers
{noformat}



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3679) fuse_dfs notrash option sets usetrash

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421445#comment-13421445
 ] 

Suresh Srinivas commented on HDFS-3679:
---

I added you as a contributor and assigned the patch to you. +1 for the patch. I 
will commit it soon.

> fuse_dfs notrash option sets usetrash
> -
>
> Key: HDFS-3679
> URL: https://issues.apache.org/jira/browse/HDFS-3679
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.0.3, 2.0.0-alpha
>Reporter: Conrad Meyer
>Priority: Minor
> Attachments: hdfs-3679.diff
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> fuse_dfs sets usetrash option when the "notrash" flag is given. This is the 
> exact opposite of the desired behavior. The "usetrash" flag sets usetrash as 
> well, but this is correct. Here are the relevant lines from fuse_options.c, 
> in latest HDFS HEAD[0]:
> 123 case KEY_USETRASH:
> 124   options.usetrash = 1;
> 125   break;
> 126 case KEY_NOTRASH:
> 127   options.usetrash = 1;
> 128   break;
> This is a pretty trivial bug to fix. I'm not familiar with the process here, 
> but I can attach a patch if needed.
> [0]: 
> https://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c?view=markup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3679) fuse_dfs notrash option sets usetrash

2012-07-24 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reassigned HDFS-3679:
-

Assignee: Conrad Meyer

> fuse_dfs notrash option sets usetrash
> -
>
> Key: HDFS-3679
> URL: https://issues.apache.org/jira/browse/HDFS-3679
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.0.3, 2.0.0-alpha
>Reporter: Conrad Meyer
>Assignee: Conrad Meyer
>Priority: Minor
> Attachments: hdfs-3679.diff
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> fuse_dfs sets usetrash option when the "notrash" flag is given. This is the 
> exact opposite of the desired behavior. The "usetrash" flag sets usetrash as 
> well, but this is correct. Here are the relevant lines from fuse_options.c, 
> in latest HDFS HEAD[0]:
> 123 case KEY_USETRASH:
> 124   options.usetrash = 1;
> 125   break;
> 126 case KEY_NOTRASH:
> 127   options.usetrash = 1;
> 128   break;
> This is a pretty trivial bug to fix. I'm not familiar with the process here, 
> but I can attach a patch if needed.
> [0]: 
> https://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_options.c?view=markup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-07-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421439#comment-13421439
 ] 

Suresh Srinivas commented on HDFS-2815:
---

Uma, thanks for posting the branch-1 patch. FSNamesystem#deleteInternal() 
method should no longer be synchronized right, given the internal synchronized 
block?

Also should we add a unit test to this patch?


> Namenode is not coming out of safemode when we perform ( NN crash + restart ) 
> .  Also FSCK report shows blocks missed.
> --
>
> Key: HDFS-2815
> URL: https://issues.apache.org/jira/browse/HDFS-2815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.24.0, 0.23.1, 1.0.0, 1.1.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Critical
> Fix For: 2.0.0-alpha, 3.0.0
>
> Attachments: HDFS-2815-22-branch.patch, HDFS-2815-Branch-1.patch, 
> HDFS-2815.patch, HDFS-2815.patch
>
>
> When tested the HA(internal) with continuous switch with some 5mins gap, 
> found some *blocks missed* and namenode went into safemode after next switch.
>
>After the analysis, i found that this files already deleted by clients. 
> But i don't see any delete commands logs namenode log files. But namenode 
> added that blocks to invalidateSets and DNs deleted the blocks.
>When restart of the namenode, it went into safemode and expecting some 
> more blocks to come out of safemode.
>Here the reason could be that, file has been deleted in memory and added 
> into invalidates after this it is trying to sync the edits into editlog file. 
> By that time NN asked DNs to delete that blocks. Now namenode shuts down 
> before persisting to editlogs.( log behind)
>Due to this reason, we may not get the INFO logs about delete, and when we 
> restart the Namenode (in my scenario it is again switch), Namenode expects 
> this deleted blocks also, as delete request is not persisted into editlog 
> before.
>I reproduced this scenario with bedug points. *I feel, We should not add 
> the blocks to invalidates before persisting into Editlog*. 
> Note: for switch, we used kill -9 (force kill)
>   I am currently in 0.20.2 version. Same verified in 0.23 as well in normal 
> crash + restart  scenario.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3703) Decrease the datanode failure detection time

2012-07-24 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421436#comment-13421436
 ] 

Kihwal Lee commented on HDFS-3703:
--

I think there are differences in the definition of failures and the 
recovery/service semantics between HDFS and HBase.  Most traditional HDFS use 
cases are better served by the best-effort, eventual success service semantics 
than the early fail-out. It would rather wait for the long in-disk error 
recovery, for example, than failing the datanode early.

We all know HBase is different in this regard since it's a serving system. The 
three-layered failure propagation model nkeywal described above does not work 
well because the definition of failures is different. Also, the cluster health 
state is a result of distributed decision, which can take a long time. For this 
reason, even the most obvious failure modes such as fail-stop won't cause the 
global state to be updated immediately.  If a time-sensitive client solely 
depends on this state from NN, it won't get satisfactory results.

In addition to providing more hints from NN as Suresh suggested, I believe the 
client has to be smarter since there is a limit in the freshness of the health 
state that NN can guarantee at scale. Reading WAL during recovery can be made 
more predictable if the client is allowed to be more aggressive and proactive 
at the price of additional resource consumption and extra load.  If its usage 
is limited and controlled, it should be acceptable to provide such a client 
implementation. I think HDFS-3705 and HDFS-3706 are along this line. I am very 
much interested in learning the HBase requirements and its wish list more.

> Decrease the datanode failure detection time
> 
>
> Key: HDFS-3703
> URL: https://issues.apache.org/jira/browse/HDFS-3703
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, name-node
>Affects Versions: 1.0.3, 2.0.0-alpha
>Reporter: nkeywal
>Assignee: Suresh Srinivas
>
> By default, if a box dies, the datanode will be marked as dead by the 
> namenode after 10:30 minutes. In the meantime, this datanode will still be 
> proposed  by the nanenode to write blocks or to read replicas. It happens as 
> well if the datanode crashes: there is no shutdown hooks to tell the nanemode 
> we're not there anymore.
> It especially an issue with HBase. HBase regionserver timeout for production 
> is often 30s. So with these configs, when a box dies HBase starts to recover 
> after 30s and, while 10 minutes, the namenode will consider the blocks on the 
> same box as available. Beyond the write errors, this will trigger a lot of 
> missed reads:
> - during the recovery, HBase needs to read the blocks used on the dead box 
> (the ones in the 'HBase Write-Ahead-Log')
> - after the recovery, reading these data blocks (the 'HBase region') will 
> fail 33% of the time with the default number of replica, slowering the data 
> access, especially when the errors are socket timeout (i.e. around 60s most 
> of the time). 
> Globally, it would be ideal if HDFS settings could be under HBase settings. 
> As a side note, HBase relies on ZooKeeper to detect regionservers issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3709) TestStartup tests still binding to the ephemeral port

2012-07-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421428#comment-13421428
 ] 

Hudson commented on HDFS-3709:
--

Integrated in Hadoop-Mapreduce-trunk #1146 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1146/])
HDFS-3709. TestStartup tests still binding to the ephemeral port. 
Contributed by Eli Collins (Revision 1364865)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364865
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestStartup.java


> TestStartup tests still binding to the ephemeral port 
> --
>
> Key: HDFS-3709
> URL: https://issues.apache.org/jira/browse/HDFS-3709
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3709.txt
>
>
> HDFS-3517 missed some test cases that bypass the default config. This 
> occasionally causes "port in use" test failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3697) Enable fadvise readahead by default

2012-07-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421430#comment-13421430
 ] 

Hudson commented on HDFS-3697:
--

Integrated in Hadoop-Mapreduce-trunk #1146 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1146/])
HDFS-3697. Enable fadvise readahead by default. Contributed by Todd Lipcon. 
(Revision 1364698)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1364698
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3697-branch-1.txt, hdfs-3697.txt, hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable the readahead feature by default in future versions so that users get 
> this benefit without any manual configuration required.
> The other fadvise features seem to be more workload-dependent and need 
> further testing before enabling by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3509) WebHdfsFilesystem does not work within a proxyuser doAs call in secure mode

2012-07-24 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421402#comment-13421402
 ] 

Daryn Sharp commented on HDFS-3509:
---

We need to test this patch in conjunction with my HDFS-3553 change.  Since I 
don't have a working SPNEGO cluster as my disposal, could you please test them? 
 Also, I think this change should go into 1.x too.

> WebHdfsFilesystem does not work within a proxyuser doAs call in secure mode
> ---
>
> Key: HDFS-3509
> URL: https://issues.apache.org/jira/browse/HDFS-3509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
>Priority: Critical
> Attachments: HDFS-3509-branch1.patch, HDFS-3509.patch
>
>
> It does not find kerberos credentials in the context (the UGI is logged in 
> from a keytab) and it fails with the following trace:
> {code}
> java.lang.IllegalStateException: unknown char '<'(60) in 
> org.mortbay.util.ajax.JSON$ReaderSource@23245e75
>   at org.mortbay.util.ajax.JSON.handleUnknown(JSON.java:788)
>   at org.mortbay.util.ajax.JSON.parse(JSON.java:777)
>   at org.mortbay.util.ajax.JSON.parse(JSON.java:603)
>   at org.mortbay.util.ajax.JSON.parse(JSON.java:183)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.jsonParse(WebHdfsFileSystem.java:259)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:268)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:427)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:722)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >