[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-07-20 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419736#comment-13419736
 ] 

Allen Wittenauer commented on HDFS-2617:


a) Without these or similar patches, 2.0 can't run in secure mode on various OS 
distributions without crippling them. i.e., this is clearly a blocker for any 
2.0 stable release and should likely be a block for even non-stable releases.  

b) Upgrades almost always trigger a config change anyway.  Given that we want 
folks to move from hftp to webhdfs, forcing a config change is a good thing, as 
we can tell them to turn on webhdfs at the same time.

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 1.2.0, 2.1.0-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3647) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1

2012-07-20 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419730#comment-13419730
 ] 

Harsh J commented on HDFS-3647:
---

Suresh - Sorry, added it cause it was a perf-tuning measure. Its OK to remove 
that.

I didn't do it on the old JIRA cause it was marked Closed. If its acceptable to 
edit Closed issues, I'll do that in the future. Apologies for the inconvenience 
:)

> Backport HDFS-2868 (Add number of active transfer threads to the DataNode 
> status) to branch-1
> -
>
> Key: HDFS-3647
> URL: https://issues.apache.org/jira/browse/HDFS-3647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Steve Hoffman
>Assignee: Harsh J
> Fix For: 1.2.0
>
> Attachments: HDFS-3647.patch, Screen Shot 2012-07-14 at 12.41.07 
> AM.png
>
>
> Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't 
> there.
> There is a lot of mystery surrounding how large to set 
> dfs.datanode.max.xcievers.  Most people say to just up it to 4096, but given 
> that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog 
> post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), 
> it would be nice if we could expose the current count via the built-in 
> metrics framework (most likely under dfs).  In this way we could watch it to 
> see if we have it set too high, too low, time to bump it up, etc.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-07-20 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419728#comment-13419728
 ] 

eric baldeschwieler commented on HDFS-2617:
---

I've been talking over the options with various actors to determine where this 
needs to be patched.  This is what I propose:

1) We patch 1.0 as proposed here

2) We do not take these patches to 2.0.

3) We additionally patch the client to try first the SPNEGO token protocol and 
then KSSL if that fails.  We patch both 1.0 and 2.0 HFTP clients to do this.  

---

With these changes we introduce the least possible cruft into 2.0.  And we 
support a gradual transition in the installed base from week to strong, so that 
orgs do not need a DDay config switch, which will require organized validation 
and disruption.

Further the default behavior is right for folks not worrying about this 
transition.

Any concerns with this approach?

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 1.2.0, 2.1.0-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3643) hdfsJniHelper.c unchecked string pointers

2012-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419704#comment-13419704
 ] 

Hadoop QA commented on HDFS-3643:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537422/hdfs3643.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2882//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2882//console

This message is automatically generated.

> hdfsJniHelper.c unchecked string pointers
> -
>
> Key: HDFS-3643
> URL: https://issues.apache.org/jira/browse/HDFS-3643
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs3643.txt
>
>
> {code}
> str = methSignature;
> while (*str != ')') str++;
> str++;
> returnType = *str;
> {code}
> This loop needs to check for {{'\0'}}. Also the following {{if/else if/else 
> if}} cascade doesn't handle unexpected values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3647) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1

2012-07-20 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419693#comment-13419693
 ] 

Suresh Srinivas commented on HDFS-3647:
---

BTW removing performance from the component - add it if you disagree.

> Backport HDFS-2868 (Add number of active transfer threads to the DataNode 
> status) to branch-1
> -
>
> Key: HDFS-3647
> URL: https://issues.apache.org/jira/browse/HDFS-3647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Steve Hoffman
>Assignee: Harsh J
> Fix For: 1.2.0
>
> Attachments: HDFS-3647.patch, Screen Shot 2012-07-14 at 12.41.07 
> AM.png
>
>
> Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't 
> there.
> There is a lot of mystery surrounding how large to set 
> dfs.datanode.max.xcievers.  Most people say to just up it to 4096, but given 
> that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog 
> post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), 
> it would be nice if we could expose the current count via the built-in 
> metrics framework (most likely under dfs).  In this way we could watch it to 
> see if we have it set too high, too low, time to bump it up, etc.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3647) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1

2012-07-20 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3647:
--

Component/s: (was: performance)

> Backport HDFS-2868 (Add number of active transfer threads to the DataNode 
> status) to branch-1
> -
>
> Key: HDFS-3647
> URL: https://issues.apache.org/jira/browse/HDFS-3647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Steve Hoffman
>Assignee: Harsh J
> Fix For: 1.2.0
>
> Attachments: HDFS-3647.patch, Screen Shot 2012-07-14 at 12.41.07 
> AM.png
>
>
> Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't 
> there.
> There is a lot of mystery surrounding how large to set 
> dfs.datanode.max.xcievers.  Most people say to just up it to 4096, but given 
> that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog 
> post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), 
> it would be nice if we could expose the current count via the built-in 
> metrics framework (most likely under dfs).  In this way we could watch it to 
> see if we have it set too high, too low, time to bump it up, etc.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3647) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1

2012-07-20 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419690#comment-13419690
 ] 

Suresh Srinivas commented on HDFS-3647:
---

Additional comment - Lets not create a new jira and attach the patch for 
backporting to the same jira. A separate jira should be created if the patch 
does not apply to older release and requires substantial work. 

> Backport HDFS-2868 (Add number of active transfer threads to the DataNode 
> status) to branch-1
> -
>
> Key: HDFS-3647
> URL: https://issues.apache.org/jira/browse/HDFS-3647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 0.20.2
>Reporter: Steve Hoffman
>Assignee: Harsh J
> Fix For: 1.2.0
>
> Attachments: HDFS-3647.patch, Screen Shot 2012-07-14 at 12.41.07 
> AM.png
>
>
> Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't 
> there.
> There is a lot of mystery surrounding how large to set 
> dfs.datanode.max.xcievers.  Most people say to just up it to 4096, but given 
> that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog 
> post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), 
> it would be nice if we could expose the current count via the built-in 
> metrics framework (most likely under dfs).  In this way we could watch it to 
> see if we have it set too high, too low, time to bump it up, etc.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3647) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1

2012-07-20 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419688#comment-13419688
 ] 

Suresh Srinivas commented on HDFS-3647:
---

why does the component have performance?

> Backport HDFS-2868 (Add number of active transfer threads to the DataNode 
> status) to branch-1
> -
>
> Key: HDFS-3647
> URL: https://issues.apache.org/jira/browse/HDFS-3647
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 0.20.2
>Reporter: Steve Hoffman
>Assignee: Harsh J
> Fix For: 1.2.0
>
> Attachments: HDFS-3647.patch, Screen Shot 2012-07-14 at 12.41.07 
> AM.png
>
>
> Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't 
> there.
> There is a lot of mystery surrounding how large to set 
> dfs.datanode.max.xcievers.  Most people say to just up it to 4096, but given 
> that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog 
> post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), 
> it would be nice if we could expose the current count via the built-in 
> metrics framework (most likely under dfs).  In this way we could watch it to 
> see if we have it set too high, too low, time to bump it up, etc.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419664#comment-13419664
 ] 

Hadoop QA commented on HDFS-3697:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537405/hdfs-3697.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
  
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2880//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2880//console

This message is automatically generated.

> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3697.txt, hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable the readahead feature by default in future versions so that users get 
> this benefit without any manual configuration required.
> The other fadvise features seem to be more workload-dependent and need 
> further testing before enabling by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3643) hdfsJniHelper.c unchecked string pointers

2012-07-20 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3643:


Status: Patch Available  (was: Open)

Patch compiles.

> hdfsJniHelper.c unchecked string pointers
> -
>
> Key: HDFS-3643
> URL: https://issues.apache.org/jira/browse/HDFS-3643
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs3643.txt
>
>
> {code}
> str = methSignature;
> while (*str != ')') str++;
> str++;
> returnType = *str;
> {code}
> This loop needs to check for {{'\0'}}. Also the following {{if/else if/else 
> if}} cascade doesn't handle unexpected values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3643) hdfsJniHelper.c unchecked string pointers

2012-07-20 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3643:


Attachment: hdfs3643.txt

Attaching simple patch to avoid walking off the end of the string, and to do 
something sane when the return value is incorrectly specified.

> hdfsJniHelper.c unchecked string pointers
> -
>
> Key: HDFS-3643
> URL: https://issues.apache.org/jira/browse/HDFS-3643
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs3643.txt
>
>
> {code}
> str = methSignature;
> while (*str != ')') str++;
> str++;
> returnType = *str;
> {code}
> This loop needs to check for {{'\0'}}. Also the following {{if/else if/else 
> if}} cascade doesn't handle unexpected values.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419642#comment-13419642
 ] 

Eli Collins commented on HDFS-2617:
---

If security was enabled the default was not no authentication, that's why this 
change is incompatible. 

I think the alternative option Eric is referring to is to default 
use-weak-http-crypto to true, which means people with secure clusters who just 
update their bits don't switch from kssl to spnego automatically, ie they'd 
have to explicitly enable spnego via setting use-weak-http-crypto to false.

Fortunately people can override this and ship with use-weak-http-crypto true.


> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 1.2.0, 2.1.0-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419636#comment-13419636
 ] 

Eli Collins commented on HDFS-3626:
---

Sounds good, thanks for the explanation.

> Creating file with invalid path can corrupt edit log
> 
>
> Key: HDFS-3626
> URL: https://issues.apache.org/jira/browse/HDFS-3626
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-3626.txt, hdfs-3626.txt, hdfs-3626.txt
>
>
> Joris Bontje reports the following:
> The following command results in a corrupt NN editlog (note the double slash 
> and reading from stdin):
> $ cat /usr/share/dict/words | hadoop fs -put - 
> hdfs://localhost:8020//path/file
> After this, restarting the namenode will result into the following fatal 
> exception:
> {code}
> 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
>  expecting start txid #173
> 2012-07-10 06:29:19,912 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
> permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
> java.lang.ArrayIndexOutOfBoundsException: -1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3553) Hftp proxy tokens are broken

2012-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419628#comment-13419628
 ] 

Hadoop QA commented on HDFS-3553:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12537420/HDFS-3553-2.branch-1.0.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2881//console

This message is automatically generated.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553.branch-1.0.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3553) Hftp proxy tokens are broken

2012-07-20 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3553:
--

Attachment: HDFS-3553-2.branch-1.0.patch

Change if logic to be closer to what it used to be, set the ugi auth type to 
token if from token, add a bunch of tests that were missing.

> Hftp proxy tokens are broken
> 
>
> Key: HDFS-3553
> URL: https://issues.apache.org/jira/browse/HDFS-3553
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 1.0.2, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Blocker
> Attachments: HDFS-3553-1.branch-1.0.patch, 
> HDFS-3553-2.branch-1.0.patch, HDFS-3553.branch-1.0.patch
>
>
> Proxy tokens are broken for hftp.  The impact is systems using proxy tokens, 
> such as oozie jobs, cannot use hftp.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3697:
--

Description: 
The fadvise features have been implemented for some time, and we've enabled 
them in production at a lot of customer sites without difficulty. I'd like to 
enable the readahead feature by default in future versions so that users get 
this benefit without any manual configuration required.

The other fadvise features seem to be more workload-dependent and need further 
testing before enabling by default.

  was:The fadvise features have been implemented for some time, and we've 
enabled them in production at a lot of customer sites without difficulty. I'd 
like to enable them by default in future versions so that users get this 
benefit without any manual configuration required.


> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3697.txt, hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable the readahead feature by default in future versions so that users get 
> this benefit without any manual configuration required.
> The other fadvise features seem to be more workload-dependent and need 
> further testing before enabling by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-07-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419611#comment-13419611
 ] 

Todd Lipcon commented on HDFS-3626:
---

No test was actually doing that. Just in the course of developing the unit 
tests in this patch, I accidentally did that, and was confused by the NPE. I 
can remove this hunk and the tests should still pass. Sound good?

> Creating file with invalid path can corrupt edit log
> 
>
> Key: HDFS-3626
> URL: https://issues.apache.org/jira/browse/HDFS-3626
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-3626.txt, hdfs-3626.txt, hdfs-3626.txt
>
>
> Joris Bontje reports the following:
> The following command results in a corrupt NN editlog (note the double slash 
> and reading from stdin):
> $ cat /usr/share/dict/words | hadoop fs -put - 
> hdfs://localhost:8020//path/file
> After this, restarting the namenode will result into the following fatal 
> exception:
> {code}
> 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
>  expecting start txid #173
> 2012-07-10 06:29:19,912 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
> permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
> java.lang.ArrayIndexOutOfBoundsException: -1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-07-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419604#comment-13419604
 ] 

Todd Lipcon commented on HDFS-3672:
---

A few comments on the initial patch:

- I definitely think we need to separate the API for getting disk locations so 
that you can pass a list of LocatedBlocks. For some of the above-mentioned use 
cases (eg MR scheduler), you need to get the locations for many files, and you 
don't want to have to do a fan-out round for each of the files separately.
- Per above, I agree that we should make the disk IDs opaque. But a single byte 
seems short-sighted. Let's expose them as an interface "DiskId" which can be 
entirely devoid of getters for now -- its only contract would be that it 
properly implements comparison, equals, and hashcode, so users can use them to 
aggregate stats by disk, etc. Internally we can implement it with a wrapper 
around a byte[].
- In the protobuf response, given the above, I think we should do something 
like:
{code}
message Response {
  repeated bytes diskIds;
  repeated uint32 diskIndexes; // for each block, pointers into above diskId 
array, or MAX_INT to indicate blocks not found
}
{code}
- Per above, need to figure out what you're doing for blocks that aren't found 
on a given DN. We also need to specify in the JavaDoc what happens in the 
response for DNs which don't respond. I think it's OK that the result would 
have some "unknown" - it's likely if any of the DNs are down.
- Doing the fan-out RPC does seem important. Unfortunately it might be tricky, 
so I agree we should do it in a separate follow-up optimization.

> Expose disk-location information for blocks to enable better scheduling
> ---
>
> Key: HDFS-3672
> URL: https://issues.apache.org/jira/browse/HDFS-3672
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3672-1.patch
>
>
> Currently, HDFS exposes on which datanodes a block resides, which allows 
> clients to make scheduling decisions for locality and load balancing. 
> Extending this to also expose on which disk on a datanode a block resides 
> would enable even better scheduling, on a per-disk rather than coarse 
> per-datanode basis.
> This API would likely look similar to Filesystem#getFileBlockLocations, but 
> also involve a series of RPCs to the responsible datanodes to determine disk 
> ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3583) Convert remaining tests to Junit4

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419596#comment-13419596
 ] 

Hudson commented on HDFS-3583:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2529 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2529/])
Move entry for HDFS-3583 in CHANGES.txt to be under branch-2. (Revision 
1363949)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363949
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-07-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419575#comment-13419575
 ] 

Todd Lipcon commented on HDFS-3672:
---

Hey Suresh. I'll try to answer a few of your questions above from the 
perspective of HBase and MR.

The information is useful for clients when they have several tasks to complete 
which involve reading blocks on a given DataNode, but the order of the tasks 
doesn't matter. One example is in HBase: we currently have several compaction 
threads running inside the region server, and those compaction threads do a lot 
of IO. HBase could do a better job of scheduling the compactions if it knew 
which blocks were actually on the same underlying disk. If two blocks are on 
separate disks, you can get 2x the throughput by reading them at the same time, 
whereas if they're on the  same disk, it would be better to schedule them one 
after the other.

You can imagine this feature also being used at some point by MapReduce. 
Consider a map-only job which reads hundreds of blocks located on the same DN. 
When the associated NodeManager asks for a task to run, the application master 
can look at the already-running tasks on that node, understand which disks are 
currently not being read, and schedule a task which accesses an idle disk. 
Another MR use case is to keep track of which local disks the various tasks are 
reading from, and de-prioritize those disks when choosing which local disk on 
which to spill map output to avoid read-write contention.

The other motivation is to eventually correlate these disk IDs with 
statistics/metrics within advanced clients. In HBase, for example, we currently 
always read from the local replica if it is available. If, however, one of the 
local disks is going bad, this can really impact latency, and we'd rather read 
a remote replica instead - the network latency is much less than the cost of 
accessing failing media. But we need to be able to look at a block and know 
which disk it's on in order to track these statistics.

The overall guiding motivation is that we looked at heavily loaded clusters 
with 12 disks and found that we were suffering from pretty significant 
"hotspotting" of disk access. During any given second, about two thirds of the 
disks tend to be at 100% utilization while the others are basically idle. Using 
lsof to look at the number of open blocks on each data volume showed the same 
hotspotting: some disks had multiple tasks reading data whereas others had 
none. With a bit more client visibility into block<->disk correspondence, we 
can try to improve this.

bq. As regards NN knowing about this information, that is one of the 
motivations of HDFS-2832. If each storage volume that corresponds to a disk on 
Datanode has a separate storage ID, NN gets block reports and other stats per 
disk.

I agree HDFS-2832 will really be useful for this. But it's a larger 
restructuring with much bigger implications. This JIRA is just about adding a 
new API which exposes some information that's already available. We explicitly 
chose to make the "disk ID" opaque in the proposed API -- that way when 
HDFS-2832 arrives, we can really easily switch over the _implementation_ to be 
based on the storage IDs without breaking users of the API.

> Expose disk-location information for blocks to enable better scheduling
> ---
>
> Key: HDFS-3672
> URL: https://issues.apache.org/jira/browse/HDFS-3672
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: hdfs-3672-1.patch
>
>
> Currently, HDFS exposes on which datanodes a block resides, which allows 
> clients to make scheduling decisions for locality and load balancing. 
> Extending this to also expose on which disk on a datanode a block resides 
> would enable even better scheduling, on a per-disk rather than coarse 
> per-datanode basis.
> This API would likely look similar to Filesystem#getFileBlockLocations, but 
> also involve a series of RPCs to the responsible datanodes to determine disk 
> ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3583) Convert remaining tests to Junit4

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419565#comment-13419565
 ] 

Hudson commented on HDFS-3583:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2574 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2574/])
Move entry for HDFS-3583 in CHANGES.txt to be under branch-2. (Revision 
1363949)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363949
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3583) Convert remaining tests to Junit4

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419558#comment-13419558
 ] 

Hudson commented on HDFS-3583:
--

Integrated in Hadoop-Common-trunk-Commit #2509 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2509/])
Move entry for HDFS-3583 in CHANGES.txt to be under branch-2. (Revision 
1363949)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363949
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419556#comment-13419556
 ] 

Eli Collins commented on HDFS-3626:
---

But why do we want DFSTestUtil#createFile to throw an IOE in this case instead? 
 I'm +1 to this if we resolve this in a follow on jira that removes the check 
and fixes up the test that's creating / and is expecting an IOE instead of an 
NPE.

> Creating file with invalid path can corrupt edit log
> 
>
> Key: HDFS-3626
> URL: https://issues.apache.org/jira/browse/HDFS-3626
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-3626.txt, hdfs-3626.txt, hdfs-3626.txt
>
>
> Joris Bontje reports the following:
> The following command results in a corrupt NN editlog (note the double slash 
> and reading from stdin):
> $ cat /usr/share/dict/words | hadoop fs -put - 
> hdfs://localhost:8020//path/file
> After this, restarting the namenode will result into the following fatal 
> exception:
> {code}
> 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
>  expecting start txid #173
> 2012-07-10 06:29:19,912 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
> permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
> java.lang.ArrayIndexOutOfBoundsException: -1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3608) fuse_dfs: detect changes in UID ticket cache

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419550#comment-13419550
 ] 

Hudson commented on HDFS-3608:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2528 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2528/])
HDFS-3608. fuse_dfs: detect changes in UID ticket cache. Contributed by 
Colin Patrick McCabe. (Revision 1363904)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363904
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/CMakeLists.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_context_handle.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_dfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_file_handle.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_chmod.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_chown.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_flush.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_getattr.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_mkdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_open.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_read.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_readdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_release.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_rename.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_rmdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_statfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_truncate.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_unlink.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_utimens.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_write.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_init.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/util
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/util/tree.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> fuse_dfs: detect changes in UID ticket cache
> 
>
> Key: HDFS-3608
> URL: https://issues.apache.org/jira/browse/HDFS-3608
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3608.004.patch, HDFS-3608.006.patch, 
> HDFS-3608.007.patch, HDFS-3608.008.patch, HDFS-3608.009.patch, 
> HDFS-3608.010.patch, HDFS-3608.011.patch, HDFS-3608.patch
>
>
> Currently in fuse_dfs, if one kinits as some principal "foo" and then does 
> some operation on fuse_dfs, then kdestroy and kinit as some principal "bar", 
> subsequent operations done via fuse_dfs will still use cached credentials for 
> "foo". The reason for this is that fuse_dfs caches Filesystem instances using 
> the UID of the user running the command as the key into the cache.  This is a 
> very uncommon scenario, since it's pretty uncommon for a single user to want 
> to use credentials for several different principals on the same box.
> However, we can use inotify to detect changes in the Kerberos ticket cache 
> file and force the next operation to create a new FileSystem instance in that 
> case.  This will also require a reference counting mechanism in fuse_dfs so 
> that

[jira] [Updated] (HDFS-3583) Convert remaining tests to Junit4

2012-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3583:
-

  Resolution: Fixed
   Fix Version/s: (was: 3.0.0)
  2.2.0-alpha
Target Version/s: 2.2.0-alpha  (was: 2.1.0-alpha)
  Status: Resolved  (was: Patch Available)

I've just committed the branch-2 patch and moved the entry in CHANGES.txt on 
trunk.

If there are any more followup patches to remove remnants of Junit3, let's do 
those in separate JIRAs.

Thanks a lot for the contribution, Andrew!

> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 2.2.0-alpha
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419539#comment-13419539
 ] 

Eli Collins commented on HDFS-3697:
---

+1  pending jenkins. It's worth calling out in the description that this only 
enables readahead, not drop cache and sync.

> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3697.txt, hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable them by default in future versions so that users get this benefit 
> without any manual configuration required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3697:
--

Attachment: hdfs-3697.txt

Oops, typo in the XML made it malformed. New one passes xmllint.

> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3697.txt, hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable them by default in future versions so that users get this benefit 
> without any manual configuration required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3597) SNN can fail to start on upgrade

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419503#comment-13419503
 ] 

Hudson commented on HDFS-3597:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2527 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2527/])
HDFS-3597. SNN fails to start after DFS upgrade. Contributed by Andy 
Isaacson. (Revision 1363899)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363899
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CheckpointSignature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryNameNodeUpgrade.java


> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419474#comment-13419474
 ] 

Hadoop QA commented on HDFS-3697:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537392/hdfs-3697.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javac.  The patch appears to cause the build to fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2879//console

This message is automatically generated.

> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable them by default in future versions so that users get this benefit 
> without any manual configuration required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3608) fuse_dfs: detect changes in UID ticket cache

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419470#comment-13419470
 ] 

Hudson commented on HDFS-3608:
--

Integrated in Hadoop-Common-trunk-Commit #2507 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2507/])
HDFS-3608. fuse_dfs: detect changes in UID ticket cache. Contributed by 
Colin Patrick McCabe. (Revision 1363904)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363904
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/CMakeLists.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_context_handle.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_dfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_file_handle.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_chmod.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_chown.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_flush.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_getattr.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_mkdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_open.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_read.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_readdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_release.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_rename.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_rmdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_statfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_truncate.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_unlink.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_utimens.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_write.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_init.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/util
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/util/tree.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> fuse_dfs: detect changes in UID ticket cache
> 
>
> Key: HDFS-3608
> URL: https://issues.apache.org/jira/browse/HDFS-3608
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3608.004.patch, HDFS-3608.006.patch, 
> HDFS-3608.007.patch, HDFS-3608.008.patch, HDFS-3608.009.patch, 
> HDFS-3608.010.patch, HDFS-3608.011.patch, HDFS-3608.patch
>
>
> Currently in fuse_dfs, if one kinits as some principal "foo" and then does 
> some operation on fuse_dfs, then kdestroy and kinit as some principal "bar", 
> subsequent operations done via fuse_dfs will still use cached credentials for 
> "foo". The reason for this is that fuse_dfs caches Filesystem instances using 
> the UID of the user running the command as the key into the cache.  This is a 
> very uncommon scenario, since it's pretty uncommon for a single user to want 
> to use credentials for several different principals on the same box.
> However, we can use inotify to detect changes in the Kerberos ticket cache 
> file and force the next operation to create a new FileSystem instance in that 
> case.  This will also require a reference counting mechanism in fuse_dfs so 
> that we ca

[jira] [Commented] (HDFS-3608) fuse_dfs: detect changes in UID ticket cache

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419467#comment-13419467
 ] 

Hudson commented on HDFS-3608:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2572 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2572/])
HDFS-3608. fuse_dfs: detect changes in UID ticket cache. Contributed by 
Colin Patrick McCabe. (Revision 1363904)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363904
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/CMakeLists.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_connect.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_context_handle.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_dfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_file_handle.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_chmod.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_chown.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_flush.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_getattr.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_mkdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_open.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_read.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_readdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_release.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_rename.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_rmdir.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_statfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_truncate.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_unlink.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_utimens.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_write.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_init.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/util
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/util/tree.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml


> fuse_dfs: detect changes in UID ticket cache
> 
>
> Key: HDFS-3608
> URL: https://issues.apache.org/jira/browse/HDFS-3608
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3608.004.patch, HDFS-3608.006.patch, 
> HDFS-3608.007.patch, HDFS-3608.008.patch, HDFS-3608.009.patch, 
> HDFS-3608.010.patch, HDFS-3608.011.patch, HDFS-3608.patch
>
>
> Currently in fuse_dfs, if one kinits as some principal "foo" and then does 
> some operation on fuse_dfs, then kdestroy and kinit as some principal "bar", 
> subsequent operations done via fuse_dfs will still use cached credentials for 
> "foo". The reason for this is that fuse_dfs caches Filesystem instances using 
> the UID of the user running the command as the key into the cache.  This is a 
> very uncommon scenario, since it's pretty uncommon for a single user to want 
> to use credentials for several different principals on the same box.
> However, we can use inotify to detect changes in the Kerberos ticket cache 
> file and force the next operation to create a new FileSystem instance in that 
> case.  This will also require a reference counting mechanism in fuse_dfs so 
> that we can fr

[jira] [Updated] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3697:
--

Status: Patch Available  (was: Open)

> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable them by default in future versions so that users get this benefit 
> without any manual configuration required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3697:
--

Attachment: hdfs-3697.txt

Attached patch enables the readahead to 4MB by default. Experimentally we've 
determined this provides a good performance boost without too high an increase 
in buffer cache usage.

I also took the liberty of adding documentation for the other fadvise 
parameters to hdfs-default.xml in this patch.

> Enable fadvise readahead by default
> ---
>
> Key: HDFS-3697
> URL: https://issues.apache.org/jira/browse/HDFS-3697
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, performance
>Affects Versions: 3.0.0, 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hdfs-3697.txt
>
>
> The fadvise features have been implemented for some time, and we've enabled 
> them in production at a lot of customer sites without difficulty. I'd like to 
> enable them by default in future versions so that users get this benefit 
> without any manual configuration required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3608) fuse_dfs: detect changes in UID ticket cache

2012-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3608:
-

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've just committed this to trunk and branch-2. Thanks a lot for the 
contribution, Colin.

> fuse_dfs: detect changes in UID ticket cache
> 
>
> Key: HDFS-3608
> URL: https://issues.apache.org/jira/browse/HDFS-3608
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3608.004.patch, HDFS-3608.006.patch, 
> HDFS-3608.007.patch, HDFS-3608.008.patch, HDFS-3608.009.patch, 
> HDFS-3608.010.patch, HDFS-3608.011.patch, HDFS-3608.patch
>
>
> Currently in fuse_dfs, if one kinits as some principal "foo" and then does 
> some operation on fuse_dfs, then kdestroy and kinit as some principal "bar", 
> subsequent operations done via fuse_dfs will still use cached credentials for 
> "foo". The reason for this is that fuse_dfs caches Filesystem instances using 
> the UID of the user running the command as the key into the cache.  This is a 
> very uncommon scenario, since it's pretty uncommon for a single user to want 
> to use credentials for several different principals on the same box.
> However, we can use inotify to detect changes in the Kerberos ticket cache 
> file and force the next operation to create a new FileSystem instance in that 
> case.  This will also require a reference counting mechanism in fuse_dfs so 
> that we can free the FileSystem classes when they refer to previous Kerberos 
> ticket caches.
> Another mechanism is to run a stat periodically on the ticket cache file.  
> This is a good fallback mechanism if inotify does not work on the file (for 
> example, because it's on an NFS mount.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3608) fuse_dfs: detect changes in UID ticket cache

2012-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3608:
-

Target Version/s: 2.2.0-alpha  (was: 2.1.0-alpha)

> fuse_dfs: detect changes in UID ticket cache
> 
>
> Key: HDFS-3608
> URL: https://issues.apache.org/jira/browse/HDFS-3608
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3608.004.patch, HDFS-3608.006.patch, 
> HDFS-3608.007.patch, HDFS-3608.008.patch, HDFS-3608.009.patch, 
> HDFS-3608.010.patch, HDFS-3608.011.patch, HDFS-3608.patch
>
>
> Currently in fuse_dfs, if one kinits as some principal "foo" and then does 
> some operation on fuse_dfs, then kdestroy and kinit as some principal "bar", 
> subsequent operations done via fuse_dfs will still use cached credentials for 
> "foo". The reason for this is that fuse_dfs caches Filesystem instances using 
> the UID of the user running the command as the key into the cache.  This is a 
> very uncommon scenario, since it's pretty uncommon for a single user to want 
> to use credentials for several different principals on the same box.
> However, we can use inotify to detect changes in the Kerberos ticket cache 
> file and force the next operation to create a new FileSystem instance in that 
> case.  This will also require a reference counting mechanism in fuse_dfs so 
> that we can free the FileSystem classes when they refer to previous Kerberos 
> ticket caches.
> Another mechanism is to run a stat periodically on the ticket cache file.  
> This is a good fallback mechanism if inotify does not work on the file (for 
> example, because it's on an NFS mount.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3608) fuse_dfs: detect changes in UID ticket cache

2012-07-20 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3608:
-

Attachment: HDFS-3608.patch

+1, the latest patch looks good to me.

Here's a very slightly updated patch which just changes some wording of the 
descriptions in hdfs-default.xml. I'm going to go ahead and commit this since 
the difference between this and the last is negligible.

> fuse_dfs: detect changes in UID ticket cache
> 
>
> Key: HDFS-3608
> URL: https://issues.apache.org/jira/browse/HDFS-3608
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3608.004.patch, HDFS-3608.006.patch, 
> HDFS-3608.007.patch, HDFS-3608.008.patch, HDFS-3608.009.patch, 
> HDFS-3608.010.patch, HDFS-3608.011.patch, HDFS-3608.patch
>
>
> Currently in fuse_dfs, if one kinits as some principal "foo" and then does 
> some operation on fuse_dfs, then kdestroy and kinit as some principal "bar", 
> subsequent operations done via fuse_dfs will still use cached credentials for 
> "foo". The reason for this is that fuse_dfs caches Filesystem instances using 
> the UID of the user running the command as the key into the cache.  This is a 
> very uncommon scenario, since it's pretty uncommon for a single user to want 
> to use credentials for several different principals on the same box.
> However, we can use inotify to detect changes in the Kerberos ticket cache 
> file and force the next operation to create a new FileSystem instance in that 
> case.  This will also require a reference counting mechanism in fuse_dfs so 
> that we can free the FileSystem classes when they refer to previous Kerberos 
> ticket caches.
> Another mechanism is to run a stat periodically on the ticket cache file.  
> This is a good fallback mechanism if inotify does not work on the file (for 
> example, because it's on an NFS mount.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3697) Enable fadvise readahead by default

2012-07-20 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3697:
-

 Summary: Enable fadvise readahead by default
 Key: HDFS-3697
 URL: https://issues.apache.org/jira/browse/HDFS-3697
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, performance
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor


The fadvise features have been implemented for some time, and we've enabled 
them in production at a lot of customer sites without difficulty. I'd like to 
enable them by default in future versions so that users get this benefit 
without any manual configuration required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3597) SNN can fail to start on upgrade

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419453#comment-13419453
 ] 

Hudson commented on HDFS-3597:
--

Integrated in Hadoop-Common-trunk-Commit #2506 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2506/])
HDFS-3597. SNN fails to start after DFS upgrade. Contributed by Andy 
Isaacson. (Revision 1363899)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363899
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CheckpointSignature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryNameNodeUpgrade.java


> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3597) SNN can fail to start on upgrade

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419450#comment-13419450
 ] 

Hudson commented on HDFS-3597:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2571 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2571/])
HDFS-3597. SNN fails to start after DFS upgrade. Contributed by Andy 
Isaacson. (Revision 1363899)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363899
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CheckpointSignature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryNameNodeUpgrade.java


> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3626) Creating file with invalid path can corrupt edit log

2012-07-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419449#comment-13419449
 ] 

Todd Lipcon commented on HDFS-3626:
---

I added that change to DFSTestUtil because, without it, we end up calling 
fs.mkdirs(null). That (correctly) throws NullPointerException. I hit this in 
one of the test cases accidentally, but I found it hard to immediately see that 
my test code was in fact at fault (i.e not a true HDFS bug). Does that seem 
reasonable?

> Creating file with invalid path can corrupt edit log
> 
>
> Key: HDFS-3626
> URL: https://issues.apache.org/jira/browse/HDFS-3626
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-3626.txt, hdfs-3626.txt, hdfs-3626.txt
>
>
> Joris Bontje reports the following:
> The following command results in a corrupt NN editlog (note the double slash 
> and reading from stdin):
> $ cat /usr/share/dict/words | hadoop fs -put - 
> hdfs://localhost:8020//path/file
> After this, restarting the namenode will result into the following fatal 
> exception:
> {code}
> 2012-07-10 06:29:19,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> /var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current/edits_173-188
>  expecting start txid #173
> 2012-07-10 06:29:19,912 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation MkdirOp [length=0, path=/, timestamp=1341915658216, 
> permissions=cloudera:supergroup:rwxr-xr-x, opCode=OP_MKDIR, txid=182]
> java.lang.ArrayIndexOutOfBoundsException: -1
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-07-20 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3597:
--

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1108) Log newly allocated blocks

2012-07-20 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419442#comment-13419442
 ] 

Sanjay Radia commented on HDFS-1108:


This is useful for restarting a failed NN  - with manual or automatic restart. 
Hence yes, it is useful for HA in Hadoop 1.

> Log newly allocated blocks
> --
>
> Key: HDFS-1108
> URL: https://issues.apache.org/jira/browse/HDFS-1108
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Reporter: dhruba borthakur
>Assignee: Todd Lipcon
> Fix For: HA branch (HDFS-1623), 1.1.0
>
> Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, 
> hdfs-1108-habranch.txt, hdfs-1108-habranch.txt, hdfs-1108-habranch.txt, 
> hdfs-1108-habranch.txt, hdfs-1108-hadoop-1-v2.patch, 
> hdfs-1108-hadoop-1-v3.patch, hdfs-1108-hadoop-1-v4.patch, 
> hdfs-1108-hadoop-1-v5.patch, hdfs-1108-hadoop-1.patch, hdfs-1108.txt
>
>
> The current HDFS design says that newly allocated blocks for a file are not 
> persisted in the NN transaction log when the block is allocated. Instead, a 
> hflush() or a close() on the file persists the blocks into the transaction 
> log. It would be nice if we can immediately persist newly allocated blocks 
> (as soon as they are allocated) for specific files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3597) SNN can fail to start on upgrade

2012-07-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419441#comment-13419441
 ] 

Todd Lipcon commented on HDFS-3597:
---

+1, lgtm. Thanks Andy. I'll commit this momentarily.

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3688) Namenode loses datanode hostname if datanode re-registers

2012-07-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419416#comment-13419416
 ] 

Jason Lowe commented on HDFS-3688:
--

Patch failed on trunk since it's only for 23.  Manually ran test-patch for this 
on branch-0.23, here's the output:

{noformat}
-1 overall.  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version ) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.TestDatanodeBlockScanner

+1 contrib tests.  The patch passed contrib unit tests.
{noformat}

The test failure appears to be unrelated, and when I ran the test again 
manually it passed.  I believe this test is known to have some races in it, see 
HDFS-2881.

> Namenode loses datanode hostname if datanode re-registers
> -
>
> Key: HDFS-3688
> URL: https://issues.apache.org/jira/browse/HDFS-3688
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
> Attachments: HDFS-3688.patch
>
>
> If a datanode ever re-registers with the namenode (e.g.: namenode restart, 
> temporary network cut, etc.) then the datanode ends up registering with an IP 
> address as the datanode name rather than the hostname.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3680) Allows customized audit logging in HDFS FSNamesystem

2012-07-20 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419413#comment-13419413
 ] 

Marcelo Vanzin commented on HDFS-3680:
--

Thanks for the comments everyone. Good to know FSNamesystem is a singleton, so 
no need to worry about that issue.

As for queuing / blocking, I understand the concerns, but I don't see how 
they're any different than today. To do something like this today, you'd do one 
of the following:

(i) Process logs post-facto, by tailing the HDFS log file or something along 
those lines.

This would be the "completely off process" model, not affecting the NN 
operation.

(ii) Use a custom log appender that parses log messages inside the NN.

This is almost the same as what my patch does; except it's tied to the log 
system implementation.

Both cases suffer from turning a log message into something expected to be a 
"stable" interface; the second approach (which is doable today, just to make 
that clear) adds on top of that all the concerns you guys listed.

Does anyone know how the different log systems behave when using file loggers, 
which I guess would be the vast majority of cases for this code? Do they do 
queuing, do they block waiting for the message to be written, what happens when 
they flush buffers, what if the log file is on NFS, etc? Lots of the concerns 
raised here are similar to those questions.

I agree that implementations of this interface can do all sorts of bad things, 
but I don't see how that's any worse than today. Unless you guys want to forgo 
using a log system at all for audit logging, and force writing to files as the 
only option, having your own custom code to do it and avoid as many of the 
issues discussed here as possible.

The code could definitely force queuing on this code path; since not everybody 
may need that (the current log approach being the example), I'm wary of turning 
that into a requirement.

So, those out of the way, a few comments about other things:
. audit logging under the namesystem lock: that can be hacked around. One ugly 
way would be to store the audit data in a thread local, and flush it in the 
unlock() methods.

. using the interface for the existing log: that can be easily done; my goal 
with not changing that part was to not change the existing behavior. I could 
use the "AUDITLOG access logger" as the default one, that would be very easy to 
do. A custom access logger would replace it (or we could make the config option 
a list, this allowing the use of both again).

> Allows customized audit logging in HDFS FSNamesystem
> 
>
> Key: HDFS-3680
> URL: https://issues.apache.org/jira/browse/HDFS-3680
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Marcelo Vanzin
>Assignee: Marcelo Vanzin
>Priority: Minor
> Attachments: accesslogger-v1.patch, accesslogger-v2.patch
>
>
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to 
> get audit logs in some log file. But it makes it kinda tricky to store audit 
> logs in any other way (let's say a database), because it would require the 
> code to implement a log appender (and thus know what logging system is 
> actually being used underneath the façade), and parse the textual log message 
> generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419412#comment-13419412
 ] 

Eli Collins commented on HDFS-3577:
---

Shouldn't be an issue in branch-1 I mean. I verified that on a branch-1 build 
from today distcp using webhdfs of a directory that contains a 3gb file works.  
So looks like this is just an issue on trunk/2.x, though per my comment above 
the same distcp does not work even with 3577 applied.

> WebHdfsFileSystem can not read files larger than 24KB
> -
>
> Key: HDFS-3577
> URL: https://issues.apache.org/jira/browse/HDFS-3577
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Blocker
> Fix For: 0.23.3, 2.1.0-alpha
>
> Attachments: h3577_20120705.patch, h3577_20120708.patch, 
> h3577_20120714.patch, h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running 
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of 
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with 
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to 
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length* 
> header, but when using chunked transfer encoding the *Content-Length* header 
> is not present and  the *URLOpener.openInputStream()* method thrown an 
> exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB

2012-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419410#comment-13419410
 ] 

Daryn Sharp commented on HDFS-3577:
---

Yes, HDFS-3166 (add timeouts) exposed the 200s tail on >2GB files caused by a 
java bug.  The content-length has to be known in order to workaround the java 
bug, and thus avoid the read timeout.

What I think can be done:
* If chunked, eliminate the content-length requirement
* If not chunked, and no content-length, obtain the length from a file stat or 
a HEAD, etc

> WebHdfsFileSystem can not read files larger than 24KB
> -
>
> Key: HDFS-3577
> URL: https://issues.apache.org/jira/browse/HDFS-3577
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Blocker
> Fix For: 0.23.3, 2.1.0-alpha
>
> Attachments: h3577_20120705.patch, h3577_20120708.patch, 
> h3577_20120714.patch, h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running 
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of 
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with 
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to 
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length* 
> header, but when using chunked transfer encoding the *Content-Length* header 
> is not present and  the *URLOpener.openInputStream()* method thrown an 
> exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3680) Allows customized audit logging in HDFS FSNamesystem

2012-07-20 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419407#comment-13419407
 ] 

Aaron T. Myers commented on HDFS-3680:
--

bq. While conceptually this jira is good idea, this is in a very critical 
portion of the NN that risks dire performance impacts. I'm inclined to 
think/propose other audit loggers should post-process the audit file in a 
separate process.

While I understand very well the desire to not let users shoot themselves in 
the performance foot, I don't think that the case of writing a custom logger 
implementation is a place where users need to have their hand held. Not a ton 
of folks will want to write a custom logger implementation, and those who do 
should understand the potential impacts this will have on the daemons where 
they're installed.

> Allows customized audit logging in HDFS FSNamesystem
> 
>
> Key: HDFS-3680
> URL: https://issues.apache.org/jira/browse/HDFS-3680
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Marcelo Vanzin
>Assignee: Marcelo Vanzin
>Priority: Minor
> Attachments: accesslogger-v1.patch, accesslogger-v2.patch
>
>
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to 
> get audit logs in some log file. But it makes it kinda tricky to store audit 
> logs in any other way (let's say a database), because it would require the 
> code to implement a log appender (and thus know what logging system is 
> actually being used underneath the façade), and parse the textual log message 
> generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3583) Convert remaining tests to Junit4

2012-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419402#comment-13419402
 ] 

Hadoop QA commented on HDFS-3583:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12537377/hdfs-3583-branch-2.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2878//console

This message is automatically generated.

> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419396#comment-13419396
 ] 

Eli Collins commented on HDFS-3577:
---

Per offline conversation with Daryn the reason this isn't an issue is that 
HDFS-3166 (Add timeout to Hftp connections) isn't in branch-1 (w/o the timeout 
we just get a 200s lag, not a failure) and 3166 exposed HDFS-3318 (Hftp hangs 
on transfers >2GB) which introduces the check that will fail if the content 
length header is not set (ie 20.x).



> WebHdfsFileSystem can not read files larger than 24KB
> -
>
> Key: HDFS-3577
> URL: https://issues.apache.org/jira/browse/HDFS-3577
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Blocker
> Fix For: 0.23.3, 2.1.0-alpha
>
> Attachments: h3577_20120705.patch, h3577_20120708.patch, 
> h3577_20120714.patch, h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running 
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of 
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with 
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to 
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length* 
> header, but when using chunked transfer encoding the *Content-Length* header 
> is not present and  the *URLOpener.openInputStream()* method thrown an 
> exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3583) Convert remaining tests to Junit4

2012-07-20 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-3583:
--

Attachment: hdfs-3583-branch-2.patch

Sorry I missed your comment Aaron, I missed it among all the Hudson messages. 
Attaching a branch-2 version generated by the same process. Tested via {{mvn 
test}}, which compiled and ran the first few tests successfully before I 
ctrl-C'd it.

The one caveat is that the hdfs-raid stuff has since been removed, so those 
tests haven't been auto-fixed.

> Convert remaining tests to Junit4
> -
>
> Key: HDFS-3583
> URL: https://issues.apache.org/jira/browse/HDFS-3583
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andrew Wang
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: hdfs-3583-branch-2.patch, hdfs-3583-part2.patch, 
> hdfs-3583.patch, hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh
>
>
> JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's 
> convert the remaining tests over to Junit4 style.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3680) Allows customized audit logging in HDFS FSNamesystem

2012-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419388#comment-13419388
 ] 

Daryn Sharp commented on HDFS-3680:
---

Todd: good points as well about log ordering and whether the impl should 
enforcing queueing, if so, bounded/unbounded queueing.  I'm not sure there's a 
one-size-fits-all, and flexibility could become very complex.  I'm a bit 
squeamish of no queueing and leaving it up to the user since the logging is 
inside the namesystem lock.  It'll be all too easy to add latency or even stall 
the NN entirely trying to log a write op.

While conceptually this jira is good idea, this is in a very critical portion 
of the NN that risks dire performance impacts. I'm inclined to think/propose 
other audit loggers should post-process the audit file in a separate process.




> Allows customized audit logging in HDFS FSNamesystem
> 
>
> Key: HDFS-3680
> URL: https://issues.apache.org/jira/browse/HDFS-3680
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Marcelo Vanzin
>Assignee: Marcelo Vanzin
>Priority: Minor
> Attachments: accesslogger-v1.patch, accesslogger-v2.patch
>
>
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to 
> get audit logs in some log file. But it makes it kinda tricky to store audit 
> logs in any other way (let's say a database), because it would require the 
> code to implement a log appender (and thus know what logging system is 
> actually being used underneath the façade), and parse the textual log message 
> generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3671) ByteRangeInputStream shouldn't require the content length header be present

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419374#comment-13419374
 ] 

Eli Collins commented on HDFS-3671:
---

The test for this is whether you can distcp a 2gb+ file from 0.20 (or 0.21) to 
2.x. Currently that fails so users don't have a way to migrate data from these 
releases (and there are a lot of people on 0.20).

> ByteRangeInputStream shouldn't require the content length header be present
> ---
>
> Key: HDFS-3671
> URL: https://issues.apache.org/jira/browse/HDFS-3671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Blocker
> Attachments: h3671_20120717.patch, h3671_20120719.patch
>
>
> Per HDFS-3318 the content length header check breaks distcp compatibility 
> with previous releases (0.20.2 and earlier, and 0.21). Like branch-1 this 
> check should be lenient.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-20 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419365#comment-13419365
 ] 

Suresh Srinivas commented on HDFS-3667:
---

Comments for the patch:
# WebHdfsFileSystem.java
#* #getWebHdfs throws unnecessary InterruptedException, URISyntaxException
#* #toIOException() please add some brief comments to describe it
#* Please add javadoc to Runner class
# Unrelated to this patch, I get a warning - The field PutOpParam.Op#NULL is 
hiding a field from type Param>. 
Same for PostOpParam, GetOpParam and DeleteOpParam.
# Remove unnecessary cast to DistributedFileSystem in 
TestDelegationTokenForProxyUser#testDelegationTokenWithRealUser.
# Please add javadoc to TestWebhdfsRetries and describe what test the class is 
for.


bq.  I think this is touching code introduced by HA, so it may be a challenge 
to merge into 23?
As you pointed, with HA changes, we already have an issue in being able to 
merge changes from trunk to 0.23. We may need to do a separate patches in some 
cases.


> Add retry support to WebHdfsFileSystem
> --
>
> Key: HDFS-3667
> URL: https://issues.apache.org/jira/browse/HDFS-3667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3667_20120718.patch
>
>
> DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
> retries on exceptions such as connection failure, safemode.  
> WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419364#comment-13419364
 ] 

Daryn Sharp commented on HDFS-3667:
---

I assumed it also implemented generic retries for non-ha transient failures.  
I'm not saying the patch is bad or wrong, but I'm concerned if it can't be 
pulled back into 23 then it might impede back-merges of future webhdfs bug 
fixes.

> Add retry support to WebHdfsFileSystem
> --
>
> Key: HDFS-3667
> URL: https://issues.apache.org/jira/browse/HDFS-3667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3667_20120718.patch
>
>
> DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
> retries on exceptions such as connection failure, safemode.  
> WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-07-20 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419337#comment-13419337
 ] 

Owen O'Malley commented on HDFS-2617:
-

Eric, we have talked through the options. We've gotten to the place where the 
user can choose:

# No authentication
# Weak authentication
# Strong authentication

The default has always been no authentication. If someone has bothered to ask 
for strong authentication, our project shouldn't subvert their effort by having 
them use known weak crypto unless they explicitly declare that hftp 
compatibility without a pre-fetched token is more important that the strength 
of their authentication.

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 1.2.0, 2.1.0-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3688) Namenode loses datanode hostname if datanode re-registers

2012-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419333#comment-13419333
 ] 

Hadoop QA commented on HDFS-3688:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537365/HDFS-3688.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2877//console

This message is automatically generated.

> Namenode loses datanode hostname if datanode re-registers
> -
>
> Key: HDFS-3688
> URL: https://issues.apache.org/jira/browse/HDFS-3688
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
> Attachments: HDFS-3688.patch
>
>
> If a datanode ever re-registers with the namenode (e.g.: namenode restart, 
> temporary network cut, etc.) then the datanode ends up registering with an IP 
> address as the datanode name rather than the hostname.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-20 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419328#comment-13419328
 ] 

Suresh Srinivas commented on HDFS-3667:
---

bq. it may be a challenge to merge into 23
Do you see a need for this in 0.23? This should help mostly retries in HA 
scenario.

> Add retry support to WebHdfsFileSystem
> --
>
> Key: HDFS-3667
> URL: https://issues.apache.org/jira/browse/HDFS-3667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3667_20120718.patch
>
>
> DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
> retries on exceptions such as connection failure, safemode.  
> WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3688) Namenode loses datanode hostname if datanode re-registers

2012-07-20 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HDFS-3688:
-

Status: Patch Available  (was: Open)

> Namenode loses datanode hostname if datanode re-registers
> -
>
> Key: HDFS-3688
> URL: https://issues.apache.org/jira/browse/HDFS-3688
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
> Attachments: HDFS-3688.patch
>
>
> If a datanode ever re-registers with the namenode (e.g.: namenode restart, 
> temporary network cut, etc.) then the datanode ends up registering with an IP 
> address as the datanode name rather than the hostname.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3688) Namenode loses datanode hostname if datanode re-registers

2012-07-20 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HDFS-3688:
-

Attachment: HDFS-3688.patch

Patch for branch-0.23 that has the datanode preserve the original hostname when 
registering with the namenode.

> Namenode loses datanode hostname if datanode re-registers
> -
>
> Key: HDFS-3688
> URL: https://issues.apache.org/jira/browse/HDFS-3688
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.23.3
>Reporter: Jason Lowe
> Attachments: HDFS-3688.patch
>
>
> If a datanode ever re-registers with the namenode (e.g.: namenode restart, 
> temporary network cut, etc.) then the datanode ends up registering with an IP 
> address as the datanode name rather than the hostname.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3680) Allows customized audit logging in HDFS FSNamesystem

2012-07-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419323#comment-13419323
 ] 

Todd Lipcon commented on HDFS-3680:
---

Daryn: decent point. Do you think we should just attach a big warning to the 
interface saying "make sure you do queueing", or should we build the queue into 
the core code itself?

The issue with doing our own queueing is that it's still nice to have the log4j 
logs interleave in-order with the other related log messages. I wouldn't want 
the log to be async. But definitely reporting to a JDBC-backed DB should be 
async via a queue.

Another thing to think through with the queue: we probably don't want it to be 
unbounded for fear of OOME. And I imagine some organizations may have a strict 
audit requirement that limits the possibility of lost audit events. In such a 
case, operations probably _should_ block on a full queue - else events could be 
permanently stuck in the queue if the underlying DB is down, and then lost if 
the NN crashes.

Marcelo, any thoughts?

> Allows customized audit logging in HDFS FSNamesystem
> 
>
> Key: HDFS-3680
> URL: https://issues.apache.org/jira/browse/HDFS-3680
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Marcelo Vanzin
>Assignee: Marcelo Vanzin
>Priority: Minor
> Attachments: accesslogger-v1.patch, accesslogger-v2.patch
>
>
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to 
> get audit logs in some log file. But it makes it kinda tricky to store audit 
> logs in any other way (let's say a database), because it would require the 
> code to implement a log appender (and thus know what logging system is 
> actually being used underneath the façade), and parse the textual log message 
> generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-07-20 Thread eric baldeschwieler (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419192#comment-13419192
 ] 

eric baldeschwieler commented on HDFS-2617:
---

Let's talk the options through and make a proposal.

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 1.2.0, 2.1.0-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-20 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-3696:
-

Description: 
When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM 
if the file size is large. When I tested, 20MB files were fine, but 200MB 
didn't work.  

I also tried reading a large file by issuing "-cat" and piping to a slow sink 
in order to force buffering. The read path didn't have this problem. The memory 
consumption stayed the same regardless of progress.


  was:
When dong "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM 
if the file size is large. When I tested, 20MB files were fine, but 200MB 
didn't work.  

I also tried reading a large file by issuing "-cat" and piping to a slow sink 
in order to force buffering. The read path didn't have this problem. The memory 
consumption stayed the same regardless of progress.



> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
>
> When doing "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes 
> OOM if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-20 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419182#comment-13419182
 ] 

Kihwal Lee commented on HDFS-3696:
--

bq. The map heap
The max heap

For reading of 1G piping into a sink that is consuming data at 10 KB/s, VSZ 
stayed at 547420 KB and RSZ 87540 KB.  It doesn't go OOM but the VM size seems 
rather big.

> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
>
> When dong "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM 
> if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-20 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419179#comment-13419179
 ] 

Kihwal Lee commented on HDFS-3696:
--

The following stack trace is from doing {{copyFromLocal}} with 140MB file. The 
map heap is 1G (-Xmx1000m) in the client side.

{noformat}
$ hadoop fs -copyFromLocal /tmp/xxx140m 
webhdfs://my.server.blah:50070/user/kihwal/xxx
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2786)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
at sun.net.www.http.PosterOutputStream.write(PosterOutputStream.java:61)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
at 
org.apache.hadoop.fs.shell.CommandWithDestination.copyStreamToTarget(CommandWithDestination.java:240)
at 
org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:219)
at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:165)
at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:150)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
at 
org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:145)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
at 
org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:122)
at 
org.apache.hadoop.fs.shell.CopyCommands$Put.processArguments(CopyCommands.java:204)
at 
org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:254)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:304)
{noformat}

> FsShell put using WebHdfsFileSystem goes OOM when file size is big
> --
>
> Key: HDFS-3696
> URL: https://issues.apache.org/jira/browse/HDFS-3696
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Kihwal Lee
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
>
> When dong "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM 
> if the file size is large. When I tested, 20MB files were fine, but 200MB 
> didn't work.  
> I also tried reading a large file by issuing "-cat" and piping to a slow sink 
> in order to force buffering. The read path didn't have this problem. The 
> memory consumption stayed the same regardless of progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3696) FsShell put using WebHdfsFileSystem goes OOM when file size is big

2012-07-20 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-3696:


 Summary: FsShell put using WebHdfsFileSystem goes OOM when file 
size is big
 Key: HDFS-3696
 URL: https://issues.apache.org/jira/browse/HDFS-3696
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Kihwal Lee
Priority: Critical
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha


When dong "fs -put" to a WebHdfsFileSystem (webhdfs://), the FsShell goes OOM 
if the file size is large. When I tested, 20MB files were fine, but 200MB 
didn't work.  

I also tried reading a large file by issuing "-cat" and piping to a slow sink 
in order to force buffering. The read path didn't have this problem. The memory 
consumption stayed the same regardless of progress.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3671) ByteRangeInputStream shouldn't require the content length header be present

2012-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419171#comment-13419171
 ] 

Daryn Sharp commented on HDFS-3671:
---

bq. Tested read/write a 3GB file using WebHDFS with trunk and branch-1. It 
worked fine and the content-length check is unnecessary.

I think trunk is chunking, and branch-1 sets the content-length.  I think Eli's 
concern is about some pre-1.x releases which did neither.

> ByteRangeInputStream shouldn't require the content length header be present
> ---
>
> Key: HDFS-3671
> URL: https://issues.apache.org/jira/browse/HDFS-3671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Blocker
> Attachments: h3671_20120717.patch, h3671_20120719.patch
>
>
> Per HDFS-3318 the content length header check breaks distcp compatibility 
> with previous releases (0.20.2 and earlier, and 0.21). Like branch-1 this 
> check should be lenient.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3680) Allows customized audit logging in HDFS FSNamesystem

2012-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419169#comment-13419169
 ] 

Daryn Sharp commented on HDFS-3680:
---

I quickly scanned the patch, and agree the implementation should probably use a 
chain of {{FSAccessLoggers}} instead of hardcoding log4j + one more.  OTOH, I 
think implementing additional loggers is fraught with peril.  For instance, it 
makes it too easy for someone to write a custom logger which imposes an 
artificial performance penalty (like blocking on a db, instead of queueing 
message for another thread to update in the db).  Sub-par loggers may be 
attributed as NN performance problems which may tarnish hadoop's reputation...


> Allows customized audit logging in HDFS FSNamesystem
> 
>
> Key: HDFS-3680
> URL: https://issues.apache.org/jira/browse/HDFS-3680
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Marcelo Vanzin
>Assignee: Marcelo Vanzin
>Priority: Minor
> Attachments: accesslogger-v1.patch, accesslogger-v2.patch
>
>
> Currently, FSNamesystem writes audit logs to a logger; that makes it easy to 
> get audit logs in some log file. But it makes it kinda tricky to store audit 
> logs in any other way (let's say a database), because it would require the 
> code to implement a log appender (and thus know what logging system is 
> actually being used underneath the façade), and parse the textual log message 
> generated by FSNamesystem.
> I'm attaching a patch that introduces a cleaner interface for this use case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3675) libhdfs: follow documented return codes

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419162#comment-13419162
 ] 

Hudson commented on HDFS-3675:
--

Integrated in Hadoop-Mapreduce-trunk #1142 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1142/])
HDFS-3675. libhdfs: follow documented return codes. Contributed by Colin 
Patrick McCabe (Revision 1363459)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363459
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c


> libhdfs: follow documented return codes
> ---
>
> Key: HDFS-3675
> URL: https://issues.apache.org/jira/browse/HDFS-3675
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs
>Affects Versions: 2.0.1-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3675.001.patch, HDFS-3675.002.patch
>
>
> libhdfs should follow its own documentation for return codes.  This means 
> always setting errno, and in most cases returning -1 (not some other value) 
> on error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1249) with fuse-dfs, chown which only has owner (or only group) argument fails with Input/output error.

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419160#comment-13419160
 ] 

Hudson commented on HDFS-1249:
--

Integrated in Hadoop-Mapreduce-trunk #1142 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1142/])
HDFS-1249. With fuse-dfs, chown which only has owner (or only group) 
argument fails with Input/output error. Contributed by Colin Patrick McCabe 
(Revision 1363466)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363466
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/fuse_impls_chown.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c


> with fuse-dfs, chown which only has owner (or only group) argument fails with 
> Input/output error.
> -
>
> Key: HDFS-1249
> URL: https://issues.apache.org/jira/browse/HDFS-1249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 0.20.1, 0.20.2
> Environment: x86 linux (ubuntu 10.04)
>Reporter: matsusaka kentaro
>Assignee: Colin Patrick McCabe
>Priority: Minor
>  Labels: fuse
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-1249.001.patch, HDFS-1249.002.patch, HDFS1249.1
>
>
> with fuse-dfs, chown which only has owner (or only group) argument fails with 
> Input/output error.
> --
> /mnt/hdfs/tmp# chown root file1
> chown: changing ownership of `file1': Input/output error
> /mnt/hdfs/tmp# chown root:root file1
> /mnt/hdfs/tmp# chown :root file1
> chown: changing group of `file1': Input/output error
> --
> I think it should be treated as unchanged for missing part(owner or group) 
> instead of returning an error.
> I took fuse_dfs log and it is saying
> --
> unique: 25, opcode: SETATTR (4), nodeid: 14, insize: 128
> chown /tmp/file1 0 4294967295
> could not lookup group -1
>unique: 25, error: -5 (Input/output error), outsize: 16
> unique: 26, opcode: SETATTR (4), nodeid: 14, insize: 128
> chown /tmp/file1 0 0
> getattr /tmp/file1
>unique: 26, success, outsize: 120
> unique: 27, opcode: SETATTR (4), nodeid: 14, insize: 128
> chown /tmp/file1 4294967295 0
> could not lookup userid -1
>unique: 27, error: -5 (Input/output error), outsize: 16
> --
> therefore this should happen because dfs_chown() in 
> src/contrib/fuse-dfs/src/fuse_impls_chown.c has following
> --
> ...
>   user = getUsername(uid);
>   if (NULL == user) {
> syslog(LOG_ERR,"Could not lookup the user id string %d\n",(int)uid); 
> fprintf(stderr, "could not lookup userid %d\n", (int)uid); 
> ret = -EIO;
>   }
>   if (0 == ret) {
> group = getGroup(gid);
> if (group == NULL) {
>   syslog(LOG_ERR,"Could not lookup the group id string %d\n",(int)gid); 
>   fprintf(stderr, "could not lookup group %d\n", (int)gid); 
>   ret = -EIO;
> } 
>   }
> ...
> --
> but actually, hdfsChown() in src/c++/libhdfs/hdfs.c has this
> --
> ...
> if (owner == NULL && group == NULL) {
>   fprintf(stderr, "Both owner and group cannot be null in chown");
>   errno = EINVAL;
>   return -1;
> }
> ...
> --
> and also, setOwner seems allowing NULL
> --
> username - If it is null, the original username remains unchanged.
> groupname - If it is null, the original groupname remains unchanged.
> --
> according to the api document.
> therefore, I think fuse_impls_chown.c should not treat only user(or only 
> group) lookup fail as an error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3673) libhdfs: fix some compiler warnings

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419158#comment-13419158
 ] 

Hudson commented on HDFS-3673:
--

Integrated in Hadoop-Mapreduce-trunk #1142 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1142/])
HDFS-3673. libhdfs: fix some compiler warnings. Contributed by Colin 
Patrick McCabe (Revision 1363457)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363457
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs_test.h
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/jni_helper.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/test_libhdfs_read.c
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c


> libhdfs: fix some compiler warnings
> ---
>
> Key: HDFS-3673
> URL: https://issues.apache.org/jira/browse/HDFS-3673
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.0-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3673.001.patch, HDFS-3673.002.patch, 
> HDFS-3673.003.patch
>
>
> Fix some compiler warnings.  Some of these are misuses of fprintf (forgetting 
> the FILE* parameter), const mismatch warnings, format specifier warnings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3690) BlockPlacementPolicyDefault incorrectly casts LOG

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419157#comment-13419157
 ] 

Hudson commented on HDFS-3690:
--

Integrated in Hadoop-Mapreduce-trunk #1142 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1142/])
HDFS-3690. BlockPlacementPolicyDefault incorrectly casts LOG. Contributed 
by Eli Collins (Revision 1363576)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363576
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> BlockPlacementPolicyDefault incorrectly casts LOG
> -
>
> Key: HDFS-3690
> URL: https://issues.apache.org/jira/browse/HDFS-3690
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 2.1.0-alpha
>
> Attachments: hdfs-3690.txt
>
>
> The hadoop-tools tests, eg TestCopyListing, fails with
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.commons.logging.impl.SimpleLog cannot be cast to 
> org.apache.commons.logging.impl.Log4JLogger
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.(BlockPlacementPolicyDefault.java:55)
> {noformat}
> caused by this cast
> {code}
>   private static final String enableDebugLogging =
> "For more information, please enable DEBUG log level on "
> + ((Log4JLogger)LOG).getLogger().getName();
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution

2012-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419155#comment-13419155
 ] 

Daryn Sharp commented on HDFS-2617:
---

I would prefer compatibility by default, but a flag isn't the end of the world.

Can we pretty please merge this into branch-2?  I know that's an unpopular 
position, but we require at least client-side kssl compat on 2.x for hftp else 
we are going to have a _very hard time_ migrating data from earlier grids.  
Given recent webhdfs jiras, I think it's safe to say it's not yet hardened 
enough to be suitable for mission-critical production environments.  I'm 
confident webhdfs will be shored up in 2.x, so kssl compat can be dropped in 
future releases.

> Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
> --
>
> Key: HDFS-2617
> URL: https://issues.apache.org/jira/browse/HDFS-2617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: security
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 1.2.0, 2.1.0-alpha
>
> Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, 
> HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, HDFS-2617-branch-1.patch, 
> HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, 
> HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch
>
>
> The current approach to secure and authenticate nn web services is based on 
> Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now 
> that we have one, we can get rid of the non-standard KSSL and use SPNEGO 
> throughout.  This will simplify setup and configuration.  Also, Kerberized 
> SSL is a non-standard approach with its own quirks and dark corners 
> (HDFS-2386).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3667) Add retry support to WebHdfsFileSystem

2012-07-20 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419136#comment-13419136
 ] 

Daryn Sharp commented on HDFS-3667:
---

It's just that it touched more than I thought it would.  I assumed the url 
opener would just have reattempts so it'd be a tiny patch.  I'll try to review 
unless someone else beats me.

On a side note, I think this is touching code introduced by HA, so it may be a 
challenge to merge into 23?

> Add retry support to WebHdfsFileSystem
> --
>
> Key: HDFS-3667
> URL: https://issues.apache.org/jira/browse/HDFS-3667
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3667_20120718.patch
>
>
> DFSClient (i.e. DistributedFileSystem) has a configurable retry policy and it 
> retries on exceptions such as connection failure, safemode.  
> WebHdfsFileSystem should have similar retry support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3690) BlockPlacementPolicyDefault incorrectly casts LOG

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419107#comment-13419107
 ] 

Hudson commented on HDFS-3690:
--

Integrated in Hadoop-Hdfs-trunk #1110 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1110/])
HDFS-3690. BlockPlacementPolicyDefault incorrectly casts LOG. Contributed 
by Eli Collins (Revision 1363576)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363576
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> BlockPlacementPolicyDefault incorrectly casts LOG
> -
>
> Key: HDFS-3690
> URL: https://issues.apache.org/jira/browse/HDFS-3690
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Blocker
> Fix For: 2.1.0-alpha
>
> Attachments: hdfs-3690.txt
>
>
> The hadoop-tools tests, eg TestCopyListing, fails with
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.commons.logging.impl.SimpleLog cannot be cast to 
> org.apache.commons.logging.impl.Log4JLogger
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.(BlockPlacementPolicyDefault.java:55)
> {noformat}
> caused by this cast
> {code}
>   private static final String enableDebugLogging =
> "For more information, please enable DEBUG log level on "
> + ((Log4JLogger)LOG).getLogger().getName();
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever

2012-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419093#comment-13419093
 ] 

Hudson commented on HDFS-3646:
--

Integrated in Hadoop-Hdfs-0.23-Build #319 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/319/])
HDFS-3646. LeaseRenewer can hold reference to inactive DFSClient instances 
forever (Kihwal Lee via daryn) (Revision 1363368)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1363368
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/LeaseRenewer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSClientAdapter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend4.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLease.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery2.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRenewer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadWhileWriting.java


> LeaseRenewer can hold reference to inactive DFSClient instances forever
> ---
>
> Key: HDFS-3646
> URL: https://issues.apache.org/jira/browse/HDFS-3646
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3646-branch-23.patch.txt, 
> hdfs-3646-branch-23.patch.txt, hdfs-3646.patch, hdfs-3646.patch.txt, 
> hdfs-3646.patch.txt
>
>
> If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the 
> reference to a {{DFSClient}} instance in {{dfsclients}} forever. This 
> prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage 
> collected, leading to memory leak.
> {{LeaseRenewer}} should remove the reference after some delay, if a 
> {{DFSClient}} instance no longer has active streams.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3671) ByteRangeInputStream shouldn't require the content length header be present

2012-07-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419008#comment-13419008
 ] 

Hadoop QA commented on HDFS-3671:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537309/h3671_20120719.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2876//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2876//console

This message is automatically generated.

> ByteRangeInputStream shouldn't require the content length header be present
> ---
>
> Key: HDFS-3671
> URL: https://issues.apache.org/jira/browse/HDFS-3671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Blocker
> Attachments: h3671_20120717.patch, h3671_20120719.patch
>
>
> Per HDFS-3318 the content length header check breaks distcp compatibility 
> with previous releases (0.20.2 and earlier, and 0.21). Like branch-1 this 
> check should be lenient.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3671) ByteRangeInputStream shouldn't require the content length header be present

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418981#comment-13418981
 ] 

Eli Collins commented on HDFS-3671:
---

Checkout my latest comment on HDFS-3577, a 3gb read/write may work but distcp 
still fails.

> ByteRangeInputStream shouldn't require the content length header be present
> ---
>
> Key: HDFS-3671
> URL: https://issues.apache.org/jira/browse/HDFS-3671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Priority: Blocker
> Attachments: h3671_20120717.patch, h3671_20120719.patch
>
>
> Per HDFS-3318 the content length header check breaks distcp compatibility 
> with previous releases (0.20.2 and earlier, and 0.21). Like branch-1 this 
> check should be lenient.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB

2012-07-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418974#comment-13418974
 ] 

Eli Collins commented on HDFS-3577:
---

Hey Nicholas,

Did you test this with distcp? Trying to distcp from a recent trunk build with 
this change still fails with *Content-Length header is missing*. Hadoop fs -get 
using webhdfs with the same file works.

{noformat}
12/07/19 23:56:43 INFO mapreduce.Job: Task Id : 
attempt_1342766959778_0002_m_00_0, Status : FAILED
Error: java.io.IOException: File copy failed: 
webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso --> 
hdfs://localhost:8020/user/eli/data4/data1/big.iso
at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:154)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:149)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying 
webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso to 
hdfs://localhost:8020/user/eli/data4/data1/big.iso
at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
... 10 more
Caused by: 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: 
java.io.IOException: Content-Length header is missing
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:201)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:167)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:112)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:90)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:71)
at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
... 11 more
Caused by: java.io.IOException: Content-Length header is missing
at 
org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:125)
at 
org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103)
at 
org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:158)
at java.io.DataInputStream.read(DataInputStream.java:132)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at java.io.FilterInputStream.read(FilterInputStream.java:90)
at 
org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:70)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:198)
... 16 more
{noformat}

> WebHdfsFileSystem can not read files larger than 24KB
> -
>
> Key: HDFS-3577
> URL: https://issues.apache.org/jira/browse/HDFS-3577
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Alejandro Abdelnur
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Blocker
> Fix For: 0.23.3, 2.1.0-alpha
>
> Attachments: h3577_20120705.patch, h3577_20120708.patch, 
> h3577_20120714.patch, h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running 
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of 
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with 
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to 
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length* 
> header, but when using chunked transfer encoding the *Content-Length* header 
> is not present and  the *URLOpener.openInputStream()* method thrown an 
> exception.

--
This message is automatically generated by JIRA.
If you think it wa