[jira] [Commented] (HADOOP-9221) Convert remaining xdocs to APT

2013-02-01 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569093#comment-13569093
 ] 

Andy Isaacson commented on HADOOP-9221:
---

bq. Are all the new docs linked from somewhere? 

I've added the links in HDFS-4460.

> Convert remaining xdocs to APT
> --
>
> Key: HADOOP-9221
> URL: https://issues.apache.org/jira/browse/HADOOP-9221
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Fix For: 2.0.3-alpha, 0.23.7
>
> Attachments: hadoop9221-1.txt, hadoop9221-2.txt, hadoop9221.txt
>
>
> The following Forrest XML documents are still present in trunk:
> {noformat}
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
> {noformat}
> Several of them are leftover cruft, and all of them are out of date to one 
> degree or another, but it's easiest to simply convert them all to APT and 
> move forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9221) Convert remaining xdocs to APT

2013-01-29 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9221:
--

Attachment: hadoop9221-2.txt

Update patch after HADOOP-9190.

> Convert remaining xdocs to APT
> --
>
> Key: HADOOP-9221
> URL: https://issues.apache.org/jira/browse/HADOOP-9221
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hadoop9221-1.txt, hadoop9221-2.txt, hadoop9221.txt
>
>
> The following Forrest XML documents are still present in trunk:
> {noformat}
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
> {noformat}
> Several of them are leftover cruft, and all of them are out of date to one 
> degree or another, but it's easiest to simply convert them all to APT and 
> move forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9253) Capture ulimit info in the logs at service start time

2013-01-28 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564996#comment-13564996
 ] 

Andy Isaacson commented on HADOOP-9253:
---

bq. {{head "$log"}} Is something that existed before and hence i left it as is.

Previously it made sense since {{$log}} was probably only a few lines long.  
Now that your code is changing {{$log}} to be guaranteed to be more than 10 
lines long, please adjust the {{head}} command as appropriate.

The reason for using {{head}} here is, there may be a few lines of output in 
the log that would be helpful for debugging.  But it's also possible that the 
log has thousands of lines of errors which would not be helpful.  With head you 
get the first few errors and avoid potentially dumping MBs of errors to the 
terminal.  Please preserve that behavior.  Since you're adding 17 lines of 
output, perhaps add 17 lines to the number that {{head}} will print. 

> Capture ulimit info in the logs at service start time
> -
>
> Key: HADOOP-9253
> URL: https://issues.apache.org/jira/browse/HADOOP-9253
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 1.1.1, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HADOOP-9253.branch-1.patch, HADOOP-9253.branch-1.patch, 
> HADOOP-9253.patch
>
>
> output of ulimit -a is helpful while debugging issues on the system.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9253) Capture ulimit info in the logs at service start time

2013-01-28 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564968#comment-13564968
 ] 

Andy Isaacson commented on HADOOP-9253:
---

bq. I am not quite sure i understand what you are referring to. The log file 
that is being printed to the console should never have any left over contents 
as start commands overwrites it.

Your patch has:
{noformat}
+++ hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh
@@ -154,7 +154,11 @@ case $startStop in
   ;;
 esac
 echo $! > $pid
-sleep 1; head "$log"
+sleep 1
+# capture the ulimit output
+echo "ulimit -a" >> $log
+ulimit -a >> $log 2>&1
+head "$log"
{noformat}

The file {{$log}} might be empty, or it might have some content from the 
'nohup' command line a few lines up.  Regardless, your patch then adds two 
commands (echo, then ulimit) that {{>>}} append to {{$log}}. Together those 
will append 17 lines of output to {{$log}}.

Then you use {{head}} to print out the first 10 lines of {{$log}}.  These 10 
lines might include some errors or warning messages from nohup, and then a few 
lines of the 17 that were printed by ulimit.

So I have two feedback items: 1. it's unclear why to write {{ulimit}} to 
{{$log}} at all. Why not just write ulimit output directly to console? 2. If 
writing ulimit to $log, why use {{head}} to truncate the output?  At least 
change the {{head}} command to print the entire expected output, {{head -20}} 
or similar.

> Capture ulimit info in the logs at service start time
> -
>
> Key: HADOOP-9253
> URL: https://issues.apache.org/jira/browse/HADOOP-9253
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 1.1.1, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HADOOP-9253.branch-1.patch, HADOOP-9253.branch-1.patch, 
> HADOOP-9253.patch
>
>
> output of ulimit -a is helpful while debugging issues on the system.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8427) Convert Forrest docs to APT, incremental

2013-01-28 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564802#comment-13564802
 ] 

Andy Isaacson commented on HADOOP-8427:
---

bq. Is there a jira that tracks the remaining work? Noticed there's still an 
xdocs directory.

HADOOP-9190
HADOOP-9221

> Convert Forrest docs to APT, incremental
> 
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Fix For: 2.0.3-alpha, 0.23.6
>
> Attachments: hadoop8427-1.txt, hadoop8427-3.txt, hadoop8427-4.txt, 
> hadoop8427-5.txt, HADOOP-8427.sh, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9253) Capture ulimit info in the logs at service start time

2013-01-28 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564628#comment-13564628
 ] 

Andy Isaacson commented on HADOOP-9253:
---

It's pretty odd to append to {{$log}} using {{>>}} and then print only the 
beginning of {{$log}} using {{head}}.  This results in the output duplicating 
the previous stanza's leftover contents of {{$log}}.

> Capture ulimit info in the logs at service start time
> -
>
> Key: HADOOP-9253
> URL: https://issues.apache.org/jira/browse/HADOOP-9253
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 1.1.1, 2.0.2-alpha
>Reporter: Arpit Gupta
>Assignee: Arpit Gupta
> Attachments: HADOOP-9253.branch-1.patch, HADOOP-9253.patch
>
>
> output of ulimit -a is helpful while debugging issues on the system.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9190) packaging docs is broken

2013-01-18 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9190:
--

Attachment: hadoop9190-1.txt

New version of patch removing Forrest from BUILDING.txt

In addition the following files can be deleted:
{noformat}
svn rm hadoop-common-project/hadoop-common/src/main/docs/forrest.properties
svn rm hadoop-hdfs-project/hadoop-hdfs/src/main/docs/forrest.properties
{noformat}

> packaging docs is broken
> 
>
> Key: HADOOP-9190
> URL: https://issues.apache.org/jira/browse/HADOOP-9190
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Thomas Graves
>Assignee: Andy Isaacson
> Attachments: hadoop9190-1.txt, hadoop9190.txt
>
>
> It looks like after the docs got converted to apt format in HADOOP-8427, mvn 
> site package -Pdist,docs no longer works.   If you run mvn site or mvn 
> site:stage by itself they work fine, its when you go to package it that it 
> breaks.
> The error is with broken links, here is one of them:
> broken-links>
>message="hadoop-common-project/hadoop-common/target/docs-src/src/documentation/content/xdocs/HttpAuthentication.xml
>  (No such file or directory)" uri="HttpAuthentication.html">
> 
> 
> 
> 
> 
> 
> 
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9206) "Setting up a Single Node Cluster" instructions need improvement in 0.23.5/2.0.2-alpha branches

2013-01-18 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557723#comment-13557723
 ] 

Andy Isaacson commented on HADOOP-9206:
---

I've converted the xdoc to SingleNodeSetup.apt.vm in HADOOP-9221.

> "Setting up a Single Node Cluster" instructions need improvement in 
> 0.23.5/2.0.2-alpha branches
> ---
>
> Key: HADOOP-9206
> URL: https://issues.apache.org/jira/browse/HADOOP-9206
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Glen Mazza
>
> Hi, in contrast to the easy-to-follow 1.0.4 instructions 
> (http://hadoop.apache.org/docs/r1.0.4/single_node_setup.html) the 0.23.5 and 
> 2.0.2-alpha instructions 
> (http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-yarn/hadoop-yarn-site/SingleCluster.html)
>  need more clarification -- it seems to be written for people who already 
> know and understand hadoop.  In particular, these points need clarification:
> 1.) Text: "You should be able to obtain the MapReduce tarball from the 
> release."
> Question: What is the MapReduce tarball?  What is its name?  I don't see such 
> an object within the hadoop-0.23.5.tar.gz download.
> 2.) Quote: "NOTE: You will need protoc installed of version 2.4.1 or greater."
> Protoc doesn't have a website you can link to (it's just mentioned offhand 
> when you Google it) -- is it really the case today that Hadoop has a 
> dependency on such a minor project?  At any rate, if you can have a link of 
> where one goes to get/install Protoc that would be good.
> 3.) Quote: "Assuming you have installed hadoop-common/hadoop-hdfs and 
> exported $HADOOP_COMMON_HOME/$HADOOP_HDFS_HOME, untar hadoop mapreduce 
> tarball and set environment variable $HADOOP_MAPRED_HOME to the untarred 
> directory."
> I'm not sure what you mean by the forward slashes: hadoop-common/hadoop-hdfs 
> and $HADOOP_COMMON_HOME/$HADOOP_HDFS_HOME -- do you mean & (install both) or 
> *or* just install one of the two?  This needs clarification--please remove 
> the forward slash and replace it with what you're trying to say.  The 
> audience here is complete newbie and they've been brought to this page from 
> here: http://hadoop.apache.org/docs/r0.23.5/ (same with r2.0.2-alpha/) 
> (quote: "Getting Started - The Hadoop documentation includes the information 
> you need to get started using Hadoop. Begin with the Single Node Setup which 
> shows you how to set up a single-node Hadoop installation."), they've 
> downloaded hadoop-0.23.5.tar.gz and want to know what to do next.  Why are 
> there potentially two applications -- hadoop-common and hadoop-hdfs and not 
> just one?  (The download doesn't appear to have two separate apps) -- if 
> there is indeed just one app and we remove the other from the above text to 
> avoid confusion?
> Again, I just downloaded hadoop-0.23.5.tar.gz -- do I need to download more?  
> If so, let us know in the docs here.
> Also, the fragment: "Assuming you have installed 
> hadoop-common/hadoop-hdfs..."  No, I haven't, that's what *this* page is 
> supposed to explain to me how to do -- how do I install these two (or just 
> one of these two)?
> Also, what do I set $HADOOP_COMMON_HOME and/or $HADOOP_HDFS_HOME to?
> 4.) Quote: "NOTE: The following instructions assume you have hdfs running."  
> No, I don't--how do I do this?  Again, this page is supposed to teach me that.
> 5.) Quote: "To start the ResourceManager and NodeManager, you will have to 
> update the configs. Assuming your $HADOOP_CONF_DIR is the configuration 
> directory..."
> Could you clarify here what the "configuration directory" is, it doesn't 
> exist in the 0.23.5 download.  I just see 
> bin,etc,include,lib,libexec,sbin,share folders but no "conf" one.)
> 6.) Quote: "Assuming that the environment variables $HADOOP_COMMON_HOME, 
> $HADOOP_HDFS_HOME, $HADOO_MAPRED_HOME, $YARN_HOME, $JAVA_HOME and 
> $HADOOP_CONF_DIR have been set appropriately."
> We'll need to know what to set YARN_HOME to here.
> Thanks!
> Glen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9221) Convert remaining xdocs to APT

2013-01-18 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9221:
--

Attachment: hadoop9221-1.txt

Attaching git diff.  This version works around a Maven/APT bug, 
http://jira.codehaus.org/browse/DOXIASITETOOLS-68 that causes a NPE in Maven, 
and also fixes a bunch of formatting failures in the previous version.

> Convert remaining xdocs to APT
> --
>
> Key: HADOOP-9221
> URL: https://issues.apache.org/jira/browse/HADOOP-9221
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hadoop9221-1.txt, hadoop9221.txt
>
>
> The following Forrest XML documents are still present in trunk:
> {noformat}
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
> {noformat}
> Several of them are leftover cruft, and all of them are out of date to one 
> degree or another, but it's easiest to simply convert them all to APT and 
> move forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9221) Convert remaining xdocs to APT

2013-01-17 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9221:
--

Attachment: hadoop9221.txt

Attaching a {{git diff}} output showing file deletions et cetera.

Converted:
{noformat}
NativeLibraries.apt.vm
ServiceLevelAuth.apt.vm 
SingleNodeSetup.apt.vm
Superusers.apt.vm
FaultInjectFramework.apt.vm
HdfsEditsViewer.apt.vm
HdfsImageViewer.apt.vm
HdfsPermissionsGuide.apt.vm
HdfsQuotaAdminGuide.apt.vm
HdfsUserGuide.apt.vm
Hftp.apt.vm
LibHdfs.apt.vm
SLGUserGuide.apt.vm
{noformat}

Discarded:

{{deployment_layout.xml}} does not seem to have any relevant information for 
users; it documented a planned reorganization which seems to have been 
subsumed/superceded by Bigtop.  Deleted.

> Convert remaining xdocs to APT
> --
>
> Key: HADOOP-9221
> URL: https://issues.apache.org/jira/browse/HADOOP-9221
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hadoop9221.txt
>
>
> The following Forrest XML documents are still present in trunk:
> {noformat}
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
> {noformat}
> Several of them are leftover cruft, and all of them are out of date to one 
> degree or another, but it's easiest to simply convert them all to APT and 
> move forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9221) Convert remaining xdocs to APT

2013-01-17 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9221:
--

Affects Version/s: 3.0.0

> Convert remaining xdocs to APT
> --
>
> Key: HADOOP-9221
> URL: https://issues.apache.org/jira/browse/HADOOP-9221
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>
> The following Forrest XML documents are still present in trunk:
> {noformat}
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
> {noformat}
> Several of them are leftover cruft, and all of them are out of date to one 
> degree or another, but it's easiest to simply convert them all to APT and 
> move forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9221) Convert remaining xdocs to APT

2013-01-17 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9221:
--

Affects Version/s: 2.0.2-alpha

> Convert remaining xdocs to APT
> --
>
> Key: HADOOP-9221
> URL: https://issues.apache.org/jira/browse/HADOOP-9221
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>
> The following Forrest XML documents are still present in trunk:
> {noformat}
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
> {noformat}
> Several of them are leftover cruft, and all of them are out of date to one 
> degree or another, but it's easiest to simply convert them all to APT and 
> move forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HADOOP-9221) Convert remaining xdocs to APT

2013-01-17 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson reassigned HADOOP-9221:
-

Assignee: Andy Isaacson

> Convert remaining xdocs to APT
> --
>
> Key: HADOOP-9221
> URL: https://issues.apache.org/jira/browse/HADOOP-9221
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>
> The following Forrest XML documents are still present in trunk:
> {noformat}
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
> hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
> hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
> {noformat}
> Several of them are leftover cruft, and all of them are out of date to one 
> degree or another, but it's easiest to simply convert them all to APT and 
> move forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9221) Convert remaining xdocs to APT

2013-01-17 Thread Andy Isaacson (JIRA)
Andy Isaacson created HADOOP-9221:
-

 Summary: Convert remaining xdocs to APT
 Key: HADOOP-9221
 URL: https://issues.apache.org/jira/browse/HADOOP-9221
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Andy Isaacson


The following Forrest XML documents are still present in trunk:
{noformat}
hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/Superusers.xml
hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/deployment_layout.xml
hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/native_libraries.xml
hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/service_level_auth.xml
hadoop-common-project/hadoop-common/src/main/docs/src/documentation/content/xdocs/single_node_setup.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/SLG_user_guide.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/faultinject_framework.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_editsviewer.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_imageviewer.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_permissions_guide.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_quota_admin_guide.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_user_guide.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hftp.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/libhdfs.xml
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/webhdfs.xml
{noformat}

Several of them are leftover cruft, and all of them are out of date to one 
degree or another, but it's easiest to simply convert them all to APT and move 
forward with editing thereafter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9190) packaging docs is broken

2013-01-17 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9190:
--

Attachment: hadoop9190.txt

Attaching minimal patch which removes Forrest from the build process.  While 
Forrest does do some useful things in the build, its primary output is built 
documents which we do not ship.

In a separate JIRA I'll convert the rest of the docs from Forrest to APT.

> packaging docs is broken
> 
>
> Key: HADOOP-9190
> URL: https://issues.apache.org/jira/browse/HADOOP-9190
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Thomas Graves
>Assignee: Andy Isaacson
> Attachments: hadoop9190.txt
>
>
> It looks like after the docs got converted to apt format in HADOOP-8427, mvn 
> site package -Pdist,docs no longer works.   If you run mvn site or mvn 
> site:stage by itself they work fine, its when you go to package it that it 
> breaks.
> The error is with broken links, here is one of them:
> broken-links>
>message="hadoop-common-project/hadoop-common/target/docs-src/src/documentation/content/xdocs/HttpAuthentication.xml
>  (No such file or directory)" uri="HttpAuthentication.html">
> 
> 
> 
> 
> 
> 
> 
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9190) packaging docs is broken

2013-01-17 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9190:
--

Status: Patch Available  (was: Open)

> packaging docs is broken
> 
>
> Key: HADOOP-9190
> URL: https://issues.apache.org/jira/browse/HADOOP-9190
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Thomas Graves
>Assignee: Andy Isaacson
> Attachments: hadoop9190.txt
>
>
> It looks like after the docs got converted to apt format in HADOOP-8427, mvn 
> site package -Pdist,docs no longer works.   If you run mvn site or mvn 
> site:stage by itself they work fine, its when you go to package it that it 
> breaks.
> The error is with broken links, here is one of them:
> broken-links>
>message="hadoop-common-project/hadoop-common/target/docs-src/src/documentation/content/xdocs/HttpAuthentication.xml
>  (No such file or directory)" uri="HttpAuthentication.html">
> 
> 
> 
> 
> 
> 
> 
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9193) hadoop script can inadvertently expand wildcard arguments when delegating to hdfs script

2013-01-15 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554284#comment-13554284
 ] 

Andy Isaacson commented on HADOOP-9193:
---

The TestZKFailoverController failure is unrelated.  There does not seem to be 
any existing test code for the {{hadoop dfs}} shell scripts, so adding tests 
for this condition is challenging.

> hadoop script can inadvertently expand wildcard arguments when delegating to 
> hdfs script
> 
>
> Key: HADOOP-9193
> URL: https://issues.apache.org/jira/browse/HADOOP-9193
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hadoop9193.diff
>
>
> The hadoop front-end script will print a deprecation warning and defer to the 
> hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
> appears as an argument then it can be inadvertently expanded by the shell to 
> match a local filesystem path before being sent to the hdfs script, which can 
> be very confusing to the end user.
> For example, the following two commands usually perform very different 
> things, even though they should be equivalent:
> {code}
> hadoop fs -ls /tmp/\*
> hadoop dfs -ls /tmp/\*
> {code}
> The former lists everything in the default filesystem under /tmp, while the 
> latter expands /tmp/\* into everything in the *local* filesystem under /tmp 
> and passes those as arguments to try to list in the default filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9210) bad mirror in download list

2013-01-15 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554269#comment-13554269
 ] 

Andy Isaacson commented on HADOOP-9210:
---

>From IRC, "hadoop can't do anything about it, and we have an automated system 
>that detects+fixes it".

> bad mirror in download list
> ---
>
> Key: HADOOP-9210
> URL: https://issues.apache.org/jira/browse/HADOOP-9210
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Reporter: Andy Isaacson
>Priority: Minor
>
> The http://hadoop.apache.org/releases.html page links to 
> http://www.apache.org/dyn/closer.cgi/hadoop/common/ which provides a list of 
> mirrors.  The first one on the list (for me) is 
> http://www.alliedquotes.com/mirrors/apache/hadoop/common/ which is 404.
> I checked the rest of the mirrors in the list and only alliedquotes is 404.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9210) bad mirror in download list

2013-01-15 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9210:
--

Release Note:   (was: From IRC, "hadoop can't do anything about it, and we 
have an automated system that detects+fixes it".)

> bad mirror in download list
> ---
>
> Key: HADOOP-9210
> URL: https://issues.apache.org/jira/browse/HADOOP-9210
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Reporter: Andy Isaacson
>Priority: Minor
>
> The http://hadoop.apache.org/releases.html page links to 
> http://www.apache.org/dyn/closer.cgi/hadoop/common/ which provides a list of 
> mirrors.  The first one on the list (for me) is 
> http://www.alliedquotes.com/mirrors/apache/hadoop/common/ which is 404.
> I checked the rest of the mirrors in the list and only alliedquotes is 404.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HADOOP-9210) bad mirror in download list

2013-01-15 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson resolved HADOOP-9210.
---

  Resolution: Not A Problem
Release Note: From IRC, "hadoop can't do anything about it, and we have an 
automated system that detects+fixes it".

> bad mirror in download list
> ---
>
> Key: HADOOP-9210
> URL: https://issues.apache.org/jira/browse/HADOOP-9210
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Reporter: Andy Isaacson
>Priority: Minor
>
> The http://hadoop.apache.org/releases.html page links to 
> http://www.apache.org/dyn/closer.cgi/hadoop/common/ which provides a list of 
> mirrors.  The first one on the list (for me) is 
> http://www.alliedquotes.com/mirrors/apache/hadoop/common/ which is 404.
> I checked the rest of the mirrors in the list and only alliedquotes is 404.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9210) bad mirror in download list

2013-01-14 Thread Andy Isaacson (JIRA)
Andy Isaacson created HADOOP-9210:
-

 Summary: bad mirror in download list
 Key: HADOOP-9210
 URL: https://issues.apache.org/jira/browse/HADOOP-9210
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation
Reporter: Andy Isaacson
Priority: Minor


The http://hadoop.apache.org/releases.html page links to 
http://www.apache.org/dyn/closer.cgi/hadoop/common/ which provides a list of 
mirrors.  The first one on the list (for me) is 
http://www.alliedquotes.com/mirrors/apache/hadoop/common/ which is 404.

I checked the rest of the mirrors in the list and only alliedquotes is 404.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9206) "Setting up a Single Node Cluster" instructions need improvement in 0.23.5/2.0.2-alpha branches

2013-01-14 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13553092#comment-13553092
 ] 

Andy Isaacson commented on HADOOP-9206:
---

Note that the docs are being converted from XDOC to APT; see HADOOP-8427 and 
HADOOP-9190. So please convert {{single_node_setup.xml}} to APT before editing 
the content, if at all possible.

> "Setting up a Single Node Cluster" instructions need improvement in 
> 0.23.5/2.0.2-alpha branches
> ---
>
> Key: HADOOP-9206
> URL: https://issues.apache.org/jira/browse/HADOOP-9206
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Glen Mazza
>
> Hi, in contrast to the easy-to-follow 1.0.4 instructions 
> (http://hadoop.apache.org/docs/r1.0.4/single_node_setup.html) the 0.23.5 and 
> 2.0.2-alpha instructions 
> (http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-yarn/hadoop-yarn-site/SingleCluster.html)
>  need more clarification -- it seems to be written for people who already 
> know and understand hadoop.  In particular, these points need clarification:
> 1.) Text: "You should be able to obtain the MapReduce tarball from the 
> release."
> Question: What is the MapReduce tarball?  What is its name?  I don't see such 
> an object within the hadoop-0.23.5.tar.gz download.
> 2.) Quote: "NOTE: You will need protoc installed of version 2.4.1 or greater."
> Protoc doesn't have a website you can link to (it's just mentioned offhand 
> when you Google it) -- is it really the case today that Hadoop has a 
> dependency on such a minor project?  At any rate, if you can have a link of 
> where one goes to get/install Protoc that would be good.
> 3.) Quote: "Assuming you have installed hadoop-common/hadoop-hdfs and 
> exported $HADOOP_COMMON_HOME/$HADOOP_HDFS_HOME, untar hadoop mapreduce 
> tarball and set environment variable $HADOOP_MAPRED_HOME to the untarred 
> directory."
> I'm not sure what you mean by the forward slashes: hadoop-common/hadoop-hdfs 
> and $HADOOP_COMMON_HOME/$HADOOP_HDFS_HOME -- do you mean & (install both) or 
> *or* just install one of the two?  This needs clarification--please remove 
> the forward slash and replace it with what you're trying to say.  The 
> audience here is complete newbie and they've been brought to this page from 
> here: http://hadoop.apache.org/docs/r0.23.5/ (same with r2.0.2-alpha/) 
> (quote: "Getting Started - The Hadoop documentation includes the information 
> you need to get started using Hadoop. Begin with the Single Node Setup which 
> shows you how to set up a single-node Hadoop installation."), they've 
> downloaded hadoop-0.23.5.tar.gz and want to know what to do next.  Why are 
> there potentially two applications -- hadoop-common and hadoop-hdfs and not 
> just one?  (The download doesn't appear to have two separate apps) -- if 
> there is indeed just one app and we remove the other from the above text to 
> avoid confusion?
> Again, I just downloaded hadoop-0.23.5.tar.gz -- do I need to download more?  
> If so, let us know in the docs here.
> Also, the fragment: "Assuming you have installed 
> hadoop-common/hadoop-hdfs..."  No, I haven't, that's what *this* page is 
> supposed to explain to me how to do -- how do I install these two (or just 
> one of these two)?
> Also, what do I set $HADOOP_COMMON_HOME and/or $HADOOP_HDFS_HOME to?
> 4.) Quote: "NOTE: The following instructions assume you have hdfs running."  
> No, I don't--how do I do this?  Again, this page is supposed to teach me that.
> 5.) Quote: "To start the ResourceManager and NodeManager, you will have to 
> update the configs. Assuming your $HADOOP_CONF_DIR is the configuration 
> directory..."
> Could you clarify here what the "configuration directory" is, it doesn't 
> exist in the 0.23.5 download.  I just see 
> bin,etc,include,lib,libexec,sbin,share folders but no "conf" one.)
> 6.) Quote: "Assuming that the environment variables $HADOOP_COMMON_HOME, 
> $HADOOP_HDFS_HOME, $HADOO_MAPRED_HOME, $YARN_HOME, $JAVA_HOME and 
> $HADOOP_CONF_DIR have been set appropriately."
> We'll need to know what to set YARN_HOME to here.
> Thanks!
> Glen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9193) hadoop script can inadvertently expand wildcard arguments when delegating to hdfs script

2013-01-09 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549149#comment-13549149
 ] 

Andy Isaacson commented on HADOOP-9193:
---

Oops, sorry [~robsparker], I didn't mean to steal this jira from you (my 
display didn't even indicate that you'd assigned it).  Feel free to steal back 
if you want.

> hadoop script can inadvertently expand wildcard arguments when delegating to 
> hdfs script
> 
>
> Key: HADOOP-9193
> URL: https://issues.apache.org/jira/browse/HADOOP-9193
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hadoop9193.diff
>
>
> The hadoop front-end script will print a deprecation warning and defer to the 
> hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
> appears as an argument then it can be inadvertently expanded by the shell to 
> match a local filesystem path before being sent to the hdfs script, which can 
> be very confusing to the end user.
> For example, the following two commands usually perform very different 
> things, even though they should be equivalent:
> {code}
> hadoop fs -ls /tmp/\*
> hadoop dfs -ls /tmp/\*
> {code}
> The former lists everything in the default filesystem under /tmp, while the 
> latter expands /tmp/\* into everything in the *local* filesystem under /tmp 
> and passes those as arguments to try to list in the default filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9193) hadoop script can inadvertently expand wildcard arguments when delegating to hdfs script

2013-01-09 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9193:
--

Status: Patch Available  (was: Open)

> hadoop script can inadvertently expand wildcard arguments when delegating to 
> hdfs script
> 
>
> Key: HADOOP-9193
> URL: https://issues.apache.org/jira/browse/HADOOP-9193
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 0.23.5, 2.0.2-alpha
>Reporter: Jason Lowe
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hadoop9193.diff
>
>
> The hadoop front-end script will print a deprecation warning and defer to the 
> hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
> appears as an argument then it can be inadvertently expanded by the shell to 
> match a local filesystem path before being sent to the hdfs script, which can 
> be very confusing to the end user.
> For example, the following two commands usually perform very different 
> things, even though they should be equivalent:
> {code}
> hadoop fs -ls /tmp/\*
> hadoop dfs -ls /tmp/\*
> {code}
> The former lists everything in the default filesystem under /tmp, while the 
> latter expands /tmp/\* into everything in the *local* filesystem under /tmp 
> and passes those as arguments to try to list in the default filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HADOOP-9193) hadoop script can inadvertently expand wildcard arguments when delegating to hdfs script

2013-01-09 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson reassigned HADOOP-9193:
-

Assignee: Andy Isaacson  (was: Robert Parker)

> hadoop script can inadvertently expand wildcard arguments when delegating to 
> hdfs script
> 
>
> Key: HADOOP-9193
> URL: https://issues.apache.org/jira/browse/HADOOP-9193
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hadoop9193.diff
>
>
> The hadoop front-end script will print a deprecation warning and defer to the 
> hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
> appears as an argument then it can be inadvertently expanded by the shell to 
> match a local filesystem path before being sent to the hdfs script, which can 
> be very confusing to the end user.
> For example, the following two commands usually perform very different 
> things, even though they should be equivalent:
> {code}
> hadoop fs -ls /tmp/\*
> hadoop dfs -ls /tmp/\*
> {code}
> The former lists everything in the default filesystem under /tmp, while the 
> latter expands /tmp/\* into everything in the *local* filesystem under /tmp 
> and passes those as arguments to try to list in the default filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9193) hadoop script can inadvertently expand wildcard arguments when delegating to hdfs script

2013-01-09 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9193:
--

Attachment: hadoop9193.diff

The wrapper script is clearly wrong using {{$*}} , unquoted even.  The correct 
construct is {{"$@"}}.  Attaching patch.

> hadoop script can inadvertently expand wildcard arguments when delegating to 
> hdfs script
> 
>
> Key: HADOOP-9193
> URL: https://issues.apache.org/jira/browse/HADOOP-9193
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Assignee: Robert Parker
>Priority: Minor
> Attachments: hadoop9193.diff
>
>
> The hadoop front-end script will print a deprecation warning and defer to the 
> hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
> appears as an argument then it can be inadvertently expanded by the shell to 
> match a local filesystem path before being sent to the hdfs script, which can 
> be very confusing to the end user.
> For example, the following two commands usually perform very different 
> things, even though they should be equivalent:
> {code}
> hadoop fs -ls /tmp/\*
> hadoop dfs -ls /tmp/\*
> {code}
> The former lists everything in the default filesystem under /tmp, while the 
> latter expands /tmp/\* into everything in the *local* filesystem under /tmp 
> and passes those as arguments to try to list in the default filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-9193) hadoop script can inadvertently expand wildcard arguments when delegating to hdfs script

2013-01-09 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-9193:
--

Description: 
The hadoop front-end script will print a deprecation warning and defer to the 
hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
appears as an argument then it can be inadvertently expanded by the shell to 
match a local filesystem path before being sent to the hdfs script, which can 
be very confusing to the end user.

For example, the following two commands usually perform very different things, 
even though they should be equivalent:
{code}
hadoop fs -ls /tmp/\*
hadoop dfs -ls /tmp/\*
{code}
The former lists everything in the default filesystem under /tmp, while the 
latter expands /tmp/\* into everything in the *local* filesystem under /tmp and 
passes those as arguments to try to list in the default filesystem.


  was:
The hadoop front-end script will print a deprecation warning and defer to the 
hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
appears as an argument then it can be inadvertently expanded by the shell to 
match a local filesystem path before being sent to the hdfs script, which can 
be very confusing to the end user.

For example, the following two commands usually perform very different things, 
even though they should be equivalent:

hadoop fs -ls /tmp/\*
hadoop dfs -ls /tmp/\*

The former lists everything in the default filesystem under /tmp, while the 
latter expands /tmp/\* into everything in the *local* filesystem under /tmp and 
passes those as arguments to try to list in the default filesystem.



> hadoop script can inadvertently expand wildcard arguments when delegating to 
> hdfs script
> 
>
> Key: HADOOP-9193
> URL: https://issues.apache.org/jira/browse/HADOOP-9193
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.0.2-alpha, 0.23.5
>Reporter: Jason Lowe
>Priority: Minor
>
> The hadoop front-end script will print a deprecation warning and defer to the 
> hdfs front-end script for certain commands, like fsck, dfs.  If a wildcard 
> appears as an argument then it can be inadvertently expanded by the shell to 
> match a local filesystem path before being sent to the hdfs script, which can 
> be very confusing to the end user.
> For example, the following two commands usually perform very different 
> things, even though they should be equivalent:
> {code}
> hadoop fs -ls /tmp/\*
> hadoop dfs -ls /tmp/\*
> {code}
> The former lists everything in the default filesystem under /tmp, while the 
> latter expands /tmp/\* into everything in the *local* filesystem under /tmp 
> and passes those as arguments to try to list in the default filesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HADOOP-9190) packaging docs is broken

2013-01-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson reassigned HADOOP-9190:
-

Assignee: Andy Isaacson

> packaging docs is broken
> 
>
> Key: HADOOP-9190
> URL: https://issues.apache.org/jira/browse/HADOOP-9190
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Thomas Graves
>Assignee: Andy Isaacson
>
> It looks like after the docs got converted to apt format in HADOOP-8427, mvn 
> site package -Pdist,docs no longer works.   If you run mvn site or mvn 
> site:stage by itself they work fine, its when you go to package it that it 
> breaks.
> The error is with broken links, here is one of them:
> broken-links>
>message="hadoop-common-project/hadoop-common/target/docs-src/src/documentation/content/xdocs/HttpAuthentication.xml
>  (No such file or directory)" uri="HttpAuthentication.html">
> 
> 
> 
> 
> 
> 
> 
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9190) packaging docs is broken

2013-01-08 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547506#comment-13547506
 ] 

Andy Isaacson commented on HADOOP-9190:
---

Since I broke it, I'll take a shot at fixing it.

> packaging docs is broken
> 
>
> Key: HADOOP-9190
> URL: https://issues.apache.org/jira/browse/HADOOP-9190
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Thomas Graves
>Assignee: Andy Isaacson
>
> It looks like after the docs got converted to apt format in HADOOP-8427, mvn 
> site package -Pdist,docs no longer works.   If you run mvn site or mvn 
> site:stage by itself they work fine, its when you go to package it that it 
> breaks.
> The error is with broken links, here is one of them:
> broken-links>
>message="hadoop-common-project/hadoop-common/target/docs-src/src/documentation/content/xdocs/HttpAuthentication.xml
>  (No such file or directory)" uri="HttpAuthentication.html">
> 
> 
> 
> 
> 
> 
> 
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8427) Convert Forrest docs to APT, incremental

2013-01-04 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543952#comment-13543952
 ] 

Andy Isaacson commented on HADOOP-8427:
---

bq. Committed to branch-2 after verifying the site builds correctly

Thanks!

So ... how do we get the built site installed on hadoop.apache.org?  Currently 
we have broken links like http://hadoop.apache.org/docs/r2.0.2-alpha/ -> 
http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-project-dist/hadoop-hdfs/hdfs_user_guide.html
 which is 404.

I guess updating the r2.0.2-alpha directory would be a bit hinky.  Do we just 
wait until 2.0.3 release ?

> Convert Forrest docs to APT, incremental
> 
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Fix For: 2.0.3-alpha
>
> Attachments: hadoop8427-1.txt, hadoop8427-3.txt, hadoop8427-4.txt, 
> hadoop8427-5.txt, HADOOP-8427.sh, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8427) Convert Forrest docs to APT

2012-12-19 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8427:
--

Attachment: hadoop8427-5.txt

Easy enough, I think the attached patch does the trick.  The incremental diff 
is tiny:
{noformat}
index da976f6..2cfe2e8 100644
--- a/hadoop-project/src/site/site.xml
+++ b/hadoop-project/src/site/site.xml
@@ -51,6 +51,8 @@
   
   
   
+  
+  
 
 
 
{noformat}

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427-1.txt, hadoop8427-3.txt, hadoop8427-4.txt, 
> hadoop8427-5.txt, HADOOP-8427.sh, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8427) Convert Forrest docs to APT

2012-12-13 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8427:
--

Attachment: hadoop8427-4.txt

New patch moving files and images around and redoing several xdoc files as 
{{.apt.vm}}.

To apply this the following needs to be run manually:
{code}
mkdir -p hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfs-logo.jpg
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfs-logo.jpg
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsarchitecture.gif
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsarchitecture.gif
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsarchitecture.odg
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsarchitecture.odg
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsarchitecture.png
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsarchitecture.png
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsdatanodes.gif
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsdatanodes.gif
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsdatanodes.odg
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsdatanodes.odg
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsdatanodes.png
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsdatanodes.png
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsproxy-forward.jpg
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsproxy-forward.jpg
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsproxy-overview.jpg
 
hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsproxy-overview.jpg
mv 
hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/resources/images/hdfsproxy-server.jpg
 hadoop-hdfs-project/hadoop-hdfs/src/site/resources/images/hdfsproxy-server.jpg
{code}

(possibly making each of those "svn mv" for svn purposes.)


> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427-1.txt, hadoop8427-3.txt, hadoop8427-4.txt, 
> hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9088) Add Murmur3 hash

2012-12-13 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531885#comment-13531885
 ] 

Andy Isaacson commented on HADOOP-9088:
---

Current best practice is SipHash, https://131002.net/siphash/ rather than 
Murmur3.  If we're going to change hash functions we should probably use that.

> Add Murmur3 hash
> 
>
> Key: HADOOP-9088
> URL: https://issues.apache.org/jira/browse/HADOOP-9088
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Radim Kolar
>Assignee: Radim Kolar
> Attachments: murmur3-2.txt, murmur3-3.txt, murmur3-4.txt, 
> murmur3-5.txt, murmur3-6.txt, murmur3-7.txt, murmur3.txt
>
>
> faster and better then murmur2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9041) FileSystem initialization can go into infinite loop

2012-12-13 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531882#comment-13531882
 ] 

Andy Isaacson commented on HADOOP-9041:
---

{code}
+FileSystem.getFileSystemClass("file",conf);
{code}
missing a space after the ",".

{code}
+public class TestFileSystemInitialization {
+
+   /**
{code}

Indentation is wonky, we do not use tabs and we use 2 space indents.

{code}
+* Check if FileSystem can be properly initialized if 
URLStreamHandlerFactory
+* is registered.
+*/
{code}
Since this is such an obscure failure condition, please mention the JIRA in the 
comment so that someone who is curious does not have to dig in the VCS history 
to figure out what's going on.
{code}
+   try {
+   FileSystem.getFileSystemClass("file:", conf);
+   }
+   catch (IOException ok) {}
+   }
{code}
Please add asserts as appropriate to ensure that we follow the expected code 
path.  For example, if we expect {{IOException}}, add {{assertFalse(true);}} 
after calling {{getFileSystemClass}}, with comments explaining the expected 
behavior.  See {{TestLocalDirAllocator.java}} for an example.

> FileSystem initialization can go into infinite loop
> ---
>
> Key: HADOOP-9041
> URL: https://issues.apache.org/jira/browse/HADOOP-9041
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.0.2-alpha
>Reporter: Radim Kolar
>Assignee: Radim Kolar
>Priority: Critical
> Attachments: fsinit2.txt, fsinit-unit.txt, fstest.groovy, 
> HADOOP-9041.patch, HADOOP-9041.patch, TestFileSystem.java
>
>
> More information is there: https://jira.springsource.org/browse/SHDP-111
> Referenced source code from example is: 
> https://github.com/SpringSource/spring-hadoop/blob/master/src/main/java/org/springframework/data/hadoop/configuration/ConfigurationFactoryBean.java
> from isolating that cause it looks like if you register: 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory before calling 
> FileSystem.loadFileSystems() then it goes into infinite loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9041) FileSystem initialization can go into infinite loop

2012-12-11 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529328#comment-13529328
 ] 

Andy Isaacson commented on HADOOP-9041:
---

Thanks, that helps.  I was able to reproduce the infinite recursion using trunk 
on Linux with
{{javac -classpath `hadoop classpath` TestFileSystem.java}}
{{java -classpath `pwd`:`hadoop classpath` TestFileSystem}}

and the stack trace
{noformat}
...
at java.net.URL.getURLStreamHandler(URL.java:1107)
at java.net.URL.(URL.java:572)
at java.net.URL.(URL.java:464)
at java.net.URL.(URL.java:413)
at java.net.JarURLConnection.parseSpecs(JarURLConnection.java:161)
at java.net.JarURLConnection.(JarURLConnection.java:144)
at 
sun.net.www.protocol.jar.JarURLConnection.(JarURLConnection.java:63)
at sun.net.www.protocol.jar.Handler.openConnection(Handler.java:24)
at java.net.URL.openConnection(URL.java:945)
at java.net.URL.openStream(URL.java:1010)
at java.util.ServiceLoader.parse(ServiceLoader.java:279)
at java.util.ServiceLoader.access$200(ServiceLoader.java:164)
at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:332)
at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:415)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2232)
at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2243)
at 
org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:67)
at java.net.URL.getURLStreamHandler(URL.java:1107)
...
{noformat}

> FileSystem initialization can go into infinite loop
> ---
>
> Key: HADOOP-9041
> URL: https://issues.apache.org/jira/browse/HADOOP-9041
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.0.2-alpha
>Reporter: Radim Kolar
>Assignee: Radim Kolar
>Priority: Critical
> Attachments: fsinit2.txt, fsinit-unit.txt, fstest.groovy, 
> HADOOP-9041.patch, HADOOP-9041.patch, TestFileSystem.java
>
>
> More information is there: https://jira.springsource.org/browse/SHDP-111
> Referenced source code from example is: 
> https://github.com/SpringSource/spring-hadoop/blob/master/src/main/java/org/springframework/data/hadoop/configuration/ConfigurationFactoryBean.java
> from isolating that cause it looks like if you register: 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory before calling 
> FileSystem.loadFileSystems() then it goes into infinite loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9041) FileSystem initialization can go into infinite loop

2012-12-10 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528598#comment-13528598
 ] 

Andy Isaacson commented on HADOOP-9041:
---

bq. So this bug may be difficult to reproduce in junit environment. But in 
regular development, it will be dangerous and hidden deep. Until here I think 
my patch can resolve this problem and looking forward any comments. 

[~yanbo] could you post instructions for how to reproduce the infinite loop 
failure using a simple java application?  I've tried to reproduce it using 
Eclipse+junit (using JDK 1.6.0_24 on Linux) and have not managed to cause the 
failure.  If you could upload or paste a standalone testcase that fails, plus 
instructions for how to run it (on whatever platform) that will help with 
manual testing.

> FileSystem initialization can go into infinite loop
> ---
>
> Key: HADOOP-9041
> URL: https://issues.apache.org/jira/browse/HADOOP-9041
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.0.2-alpha
>Reporter: Radim Kolar
>Assignee: Radim Kolar
>Priority: Critical
> Attachments: fsinit2.txt, fsinit-unit.txt, fstest.groovy, 
> HADOOP-9041.patch, HADOOP-9041.patch
>
>
> More information is there: https://jira.springsource.org/browse/SHDP-111
> Referenced source code from example is: 
> https://github.com/SpringSource/spring-hadoop/blob/master/src/main/java/org/springframework/data/hadoop/configuration/ConfigurationFactoryBean.java
> from isolating that cause it looks like if you register: 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory before calling 
> FileSystem.loadFileSystems() then it goes into infinite loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8698) Do not call unneceseary setConf(null) in Configured constructor

2012-12-06 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13525998#comment-13525998
 ] 

Andy Isaacson commented on HADOOP-8698:
---

bq. because java do not generates no arg constructor for you if constructor 
with argument exists.

Ah, my incremental test build succeeded for irrelevant reasons; when I did "mvn 
clean test" it failed as expected.  Thanks for enlightening me!

Are you planning to post a patch with the proposed cleanup?

> Do not call unneceseary setConf(null) in Configured constructor
> ---
>
> Key: HADOOP-8698
> URL: https://issues.apache.org/jira/browse/HADOOP-8698
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 0.23.3, 3.0.0
>Reporter: Radim Kolar
>Priority: Minor
> Fix For: 0.24.0, 3.0.0
>
> Attachments: setconf-null.txt
>
>
> no-arg constructor of /org/apache/hadoop/conf/Configured calls setConf(null). 
> This is unnecessary and it increases complexity of setConf() code because you 
> have to check for not null object reference before using it. Under normal 
> conditions setConf() is never called with null reference, so not null check 
> is unnecessary.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9041) FileSystem initialization can go into infinite loop

2012-12-06 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13525988#comment-13525988
 ] 

Andy Isaacson commented on HADOOP-9041:
---

Please post a roll-up patch containing both the code change and the unit test.  
Also, {{TestFileSystemInicialization}} should be spelled 
{{TestFileSystemInitialization}}.

> FileSystem initialization can go into infinite loop
> ---
>
> Key: HADOOP-9041
> URL: https://issues.apache.org/jira/browse/HADOOP-9041
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.0.2-alpha
>Reporter: Radim Kolar
>Assignee: Yanbo Liang
>Priority: Critical
> Attachments: fsinit-unit.txt, fstest.groovy, HADOOP-9041.patch, 
> HADOOP-9041.patch
>
>
> More information is there: https://jira.springsource.org/browse/SHDP-111
> Referenced source code from example is: 
> https://github.com/SpringSource/spring-hadoop/blob/master/src/main/java/org/springframework/data/hadoop/configuration/ConfigurationFactoryBean.java
> from isolating that cause it looks like if you register: 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory before calling 
> FileSystem.loadFileSystems() then it goes into infinite loop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8698) Do not call unneceseary setConf(null) in Configured constructor

2012-12-06 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13525986#comment-13525986
 ] 

Andy Isaacson commented on HADOOP-8698:
---

Can you include the proposed cleanup to remove the unnecessary complexity 
caused by checking for null?  Also, after this patch Configured reads:
{code}
...
   public Configured() {
   }
{code}
Why not just delete the method entirely?

> Do not call unneceseary setConf(null) in Configured constructor
> ---
>
> Key: HADOOP-8698
> URL: https://issues.apache.org/jira/browse/HADOOP-8698
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Affects Versions: 0.23.3, 3.0.0
>Reporter: Radim Kolar
>Priority: Minor
> Fix For: 0.24.0, 3.0.0
>
> Attachments: setconf-null.txt
>
>
> no-arg constructor of /org/apache/hadoop/conf/Configured calls setConf(null). 
> This is unnecessary and it increases complexity of setConf() code because you 
> have to check for not null object reference before using it. Under normal 
> conditions setConf() is never called with null reference, so not null check 
> is unnecessary.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-9103) UTF8 class does not properly decode Unicode characters outside the basic multilingual plane

2012-11-29 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507039#comment-13507039
 ] 

Andy Isaacson commented on HADOOP-9103:
---

bq. It's not "buggy" it's just "different" 

It's buggy if we ever end up writing a CESU-8 bytestream where someone else 
expects UTF-8.  For example, {{dfs -ls}} writing CESU-8 to stdout wouldn't work 
properly, because other programs such as {{xterm}} or {{putty}} don't implement 
the CESU-8 decoding rules.  (This example doesn't happen currently, because the 
CESU-8 filename is deserialized into a String, where it's interpreted as a 
surrogate pair, which is then written, and the correct surrogate pair -> UTF-8 
encoding happens on the output side.)  Hopefully we haven't overlooked any such 
existing bugs and nobody accidentally uses UTF8.java in the future.  (At least 
it's marked @Deprecated.)

Agreed that as long as UTF8.java is the thing that reads the bytestream, we can 
continue to implement CESU-8 and it can remain partially backwards compatible 
with previous versions of UTF8.java.

> UTF8 class does not properly decode Unicode characters outside the basic 
> multilingual plane
> ---
>
> Key: HADOOP-9103
> URL: https://issues.apache.org/jira/browse/HADOOP-9103
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.1
> Environment: SUSE LINUX
>Reporter: yixiaohua
>Assignee: Todd Lipcon
> Attachments: FSImage.java, hadoop-9103.txt, hadoop-9103.txt, 
> hadoop-9103.txt, ProblemString.txt, TestUTF8AndStringGetBytes.java, 
> TestUTF8AndStringGetBytes.java
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> this the log information  of the  exception  from the SecondaryNameNode: 
> 2012-03-28 00:48:42,553 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: 
> java.io.IOException: Found lease for
>  non-existent file 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/@???
> ??tor.qzone.qq.com/keypart-00174
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1211)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:589)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225)
> at java.lang.Thread.run(Thread.java:619)
> this is the log information  about the file from namenode:
> 2012-03-28 00:32:26,528 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss 
> ip=/10.131.16.34cmd=create  
> src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174 dst=null
> perm=boss:boss:rw-r--r--
> 2012-03-28 00:37:42,387 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174. 
> blk_2751836614265659170_184668759
> 2012-03-28 00:37:42,696 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.completeFile: file 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174 is closed by 
> DFSClient_attempt_201203271849_0016_r_000174_0
> 2012-03-28 00:37:50,315 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss 
> ip=/10.131.16.34cmd=rename  
> src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174 
> dst=/user/boss/pgv/fission/task16/split/  @?
> tor.qzone.qq.com/keypart-00174  perm=boss:boss:rw-r--r--
> after check the code that save FSImage,I found there are a problem that maybe 
> a bug of HDFS Code,I past below:
> -this is the saveFSImage method  in  FSImage.java, I make some 
> mark at the problem code
> /**
>* Save the contents of the FS image to the file.
>*/
>   void saveFSImage(File newFile) throws IOException {
> FSNamesystem fsNamesys = FSNamesystem.getFSNamesystem();
> FSDirectory fsDir = fsN

[jira] [Commented] (HADOOP-9103) UTF8 class does not properly decode Unicode characters outside the basic multilingual plane

2012-11-29 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506894#comment-13506894
 ] 

Andy Isaacson commented on HADOOP-9103:
---

bq. +   * This is a regression est for HDFS-3307.

test, not est.  Since this jira has moved to HADOOP-9103, update the reference.

{code}
+ * Note that this decodes UTF-8 but actually encodes CESU-8, a variant of
+ * UTF-8: see http://en.wikipedia.org/wiki/CESU-8
{code}
Rather than adding a comment saying "this code is buggy", how about we fix the 
bug?  Outputting proper 4-byte UTF8 sequences for a given UTF-16 surrogate pair 
is a much better solution than the current behavior.

So as far as it goes the patch looks good.  I'll look into the surrogate pair 
stuff.

> UTF8 class does not properly decode Unicode characters outside the basic 
> multilingual plane
> ---
>
> Key: HADOOP-9103
> URL: https://issues.apache.org/jira/browse/HADOOP-9103
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.1
> Environment: SUSE LINUX
>Reporter: yixiaohua
>Assignee: Todd Lipcon
> Attachments: FSImage.java, hadoop-9103.txt, hadoop-9103.txt, 
> ProblemString.txt, TestUTF8AndStringGetBytes.java, 
> TestUTF8AndStringGetBytes.java
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> this the log information  of the  exception  from the SecondaryNameNode: 
> 2012-03-28 00:48:42,553 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: 
> java.io.IOException: Found lease for
>  non-existent file 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/@???
> ??tor.qzone.qq.com/keypart-00174
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1211)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:589)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225)
> at java.lang.Thread.run(Thread.java:619)
> this is the log information  about the file from namenode:
> 2012-03-28 00:32:26,528 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss 
> ip=/10.131.16.34cmd=create  
> src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174 dst=null
> perm=boss:boss:rw-r--r--
> 2012-03-28 00:37:42,387 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174. 
> blk_2751836614265659170_184668759
> 2012-03-28 00:37:42,696 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.completeFile: file 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174 is closed by 
> DFSClient_attempt_201203271849_0016_r_000174_0
> 2012-03-28 00:37:50,315 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss 
> ip=/10.131.16.34cmd=rename  
> src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?tor.qzone.qq.com/keypart-00174 
> dst=/user/boss/pgv/fission/task16/split/  @?
> tor.qzone.qq.com/keypart-00174  perm=boss:boss:rw-r--r--
> after check the code that save FSImage,I found there are a problem that maybe 
> a bug of HDFS Code,I past below:
> -this is the saveFSImage method  in  FSImage.java, I make some 
> mark at the problem code
> /**
>* Save the contents of the FS image to the file.
>*/
>   void saveFSImage(File newFile) throws IOException {
> FSNamesystem fsNamesys = FSNamesystem.getFSNamesystem();
> FSDirectory fsDir = fsNamesys.dir;
> long startTime = FSNamesystem.now();
> //
> // Write out data
> //
> DataOutputStream out = new DataOutputStream(
> new BufferedOutputStream(
>  new 
> FileOutputStream(newFile)));
> try {
>  

[jira] [Commented] (HADOOP-8615) EOFException in DecompressorStream.java needs to be more verbose

2012-11-21 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502385#comment-13502385
 ] 

Andy Isaacson commented on HADOOP-8615:
---

The latest version of the patch looks great. +1.

> EOFException in DecompressorStream.java needs to be more verbose
> 
>
> Key: HADOOP-8615
> URL: https://issues.apache.org/jira/browse/HADOOP-8615
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.2
>Reporter: Jeff Lord
>  Labels: patch
> Attachments: HADOOP-8615.patch, HADOOP-8615-release-0.20.2.patch, 
> HADOOP-8615-ver2.patch, HADOOP-8615-ver3.patch
>
>
> In ./src/core/org/apache/hadoop/io/compress/DecompressorStream.java
> The following exception should at least pass back the file that it encounters 
> this error in relation to:
>   protected void getCompressedData() throws IOException {
> checkStream();
> int n = in.read(buffer, 0, buffer.length);
> if (n == -1) {
>   throw new EOFException("Unexpected end of input stream");
> }
> This would help greatly to debug bad/corrupt files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8427) Convert Forrest docs to APT

2012-11-08 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493368#comment-13493368
 ] 

Andy Isaacson commented on HADOOP-8427:
---

bq. my patch in HADOOP-8860 sets up the nav correctly so it would help if 
committed that one first. Does that patch look OK to you?

Yes, I reviewed HADOOP-8860, please go ahead and commit.  I'll take another 
shot at the conversions and post a new patch after 8860 is in.

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427-1.txt, hadoop8427-3.txt, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8860) Split MapReduce and YARN sections in documentation navigation

2012-11-07 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492896#comment-13492896
 ] 

Andy Isaacson commented on HADOOP-8860:
---

The patch seems reasonable to me. +1.

> Split MapReduce and YARN sections in documentation navigation
> -
>
> Key: HADOOP-8860
> URL: https://issues.apache.org/jira/browse/HADOOP-8860
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.1-alpha
>Reporter: Tom White
>Assignee: Tom White
> Attachments: HADOOP-8860.patch, HADOOP-8860.sh
>
>
> This JIRA is to change the navigation on 
> http://hadoop.apache.org/docs/r2.0.1-alpha/ to reflect the fact that 
> MapReduce and YARN are separate modules/sub-projects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8615) EOFException in DecompressorStream.java needs to be more verbose

2012-11-06 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491836#comment-13491836
 ] 

Andy Isaacson commented on HADOOP-8615:
---

BTW, the wiki page http://wiki.apache.org/hadoop/HowToContribute is supposed to 
answer these questions, but it doesn't currently answer them very well I think. 
 If you're willing to contribute to the wiki, some improvements there would be 
helpful to everybody! :)

> EOFException in DecompressorStream.java needs to be more verbose
> 
>
> Key: HADOOP-8615
> URL: https://issues.apache.org/jira/browse/HADOOP-8615
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.2
>Reporter: Jeff Lord
>  Labels: patch
> Attachments: HADOOP-8615.patch, HADOOP-8615-release-0.20.2.patch, 
> HADOOP-8615-ver2.patch
>
>
> In ./src/core/org/apache/hadoop/io/compress/DecompressorStream.java
> The following exception should at least pass back the file that it encounters 
> this error in relation to:
>   protected void getCompressedData() throws IOException {
> checkStream();
> int n = in.read(buffer, 0, buffer.length);
> if (n == -1) {
>   throw new EOFException("Unexpected end of input stream");
> }
> This would help greatly to debug bad/corrupt files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8615) EOFException in DecompressorStream.java needs to be more verbose

2012-11-06 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491828#comment-13491828
 ] 

Andy Isaacson commented on HADOOP-8615:
---

bq. To incorporate these fixes, should I take the latest from the trunk again,

Good question.  Your patch will be applied against trunk when it's committed, 
so in some cases you will need to make changes to merge with trunk, and there's 
never a downside to refreshing your patch against trunk.

>From the patch file formatting it looks like you're using svn, so you can 
>probably just "svn up" and resolve any merge conflicts.  I use git, and use 
>"git pull --rebase" to get a similar effect on my working branches.

> EOFException in DecompressorStream.java needs to be more verbose
> 
>
> Key: HADOOP-8615
> URL: https://issues.apache.org/jira/browse/HADOOP-8615
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.2
>Reporter: Jeff Lord
>  Labels: patch
> Attachments: HADOOP-8615.patch, HADOOP-8615-release-0.20.2.patch, 
> HADOOP-8615-ver2.patch
>
>
> In ./src/core/org/apache/hadoop/io/compress/DecompressorStream.java
> The following exception should at least pass back the file that it encounters 
> this error in relation to:
>   protected void getCompressedData() throws IOException {
> checkStream();
> int n = in.read(buffer, 0, buffer.length);
> if (n == -1) {
>   throw new EOFException("Unexpected end of input stream");
> }
> This would help greatly to debug bad/corrupt files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8615) EOFException in DecompressorStream.java needs to be more verbose

2012-11-05 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490842#comment-13490842
 ] 

Andy Isaacson commented on HADOOP-8615:
---

bq. Please let me know for any feedback.

Sorry for the delay!

A few more whitespace fixups:
- make sure {{) throws}} has a space between the ) and "throws". (2 examples in 
this patch)
- still a few argument lists missing a space after "," for example {{return 
createInputStream(in, null,fileName);}}.
- also a few argument lists with extra spaces before "," for example 
{{Decompressor decompressor , String fileName}}
- extra space in {{protected  String fileName}}
- extra space in {{this.fileName =  fileName}}
- missing spaces in {{\+"file = "\+this.fileName}}, always put spaces on both 
sides of "\+" and other operators. also we generally put the "+" on the 
previous line for a string continuation like this one.
- missing space in {{if ((b1 | b2 | b3 | b4) < 0\)\{}} before "{"
- missing space in {{String fileName ="fileName";}} after "="

Thanks again for working on this enhancement!

> EOFException in DecompressorStream.java needs to be more verbose
> 
>
> Key: HADOOP-8615
> URL: https://issues.apache.org/jira/browse/HADOOP-8615
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.2
>Reporter: Jeff Lord
>  Labels: patch
> Attachments: HADOOP-8615.patch, HADOOP-8615-release-0.20.2.patch, 
> HADOOP-8615-ver2.patch
>
>
> In ./src/core/org/apache/hadoop/io/compress/DecompressorStream.java
> The following exception should at least pass back the file that it encounters 
> this error in relation to:
>   protected void getCompressedData() throws IOException {
> checkStream();
> int n = in.read(buffer, 0, buffer.length);
> if (n == -1) {
>   throw new EOFException("Unexpected end of input stream");
> }
> This would help greatly to debug bad/corrupt files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9001) libhadoop.so links against wrong OpenJDK libjvm.so

2012-10-31 Thread Andy Isaacson (JIRA)
Andy Isaacson created HADOOP-9001:
-

 Summary: libhadoop.so links against wrong OpenJDK libjvm.so
 Key: HADOOP-9001
 URL: https://issues.apache.org/jira/browse/HADOOP-9001
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Andy Isaacson
Priority: Minor


After building against OpenJDK 6b24-1.11.4-3 (Debian amd64) using
bq. {{mvn -Pnative,dist clean package -Dmaven.javadoc.skip=true -DskipTests 
-Dtar}}
the resulting binaries {{libhadoop.so}} and {{libhdfs.so}} are linked to the 
wrong {{libjvm.so}}:
{code}
% LD_LIBRARY_PATH=/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/server ldd 
hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/lib/native/libhadoop.so.1.0.0
linux-vdso.so.1 =>  (0x7fff8c7ff000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x7f31df30e000)
libjvm.so.0 => not found
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7f31def86000)
/lib64/ld-linux-x86-64.so.2 (0x7f31df73d000)
{code}
Inspecting the build output it appears that {{JNIFlags.cmake}} decided, 
mysteriously, to link against 
{{/usr/lib/jvm/default-java/jre/lib/amd64/jamvm/libjvm.so}}, based on:
{code}
 [exec] JAVA_HOME=, 
JAVA_JVM_LIBRARY=/usr/lib/jvm/default-java/jre/lib/amd64/jamvm/libjvm.so
 [exec] JAVA_INCLUDE_PATH=/usr/lib/jvm/default-java/include, 
JAVA_INCLUDE_PATH2=/usr/lib/jvm/default-java/include/linux
 [exec] Located all JNI components successfully.
{code}

The "jamvm" is not mentioned anywhere in my environment or any symlinks in 
/usr, so apparently cmake iterated over the directories in 
{{/usr/lib/jvm/default-java/jre/lib/amd64}} to find it.  The following 
{{libjvm.so}} files are present on this machine:
{code}
-rw-r--r-- 1 root root  1050190 Sep  2 13:38 
/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/cacao/libjvm.so
-rw-r--r-- 1 root root  1554628 Sep  2 11:21 
/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/jamvm/libjvm.so
-rw-r--r-- 1 root root 12193850 Sep  2 13:38 
/usr/lib/jvm/java-6-openjdk-amd64/jre/lib/amd64/server/libjvm.so
{code}

Note the difference between {{libjvm.so}} and {{libjvm.so.0}}; the latter seems 
to come from the {{DT_SONAME}} in {{jamvm/libjvm.so}}, but that library seems 
to just be broken since there's no {{libjvm.so.0}} symlink anywhere on the 
filesystem.  I suspect *that* is a bug in OpenJDK but we should just avoid the 
issue by finding the right value for {{JAVA_JVM_LIBRARY}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8994) TestDFSShell creates file named "noFileHere", making further tests hard to understand

2012-10-29 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8994:
--

Affects Version/s: 3.0.0
   2.0.2-alpha

> TestDFSShell creates file named "noFileHere", making further tests hard to 
> understand
> -
>
> Key: HADOOP-8994
> URL: https://issues.apache.org/jira/browse/HADOOP-8994
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hadoop8994.txt
>
>
> While working on HDFS-1331 I added a test to {{TestDFSShell}} which used 
> {{noFileHere}} in the negative case.  This failed mysteriously because the 
> earlier tests run {{-touchz noFileHere}} for no good reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8994) TestDFSShell creates file named "noFileHere", making further tests hard to understand

2012-10-29 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8994:
--

Component/s: test

> TestDFSShell creates file named "noFileHere", making further tests hard to 
> understand
> -
>
> Key: HADOOP-8994
> URL: https://issues.apache.org/jira/browse/HADOOP-8994
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hadoop8994.txt
>
>
> While working on HDFS-1331 I added a test to {{TestDFSShell}} which used 
> {{noFileHere}} in the negative case.  This failed mysteriously because the 
> earlier tests run {{-touchz noFileHere}} for no good reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8994) TestDFSShell creates file named "noFileHere", making further tests hard to understand

2012-10-29 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8994:
--

Priority: Minor  (was: Major)

> TestDFSShell creates file named "noFileHere", making further tests hard to 
> understand
> -
>
> Key: HADOOP-8994
> URL: https://issues.apache.org/jira/browse/HADOOP-8994
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hadoop8994.txt
>
>
> While working on HDFS-1331 I added a test to {{TestDFSShell}} which used 
> {{noFileHere}} in the negative case.  This failed mysteriously because the 
> earlier tests run {{-touchz noFileHere}} for no good reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8994) TestDFSShell creates file named "noFileHere", making further tests hard to understand

2012-10-29 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8994:
--

Status: Patch Available  (was: Open)

> TestDFSShell creates file named "noFileHere", making further tests hard to 
> understand
> -
>
> Key: HADOOP-8994
> URL: https://issues.apache.org/jira/browse/HADOOP-8994
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: test
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hadoop8994.txt
>
>
> While working on HDFS-1331 I added a test to {{TestDFSShell}} which used 
> {{noFileHere}} in the negative case.  This failed mysteriously because the 
> earlier tests run {{-touchz noFileHere}} for no good reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8994) TestDFSShell creates file named "noFileHere", making further tests hard to understand

2012-10-29 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8994:
--

Attachment: hadoop8994.txt

Change name of the file that is created to {{isFileHere}}.

> TestDFSShell creates file named "noFileHere", making further tests hard to 
> understand
> -
>
> Key: HADOOP-8994
> URL: https://issues.apache.org/jira/browse/HADOOP-8994
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hadoop8994.txt
>
>
> While working on HDFS-1331 I added a test to {{TestDFSShell}} which used 
> {{noFileHere}} in the negative case.  This failed mysteriously because the 
> earlier tests run {{-touchz noFileHere}} for no good reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HADOOP-8994) TestDFSShell creates file named "noFileHere", making further tests hard to understand

2012-10-29 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson reassigned HADOOP-8994:
-

Assignee: Andy Isaacson

> TestDFSShell creates file named "noFileHere", making further tests hard to 
> understand
> -
>
> Key: HADOOP-8994
> URL: https://issues.apache.org/jira/browse/HADOOP-8994
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hadoop8994.txt
>
>
> While working on HDFS-1331 I added a test to {{TestDFSShell}} which used 
> {{noFileHere}} in the negative case.  This failed mysteriously because the 
> earlier tests run {{-touchz noFileHere}} for no good reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-8994) TestDFSShell creates file named "noFileHere", making further tests hard to understand

2012-10-29 Thread Andy Isaacson (JIRA)
Andy Isaacson created HADOOP-8994:
-

 Summary: TestDFSShell creates file named "noFileHere", making 
further tests hard to understand
 Key: HADOOP-8994
 URL: https://issues.apache.org/jira/browse/HADOOP-8994
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Andy Isaacson


While working on HDFS-1331 I added a test to {{TestDFSShell}} which used 
{{noFileHere}} in the negative case.  This failed mysteriously because the 
earlier tests run {{-touchz noFileHere}} for no good reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8615) EOFException in DecompressorStream.java needs to be more verbose

2012-10-26 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485159#comment-13485159
 ] 

Andy Isaacson commented on HADOOP-8615:
---

Thomas,

Thank you for working on this!  I've been annoyed by this unhelpful error 
message before.

The findbugs complaint seems legit:
{code}
Correctness Warnings
CodeWarning
MF  Field BlockDecompressorStream.fileName masks field in superclass 
org.apache.hadoop.io.compress.DecompressorStream
{code}
Please fix the coding style throughout the patch:
* you have leftover unused comments like ";//" at the end of lines
* always put a space after , in argument lists, for example 
{{decompress(buf,0,10);}} but there are many occurrences in the patch.
* in {{if}} tests, always put exactly space before ( and { and around 
operators.  For example {{if(null !=  this.fileName){ }} has one extra space 
after {{!=}} and is missing spaces before ( and {.
* properly indent continuation lines.  Use vim or emacs or eclipse for 
automatic indentation if necessary.
* exactly one space around {{else}}, you have }else{ in several places.
* in {{testBlockDecompress}} you want to {{fail("did not raise expected 
exception")}} after calling {{.decompress}}.
* please fill in javadoc {{\@param}} entries, or delete them.

The patch is looking good, almost all the above is just cosmetic.  Again, 
thanks for the code!

> EOFException in DecompressorStream.java needs to be more verbose
> 
>
> Key: HADOOP-8615
> URL: https://issues.apache.org/jira/browse/HADOOP-8615
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.2
>Reporter: Jeff Lord
>  Labels: patch
> Attachments: HADOOP-8615.patch, HADOOP-8615-release-0.20.2.patch
>
>
> In ./src/core/org/apache/hadoop/io/compress/DecompressorStream.java
> The following exception should at least pass back the file that it encounters 
> this error in relation to:
>   protected void getCompressedData() throws IOException {
> checkStream();
> int n = in.read(buffer, 0, buffer.length);
> if (n == -1) {
>   throw new EOFException("Unexpected end of input stream");
> }
> This would help greatly to debug bad/corrupt files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8427) Convert Forrest docs to APT

2012-10-25 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8427:
--

Attachment: hadoop8427-3.txt

Attaching new version, diffstat says
{code}
 CommandsManual.apt.vm |  490 ++
 FileSystemShell.apt.vm|  418 +++
 HttpAuthentication.apt.vm |   99 +
 3 files changed, 1007 insertions(+)
{code}

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427-1.txt, hadoop8427-3.txt, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8427) Convert Forrest docs to APT

2012-10-23 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482599#comment-13482599
 ] 

Andy Isaacson commented on HADOOP-8427:
---

bq. There are two problems (1) converting the format to API and (2) removing 
the out-dated doc. (1) changes the entire file but (2) only removes some lines 
for the file.

I think you're suggesting that (1) be a completely mechanical conversion to APT 
with no editing, so that the rendered document should be word-for-word 
identical, right?

I can certainly do that.  That means doing a fair amount of work on documents 
that will immediately be deleted because they are out of date, but I agree that 
a mechanical change is easiest to validate.

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427-1.txt, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8900) BuiltInGzipDecompressor throws IOException - stored gzip size doesn't match decompressed size

2012-10-22 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481990#comment-13481990
 ] 

Andy Isaacson commented on HADOOP-8900:
---

bq. Andy or Colin, can you please review the merged branch-1 patch.

hadoop-8900.branch-1.patch looks good to me.  Thanks for the backport!

> BuiltInGzipDecompressor throws IOException - stored gzip size doesn't match 
> decompressed size
> -
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.1-alpha
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: BuiltInGzipDecompressor2.patch, hadoop8900-2.txt, 
> hadoop-8900.branch-1.patch, hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8901) GZip and Snappy support may not work without unversioned libraries

2012-10-22 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481890#comment-13481890
 ] 

Andy Isaacson commented on HADOOP-8901:
---

bq. I'll test your suggestion on Linux to see if it has the needed behavior

I tested with
{noformat}
--- a/hadoop-common-project/hadoop-common/src/CMakeLists.txt
+++ b/hadoop-common-project/hadoop-common/src/CMakeLists.txt
@@ -55,7 +55,7 @@ if (NOT GENERATED_JAVAH)
 MESSAGE(FATAL_ERROR "You must set the cmake variable GENERATED_JAVAH")
 endif (NOT GENERATED_JAVAH)
 find_package(JNI REQUIRED)
-find_package(ZLIB REQUIRED)
+find_package(ZLIB 1 REQUIRED)
 
 set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -g -Wall -O2")
 set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -D_REENTRANT -D_FILE_OFFSET_BITS=64")
{noformat}

and found that the resulting value for {{HADOOP_ZLIB_LIBRARY}} remained 
{{"libz.so"}}.  This doesn't work on standard Linux installs because {{dlopen}} 
uses the string it is passed as an exact filename match, and without the 
{{-dev}} packages, Linux installations do not have {{libfoo.so}} symlinks 
installed.

Colin's fix which was committed on this Jira addresses the problem by ensuring 
that {{HADOOP_ZLIB_LIBRARY}} expands to a string that specifies the library ABI 
version, {{"libz.so.1"}}.

It seems like you've actually got the same problem on BSD, but the hardcoded 
"1" breaks because BSD considers zlib to have a larger ABI revision.  Possibly 
this is related to an issue in the upstream Linux system where the zlib ABI 
number is not properly being incremented when the ABI changes -- certainly BSD 
has been much more careful / aware of such issues over the years.

I think the right fix is to figure out how to have CMake determine the correct 
ABI revision number at build time, no?

> GZip and Snappy support may not work without unversioned libraries
> --
>
> Key: HADOOP-8901
> URL: https://issues.apache.org/jira/browse/HADOOP-8901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: native
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HADOOP-8901.001.patch, HADOOP-8901.002.patch, 
> HADOOP-8901.003.patch
>
>
> Currently, we use {{dlopen}} to open {{libz.so}} and {{libsnappy.so}}, to get 
> Gzip and Snappy support, respectively.
> However, this is not correct; we should be dlopening {{libsnappy.so.1}} 
> instead.  The versionless form of the shared library is not commonly 
> installed except by development packages.  Also, we may run into subtle 
> compatibility problems if a new version of libsnappy comes out.
> Thanks to Brandon Vargo for reporting this bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8901) GZip and Snappy support may not work without unversioned libraries

2012-10-22 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481868#comment-13481868
 ] 

Andy Isaacson commented on HADOOP-8901:
---

Radim,

Thanks for noting this issue! With {{find_package(ZLIB 1 REQUIRED)}} , what 
path does dlopen end up using on BSD to open the libz.so?  If the pathname is 
{{"/usr/lib/libz.so"}} then there's a dependency on that symlink pointing to 
the correct library.

I'll test your suggestion on Linux to see if it has the needed behavior -- in 
order to support Linux systems without a {{"libz.so"}} symlink, we need to 
ensure that dlopen gets a pathname like {{"libz.so.1"}}.

Also, just out of curiosity, what BSD are you testing this on?

> GZip and Snappy support may not work without unversioned libraries
> --
>
> Key: HADOOP-8901
> URL: https://issues.apache.org/jira/browse/HADOOP-8901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: native
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HADOOP-8901.001.patch, HADOOP-8901.002.patch, 
> HADOOP-8901.003.patch
>
>
> Currently, we use {{dlopen}} to open {{libz.so}} and {{libsnappy.so}}, to get 
> Gzip and Snappy support, respectively.
> However, this is not correct; we should be dlopening {{libsnappy.so.1}} 
> instead.  The versionless form of the shared library is not commonly 
> installed except by development packages.  Also, we may run into subtle 
> compatibility problems if a new version of libsnappy comes out.
> Thanks to Brandon Vargo for reporting this bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8427) Convert Forrest docs to APT

2012-10-17 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478605#comment-13478605
 ] 

Andy Isaacson commented on HADOOP-8427:
---

bq. do the convert and update in two patches/JIRAs

OK, that's what I'll end up doing on this jira -- a diff creating the new 
.apt.vm files, after which the old .xml files can simply be "svn rm"ed.

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427-1.txt, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8427) Convert Forrest docs to APT

2012-10-17 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8427:
--

Attachment: hadoop8427-1.txt

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427-1.txt, hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8427) Convert Forrest docs to APT

2012-10-17 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478568#comment-13478568
 ] 

Andy Isaacson commented on HADOOP-8427:
---

bq. Hi Andy, could we first remove the out-of-date doc and then convert it to 
APT? It would be much easier to see the changes.

I'm not sure what you're asking for, could you be more specific?

Since all the filenames and all the content is changing, this diff is going to 
be monstrous in any case.  I can post a separate "here's the rm command to run" 
and then a "here's the new markup", but the new markup (in APT) is completely 
different from the old markup (in Forrest XML).

Pages that I'm planning to convert and fully update to 2.0:

http://hadoop.apache.org/docs/r1.0.3/file_system_shell.html
http://hadoop.apache.org/docs/r1.0.3/commands_manual.html

Pages that I'm planning to convert without editing for 2.0 content:

http://hadoop.apache.org/docs/r1.1.0/mapred_tutorial.html
http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html
http://hadoop.apache.org/docs/r1.0.3/deployment_layout.html
http://hadoop.apache.org/docs/r1.0.3/native_libraries.html
http://hadoop.apache.org/docs/r1.0.3/service_level_auth.html
http://hadoop.apache.org/docs/r1.0.3/cluster_setup.html

Pages that I'm planning to delete as no longer relevant:

http://hadoop.apache.org/docs/r1.0.3/HttpAuthentication.html

I've moved mapred\_tutorial and cluster\_setup from "plan to delete" to "plan 
to convert without editing" since my last comment.  I'm not sure if 
HttpAuthentication is salvageable, would appreciate input on that.  In any 
case, it's fairly easy to resurrect a page.

In fact, I suppose the right thing is simply to leave xdocs/*.xml in place 
(unused), adding apt.vm versions as they're converted, then deleting the unused 
.xml after they are completely redundant.  I'll post a new patch to that effect.

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-6311) Add support for unix domain sockets to JNI libs

2012-10-11 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474694#comment-13474694
 ] 

Andy Isaacson commented on HADOOP-6311:
---

Thinking about this further, it seems like the native code should be the 
minimal possible, and the database of which FDs are accessible should be 
managed in Java code.  This would remove the need for the red-black tree in C, 
and it would make the whole patch much smaller.  Todd suggested a way to 
completely avoid needing the cookie, as well.

> Add support for unix domain sockets to JNI libs
> ---
>
> Key: HADOOP-6311
> URL: https://issues.apache.org/jira/browse/HADOOP-6311
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: native
>Affects Versions: 0.20.0
>Reporter: Todd Lipcon
>Assignee: Colin Patrick McCabe
> Attachments: 6311-trunk-inprogress.txt, HADOOP-6311.014.patch, 
> HADOOP-6311.016.patch, HADOOP-6311.018.patch, HADOOP-6311.020b.patch, 
> HADOOP-6311.020.patch, HADOOP-6311.021.patch, HADOOP-6311.022.patch, 
> HADOOP-6311-0.patch, HADOOP-6311-1.patch, hadoop-6311.txt
>
>
> For HDFS-347 we need to use unix domain sockets. This JIRA is to include a 
> library in common which adds a o.a.h.net.unix package based on the code from 
> Android (apache 2 license)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-6867) Using socket address for datanode registry breaks multihoming

2012-10-11 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-6867:
--

Description: 
Related: 
* https://issues.apache.org/jira/browse/HADOOP-985
* https://issues.apache.org/jira/secure/attachment/12350813/HADOOP-985-1.patch
* http://old.nabble.com/public-IP-for-datanode-on-EC2-td19336240.html
* 
http://www.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-a-gateway/
 

Datanodes register using their dns name (even configurable with 
dfs.datanode.dns.interface). However, the Namenode only really uses the source 
address that the registration came from when sharing it to clients wanting to 
write to HDFS.

Specific environment that causes this problem:
* Datanode and Namenode multihomed on two networks.
* Datanode registers to namenode using dns name on network #1
* Client (distcp) connects to namenode on network #2 \(*) and is told to write 
to datanodes on network #1, which doesn't work for us.

\(*) Allowing contact to the namenode on multiple networks was achieved with a 
socat proxy hack that tunnels network#2 to network#1 port 8020. This is 
unrelated to the issue at hand.


The cloudera link above recommends proxying for other reasons than multihoming, 
but it would work, but it doesn't sound like it would well (bandwidth, 
multiplicity, multitenant, etc).

Our specific scenario is wanting to distcp over a different network interface 
than the datanodes register themselves on, but it would be nice if both (all) 
interfaces worked. We are internally going to patch hadoop to roll back parts 
of the patch mentioned above so that we rely the datanode name rather than the 
socket address it uses to talk to the namenode. The alternate option is to push 
config changes to all nodes that force them to listen/register on one specific 
interface only. This helps us work around our specific problem, but doesn't 
really help with multihoming. 

I would propose that datanodes register all interface addresses during the 
registration/heartbeat/whatever process does this and hdfs clients would be 
given all addresses for a specific node to perform operations against and they 
could select accordingly (or 'whichever worked first') just like round-robin 
dns does.


  was:
Related: 
* https://issues.apache.org/jira/browse/HADOOP-985
* https://issues.apache.org/jira/secure/attachment/12350813/HADOOP-985-1.patch
* http://old.nabble.com/public-IP-for-datanode-on-EC2-td19336240.html
* 
http://www.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-a-gateway/
 

Datanodes register using their dns name (even configurable with 
dfs.datanode.dns.interface). However, the Namenode only really uses the source 
address that the registration came from when sharing it to clients wanting to 
write to HDFS.

Specific environment that causes this problem:
* Datanode and Namenode multihomed on two networks.
* Datanode registers to namenode using dns name on network #1
* Client (distcp) connects to namenode on network #2 (*) and is told to write 
to datanodes on network #1, which doesn't work for us.

(*) Allowing contact to the namenode on multiple networks was achieved with a 
socat proxy hack that tunnels network#2 to network#1 port 8020. This is 
unrelated to the issue at hand.


The cloudera link above recommends proxying for other reasons than multihoming, 
but it would work, but it doesn't sound like it would well (bandwidth, 
multiplicity, multitenant, etc).

Our specific scenario is wanting to distcp over a different network interface 
than the datanodes register themselves on, but it would be nice if both (all) 
interfaces worked. We are internally going to patch hadoop to roll back parts 
of the patch mentioned above so that we rely the datanode name rather than the 
socket address it uses to talk to the namenode. The alternate option is to push 
config changes to all nodes that force them to listen/register on one specific 
interface only. This helps us work around our specific problem, but doesn't 
really help with multihoming. 

I would propose that datanodes register all interface addresses during the 
registration/heartbeat/whatever process does this and hdfs clients would be 
given all addresses for a specific node to perform operations against and they 
could select accordingly (or 'whichever worked first') just like round-robin 
dns does.



> Using socket address for datanode registry breaks multihoming
> -
>
> Key: HADOOP-6867
> URL: https://issues.apache.org/jira/browse/HADOOP-6867
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 0.20.2
> Environment: hadoop-0.20-0.20.2+228-1, centos 5, distcp
>Reporter: Jordan Sissel
>
> Related: 
> * https://issues.apache.org/jira/browse/HADOOP-985
> * https://issues.apache.org/j

[jira] [Commented] (HADOOP-6311) Add support for unix domain sockets to JNI libs

2012-10-10 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473757#comment-13473757
 ] 

Andy Isaacson commented on HADOOP-6311:
---

High level comments:
* We need a design doc explaining how all of these bits work together. This 
Jira has gone on long enough that it does not serve as documentation.
* I didn't review the red-black and splay tree implementations at all. I'm not 
sure why we expect this to be big/contended enough to deserve anything beyond a 
trivial hash table, which takes about 20 lines of C.  (ah, I see the code comes 
from \*BSD, so that's good at least.  We should document where and what version 
it came from for future maintainers' sanity.)

{code}
+++ 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/net/fd_server.h
...
+/**
+ * This file includes some common utilities 
+ * for all native code used in hadoop.
+ */
{code}
I don't think this comment is accurate.
{code}
+#include 
{code}
Please move {{#include}}s to the relevant {{.c}} unless they're needed in the 
{{.h}} directly. Doesn't look like it's needed here.
{code}
+memset(&addr, 0, sizeof(struct sockaddr_un));
{code}
I prefer to say {{memset(&x, 0, sizeof\(x\))}} so that the code is clearly 
using the correct size. I don't feel too strongly about this though.
{code}
+addr.sun_family = AF_UNIX;
+if (bind(ud->sock, (struct sockaddr*)&addr, sizeof(sa_family_t)) < 0) {
{code}
This seems to be using the Linux-proprietary "abstract namespace".  If we do 
this it should be a Linux-specific name, not "unixDomainSock" which implies 
that the code is portable to other UNIXes such as Darwin/Mac OS or Solaris or 
FreeBSD.

The abstract socket API is documented at 
http://www.kernel.org/doc/man-pages/online/pages/man7/unix.7.html

(If I'm wrong and the abstract sockets are supported by other OSes then great! 
but I'm pretty sure they're not.)

Talking to Colin offline we confirmed that abstract sockets are Linux-specific, 
but he pointed out that {{unixDomainSockCreateAndBind}} handles both abstract 
sockets and named sockets (in the {{if(jpath)}} branch).  So the name is OK but 
we need a comment calling out the abstract socket use case.  The Linux-specific 
code will compile OK on other OSes, but it might be useful if the exception 
message says "your OS requires an explicit path" on non-Linux (using an 
{{#ifndef __linux__}} perhaps).

The control flow is a little confusing but not too bad, it could use a comment 
perhaps something like {{/* Client requested abstract socket (see unix(7) for 
details) by setting path = null. */}} in the abstract path case.

{code}
+  if (!jpath) {
... 20 lines of code
+  } else {
... 10 lines of code
+  }
{code}

I'd reorder them to {code}if (jpath) { ... } else { /* !jpath */ ... } {code} 
as it's one less bit-flip to think about.

Could you explain the benefits of using abstract sockets?  Why is it better 
than a named socket?  Ideally in a code comment near the implementation, or in 
this Jira.

{code}
+  jthr = newNativeIOException(env, errno, "error autobinding "
+  "PF_UNIX socket: ");
{code}
I don't recognize the phrase "autobinding".  Is that something specific to 
abstract sockets? if so it can be described in the documentation.
{code}
+  jthr = newNativeIOException(env, EIO, "getsockname():  "
+   "autobound abstract socket identifier started with / ");
{code}
Most of your newnativeIOException texts end with {{: "}} but this one ends with 
{{/ "}}. Best to be consistent.
{code}
+jthrowable unixDomainSetupSockaddr(JNIEnv *env, const char *id,
{code}
I think this function can be static, right?
{code}
+#define RETRY_ON_EINTR(ret, expr) do { \
+  ret = expr; \
+} while ((ret == -1) && (errno == EINTR));
{code}
This probably wants a maximum retry count (hardcoding 100 or thereabouts should 
be fine).

{code}
+static ssize_t safe_write(int fd, const void *b, size_t c)
+{
+  int res;
+
+  while (c > 0) {
+res = write(fd, b, c);
+if (res < 0) {
+  if (errno != EINTR)
+return -errno;
+  continue;
+}
+c -= res;
+b = (char *)b + res;
{code}
* I'd use a local {{char *p = b}} rather than having the cast in the loop.
* if write returns a too large value (which "cannot happen" but bugs happen) 
and c underflows, since it's unsigned the loop will spin forever (2^63 is 
forever).  I'd explicitly check {{if (res > c) return;}} before decrementing c.
* {{write(2)}} returns the number written.  Seems like we should do that here 
too.  If the user calls safe_write(100), we write 50 and then get ENOSPC, we 
should return 50 I think.  But I'm not sure, maybe that's not the right 
interface contract here.

{code}
+  if (!cmsg->cmsg_type == SCM_RIGHTS) {
{code}
Should be {{if (cmsg_type != SCM\_RIGHTS)}}.
{code}
+  if (setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO, (char *)&timeou

[jira] [Updated] (HADOOP-8427) Convert Forrest docs to APT

2012-10-10 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8427:
--

Target Version/s:   (was: )
  Status: Patch Available  (was: Open)

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-09 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472998#comment-13472998
 ] 

Andy Isaacson commented on HADOOP-8900:
---

bq. It's kind of annoying to have to use 4GB of temporary space

Nope, it only writes the compressed file to disk; {{gzip -1}} compresses 4GB of 
zeros to 18 MiB.

bq. Could you please port it to branch-1 that that we could integrate it to 
branch-1-win

Slavik, thanks for the review!

I don't have very much experience on branch-1, would you like to take a shot at 
the port?  Especially I don't know very much about the test framework 
differences.  I will figure out the details and do the port later this week if 
you don't get to it first.

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.1-alpha
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Attachments: BuiltInGzipDecompressor2.patch, hadoop8900-2.txt, 
> hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-09 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8900:
--

Attachment: hadoop8900-2.txt

New patch for trunk:
* fix all examples of {{long&0x}} in the tree (adds TestVLong).
* verified that 4GB+1 is the relevant edge case, 2GB+1 does not trigger the 
failure.

I'm still a bit unhappy at the long runtime, but 100 seconds is not *that* long 
by the standards of this test suite, so maybe it's worthwhile.

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 1-win, 2.0.1-alpha
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Attachments: BuiltInGzipDecompressor2.patch, hadoop8900-2.txt, 
> hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8900:
--

Affects Version/s: 2.0.1-alpha

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Attachments: BuiltInGzipDecompressor2.patch, hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8901) GZip and Snappy support may not work without unversioned libraries

2012-10-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8901:
--

Description: 
Currently, we use {{dlopen}} to open {{libz.so}} and {{libsnappy.so}}, to get 
Gzip and Snappy support, respectively.

However, this is not correct; we should be dlopening {{libsnappy.so.1}} 
instead.  The versionless form of the shared library is not commonly installed 
except by development packages.  Also, we may run into subtle compatibility 
problems if a new version of libsnappy comes out.

Thanks to Brandon Vargo for reporting this bug.

  was:
Currently, we use {{dlopen}} to open {{libz.so}} and {{libsnappy.so}}, to get 
Gzip and Snappy support, respectively.

However, this is not correct; we should be dlopening {{libsnappy.so.1}} 
instead.  The versionless form of the shared library is not commonly installed 
except by development packages.  Also, we may run into subtle compatibility 
problems if a new version of libsnappy comes out.


> GZip and Snappy support may not work without unversioned libraries
> --
>
> Key: HADOOP-8901
> URL: https://issues.apache.org/jira/browse/HADOOP-8901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: native
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HADOOP-8901.001.patch, HADOOP-8901.002.patch
>
>
> Currently, we use {{dlopen}} to open {{libz.so}} and {{libsnappy.so}}, to get 
> Gzip and Snappy support, respectively.
> However, this is not correct; we should be dlopening {{libsnappy.so.1}} 
> instead.  The versionless form of the shared library is not commonly 
> installed except by development packages.  Also, we may run into subtle 
> compatibility problems if a new version of libsnappy comes out.
> Thanks to Brandon Vargo for reporting this bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8901) GZip and Snappy support may not work without unversioned libraries

2012-10-08 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472011#comment-13472011
 ] 

Andy Isaacson commented on HADOOP-8901:
---

I agree with Todd, let's just fix the regression rather than adding zlib as a 
link-time dependency.

> GZip and Snappy support may not work without unversioned libraries
> --
>
> Key: HADOOP-8901
> URL: https://issues.apache.org/jira/browse/HADOOP-8901
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: native
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HADOOP-8901.001.patch, HADOOP-8901.002.patch
>
>
> Currently, we use {{dlopen}} to open {{libz.so}} and {{libsnappy.so}}, to get 
> Gzip and Snappy support, respectively.
> However, this is not correct; we should be dlopening {{libsnappy.so.1}} 
> instead.  The versionless form of the shared library is not commonly 
> installed except by development packages.  Also, we may run into subtle 
> compatibility problems if a new version of libsnappy comes out.
> For libz, I believe we should simply link normally against it, rather than 
> loading it using {{dlopen}}.  It is part of the base Linux install for every 
> distribution I'm aware of, so let's not overcomplicate things.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8900:
--

Status: Patch Available  (was: Open)

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Attachments: hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work stopped] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-8900 stopped by Andy Isaacson.

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Attachments: hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8900:
--

Attachment: hadoop8900.txt

Attaching patch which corrects this mask issue and adds a testcase which fails 
without the fix.

Unfortunately the testcase takes more than 30 seconds to run on my 2.5GHz Core 
i5, so I doubt that it should be run by default.  The total runtime for 
TestCodec goes from 16 seconds to 99 seconds with testGzipLongOverflow enabled.

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
> Attachments: hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson reassigned HADOOP-8900:
-

Assignee: Andy Isaacson

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Attachments: hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HADOOP-8900) BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match decompressed size (Slavik Krassovsky)

2012-10-08 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-8900 started by Andy Isaacson.

> BuiltInGzipDecompressor : java.io.IOException: stored gzip size doesn't match 
> decompressed size (Slavik Krassovsky)
> ---
>
> Key: HADOOP-8900
> URL: https://issues.apache.org/jira/browse/HADOOP-8900
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: Encountered failure when processing large GZIP file
>Reporter: Slavik Krassovsky
>Assignee: Andy Isaacson
> Attachments: hadoop8900.txt
>
>
> Encountered failure when processing large GZIP file
> • Gz: Failed in 1hrs, 13mins, 57sec with the error:
>  ¸java.io.IOException: IO error in map input file 
> hdfs://localhost:9000/Halo4/json_m/gz/NewFileCat.txt.gz
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:242)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>  at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:260)
>  Caused by: java.io.IOException: stored gzip size doesn't match decompressed 
> size
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeTrailerState(BuiltInGzipDecompressor.java:389)
>  at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:224)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:82)
>  at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
>  at java.io.InputStream.read(InputStream.java:102)
>  at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:136)
>  at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:40)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
>  at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>  ... 9 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8427) Convert Forrest docs to APT

2012-10-05 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8427:
--

Attachment: hadoop8427.txt

Convert commands_manual.html and file_system_shell.html to APT.  Remove a bunch 
of out of date documentation that no longer correctly describes Hadoop 2.0.

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
> Attachments: hadoop8427.txt
>
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HADOOP-8427) Convert Forrest docs to APT

2012-10-05 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson reassigned HADOOP-8427:
-

Assignee: Andy Isaacson

> Convert Forrest docs to APT
> ---
>
> Key: HADOOP-8427
> URL: https://issues.apache.org/jira/browse/HADOOP-8427
> Project: Hadoop Common
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Andy Isaacson
>  Labels: newbie
>
> Some of the forrest docs content in src/docs/src/documentation/content/xdocs 
> has not yet been converted to APT and moved to src/site/apt. Let's convert 
> the forrest docs that haven't been converted yet to new APT content in 
> hadoop-common/src/site/apt (and link the new content into 
> hadoop-project/src/site/apt/index.apt.vm) and remove all forrest dependencies.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

2012-10-01 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467229#comment-13467229
 ] 

Andy Isaacson commented on HADOOP-8845:
---

(sorry for the markup messup in my last comment.)

The currently pending patch specifically checks in {{pTestClosure6}} that the 
case I mentioned is handled correctly, so I think we're all on the same page. :)

Code-wise, one minor comment:
{code}
+  public boolean apply(FileStatus input) {
+return input.isDirectory() ? true : false;
+  }
{code}

This is an anti-pattern; {{foo() ? true : false}} is the same as {{foo()}}.

Other than that, LGTM on the code level. I haven't carefully read the 
GlobFilter implementation to see if there's a cleaner/simpler way to implement 
this bugfix.

> When looking for parent paths info, globStatus must filter out non-directory 
> elements to avoid an AccessControlException
> 
>
> Key: HADOOP-8845
> URL: https://issues.apache.org/jira/browse/HADOOP-8845
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>  Labels: glob
> Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing 
> below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test 
> file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test 
> file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop  0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the 
> subdirectory '/tmp/testdir/1', and ignore the regular file 
> '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop  0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=testuser, access=EXECUTE, 
> inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile 
> cause we tried to access the /tmp/testdir/testfile/testfile as a path. This 
> shouldn't happen, as the testfile is a file and not a path parent to be 
> looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of 
> bypassing permissions, but that can be looked up on another JIRA - if it is 
> fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or 
> /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

2012-10-01 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467158#comment-13467158
 ] 

Andy Isaacson commented on HADOOP-8845:
---

bq. Since * can match the empty string, in other contexts it could be 
appropriate to return ""/tmp/testdir/testfile" for "/tmp/testdir/*/testfile".

That's not right for Posix-style path glob: /usr/*/bin does match /usr/X11/bin 
but does not match /usr/bin, even though /usr//bin is a valid synonym for 
/usr/bin, and this is an important feature that is commonly depended on in 
scripts. For example an admin might {{rm /var/www/user/*/.htaccess}} to remove 
all the user's htaccess files while leaving {{/var/www/user/.htaccess}} intact.

So unless there's a specific need for that kind of funky glob, I don't think we 
need to support it?

> When looking for parent paths info, globStatus must filter out non-directory 
> elements to avoid an AccessControlException
> 
>
> Key: HADOOP-8845
> URL: https://issues.apache.org/jira/browse/HADOOP-8845
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>  Labels: glob
> Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch
>
>
> A brief description from my colleague Stephen Fritz who helped discover it:
> {code}
> [root@node1 ~]# su - hdfs
> -bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing 
> below
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
> -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test 
> file in the subdirectory
> -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test 
> file in the directory
> -bash-4.1$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop  0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
> All files are where we expect them...OK, let's try reading
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- success!
> -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
> My Test String <-- success!  
> Note that we used an '*' in the cat command, and it correctly found the 
> subdirectory '/tmp/testdir/1', and ignore the regular file 
> '/tmp/testdir/testfile'
> -bash-4.1$ exit
> logout
> [root@node1 ~]# su - testuser <-- lets try it as a different user:
> [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
> drwxr-xr-x   - hdfs hadoop  0 2012-09-25 06:52 /tmp/testdir/1
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
> -rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
> My Test String <-- good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
> My Test String <-- so far so good
> [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
> cat: org.apache.hadoop.security.AccessControlException: Permission denied: 
> user=testuser, access=EXECUTE, 
> inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
> {code}
> Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile 
> cause we tried to access the /tmp/testdir/testfile/testfile as a path. This 
> shouldn't happen, as the testfile is a file and not a path parent to be 
> looked up upon.
> {code}
> 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile)
> {code}
> Surprisingly the superuser avoids hitting into the error, as a result of 
> bypassing permissions, but that can be looked up on another JIRA - if it is 
> fine to let it be like that or not.
> This JIRA targets a client-sided fix to not cause such /path/file/dir or 
> /path/file/file kinda lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8568) DNS#reverseDns fails on IPv6 addresses

2012-10-01 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467092#comment-13467092
 ] 

Andy Isaacson commented on HADOOP-8568:
---

{code}
+// rawaddr bytes are of type unsigned int - this converts the given
+// byte to a (signed) int "rawintaddr"
+int rawintaddr = rawaddr[i] & 0xff;
+// format "rawintaddr" into a hex String
+String addressbyte = String.format("%02x", rawintaddr);
{code}
This can be more simply and clearly written as
{code}
String addressbyte = String.format("%02x", rawaddr[i] & 0xff);
{code}
It's actually sufficient to say {{format("%02x", rawaddr[i])}} but I find that 
a little too magic; making the 8-bit truncation explicit seems to more clearly 
express the intent to me.  (The mask-free version only gives the correct 
two-nibble output because of the overspecified {{FormatSpecifier#print(byte, 
Locale)}} implementation in {{java.util.Formatter}}, and breaks if you change 
to a local int variable for example.)

> DNS#reverseDns fails on IPv6 addresses
> --
>
> Key: HADOOP-8568
> URL: https://issues.apache.org/jira/browse/HADOOP-8568
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Tony Kew
>  Labels: newbie
> Attachments: HADOOP-8568.patch
>
>
> DNS#reverseDns assumes hostIp is a v4 address (4 parts separated by dots), 
> blows up if given a v6 address:
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
> at org.apache.hadoop.net.DNS.reverseDns(DNS.java:79)
> at org.apache.hadoop.net.DNS.getHosts(DNS.java:237)
> at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:340)
> at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:358)
> at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:337)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:235)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1649)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8568) DNS#reverseDns fails on IPv6 addresses

2012-10-01 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467037#comment-13467037
 ] 

Andy Isaacson commented on HADOOP-8568:
---

Tony,

no need to delete old patches when uploading a new one, it can be useful to 
reviewers to have old patches available to use {{interdiff}} or similar tools 
or simply to review advancement of a change.  I tend to use a new name 
(hdfs-123.patch, hdfs-123-1.patch, etc) for each upload, but that's just for my 
convenience since Jira keeps track of different uploads with the same name just 
fine.

{code}
+  byte rawaddr[] = hostIp.getAddress();
...
+  String[] parts = hostaddr.split("\\.");
+  reverseIP = parts[3] + "." + parts[2] + "." + parts[1] + "."
++ parts[0] + ".in-addr.arpa";
{code}
I think the {{byte[]}} version of this code, used for IPv6, is significantly 
superior to the regex based string version used for IPv4.  Could you rewrite 
the IPv4 section of the code using {{getAddress()}}?  This may also result in 
greater code sharing between the two branches.


> DNS#reverseDns fails on IPv6 addresses
> --
>
> Key: HADOOP-8568
> URL: https://issues.apache.org/jira/browse/HADOOP-8568
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Tony Kew
>  Labels: newbie
> Attachments: HADOOP-8568.patch
>
>
> DNS#reverseDns assumes hostIp is a v4 address (4 parts separated by dots), 
> blows up if given a v6 address:
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 3
> at org.apache.hadoop.net.DNS.reverseDns(DNS.java:79)
> at org.apache.hadoop.net.DNS.getHosts(DNS.java:237)
> at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:340)
> at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:358)
> at org.apache.hadoop.net.DNS.getDefaultHost(DNS.java:337)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:235)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1649)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8756) Fix SEGV when libsnappy is in java.library.path but not LD_LIBRARY_PATH

2012-09-28 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465980#comment-13465980
 ] 

Andy Isaacson commented on HADOOP-8756:
---

+1

I've reviewed HADOOP-8756.004.patch and I have no further issues.  This is a 
good localized fix that is not dependent on nor obsoleted by the other patches 
under discussion or checked in. Without this patch a local "mvn -Pnative,dist 
clean package -Dmaven.javadoc.skip=true -DskipTests; mvn test -Dtest=TestCodec" 
fails.
{code}
Running org.apache.hadoop.io.compress.TestCodec
Tests run: 21, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 15.956 sec <<< 
FAILURE!

Results :

Failed tests:   testSnappyCodec(org.apache.hadoop.io.compress.TestCodec): 
Snappy native available but Hadoop native not
{code}

> Fix SEGV when libsnappy is in java.library.path but not LD_LIBRARY_PATH
> ---
>
> Key: HADOOP-8756
> URL: https://issues.apache.org/jira/browse/HADOOP-8756
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: native
>Affects Versions: 2.0.2-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HADOOP-8756.002.patch, HADOOP-8756.003.patch, 
> HADOOP-8756.004.patch
>
>
> We use {{System.loadLibrary("snappy")}} from the Java side.  However in 
> libhadoop, we use {{dlopen}} to open libsnappy.so dynamically.  
> System.loadLibrary uses {{java.library.path}} to resolve libraries, and 
> {{dlopen}} uses {{LD_LIBRARY_PATH}} and the system paths to resolve 
> libraries.  Because of this, the two library loading functions can be at odds.
> We should fix this so we only load the library once, preferably using the 
> standard Java {{java.library.path}}.
> We should also log the search path(s) we use for {{libsnappy.so}} when 
> loading fails, so that it's easier to diagnose configuration issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8386) hadoop script doesn't work if 'cd' prints to stdout (default behavior in Ubuntu)

2012-09-28 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8386:
--

Attachment: hadoop-8386-1.diff

Let's try that again, against trunk rather than branch-1 this time.

> hadoop script doesn't work if 'cd' prints to stdout (default behavior in 
> Ubuntu)
> 
>
> Key: HADOOP-8386
> URL: https://issues.apache.org/jira/browse/HADOOP-8386
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 1.0.2
> Environment: Ubuntu
>Reporter: Christopher Berner
> Attachments: hadoop-8386-1.diff, hadoop-8386.diff, hadoop.diff
>
>
> if the 'hadoop' script is run as 'bin/hadoop' on a distro where the 'cd' 
> command prints to stdout, the script will fail due to this line: 'bin=`cd 
> "$bin"; pwd`'
> Workaround: execute from the bin/ directory as './hadoop'
> Fix: change that line to 'bin=`cd "$bin" > /dev/null; pwd`'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8386) hadoop script doesn't work if 'cd' prints to stdout (default behavior in Ubuntu)

2012-09-28 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8386:
--

Attachment: hadoop-8386.diff

Uploading version of Christopher's patch that test-patch can apply.

> hadoop script doesn't work if 'cd' prints to stdout (default behavior in 
> Ubuntu)
> 
>
> Key: HADOOP-8386
> URL: https://issues.apache.org/jira/browse/HADOOP-8386
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 1.0.2
> Environment: Ubuntu
>Reporter: Christopher Berner
> Attachments: hadoop-8386.diff, hadoop.diff
>
>
> if the 'hadoop' script is run as 'bin/hadoop' on a distro where the 'cd' 
> command prints to stdout, the script will fail due to this line: 'bin=`cd 
> "$bin"; pwd`'
> Workaround: execute from the bin/ directory as './hadoop'
> Fix: change that line to 'bin=`cd "$bin" > /dev/null; pwd`'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8386) hadoop script doesn't work if 'cd' prints to stdout (default behavior in Ubuntu)

2012-09-28 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465962#comment-13465962
 ] 

Andy Isaacson commented on HADOOP-8386:
---

The Gnu Autoconf manual has this to say:
http://www.gnu.org/software/autoconf/manual/autoconf-2.67/html_node/Special-Shell-Variables.html
{quote}
CDPATH
When this variable is set it specifies a list of directories to search when 
invoking cd with a relative file name that did not start with ‘./’ or ‘../’. 
Posix 1003.1-2001 says that if a nonempty directory name from CDPATH is used 
successfully, cd prints the resulting absolute file name. Unfortunately this 
output can break idioms like ‘abs=`cd src && pwd`’ because abs receives the 
name twice. Also, many shells do not conform to this part of Posix; for 
example, zsh prints the result only if a directory name other than . was chosen 
from CDPATH.

In practice the shells that have this problem also support unset, so you 
can work around the problem as follows:

  (unset CDPATH) >/dev/null 2>&1 && unset CDPATH

You can also avoid output by ensuring that your directory name is absolute 
or anchored at ‘./’, as in ‘abs=`cd ./src && pwd`’. 
{quote}
So the Bash behavior is specified by Posix, alas. It is specified to write to 
stdout not stderr, so the patch is correct in that regard (I was concerned we 
might also need {{2>&1}} or similar).

So, LGTM.  I'll upload a patch that conforms to test-patch expectations. Thanks 
for the contribution, Christopher!

> hadoop script doesn't work if 'cd' prints to stdout (default behavior in 
> Ubuntu)
> 
>
> Key: HADOOP-8386
> URL: https://issues.apache.org/jira/browse/HADOOP-8386
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 1.0.2
> Environment: Ubuntu
>Reporter: Christopher Berner
> Attachments: hadoop.diff
>
>
> if the 'hadoop' script is run as 'bin/hadoop' on a distro where the 'cd' 
> command prints to stdout, the script will fail due to this line: 'bin=`cd 
> "$bin"; pwd`'
> Workaround: execute from the bin/ directory as './hadoop'
> Fix: change that line to 'bin=`cd "$bin" > /dev/null; pwd`'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8386) hadoop script doesn't work if 'cd' prints to stdout (default behavior in Ubuntu)

2012-09-26 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464385#comment-13464385
 ] 

Andy Isaacson commented on HADOOP-8386:
---

If it's not an alias, cd hackery almost always is done using a shell function.  
Also, a /bin/cd command cannot work -- running it would fork a child process, 
change directory of the child process, and then exit, having no impact on the 
parent shell process.

To figure out what your cd is doing, in bash use "type cd".
{noformat}
# first, define a function foo
$ foo() { echo bar; }
# now, run it
$ foo
bar
$ type foo
foo is a function
foo () 
{ 
echo bar
}
$
{noformat}
In dash, {{type}} just says {{foo is a shell function}}.  I bet the original 
user is using bash though.

bq. Fixes the 'hadoop' script to work on Ubuntu distro and others where the 
'cd' command prints to stdout

My ubuntu 12.04 install doesn't have any aliases or functions defined for cd, 
can you find out what package is installing the evil settings in 
/etc/bash_completion.d (most likely) and file an upstream bug?
{code}
ubuntu@ubu-cdh-0:~$ type cd
cd is a shell builtin
{code}

> hadoop script doesn't work if 'cd' prints to stdout (default behavior in 
> Ubuntu)
> 
>
> Key: HADOOP-8386
> URL: https://issues.apache.org/jira/browse/HADOOP-8386
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 1.0.2
> Environment: Ubuntu
>Reporter: Christopher Berner
> Attachments: hadoop.diff
>
>
> if the 'hadoop' script is run as 'bin/hadoop' on a distro where the 'cd' 
> command prints to stdout, the script will fail due to this line: 'bin=`cd 
> "$bin"; pwd`'
> Workaround: execute from the bin/ directory as './hadoop'
> Fix: change that line to 'bin=`cd "$bin" > /dev/null; pwd`'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8803) Make Hadoop running more secure public cloud envrionment

2012-09-26 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464369#comment-13464369
 ] 

Andy Isaacson commented on HADOOP-8803:
---

bq. In switched networks, which all reasonable clusters are configured, you 
only see the traffic to/from the compromised NIC.
Switch MAC tables are not a security measure; it's pretty easy to fool most 
switches into sending traffic to the wrong port.  Managed switches can often be 
configured to avoid this failure or to alarm when MAC spoofing happens, but 
that's additional admin overhead.

So yeah, it's a real threat that a compromised machine can observe and MITM 
traffic to other hosts in the same broadcast domain.

I'm not sure the block token is the right solution to this problem, but it is a 
real problem.

> Make Hadoop running more secure public cloud envrionment
> 
>
> Key: HADOOP-8803
> URL: https://issues.apache.org/jira/browse/HADOOP-8803
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, ipc, security
>Affects Versions: 0.20.204.0
>Reporter: Xianqing Yu
>  Labels: hadoop
>   Original Estimate: 2m
>  Remaining Estimate: 2m
>
> I am a Ph.D student in North Carolina State University. I am modifying the 
> Hadoop's code (which including most parts of Hadoop, e.g. JobTracker, 
> TaskTracker, NameNode, DataNode) to achieve better security.
>  
> My major goal is that make Hadoop running more secure in the Cloud 
> environment, especially for public Cloud environment. In order to achieve 
> that, I redesign the currently security mechanism and achieve following 
> proprieties:
> 1. Bring byte-level access control to Hadoop HDFS. Based on 0.20.204, HDFS 
> access control is based on user or block granularity, e.g. HDFS Delegation 
> Token only check if the file can be accessed by certain user or not, Block 
> Token only proof which block or blocks can be accessed. I make Hadoop can do 
> byte-granularity access control, each access party, user or task process can 
> only access the bytes she or he least needed.
> 2. I assume that in the public Cloud environment, only Namenode, secondary 
> Namenode, JobTracker can be trusted. A large number of Datanode and 
> TaskTracker may be compromised due to some of them may be running under less 
> secure environment. So I re-design the secure mechanism to make the damage 
> the hacker can do to be minimized.
>  
> a. Re-design the Block Access Token to solve wildly shared-key problem of 
> HDFS. In original Block Access Token design, all HDFS (Namenode and Datanode) 
> share one master key to generate Block Access Token, if one DataNode is 
> compromised by hacker, the hacker can get the key and generate any  Block 
> Access Token he or she want.
>  
> b. Re-design the HDFS Delegation Token to do fine-grain access control for 
> TaskTracker and Map-Reduce Task process on HDFS. 
>  
> In the Hadoop 0.20.204, all TaskTrackers can use their kerberos credentials 
> to access any files for MapReduce on HDFS. So they have the same privilege as 
> JobTracker to do read or write tokens, copy job file, etc.. However, if one 
> of them is compromised, every critical thing in MapReduce directory (job 
> file, Delegation Token) is exposed to attacker. I solve the problem by making 
> JobTracker to decide which TaskTracker can access which file in MapReduce 
> Directory on HDFS.
>  
> For Task process, once it get HDFS Delegation Token, it can access everything 
> belong to this job or user on HDFS. By my design, it can only access the 
> bytes it needed from HDFS.
>  
> There are some other improvement in the security, such as TaskTracker can not 
> know some information like blockID from the Block Token (because it is 
> encrypted by my way), and HDFS can set up secure channel to send data as a 
> option.
>  
> By those features, Hadoop can run much securely under uncertain environment 
> such as Public Cloud. I already start to test my prototype. I want to know 
> that whether community is interesting about my work? Is that a value work to 
> contribute to production Hadoop?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8855) SSL-based image transfer does not work when Kerberos is disabled

2012-09-26 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464261#comment-13464261
 ] 

Andy Isaacson commented on HADOOP-8855:
---

Further to the above, I've also verified that using the patched JAR, the 2NN is 
able to retrieve the fsimage from the NN with config
{code}
hadoop.ssl.enabled
true
...
hadoop.security.authorization
false
...
hadoop.security.authentication
simple
{code}

Thanks for the fix, Todd.

> SSL-based image transfer does not work when Kerberos is disabled
> 
>
> Key: HADOOP-8855
> URL: https://issues.apache.org/jira/browse/HADOOP-8855
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hadoop-8855.txt, hadoop-8855.txt, hadoop-8855.txt
>
>
> In SecurityUtil.openSecureHttpConnection, we first check 
> {{UserGroupInformation.isSecurityEnabled()}}. However, this only checks the 
> kerberos config, which is independent of {{hadoop.ssl.enabled}}. Instead, we 
> should check {{HttpConfig.isSecure()}}.
> Credit to Wing Yew Poon for discovering this bug

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8855) SSL-based image transfer does not work when Kerberos is disabled

2012-09-26 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464217#comment-13464217
 ] 

Andy Isaacson commented on HADOOP-8855:
---

I tested Todd's patch on a cluster with various permutations of krb5 and SSL. 
With the patched JAR, all of my tests passed.
- hadoop.security.authentication=kerberos hadoop.ssl.enabled=true: dfsadmin 
-fetchImage works.
- hadoop.security.authentication=simple hadoop.ssl.enabled=true: fetchImage 
works.
- hadoop.security.authentication=kerberos hadoop.ssl.enabled=false: fetchImage 
works.

I also duplicated Todd's observation that {{dfsadmin -fetchImage}} does not 
work on krb5 without the doAs.

> SSL-based image transfer does not work when Kerberos is disabled
> 
>
> Key: HADOOP-8855
> URL: https://issues.apache.org/jira/browse/HADOOP-8855
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hadoop-8855.txt, hadoop-8855.txt, hadoop-8855.txt
>
>
> In SecurityUtil.openSecureHttpConnection, we first check 
> {{UserGroupInformation.isSecurityEnabled()}}. However, this only checks the 
> kerberos config, which is independent of {{hadoop.ssl.enabled}}. Instead, we 
> should check {{HttpConfig.isSecure()}}.
> Credit to Wing Yew Poon for discovering this bug

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8855) SSL-based image transfer does not work when Kerberos is disabled

2012-09-26 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464077#comment-13464077
 ] 

Andy Isaacson commented on HADOOP-8855:
---

LGTM.

> SSL-based image transfer does not work when Kerberos is disabled
> 
>
> Key: HADOOP-8855
> URL: https://issues.apache.org/jira/browse/HADOOP-8855
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Attachments: hadoop-8855.txt
>
>
> In SecurityUtil.openSecureHttpConnection, we first check 
> {{UserGroupInformation.isSecurityEnabled()}}. However, this only checks the 
> kerberos config, which is independent of {{hadoop.ssl.enabled}}. Instead, we 
> should check {{HttpConfig.isSecure()}}.
> Credit to Wing Yew Poon for discovering this bug

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8806) libhadoop.so: dlopen should be better at locating libsnappy.so, etc.

2012-09-17 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457236#comment-13457236
 ] 

Andy Isaacson commented on HADOOP-8806:
---

I didn't intend my exploratory tarball to derail my Friday +1.  +1.

> libhadoop.so: dlopen should be better at locating libsnappy.so, etc.
> 
>
> Key: HADOOP-8806
> URL: https://issues.apache.org/jira/browse/HADOOP-8806
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HADOOP-8806.003.patch, rpathtest2.tar.gz, 
> rpathtest.tar.gz
>
>
> libhadoop calls {{dlopen}} to load {{libsnappy.so}} and {{libz.so}}.  These 
> libraries can be bundled in the {{$HADOOP_ROOT/lib/native}} directory.  For 
> example, the {{-Dbundle.snappy}} build option copies {{libsnappy.so}} to this 
> directory.  However, snappy can't be loaded from this directory unless 
> {{LD_LIBRARY_PATH}} is set to include this directory.
> Can we make this configuration "just work" without needing to rely on 
> {{LD_LIBRARY_PATH}}?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8615) EOFException in DecompressorStream.java needs to be more verbose

2012-09-17 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457199#comment-13457199
 ] 

Andy Isaacson commented on HADOOP-8615:
---

Thomas,
Thank you for the patch!

bq. Please let me know if there is any procedure for making this tested only on 
hadoop common
release 0.20.2

Don't worry about the robots testing against the wrong branch, you're not doing 
anything wrong.

It seems to me this change would be a good thing on trunk as well. Can you port 
the patch to trunk?

bq. When the user uses this method and pass the filename, it would be printed 
in the EOF exception thrown, if any. So I believe the test cases may not be 
necessary. I was able to test it locally by forcefully creating an EOF 
Exception and verifying the new message as "java.io.EOFException: Unexpected 
end of input stream in the file = filename"

I think this should be fairly easy to test -- just write a compressed stream, 
truncate the compressed stream, then try to read it, catch the EOFException and 
verify that the filename shows up in the exception text.  Or am I missing 
something?

I'm a little worried about the places where your {{fileName}}-using methods add 
new default values, for example:
{code}
+  public CompressionInputStream createInputStream(InputStream in, 
+Decompressor decompressor, String fileName) 
+  throws IOException {
+return new DecompressorStream(in, decompressor, 
+   conf.getInt("io.file.buffer.size", 4*1024),fileName);
+  }
{code}
I'll have to think about it longer, but having a default value of 4k hidden in 
this method seems wrong to me at a first glance.  There are a few other 
instances of this as well.

> EOFException in DecompressorStream.java needs to be more verbose
> 
>
> Key: HADOOP-8615
> URL: https://issues.apache.org/jira/browse/HADOOP-8615
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.20.2
>Reporter: Jeff Lord
>  Labels: patch
> Attachments: HADOOP-8615-release-0.20.2.patch
>
>
> In ./src/core/org/apache/hadoop/io/compress/DecompressorStream.java
> The following exception should at least pass back the file that it encounters 
> this error in relation to:
>   protected void getCompressedData() throws IOException {
> checkStream();
> int n = in.read(buffer, 0, buffer.length);
> if (n == -1) {
>   throw new EOFException("Unexpected end of input stream");
> }
> This would help greatly to debug bad/corrupt files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HADOOP-8806) libhadoop.so: dlopen should be better at locating libsnappy.so, etc.

2012-09-14 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HADOOP-8806:
--

Attachment: rpathtest.tar.gz

Attaching rpathtest.tar.gz which contains a test program for various rpath 
scenarios.

> libhadoop.so: dlopen should be better at locating libsnappy.so, etc.
> 
>
> Key: HADOOP-8806
> URL: https://issues.apache.org/jira/browse/HADOOP-8806
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HADOOP-8806.003.patch, rpathtest.tar.gz
>
>
> libhadoop calls {{dlopen}} to load {{libsnappy.so}} and {{libz.so}}.  These 
> libraries can be bundled in the {{$HADOOP_ROOT/lib/native}} directory.  For 
> example, the {{-Dbundle.snappy}} build option copies {{libsnappy.so}} to this 
> directory.  However, snappy can't be loaded from this directory unless 
> {{LD_LIBRARY_PATH}} is set to include this directory.
> Can we make this configuration "just work" without needing to rely on 
> {{LD_LIBRARY_PATH}}?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8806) libhadoop.so: dlopen should be better at locating libsnappy.so, etc.

2012-09-14 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456291#comment-13456291
 ] 

Andy Isaacson commented on HADOOP-8806:
---

I ran a standalone test of {{DT_RPATH $ORIGIN}}.  There's good news and bad 
news which is probably ok.

On the plus side, setting DT_RPATH=$ORIGIN on libhadoop.so does allow it to 
find libsnappy.so in the same directory.  This is good.

On the downside, setting DT_RPATH=$ORIGIN on libhadoop.so pollutes the search 
path for later dlopens in the main executable.  In my test,
{noformat}
  main.c
-> dlopens /path/to/libbar.so with DT_RPATH=$ORIGIN
  -> dlopens libfoo.so
-> dlopens libfoo.so
{noformat}
Since main.c is supposed to be unaffected by the behavior of libbar, the final 
dlopen should fail since main's search path does not include libfoo.

Unfortunately the final dlopen succeeds.  This means that the libfoo opened by 
libbar, while loaded, is available for open by main.

Modifying the test so that libbar.so dlclose()s libfoo before returning fixes 
the problem; the final open of libfoo fails again.  So it's not a problem of 
libbar's DT_RPATH polluting the main executable's search path, but rather, a 
single table of currently open objects.

What does this mean?  Suppose we have a libsnappy.1.0.14 in $ORIGIN, and the 
system has a libsnappy.1.0.20 in /usr/lib.  A program which uses libhadoop will 
get 1.0.14 if libhadoop is opened before libsnappy, and 1.0.20 if libsnappy is 
opened before libhadoop.  This kinda sucks. But, it's definitely a corner case, 
so maybe it's OK.

If we used RUNPATH rather than RPATH, the user could work around the above 
problem by setting LD_LIBRARY_PATH.  Since it's probably safer from a build 
perspective to use RPATH -- I couldn't even figure out how to get ld-2.22 to 
set RUNPATH -- then LD_LIBRARY_PATH will not work. 

> libhadoop.so: dlopen should be better at locating libsnappy.so, etc.
> 
>
> Key: HADOOP-8806
> URL: https://issues.apache.org/jira/browse/HADOOP-8806
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HADOOP-8806.003.patch
>
>
> libhadoop calls {{dlopen}} to load {{libsnappy.so}} and {{libz.so}}.  These 
> libraries can be bundled in the {{$HADOOP_ROOT/lib/native}} directory.  For 
> example, the {{-Dbundle.snappy}} build option copies {{libsnappy.so}} to this 
> directory.  However, snappy can't be loaded from this directory unless 
> {{LD_LIBRARY_PATH}} is set to include this directory.
> Can we make this configuration "just work" without needing to rely on 
> {{LD_LIBRARY_PATH}}?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8806) libhadoop.so: dlopen should be better at locating libsnappy.so, etc.

2012-09-14 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456261#comment-13456261
 ] 

Andy Isaacson commented on HADOOP-8806:
---

This seems like a reasonable stopgap to me. +1.

Note that {{$ORIGIN}} is documented in {{ld-linux(8)}}:
{noformat}
RPATH TOKEN EXPANSION
   The  runtime  linker provides a number of tokens that can be used in an
   rpath specification (DT_RPATH or DT_RUNPATH).

   $ORIGIN
  ld.so understands the string $ORIGIN (or equivalently ${ORIGIN})
  in  an  rpath specification to mean the directory containing the
  application  executable.
{noformat}

> libhadoop.so: dlopen should be better at locating libsnappy.so, etc.
> 
>
> Key: HADOOP-8806
> URL: https://issues.apache.org/jira/browse/HADOOP-8806
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HADOOP-8806.003.patch
>
>
> libhadoop calls {{dlopen}} to load {{libsnappy.so}} and {{libz.so}}.  These 
> libraries can be bundled in the {{$HADOOP_ROOT/lib/native}} directory.  For 
> example, the {{-Dbundle.snappy}} build option copies {{libsnappy.so}} to this 
> directory.  However, snappy can't be loaded from this directory unless 
> {{LD_LIBRARY_PATH}} is set to include this directory.
> Can we make this configuration "just work" without needing to rely on 
> {{LD_LIBRARY_PATH}}?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8791) rm "Only deletes non empty directory and files."

2012-09-14 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456116#comment-13456116
 ] 

Andy Isaacson commented on HADOOP-8791:
---

bq. we could add a new rmdir in FsShell

There's already {{hdfs dfs -rmdir}} in trunk.  It behaves like Unix rmdir: it 
only deletes empty directories, failing with "Directory is not empty" if there 
are files or subdirectories.

> rm "Only deletes non empty directory and files."
> 
>
> Key: HADOOP-8791
> URL: https://issues.apache.org/jira/browse/HADOOP-8791
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.0.3, 3.0.0
>Reporter: Bertrand Dechoux
>Assignee: Jing Zhao
>  Labels: documentation
> Attachments: HADOOP-8791-branch-1.patch, HADOOP-8791-trunk.patch
>
>
> The documentation (1.0.3) is describing the opposite of what rm does.
> It should be  "Only delete files and empty directories."
> With regards to file, the size of the file should not matter, should it?
> OR I am totally misunderstanding the semantic of this command and I am not 
> the only one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HADOOP-8806) libhadoop.so: search java.library.path when calling dlopen

2012-09-13 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455531#comment-13455531
 ] 

Andy Isaacson commented on HADOOP-8806:
---

Another potential issue -- there is plenty of fun debugging waiting for the 
first developer who tries to have a dynamic libsnappy.so and a static 
snappy.a-in-libhadoop.so in the same executable.  Supposedly that scenario can 
be made to work, but I've had no end of trouble with similar scenarios 
previously.

> libhadoop.so: search java.library.path when calling dlopen
> --
>
> Key: HADOOP-8806
> URL: https://issues.apache.org/jira/browse/HADOOP-8806
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Priority: Minor
>
> libhadoop calls {{dlopen}} to load {{libsnappy.so}} and {{libz.so}}.  These 
> libraries can be bundled in the {{$HADOOP_ROOT/lib/native}} directory.  For 
> example, the {{-Dbundle.snappy}} build option copies {{libsnappy.so}} to this 
> directory.  However, snappy can't be loaded from this directory unless 
> {{LD_LIBRARY_PATH}} is set to include this directory.
> Should we also search {{java.library.path}} when loading these libraries?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >