[
https://issues.apache.org/jira/browse/WHIRR-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125229#comment-13125229
]
John Conwell commented on WHIRR-392:
------------------------------------
Correct, whirr creates a hadoop-site.xml file with enough of the settings for
me to be able to run hadoop commands pointed at the correct name node.
But I'm using whirr to fully deploy my application to EC2, which is made up of
multiple clusters of resources, hadoop being just one. One of my clusters is
using the hadoop API to launch jobs that run in the hadoop cluster. When this
cluster needs to run a hadoop job it creates a new hadoop Configuration()
instance, whcih looks for the hadoop config on the VM its running from, but it
cant find it (because its not one of the hadoop nodes), so its just uses the
default hadoop config values, like number of reducers being 1 as well as
anything custom I put in the whirr config file.
So even though the hadoop cluster is configured correctly, the namenode uses
the job configuration that gets passed to it (the one with the default values),
and my jobs run incorrectly. So what I need is a way to get the three
*-site.xml files that whirr created and wrote to each hadoop node. If these
three files are in the class path of my process that is starting hadoop jobs,
it picks up the custom configuration values and my hadoop jobs run correctly.
> Save *-site.xml files configured for hadoop cluster to local machine.
> ---------------------------------------------------------------------
>
> Key: WHIRR-392
> URL: https://issues.apache.org/jira/browse/WHIRR-392
> Project: Whirr
> Issue Type: New Feature
> Components: service/hadoop
> Reporter: John Conwell
>
> When using whirr to create a hadoop cluster, whirr should save copies of the
> three *-site.xml files it configures for the hadoop cluster to the local
> machine. This way when launching a job from the local machine, the job
> driver can read these config files when creating a new Configuration object,
> and the hadoop job will be properly configured to execute on the target
> cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira