[jira] [Updated] (IGNITE-483) Simplify Hadoop "ignition"

Ivan Veselovsky (JIRA) Tue, 17 Nov 2015 02:59:28 -0800

     [ 
https://issues.apache.org/jira/browse/IGNITE-483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ivan Veselovsky updated IGNITE-483:
-----------------------------------
    Description: 
Currently in Ignite we have setup-hadoop script + java class to replace configs 
and create symlinks to Ignite libraries.
This setup script and Ignite launcher script have many dependencies on the 
Hadoop distribution layout (files, configs, etc.)

I suggest to simplify this:
1) It seems to me that *no* symlink and/or library copying needed at all. 
Instead, to make hadoop client work via Ignite, we can use the following 
wrapper around the default hadoop client launcher script:
file "hadoop-ignited" :

{code}
# Ignite home is needed to allow Ignite libraries to find the logger config (it 
is resolved relative to the Ignite home):
export IGNITE_HOME=/home/ignite/ignite-hadoop-1.0.0-RC3-SNAPSHOT

# Add necessary Ignite libraries to the Hadoop client classpath:
export 
HADOOP_CLASSPATH=${IGNITE_HOME}/libs/ignite-core-1.0.0-RC3-SNAPSHOT.jar:${IGNITE_HOME}/libs/ignite-hadoop/ignite-hadoop-1.0.0-RC3-SNAPSHOT.jar:${IGNITE_HOME}/libs/ignite-shmem-1.0.0.jar

hadoop --config ${IGNITE_HOME}/ignite-conf "${@}"
{code}

where  {code}${IGNITE_HOME}/ignite-conf{code} is folder inside IGNITE 
distribution where the 2 custom configs are located: core-site.xml, 
mapred-site.xml .

(If the user wants only Ignite MapRed engine, but does not need Ignite 
filesystem, he symlinks the default mapred-site.xml in this directory.)

This way, in order to use fully pre-configured Hadoop cluster the user only 
needs to add "hadoop-ignited" script into PATH.
After that instead of running hadoop jar examples.jar pi 10 10 , use can run 
hadoop-ignited jar examples.jar pi 10 10 to run the sample on fully ignited 
Hadoop.

2) The only thing that is still needs attention is how the Ignite node should 
find the Hadoop libraries. My suggestion is to provide 3 versions of config 
scripts depending on the supported layouts, like "layout-apache", 
"layout-cloudera", and "layout-bigtop-hortonworks" (I suppose, the latter 2 are 
identical) . After that ask use after Ignite archive unzipping write 1 line 
into startup script, like LAYOUT=cloudera, and that's it -- the corresponding 
env script will be picked up to set up env variables needed by the user's 
Hadoop distribution.

(Earlier attempt to solve this problem described in 
https://issues.apache.org/jira/browse/IGNITE-372 .)

This approach works with Hive as well.

  was:
Currently in Ignite we have setup-hadoop script + java class to replace configs 
and create symlinks to Ignite libraries.
This setup script and Ignite launcher script have many dependencies on the 
Hadoop distribution layout (files, configs, etc.)

I suggest to simplify this:
1) It seems to me that *no* symlink and/or library copying needed at all. 
Instead, to make hadoop client work via Ignite, we can use the following 
wrapper around the default hadoop client launcher script:
file "hadoop-ignited" :

{code}
# Ignite home is needed to allow Ignite libraries to find the logger config (it 
is resolved relative to the Ignite home):
export IGNITE_HOME=/home/ignite/ignite-hadoop-1.0.0-RC3-SNAPSHOT

# Add necessary Ignite libraries to the Hadoop client classpath:
export 
HADOOP_CLASSPATH=${IGNITE_HOME}/libs/ignite-core-1.0.0-RC3-SNAPSHOT.jar:${IGNITE_HOME}/libs/ignite-hadoop/ignite-hadoop-1.0.0-RC3-SNAPSHOT.jar

hadoop --config ${IGNITE_HOME}/ignite-conf "${@}"
{code}

where  {code}${IGNITE_HOME}/ignite-conf{code} is folder inside IGNITE 
distribution where the 2 custom configs are located: core-site.xml, 
mapred-site.xml .

(If the user wants only Ignite MapRed engine, but does not need Ignite 
filesystem, he symlinks the default mapred-site.xml in this directory.)

This way, in order to use fully pre-configured Hadoop cluster the user only 
needs to add "hadoop-ignited" script into PATH.
After that instead of running hadoop jar examples.jar pi 10 10 , use can run 
hadoop-ignited jar examples.jar pi 10 10 to run the sample on fully ignited 
Hadoop.

2) The only thing that is still needs attention is how the Ignite node should 
find the Hadoop libraries. My suggestion is to provide 3 versions of config 
scripts depending on the supported layouts, like "layout-apache", 
"layout-cloudera", and "layout-bigtop-hortonworks" (I suppose, the latter 2 are 
identical) . After that ask use after Ignite archive unzipping write 1 line 
into startup script, like LAYOUT=cloudera, and that's it -- the corresponding 
env script will be picked up to set up env variables needed by the user's 
Hadoop distribution.

(Earlier attempt to solve this problem described in 
https://issues.apache.org/jira/browse/IGNITE-372 .)

TODO: will this approach work with Hive, etc.?


> Simplify Hadoop "ignition" 
> ---------------------------
>
>                 Key: IGNITE-483
>                 URL: https://issues.apache.org/jira/browse/IGNITE-483
>             Project: Ignite
>          Issue Type: Task
>    Affects Versions: sprint-1
>            Reporter: Ivan Veselovsky
>            Assignee: Ivan Veselovsky
>
> Currently in Ignite we have setup-hadoop script + java class to replace 
> configs and create symlinks to Ignite libraries.
> This setup script and Ignite launcher script have many dependencies on the 
> Hadoop distribution layout (files, configs, etc.)
> I suggest to simplify this:
> 1) It seems to me that *no* symlink and/or library copying needed at all. 
> Instead, to make hadoop client work via Ignite, we can use the following 
> wrapper around the default hadoop client launcher script:
> file "hadoop-ignited" :
> {code}
> # Ignite home is needed to allow Ignite libraries to find the logger config 
> (it is resolved relative to the Ignite home):
> export IGNITE_HOME=/home/ignite/ignite-hadoop-1.0.0-RC3-SNAPSHOT
> # Add necessary Ignite libraries to the Hadoop client classpath:
> export 
> HADOOP_CLASSPATH=${IGNITE_HOME}/libs/ignite-core-1.0.0-RC3-SNAPSHOT.jar:${IGNITE_HOME}/libs/ignite-hadoop/ignite-hadoop-1.0.0-RC3-SNAPSHOT.jar:${IGNITE_HOME}/libs/ignite-shmem-1.0.0.jar
> hadoop --config ${IGNITE_HOME}/ignite-conf "${@}"
> {code}
> where  {code}${IGNITE_HOME}/ignite-conf{code} is folder inside IGNITE 
> distribution where the 2 custom configs are located: core-site.xml, 
> mapred-site.xml .
> (If the user wants only Ignite MapRed engine, but does not need Ignite 
> filesystem, he symlinks the default mapred-site.xml in this directory.)
> This way, in order to use fully pre-configured Hadoop cluster the user only 
> needs to add "hadoop-ignited" script into PATH.
> After that instead of running hadoop jar examples.jar pi 10 10 , use can run 
> hadoop-ignited jar examples.jar pi 10 10 to run the sample on fully ignited 
> Hadoop.
> 2) The only thing that is still needs attention is how the Ignite node should 
> find the Hadoop libraries. My suggestion is to provide 3 versions of config 
> scripts depending on the supported layouts, like "layout-apache", 
> "layout-cloudera", and "layout-bigtop-hortonworks" (I suppose, the latter 2 
> are identical) . After that ask use after Ignite archive unzipping write 1 
> line into startup script, like LAYOUT=cloudera, and that's it -- the 
> corresponding env script will be picked up to set up env variables needed by 
> the user's Hadoop distribution.
> (Earlier attempt to solve this problem described in 
> https://issues.apache.org/jira/browse/IGNITE-372 .)
> This approach works with Hive as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (IGNITE-483) Simplify Hadoop "ignition"

Reply via email to