Mapred-site.xml:
<!--Autogenerated by Cloudera CM on 2013-12-04T22:38:07.943Z-->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>som-dmsandbox01.humedica.net:8021</value>
</property>
<property>
<name>mapred.job.tracker.http.address</name>
<value>0.0.0.0:50030</value>
</property>
<property>
<name>mapreduce.job.counters.max</name>
<value>120</value>
</property>
<property>
<name>mapred.output.compress</name>
<value>false</value>
</property>
<property>
<name>mapred.output.compression.type</name>
<value>BLOCK</value>
</property>
<property>
<name>mapred.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.DefaultCodec</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>zlib.compress.level</name>
<value>DEFAULT_COMPRESSION</value>
</property>
<property>
<name>io.sort.factor</name>
<value>64</value>
</property>
<property>
<name>io.sort.record.percent</name>
<value>0.05</value>
</property>
<property>
<name>io.sort.spill.percent</name>
<value>0.8</value>
</property>
<property>
<name>mapred.reduce.parallel.copies</name>
<value>10</value>
</property>
<property>
<name>mapred.submit.replication</name>
<value>2</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
</property>
<property>
<name>mapred.userlog.retain.hours</name>
<value>24</value>
</property>
<property>
<name>io.sort.mb</name>
<value>71</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value> -Xmx298061516</value>
</property>
<property>
<name>mapred.job.reuse.jvm.num.tasks</name>
<value>1</value>
</property>
<property>
<name>mapred.map.tasks.speculative.execution</name>
<value>false</value>
</property>
<property>
<name>mapred.reduce.tasks.speculative.execution</name>
<value>false</value>
</property>
<property>
<name>mapred.reduce.slowstart.completed.maps</name>
<value>0.8</value>
</property>
</configuration>
From: Abraham Elmahrek <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 30, 2014 2:13 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: Sqoop to HDFS error Cannot initialize cluster
Hmmm could you provide your mapred-site.xml? It seems like you need to update
the mapreduce.framework.name<http://mapreduce.framework.name> to "classic" if
you're using MR1.
-Abe
On Thu, Jan 30, 2014 at 11:02 AM, Brenden Cobb
<[email protected]<mailto:[email protected]>> wrote:
Hi Abe-
Sqoop 1.4.3 was installed as part of CDH 4.5
Using the server domain instead of localhost did push things along a bit, but
the job is complaining that the LocalJobRunner is on the "master" node in
cluster:
14/01/30 13:49:18 INFO mapreduce.Cluster: Failed to use
org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
"mapreduce.jobtracker.address" configuration value for LocalJobRunner :
"som-dmsandbox01.humedica.net:8021<http://som-dmsandbox01.humedica.net:8021>"
14/01/30 13:49:18 ERROR security.UserGroupInformation:
PriviledgedActionException as:oracle (auth:SIMPLE) cause:java.io.IOException:
Cannot initialize Cluster. Please check your configuration for
mapreduce.framework.name<http://mapreduce.framework.name> and the correspond
server addresses.
14/01/30 13:49:18 ERROR tool.ImportTool: Encountered IOException running import
job: java.io.IOException: Cannot initialize Cluster. Please check your
configuration for mapreduce.framework.name<http://mapreduce.framework.name> and
the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)
The instance above mentions som-dmsandbox01.humedica.net<http://humedica.net>
(the master) while the machine Im executing on is
som-dmsandbox03.humedica.net<http://humedica.net>
-BC
From: Abraham Elmahrek <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 30, 2014 1:49 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: Sqoop to HDFS error Cannot initialize cluster
Hey there,
Sqoop1 is actually just a really heavy client. It will create jobs in MapReduce
for data transfering.
With that being said, I'm curious about how sqoop was installed? What version
of Sqoop1 are you running? It might be as simple as setting the HADOOP_HOME
environment variable or updating one of the configs.
-Abe
On Thu, Jan 30, 2014 at 10:36 AM, Brenden Cobb
<[email protected]<mailto:[email protected]>> wrote:
I think I have part of the answer.. I'm specifying localhost when I think I
should be using the actual domain, otherwise sqoop thinks it's not in
distributed mode?
-BC
From: Brenden Cobb <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 30, 2014 12:34 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Sqoop to HDFS error Cannot initialize cluster
Hello-
I'm trying to sqoop data from oracle to hdfs but getting the following error:
$ sqoop import --connect jdbc:oracle:thin:@localhost:1521/DB11G --username
sqoop --password xx --table sqoop.test
…
14/01/30 10:58:10 INFO orm.CompilationManager: Writing jar file:
/tmp/sqoop-oracle/compile/fa0ce9acd6ac6d0c349389a6dbfee62b/sqoop.test.jar
14/01/30 10:58:10 INFO mapreduce.ImportJobBase: Beginning import of sqoop.test
14/01/30 10:58:10 WARN conf.Configuration: mapred.job.tracker is deprecated.
Instead, use mapreduce.jobtracker.address
14/01/30 10:58:10 WARN conf.Configuration: mapred.jar is deprecated. Instead,
use mapreduce.job.jar
14/01/30 10:58:10 INFO manager.SqlManager: Executing SQL statement: SELECT
FIRST,LAST,EMAIL FROM sqoop.test WHERE 1=0
14/01/30 10:58:11 WARN conf.Configuration: mapred.map.tasks is deprecated.
Instead, use mapreduce.job.maps
14/01/30 10:58:11 ERROR security.UserGroupInformation:
PriviledgedActionException as:oracle (auth:SIMPLE) cause:java.io.IOException:
Cannot initialize Cluster. Please check your configuration for
mapreduce.framework.name<http://mapreduce.framework.name> and the correspond
server addresses.
14/01/30 10:58:11 ERROR tool.ImportTool: Encountered IOException running import
job: java.io.IOException: Cannot initialize Cluster. Please check your
configuration for mapreduce.framework.name<http://mapreduce.framework.name> and
the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1235)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1234)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1263)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:606)
at com.quest.oraoop.OraOopConnManager.importTable(OraOopConnManager.java:260)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
Checking just the Database side works ok:
$ sqoop list-tables --connect jdbc:oracle:thin:@localhost:1521:DB11G --username
sqoop --password xx
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/01/30 12:12:20 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.5.0
14/01/30 12:12:20 WARN tool.BaseSqoopTool: Setting your password on the
command-line is insecure. Consider using -P instead.
14/01/30 12:12:20 INFO manager.SqlManager: Using default fetchSize of 1000
14/01/30 12:12:21 INFO manager.OracleManager: Time zone has been set to GMT
TEST
Any thoughts?
Thanks,
BC