Re: Sqoop to HDFS error Cannot initialize cluster

Brenden Cobb Thu, 30 Jan 2014 15:11:09 -0800

Thanks very much for the detailed instructions.. However I am still receiving 
the error:


14/01/30 18:07:58 ERROR security.UserGroupInformation: 
PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: 
Cannot initialize Cluster. Please check your configuration for 
mapreduce.framework.name and the correspond server addresses.
14/01/30 18:07:58 ERROR tool.ImportTool: Encountered IOException running import 
job: java.io.IOException: Cannot initialize Cluster. Please check your 
configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1235)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1234)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1263)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:606)
at com.quest.oraoop.OraOopConnManager.importTable(OraOopConnManager.java:260)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240)


Any other thoughts?

Thanks

From: Abraham Elmahrek <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 30, 2014 3:01 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Sqoop to HDFS error Cannot initialize cluster

It seems like mapreduce.framework.name<http://mapreduce.framework.name> is 
missing from this configuration. You should be able to use a safety valve to 
manually add it in Cloudera Manager. The correct value here is "classic" I 
believe since you don't have Yarn deployed.

To add a safety valve configuration for MapReduce, go to Services -> Mapreduce 
-> Configuration -> View and Edit -> Service Wide -> Advanced -> Safety valve 
configuration for mapred-site.xml. You should able to add the entry:

<property>
  <name>mapreduce.framework.name<http://mapreduce.framework.name></name>
  <value>classic</value>
</property>

Then save and restart MR. Let us know how it goes.

-Abe


On Thu, Jan 30, 2014 at 11:18 AM, Brenden Cobb 
<[email protected]<mailto:[email protected]>> wrote:
Mapred-site.xml:

<!--Autogenerated by Cloudera CM on 2013-12-04T22:38:07.943Z-->
<configuration>
  <property>
    <name>mapred.job.tracker</name>
    
<value>som-dmsandbox01.humedica.net:8021<http://som-dmsandbox01.humedica.net:8021></value>
  </property>
  <property>
    <name>mapred.job.tracker.http.address</name>
    <value>0.0.0.0:50030<http://0.0.0.0:50030></value>
  </property>
  <property>
    <name>mapreduce.job.counters.max</name>
    <value>120</value>
  </property>
  <property>
    <name>mapred.output.compress</name>
    <value>false</value>
  </property>
  <property>
    <name>mapred.output.compression.type</name>
    <value>BLOCK</value>
  </property>
  <property>
    <name>mapred.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.DefaultCodec</value>
  </property>
  <property>
    <name>mapred.map.output.compression.codec</name>
    <value>org.apache.hadoop.io.compress.SnappyCodec</value>
  </property>
  <property>
    <name>mapred.compress.map.output</name>
    <value>true</value>
  </property>
  <property>
    <name>zlib.compress.level</name>
    <value>DEFAULT_COMPRESSION</value>
  </property>
  <property>
    <name>io.sort.factor</name>
    <value>64</value>
  </property>
  <property>
    <name>io.sort.record.percent</name>
    <value>0.05</value>
  </property>
  <property>
    <name>io.sort.spill.percent</name>
    <value>0.8</value>
  </property>
  <property>
    <name>mapred.reduce.parallel.copies</name>
    <value>10</value>
  </property>
  <property>
    <name>mapred.submit.replication</name>
    <value>2</value>
  </property>
  <property>
    <name>mapred.reduce.tasks</name>
    <value>2</value>
  </property>
  <property>
    <name>mapred.userlog.retain.hours</name>
    <value>24</value>
  </property>
  <property>
    <name>io.sort.mb</name>
    <value>71</value>
  </property>
  <property>
    <name>mapred.child.java.opts</name>
    <value> -Xmx298061516</value>
  </property>
  <property>
    <name>mapred.job.reuse.jvm.num.tasks</name>
    <value>1</value>
  </property>
  <property>
    <name>mapred.map.tasks.speculative.execution</name>
    <value>false</value>
  </property>
  <property>
    <name>mapred.reduce.tasks.speculative.execution</name>
    <value>false</value>
  </property>
  <property>
    <name>mapred.reduce.slowstart.completed.maps</name>
    <value>0.8</value>
  </property>
</configuration>

From: Abraham Elmahrek <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 30, 2014 2:13 PM

To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Sqoop to HDFS error Cannot initialize cluster

Hmmm could you provide your mapred-site.xml? It seems like you need to update 
the mapreduce.framework.name<http://mapreduce.framework.name> to "classic" if 
you're using MR1.

-Abe


On Thu, Jan 30, 2014 at 11:02 AM, Brenden Cobb 
<[email protected]<mailto:[email protected]>> wrote:
Hi Abe-

Sqoop 1.4.3 was installed as part of CDH 4.5

Using the server domain instead of localhost did push things along a bit, but 
the job is complaining that the LocalJobRunner is on the "master" node in 
cluster:

14/01/30 13:49:18 INFO mapreduce.Cluster: Failed to use 
org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid 
"mapreduce.jobtracker.address" configuration value for LocalJobRunner : 
"som-dmsandbox01.humedica.net:8021<http://som-dmsandbox01.humedica.net:8021>"
14/01/30 13:49:18 ERROR security.UserGroupInformation: 
PriviledgedActionException as:oracle (auth:SIMPLE) cause:java.io.IOException: 
Cannot initialize Cluster. Please check your configuration for 
mapreduce.framework.name<http://mapreduce.framework.name> and the correspond 
server addresses.
14/01/30 13:49:18 ERROR tool.ImportTool: Encountered IOException running import 
job: java.io.IOException: Cannot initialize Cluster. Please check your 
configuration for mapreduce.framework.name<http://mapreduce.framework.name> and 
the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)


The instance above mentions som-dmsandbox01.humedica.net<http://humedica.net> 
(the master)  while the machine Im executing on is 
som-dmsandbox03.humedica.net<http://humedica.net>

-BC

From: Abraham Elmahrek <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 30, 2014 1:49 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Sqoop to HDFS error Cannot initialize cluster

Hey there,

Sqoop1 is actually just a really heavy client. It will create jobs in MapReduce 
for data transfering.

With that being said, I'm curious about how sqoop was installed? What version 
of Sqoop1 are you running? It might be as simple as setting the HADOOP_HOME 
environment variable or updating one of the configs.

-Abe


On Thu, Jan 30, 2014 at 10:36 AM, Brenden Cobb 
<[email protected]<mailto:[email protected]>> wrote:
I think I have part of the answer.. I'm specifying localhost when I think I 
should be using the actual domain, otherwise sqoop thinks it's not in 
distributed mode?

-BC

From: Brenden Cobb <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, January 30, 2014 12:34 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Sqoop to HDFS error Cannot initialize cluster

Hello-

I'm trying to sqoop data from oracle to hdfs but getting the following error:

$ sqoop import --connect jdbc:oracle:thin:@localhost:1521/DB11G --username 
sqoop --password xx --table sqoop.test

…
14/01/30 10:58:10 INFO orm.CompilationManager: Writing jar file: 
/tmp/sqoop-oracle/compile/fa0ce9acd6ac6d0c349389a6dbfee62b/sqoop.test.jar
14/01/30 10:58:10 INFO mapreduce.ImportJobBase: Beginning import of sqoop.test
14/01/30 10:58:10 WARN conf.Configuration: mapred.job.tracker is deprecated. 
Instead, use mapreduce.jobtracker.address
14/01/30 10:58:10 WARN conf.Configuration: mapred.jar is deprecated. Instead, 
use mapreduce.job.jar
14/01/30 10:58:10 INFO manager.SqlManager: Executing SQL statement: SELECT 
FIRST,LAST,EMAIL FROM sqoop.test WHERE 1=0
14/01/30 10:58:11 WARN conf.Configuration: mapred.map.tasks is deprecated. 
Instead, use mapreduce.job.maps
14/01/30 10:58:11 ERROR security.UserGroupInformation: 
PriviledgedActionException as:oracle (auth:SIMPLE) cause:java.io.IOException: 
Cannot initialize Cluster. Please check your configuration for 
mapreduce.framework.name<http://mapreduce.framework.name> and the correspond 
server addresses.
14/01/30 10:58:11 ERROR tool.ImportTool: Encountered IOException running import 
job: java.io.IOException: Cannot initialize Cluster. Please check your 
configuration for mapreduce.framework.name<http://mapreduce.framework.name> and 
the correspond server addresses.

at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1235)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1234)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1263)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:606)
at com.quest.oraoop.OraOopConnManager.importTable(OraOopConnManager.java:260)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240)


Checking just the Database side works ok:
$ sqoop list-tables --connect jdbc:oracle:thin:@localhost:1521:DB11G --username 
sqoop --password xx
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
14/01/30 12:12:20 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.5.0
14/01/30 12:12:20 WARN tool.BaseSqoopTool: Setting your password on the 
command-line is insecure. Consider using -P instead.
14/01/30 12:12:20 INFO manager.SqlManager: Using default fetchSize of 1000
14/01/30 12:12:21 INFO manager.OracleManager: Time zone has been set to GMT
TEST


Any thoughts?

Thanks,
BC

Re: Sqoop to HDFS error Cannot initialize cluster

Reply via email to