Bharath created SQOOP-3234:
------------------------------
Summary: Using Sqoop API with hive import option fails while
loading imported data into hive
Key: SQOOP-3234
URL: https://issues.apache.org/jira/browse/SQOOP-3234
Project: Sqoop
Issue Type: Bug
Affects Versions: 1.4.6
Reporter: Bharath
I am trying to execute Sqoop from a java program leveraging the *SqoopOptions*
and *ImportTool* packages to import a table from a postgres database onto a
hive table. Running the sqoop command from command line works perfectly fine
and imports the table into hive. The problem I am facing with leveraging the
Sqoop APIs is that after the map tasks finish and loads data onto hdfs
directory, the step where loading of data into hive managed table happens, hive
complains that the directory "file:/user/hive/warehouse/lineitem" doesn't
exist. From the error message, it is clear that hive is looking for the
directory "/user/hive/warehouse/lineitem" on my local filesystem instead of
hdfs even though I provided all the necessary conf files before invoking sqoop.
Here is a miniature version of the sqoop program I am using:
{code:java}
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.tool.ImportTool;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
public class sqoopexperiments {
protected static Configuration getConfiguration() {
Configuration conf = new Configuration();
conf.addResource(new
Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/core-site.xml"));
conf.addResource(new
Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/yarn-site.xml"));
conf.addResource(new
Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/hdfs-site.xml"));
conf.addResource(new
Path("/usr/local/Cellar/hadoop/2.7.2/libexec/etc/hadoop/mapred-site.xml"));
conf.addResource(new
Path("/usr/local/Cellar/hive/1.2.1/libexec/conf/hive-site.xml"));
return conf;
}
private static SqoopOptions SqoopOptions = new
SqoopOptions(getConfiguration());
private static final String connectionString =
"jdbc:postgresql://127.0.0.1:5432/sales";
private static final String username = "unifi";
private static final String password = "unifi";
private static int executeSqoop() {
int retCode;
retCode = new ImportTool().run(SqoopOptions);
if (retCode != 0) {
throw new RuntimeException("Sqoop execution failure. Return code :
"+Integer.toString(retCode));
}
return retCode;
}
public static void main(String[] args) {
SqoopOptions.setConnectString(connectionString);
SqoopOptions.setUsername(username);
SqoopOptions.setPassword(password);
SqoopOptions.setTableName("lineitem");
SqoopOptions.setTargetDir("/user/unifi/tmp/testsqoop/1");
SqoopOptions.setHiveImport(true);
executeSqoop();
}
}
{code}
Here is the output of my program execution:
{code}
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/bin/java
-agentlib:jdwp=transport=dt_socket,address=127.0.0.1:64590,suspend=y,server=n
-Dfile.encoding=UTF-8 -classpath
"/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/charsets.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/deploy.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/dnsns.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/localedata.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/nashorn.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunec.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/ext/zipfs.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/javaws.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jce.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jfr.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jfxswt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/jsse.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/management-agent.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/plugin.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/resources.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/rt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/ant-javafx.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/dt.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/javafx-mx.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/jconsole.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/packager.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/sa-jdi.jar:
/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/lib/tools.jar:
/Users/bharath/dev/sqoopexperiments/out/production/sqoopexperiments:
/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/hadoop-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/activation-1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/asm-3.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-framework-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hadoop-annotations-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hadoop-auth-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/junit-4.11.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/xz-1.0.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/Cellar/sqoop/1.4.6/libexec/sqoop-1.4.6.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/hdfs/hadoop-hdfs-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-api-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-client-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-registry-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-common-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.2.jar:/usr/local/Cellar/hadoop/2.7.2/libexec/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.2.jar:/usr/local/Cellar/sqoop/1.4.6/libexec/lib/postgresql-9.3-1102.jdbc4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-core-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-fate-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-start-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/accumulo-trace-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/activation-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ant-1.9.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ant-launcher-1.9.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/antlr-2.7.7.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/antlr-runtime-3.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/apache-log4j-extras-1.2.17.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/asm-commons-3.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/asm-tree-3.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/avro-1.7.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/bonecp-0.8.0.RELEASE.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-avatica-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-core-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/calcite-linq4j-1.2.0-incubating.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-beanutils-1.7.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-beanutils-core-1.8.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-cli-1.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-codec-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-collections-3.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-compiler-2.7.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-compress-1.4.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-configuration-1.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-dbcp-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-digester-1.8.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-httpclient-3.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-io-2.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-lang-2.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-logging-1.1.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-math-2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-pool-1.5.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/commons-vfs2-2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-client-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-framework-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/curator-recipes-2.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-core-3.2.10.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/derby-10.10.2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/eigenbase-properties-1.1.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-annotation_1.0_spec-1.1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-jaspic_1.0_spec-1.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/geronimo-jta_1.1_spec-1.1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/groovy-all-2.1.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/guava-14.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hamcrest-core-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-accumulo-handler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-ant-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-beeline-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-cli-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-common-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-contrib-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-exec-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-hbase-handler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-hwi-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-jdbc-1.2.1-standalone.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-jdbc-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-metastore-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-serde-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-service-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-0.20S-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-0.23-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-common-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-shims-scheduler-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-testutils-1.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/httpclient-4.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/httpcore-4.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ivy-2.4.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/janino-2.7.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jcommander-1.32.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jdo-api-3.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jetty-all-7.6.0.v20120127.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jetty-all-server-7.6.0.v20120127.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jline-2.12.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/joda-time-2.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jpam-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/json-20090211.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jsr305-3.0.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/jta-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/junit-4.11.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/libfb303-0.9.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/libthrift-0.9.2.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/log4j-1.2.16.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/mail-1.4.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-api-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-provider-svn-commons-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/maven-scm-provider-svnexe-1.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/netty-3.7.0.Final.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/opencsv-2.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/oro-2.0.8.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/paranamer-2.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/parquet-hadoop-bundle-1.6.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/plexus-utils-1.5.6.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/postgresql-9.3-1102.jdbc4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/regexp-1.3.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/servlet-api-2.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/snappy-java-1.0.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/ST4-4.0.4.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/stax-api-1.0.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/stringtemplate-3.2.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/super-csv-2.2.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/tempus-fugit-1.1.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/velocity-1.5.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/xz-1.0.jar:/usr/local/Cellar/hive/1.2.1/libexec/lib/zookeeper-3.4.6.jar:/Applications/IntelliJ
IDEA.app/Contents/lib/idea_rt.jar" sqoopexperiments
Connected to the target VM, address: '127.0.0.1:64590', transport: 'socket'
2017-09-03 12:25:28,072 WARN [main] sqoop.ConnFactory
(ConnFactory.java:loadManagersFromConfDir(273)) - $SQOOP_CONF_DIR has not been
set in the environment. Cannot check for additional configuration.
2017-09-03 12:25:28,199 INFO [main] manager.SqlManager
(SqlManager.java:initOptionDefaults(98)) - Using default fetchSize of 1000
2017-09-03 12:25:30,085 INFO [main] tool.CodeGenTool
(CodeGenTool.java:generateORM(92)) - Beginning code generation
2017-09-03 12:25:30,198 INFO [main] manager.SqlManager
(SqlManager.java:execute(757)) - Executing SQL statement: SELECT t.* FROM
"lineitem" AS t LIMIT 1
2017-09-03 12:25:30,232 INFO [main] orm.CompilationManager
(CompilationManager.java:findHadoopJars(85)) - $HADOOP_MAPRED_HOME is not set
Note: /tmp/sqoop-bharath/compile/0df909eb6973527d155c8c591a072c5e/lineitem.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2017-09-03 12:25:31,464 INFO [main] orm.CompilationManager
(CompilationManager.java:jar(330)) - Writing jar file:
/tmp/sqoop-bharath/compile/0df909eb6973527d155c8c591a072c5e/lineitem.jar
2017-09-03 12:25:31,471 WARN [main] manager.PostgresqlManager
(PostgresqlManager.java:importTable(119)) - It looks like you are importing
from postgresql.
2017-09-03 12:25:31,471 WARN [main] manager.PostgresqlManager
(PostgresqlManager.java:importTable(120)) - This transfer can be faster! Use
the --direct
2017-09-03 12:25:31,471 WARN [main] manager.PostgresqlManager
(PostgresqlManager.java:importTable(121)) - option to exercise a
postgresql-specific fast path.
2017-09-03 12:25:31,475 WARN [main] manager.CatalogQueryManager
(CatalogQueryManager.java:getPrimaryKey(239)) - The table lineitem contains a
multi-column primary key. Sqoop will default to the column l_orderkey only for
this job.
2017-09-03 12:25:31,476 WARN [main] manager.CatalogQueryManager
(CatalogQueryManager.java:getPrimaryKey(239)) - The table lineitem contains a
multi-column primary key. Sqoop will default to the column l_orderkey only for
this job.
2017-09-03 12:25:31,476 INFO [main] mapreduce.ImportJobBase
(ImportJobBase.java:runImport(235)) - Beginning import of lineitem
2017-09-03 12:25:31,651 WARN [main] util.NativeCodeLoader
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
2017-09-03 12:25:31,680 INFO [main] Configuration.deprecation
(Configuration.java:warnOnceIfDeprecated(1173)) - mapred.jar is deprecated.
Instead, use mapreduce.job.jar
2017-09-03 12:25:32,243 INFO [main] Configuration.deprecation
(Configuration.java:warnOnceIfDeprecated(1173)) - mapred.map.tasks is
deprecated. Instead, use mapreduce.job.maps
2017-09-03 12:25:32,244 WARN [main] mapreduce.JobBase
(JobBase.java:cacheJars(179)) - SQOOP_HOME is unset. May not be able to find
all job dependencies.
2017-09-03 12:25:32,308 INFO [main] client.RMProxy
(RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at
/0.0.0.0:8032
2017-09-03 12:25:32,676 WARN [main] mapreduce.JobResourceUploader
(JobResourceUploader.java:uploadFiles(64)) - Hadoop command-line option parsing
not performed. Implement the Tool interface and execute your application with
ToolRunner to remedy this.
2017-09-03 12:25:32,916 INFO [main] db.DBInputFormat
(DBInputFormat.java:setTxIsolation(192)) - Using read commited transaction
isolation
2017-09-03 12:25:32,917 INFO [main] db.DataDrivenDBInputFormat
(DataDrivenDBInputFormat.java:getSplits(147)) - BoundingValsQuery: SELECT
MIN("l_orderkey"), MAX("l_orderkey") FROM "lineitem"
2017-09-03 12:25:32,947 INFO [main] mapreduce.JobSubmitter
(JobSubmitter.java:submitJobInternal(198)) - number of splits:4
2017-09-03 12:25:33,022 INFO [main] mapreduce.JobSubmitter
(JobSubmitter.java:printTokens(287)) - Submitting tokens for job:
job_1504463611328_0004
2017-09-03 12:25:33,291 INFO [main] impl.YarnClientImpl
(YarnClientImpl.java:submitApplication(273)) - Submitted application
application_1504463611328_0004
2017-09-03 12:25:33,334 INFO [main] mapreduce.Job (Job.java:submit(1294)) -
The url to track the job:
http://Bharaths-MacBook-Pro.local:8088/proxy/application_1504463611328_0004/
2017-09-03 12:25:33,335 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1339)) - Running job: job_1504463611328_0004
2017-09-03 12:25:39,430 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1360)) - Job job_1504463611328_0004 running in
uber mode : false
2017-09-03 12:25:39,431 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1367)) - map 0% reduce 0%
2017-09-03 12:25:43,479 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1367)) - map 25% reduce 0%
2017-09-03 12:25:45,495 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1367)) - map 50% reduce 0%
2017-09-03 12:25:46,504 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1367)) - map 75% reduce 0%
2017-09-03 12:25:47,513 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1367)) - map 100% reduce 0%
2017-09-03 12:25:47,520 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1378)) - Job job_1504463611328_0004 completed
successfully
2017-09-03 12:25:47,597 INFO [main] mapreduce.Job
(Job.java:monitorAndPrintJob(1385)) - Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=486880
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=485
HDFS: Number of bytes written=8508828
HDFS: Number of read operations=16
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=12656
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=12656
Total vcore-milliseconds taken by all map tasks=12656
Total megabyte-milliseconds taken by all map tasks=12959744
Map-Reduce Framework
Map input records=60175
Map output records=60175
Input split bytes=485
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=204
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=553648128
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=8508828
2017-09-03 12:25:47,602 INFO [main] mapreduce.ImportJobBase
(ImportJobBase.java:runJob(184)) - Transferred 8.1147 MB in 15.3528 seconds
(541.2293 KB/sec)
2017-09-03 12:25:47,604 INFO [main] mapreduce.ImportJobBase
(ImportJobBase.java:runJob(186)) - Retrieved 60175 records.
2017-09-03 12:25:47,611 INFO [main] manager.SqlManager
(SqlManager.java:execute(757)) - Executing SQL statement: SELECT t.* FROM
"lineitem" AS t LIMIT 1
2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter
(TableDefWriter.java:getCreateTableStmt(188)) - Column l_quantity had to be
cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter
(TableDefWriter.java:getCreateTableStmt(188)) - Column l_extendedprice had to
be cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter
(TableDefWriter.java:getCreateTableStmt(188)) - Column l_discount had to be
cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter
(TableDefWriter.java:getCreateTableStmt(188)) - Column l_tax had to be cast to
a less precise type in Hive
2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter
(TableDefWriter.java:getCreateTableStmt(188)) - Column l_shipdate had to be
cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter
(TableDefWriter.java:getCreateTableStmt(188)) - Column l_commitdate had to be
cast to a less precise type in Hive
2017-09-03 12:25:47,615 WARN [main] hive.TableDefWriter
(TableDefWriter.java:getCreateTableStmt(188)) - Column l_receiptdate had to be
cast to a less precise type in Hive
2017-09-03 12:25:47,637 INFO [main] hive.HiveImport
(HiveImport.java:importTable(194)) - Loading uploaded data into Hive
Logging initialized using configuration in
jar:file:/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-common-1.2.1.jar!/hive-log4j.properties
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
MetaException(message:file:/user/hive/warehouse/lineitem is not a directory or
unable to create one)
Disconnected from the target VM, address: '127.0.0.1:64590', transport: 'socket'
Exception in thread "main" java.lang.RuntimeException: Sqoop execution failure.
Return code : 1
at sqoopexperiments.executeSqoop(sqoopexperiments.java:30)
at sqoopexperiments.main(sqoopexperiments.java:44)
Process finished with exit code 1
{code}
Notice that everything works perfectly fine until the "Loading uploaded data
into Hive" stage. Stepping through the call stack using debugger, I discovered
the reason for the failure. The problem is in "executeScript" method of
org/apache/sqoop/hive/HiveImport.class. When the control reaches this point,
the temporary hive script contains this line of contents exactly as expected:
{code}
CREATE TABLE IF NOT EXISTS `lineitem` ( `l_orderkey` INT, `l_partkey` INT,
`l_suppkey` INT, `l_linenumber` INT, `l_quantity` DOUBLE, `l_extendedprice`
DOUBLE, `l_discount` DOUBLE, `l_tax` DOUBLE, `l_returnflag` STRING,
`l_linestatus` STRING, `l_shipdate` STRING, `l_commitdate` STRING,
`l_receiptdate` STRING, `l_shipinstruct` STRING, `l_shipmode` STRING,
`l_comment` STRING) COMMENT 'Imported by sqoop on 2017/09/03 12:25:47' ROW
FORMAT DELIMITED FIELDS TERMINATED BY '\054' LINES TERMINATED BY '\012' STORED
AS TEXTFILE;
LOAD DATA INPATH 'hdfs://localhost:9000/user/unifi/tmp/testsqoop/1' INTO TABLE
`lineitem`;
{code}
There is this block of code which determines how to execute this temporary hive
script in executeScript method of HiveImport.class file:
{code}
try {
Class ite =
Class.forName("org.apache.hadoop.hive.cli.CliDriver");
LOG.debug("Using in-process Hive instance.");
subprocessSM = new SubprocessSecurityManager();
subprocessSM.install();
String[] cause1 = new String[]{"-f", filename};
Method ese1 = ite.getMethod("main", new
Class[]{cause1.getClass()});
ese1.invoke((Object)null, new Object[]{cause1});
} catch (ClassNotFoundException var14) {
LOG.debug("Using external Hive process.");
this.executeExternalHiveScript(filename, env);
}
{code}
If Hive CLI driver related jars are in my classpath, the program tries to
invoke this line "ese1.invoke((Object)null, new Object[]{cause1});" and after
this stage it looses all the hadoop configuration context (hdfs-site,
yarn-site, mapred-site, hive-site configs) it carried until this far and
resorts to a default configuration because of this code in
/usr/local/Cellar/hive/1.2.1/libexec/lib/hive-cli-1.2.1.jar!/org/apache/hadoop/hive/cli/CliDriver.class:
{code}
public CliDriver() {
SessionState ss = SessionState.get();
this.conf = (Configuration)(ss != null?ss.getConf():new
Configuration());
Log LOG = LogFactory.getLog("CliDriver");
if(LOG.isDebugEnabled()) {
LOG.debug("CliDriver inited with classpath " +
System.getProperty("java.class.path"));
}
this.console = new LogHelper(LOG);
}
{code}
I don't fully understand what this "SessionState" is and why it is null here.
This causes a new Configuration() to be generated which results in all my
hadoop configuration being lost and hence it looks for the
"/user/hive/warehouse/lineitem" directory on my local file system instead of
hdfs.
If I remove the "hive-cli-1.2.1.jar" from my classpath, the HiveImport program
takes the route of executing hive script using the hive binary on my system and
in this mode the hive table gets created properly:
{code}
private void executeExternalHiveScript(String filename, List<String> env)
throws IOException {
String hiveExec = this.getHiveBinPath();
ArrayList args = new ArrayList();
args.add(hiveExec);
args.add("-f");
args.add(filename);
LoggingAsyncSink logSink = new LoggingAsyncSink(LOG);
int ret = Executor.exec((String[])args.toArray(new String[0]),
(String[])env.toArray(new String[0]), logSink, logSink);
if(0 != ret) {
throw new IOException("Hive exited with status " + ret);
}
}
{code}
My intention is to execute sqoop import to hive from java by prepackaging all
the necessary hadoop jars without the need for hadoop binaries (hdfs, mapred,
hive etc) to be present on my system. Stepping though the debugger, it appears
to me that there is some bug which is causing all the hadoop configs to be lost
when sqoop reaches Hive execution stage and executes via
org.apache.hadoop.hive.cli.CliDriver.
Hoping that somebody has attempted to do hive imports via sqoop in this fashion
and figured out a solution or a workaround.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)