[
https://issues.apache.org/jira/browse/SQOOP-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Kemper updated SQOOP-3030:
---------------------------------
Description:
It appears that if the case of the columns in the Oracle RDBMS are not used
with the Sqoop --columns option with (export + --hcatalog + --direct) the Sqoop
map task will fail with an NPE where the non-direct path fails correct. My
test case is below, please review and let me know if you have any questions.
Additionally it would be nice if we could detect this column case mis-match
during compile time and not submit the job only to have it fail in YARN.
{noformat}
TEST CASE:
STEP 01 : Create Oracle Source/Target Table
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"drop table t1"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"create table t1 (c1 int, c2 varchar(10))"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"insert into t1 values (1, 'some data')"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"select * from t1"
-------------------------------------
| C1 | C2 |
-------------------------------------
| 1 | some data |
-------------------------------------
STEP 02 : Import Oracle Table using Sqoop --hcatalog options
beeline -u jdbc:hive2:// -e "use default; drop table t1_text;"
sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1
--num-mappers 1 --hcatalog-database default --hcatalog-table t1_text
--create-hcatalog-table --hcatalog-storage-stanza 'stored as textfile'
beeline -u jdbc:hive2:// -e "use default; select * from t1_text;"
Output:
<SNIP>
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column names projected
: [c1, c2]
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column name - info map
:
c1 : [Type : 2,Precision : 38,Scale : 0]
c2 : [Type : 12,Precision : 10,Scale : 0]
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Creating HCatalog table
default.t1_text for import
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: HCatalog Create table
statement:
create table `default`.`t1_text` (
`c1` decimal(38),
`c2` varchar(10))
stored as textfile
<SNIP>
16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 34.4222
seconds (0.3486 bytes/sec)
16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Retrieved 1 records.
---
+-------------+-------------+--+
| t1_text.c1 | t1_text.c2 |
+-------------+-------------+--+
| 1 | some data |
+-------------+-------------+--+
STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns options
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T2
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"c1,c2"
Output: (failure = correct)
<SNIP>
16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column names projected
: [c1, c2]
16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column name - info map
:
c1 : [Type : 2,Precision : 38,Scale : 0]
c2 : [Type : 12,Precision : 10,Scale : 0]
<SNIP>
16/10/18 09:13:48 INFO hive.metastore: Connected to metastore.
16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: HCatalog full table schema
fields = [c1, c2]
16/10/18 09:13:49 INFO Configuration.deprecation: mapred.output.dir is
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog table partitioning key
fields = []
16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog projected schema
fields = [c1, c2]
<SNIP>
16/10/18 09:14:22 INFO mapreduce.ExportJobBase: Exported 0 records.
16/10/18 09:14:22 ERROR tool.ExportTool: Error during export: Export job failed!
---
2016-10-18 09:14:19,418 INFO [main]
org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties
{transient_lastDdlTime=1476807097, name=default.t1_text,
serialization.null.format=\N, columns=c1,c2,
serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
serialization.format=1, columns.types=decimal(38,0),varchar(10)}
2016-10-18 09:14:19,660 INFO [Thread-12]
org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
finished. keepGoing=false
2016-10-18 09:14:19,952 ERROR [Thread-11]
org.apache.sqoop.mapreduce.AsyncSqlOutputFormat: Got exception in update
thread: java.sql.SQLSyntaxErrorException: ORA-00904: "c2": invalid identifier
STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns options
and Oracle --direct
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T2
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"c1,c2" --direct
Output: (failure = not correct (NPE))
<SNIP>
16/10/18 09:17:47 INFO oracle.OraOopManagerFactory:
**************************************************
*** Using Data Connector for Oracle and Hadoop ***
**************************************************
16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: Oracle Database version:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: This Oracle database is not
a RAC.
16/10/18 09:17:47 INFO Configuration.deprecation: mapred.map.max.attempts is
deprecated. Instead, use mapreduce.map.maxattempts
16/10/18 09:17:47 INFO tool.CodeGenTool: Beginning code generation
16/10/18 09:17:47 INFO manager.SqlManager: Executing SQL statement: SELECT
"C1","C2" FROM T2 WHERE 0=1
<SNIP>
16/10/18 09:17:50 INFO mapreduce.ExportJobBase: Configuring HCatalog for export
job
16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific
details for job
16/10/18 09:17:50 INFO manager.SqlManager: Executing SQL statement: SELECT
"C1","C2" FROM "T2" WHERE 1=0
16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column names projected
: [c1, c2]
16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column name - info map
:
c1 : [Type : 2,Precision : 38,Scale : 0]
c2 : [Type : 12,Precision : 10,Scale : 0]
<SNIP>
16/10/18 09:17:51 INFO hive.metastore: Connected to metastore.
16/10/18 09:17:52 INFO hcat.SqoopHCatUtilities: HCatalog full table schema
fields = [c1, c2]
<SNIP>
16/10/18 09:18:25 INFO mapreduce.ExportJobBase: Exported 0 records.
16/10/18 09:18:25 ERROR tool.ExportTool: Error during export: Export job failed!
---
2016-10-18 09:18:23,561 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of rows per
batch is: 100
2016-10-18 09:18:23,561 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of batches
per commit is: 100
2016-10-18 09:18:23,721 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: This record writer is
connected to Oracle via the JDBC URL:
"oracle.jdbc.driver.T4CConnection@34133979"
to the Oracle instance: "ORCL"
2016-10-18 09:18:23,828 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOracleQueries: Session Time Zone set to
GMT
2016-10-18 09:18:23,883 INFO [main]
org.apache.sqoop.manager.oracle.OracleConnectionFactory: Initializing Oracle
session with SQL :
begin
dbms_application_info.set_module(module_name => 'Data Connector for Oracle
and Hadoop', action_name => 'export 20161018091747PDT');
end;
2016-10-18 09:18:23,883 WARN [main]
org.apache.sqoop.manager.oracle.OracleConnectionFactory: No Oracle 'session
initialization' statements were found to execute.
Check that your oraoop-site-template.xml and/or oraoop-site.xml files are
correctly installed in the ${SQOOP_HOME}/conf directory.
2016-10-18 09:18:24,116 INFO [main]
org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties
{transient_lastDdlTime=1476807097, name=default.t1_text,
serialization.null.format=\N, columns=c1,c2,
serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
serialization.format=1, columns.types=decimal(38,0),varchar(10)}
2016-10-18 09:18:24,345 INFO [Thread-12]
org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
finished. keepGoing=false
2016-10-18 09:18:24,357 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: Batch-Mode insert
statement:
insert into "SQOOP"."T2"
("C1"
,"C2")
values
(:C1
,:C2)
2016-10-18 09:18:24,358 ERROR [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatInsert: The following error
occurred during configurePreparedStatement()
java.lang.NullPointerException
at
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase$OraOopDBRecordWriterBase.setBindValueAtName(OraOopOutputFormatBase.java:432)
STEP 04 : Verify Export without --direct and correct column case works
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"C1,C2"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"select * from t1"
Output:
16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 30.9303
seconds (257.9025 bytes/sec)
16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Exported 1 records.
---
-------------------------------------
| C1 | C2 |
-------------------------------------
| 1 | some data |
| 1 | some data |
-------------------------------------
STEP 05 : Verify Export with --direct and correct column case works
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"C1,C2" --direct
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"select * from t1"
Output:
<SNIP>
16/10/18 09:30:40 INFO oracle.OraOopManagerFactory:
**************************************************
*** Using Data Connector for Oracle and Hadoop ***
**************************************************
<SNIP>
16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 30.9987
seconds (257.333 bytes/sec)
16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Exported 1 records.
---
-------------------------------------
| C1 | C2 |
-------------------------------------
| 1 | some data |
| 1 | some data |
| 1 | some data |
-------------------------------------
{noformat}
was:
It appears that if the case of the columns in the Oracle RDBMS are not used
with the Sqoop --columns option with (export + --hcatalog + --direct) the Sqoop
map task will fail with an NPE where the non-direct path fails correct. My
test case is below, please review and let me know if you have any questions.
Additionally it would be nice if we could detect this column case mis-match
during compile time and not submit the job only to have it fail in YARN.
TEST CASE:
STEP 01 : Create Oracle Source/Target Table
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"drop table t1"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"create table t1 (c1 int, c2 varchar(10))"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"insert into t1 values (1, 'some data')"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"select * from t1"
-------------------------------------
| C1 | C2 |
-------------------------------------
| 1 | some data |
-------------------------------------
STEP 02 : Import Oracle Table using Sqoop --hcatalog options
beeline -u jdbc:hive2:// -e "use default; drop table t1_text;"
sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1
--num-mappers 1 --hcatalog-database default --hcatalog-table t1_text
--create-hcatalog-table --hcatalog-storage-stanza 'stored as textfile'
beeline -u jdbc:hive2:// -e "use default; select * from t1_text;"
Output:
<SNIP>
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column names projected
: [c1, c2]
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column name - info map
:
c1 : [Type : 2,Precision : 38,Scale : 0]
c2 : [Type : 12,Precision : 10,Scale : 0]
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Creating HCatalog table
default.t1_text for import
16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: HCatalog Create table
statement:
create table `default`.`t1_text` (
`c1` decimal(38),
`c2` varchar(10))
stored as textfile
<SNIP>
16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Transferred 12 bytes in 34.4222
seconds (0.3486 bytes/sec)
16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Retrieved 1 records.
---
+-------------+-------------+--+
| t1_text.c1 | t1_text.c2 |
+-------------+-------------+--+
| 1 | some data |
+-------------+-------------+--+
STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns options
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T2
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"c1,c2"
Output: (failure = correct)
<SNIP>
16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column names projected
: [c1, c2]
16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column name - info map
:
c1 : [Type : 2,Precision : 38,Scale : 0]
c2 : [Type : 12,Precision : 10,Scale : 0]
<SNIP>
16/10/18 09:13:48 INFO hive.metastore: Connected to metastore.
16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: HCatalog full table schema
fields = [c1, c2]
16/10/18 09:13:49 INFO Configuration.deprecation: mapred.output.dir is
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog table partitioning key
fields = []
16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog projected schema
fields = [c1, c2]
<SNIP>
16/10/18 09:14:22 INFO mapreduce.ExportJobBase: Exported 0 records.
16/10/18 09:14:22 ERROR tool.ExportTool: Error during export: Export job failed!
---
2016-10-18 09:14:19,418 INFO [main]
org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties
{transient_lastDdlTime=1476807097, name=default.t1_text,
serialization.null.format=\N, columns=c1,c2,
serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
serialization.format=1, columns.types=decimal(38,0),varchar(10)}
2016-10-18 09:14:19,660 INFO [Thread-12]
org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
finished. keepGoing=false
2016-10-18 09:14:19,952 ERROR [Thread-11]
org.apache.sqoop.mapreduce.AsyncSqlOutputFormat: Got exception in update
thread: java.sql.SQLSyntaxErrorException: ORA-00904: "c2": invalid identifier
STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns options
and Oracle --direct
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T2
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"c1,c2" --direct
Output: (failure = not correct (NPE))
<SNIP>
16/10/18 09:17:47 INFO oracle.OraOopManagerFactory:
**************************************************
*** Using Data Connector for Oracle and Hadoop ***
**************************************************
16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: Oracle Database version:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: This Oracle database is not
a RAC.
16/10/18 09:17:47 INFO Configuration.deprecation: mapred.map.max.attempts is
deprecated. Instead, use mapreduce.map.maxattempts
16/10/18 09:17:47 INFO tool.CodeGenTool: Beginning code generation
16/10/18 09:17:47 INFO manager.SqlManager: Executing SQL statement: SELECT
"C1","C2" FROM T2 WHERE 0=1
<SNIP>
16/10/18 09:17:50 INFO mapreduce.ExportJobBase: Configuring HCatalog for export
job
16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific
details for job
16/10/18 09:17:50 INFO manager.SqlManager: Executing SQL statement: SELECT
"C1","C2" FROM "T2" WHERE 1=0
16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column names projected
: [c1, c2]
16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column name - info map
:
c1 : [Type : 2,Precision : 38,Scale : 0]
c2 : [Type : 12,Precision : 10,Scale : 0]
<SNIP>
16/10/18 09:17:51 INFO hive.metastore: Connected to metastore.
16/10/18 09:17:52 INFO hcat.SqoopHCatUtilities: HCatalog full table schema
fields = [c1, c2]
<SNIP>
16/10/18 09:18:25 INFO mapreduce.ExportJobBase: Exported 0 records.
16/10/18 09:18:25 ERROR tool.ExportTool: Error during export: Export job failed!
---
2016-10-18 09:18:23,561 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of rows per
batch is: 100
2016-10-18 09:18:23,561 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of batches
per commit is: 100
2016-10-18 09:18:23,721 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: This record writer is
connected to Oracle via the JDBC URL:
"oracle.jdbc.driver.T4CConnection@34133979"
to the Oracle instance: "ORCL"
2016-10-18 09:18:23,828 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOracleQueries: Session Time Zone set to
GMT
2016-10-18 09:18:23,883 INFO [main]
org.apache.sqoop.manager.oracle.OracleConnectionFactory: Initializing Oracle
session with SQL :
begin
dbms_application_info.set_module(module_name => 'Data Connector for Oracle
and Hadoop', action_name => 'export 20161018091747PDT');
end;
2016-10-18 09:18:23,883 WARN [main]
org.apache.sqoop.manager.oracle.OracleConnectionFactory: No Oracle 'session
initialization' statements were found to execute.
Check that your oraoop-site-template.xml and/or oraoop-site.xml files are
correctly installed in the ${SQOOP_HOME}/conf directory.
2016-10-18 09:18:24,116 INFO [main]
org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties
{transient_lastDdlTime=1476807097, name=default.t1_text,
serialization.null.format=\N, columns=c1,c2,
serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
serialization.format=1, columns.types=decimal(38,0),varchar(10)}
2016-10-18 09:18:24,345 INFO [Thread-12]
org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
finished. keepGoing=false
2016-10-18 09:18:24,357 INFO [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: Batch-Mode insert
statement:
insert into "SQOOP"."T2"
("C1"
,"C2")
values
(:C1
,:C2)
2016-10-18 09:18:24,358 ERROR [main]
org.apache.sqoop.manager.oracle.OraOopOutputFormatInsert: The following error
occurred during configurePreparedStatement()
java.lang.NullPointerException
at
org.apache.sqoop.manager.oracle.OraOopOutputFormatBase$OraOopDBRecordWriterBase.setBindValueAtName(OraOopOutputFormatBase.java:432)
STEP 04 : Verify Export without --direct and correct column case works
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"C1,C2"
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"select * from t1"
Output:
16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 30.9303
seconds (257.9025 bytes/sec)
16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Exported 1 records.
---
-------------------------------------
| C1 | C2 |
-------------------------------------
| 1 | some data |
| 1 | some data |
-------------------------------------
STEP 05 : Verify Export with --direct and correct column case works
sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table T1
--num-mappers 2 --hcatalog-database default --hcatalog-table t1_text --columns
"C1,C2" --direct
sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
"select * from t1"
Output:
<SNIP>
16/10/18 09:30:40 INFO oracle.OraOopManagerFactory:
**************************************************
*** Using Data Connector for Oracle and Hadoop ***
**************************************************
<SNIP>
16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in 30.9987
seconds (257.333 bytes/sec)
16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Exported 1 records.
---
-------------------------------------
| C1 | C2 |
-------------------------------------
| 1 | some data |
| 1 | some data |
| 1 | some data |
-------------------------------------
> Export to Oracle using (--direct + --hcatalog + --columns) with columns
> having the wrong case sensitivity fails with NPE
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: SQOOP-3030
> URL: https://issues.apache.org/jira/browse/SQOOP-3030
> Project: Sqoop
> Issue Type: Bug
> Components: connectors/oracle, hive-integration
> Reporter: Markus Kemper
>
> It appears that if the case of the columns in the Oracle RDBMS are not used
> with the Sqoop --columns option with (export + --hcatalog + --direct) the
> Sqoop map task will fail with an NPE where the non-direct path fails correct.
> My test case is below, please review and let me know if you have any
> questions.
> Additionally it would be nice if we could detect this column case mis-match
> during compile time and not submit the job only to have it fail in YARN.
> {noformat}
> TEST CASE:
> STEP 01 : Create Oracle Source/Target Table
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
> "drop table t1"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
> "create table t1 (c1 int, c2 varchar(10))"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
> "insert into t1 values (1, 'some data')"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
> "select * from t1"
> -------------------------------------
> | C1 | C2 |
> -------------------------------------
> | 1 | some data |
> -------------------------------------
> STEP 02 : Import Oracle Table using Sqoop --hcatalog options
> beeline -u jdbc:hive2:// -e "use default; drop table t1_text;"
> sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table
> T1 --num-mappers 1 --hcatalog-database default --hcatalog-table t1_text
> --create-hcatalog-table --hcatalog-storage-stanza 'stored as textfile'
> beeline -u jdbc:hive2:// -e "use default; select * from t1_text;"
> Output:
> <SNIP>
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column names
> projected : [c1, c2]
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Database column name - info
> map :
> c1 : [Type : 2,Precision : 38,Scale : 0]
> c2 : [Type : 12,Precision : 10,Scale : 0]
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: Creating HCatalog table
> default.t1_text for import
> 16/10/18 09:11:31 INFO hcat.SqoopHCatUtilities: HCatalog Create table
> statement:
> create table `default`.`t1_text` (
> `c1` decimal(38),
> `c2` varchar(10))
> stored as textfile
> <SNIP>
> 16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Transferred 12 bytes in
> 34.4222 seconds (0.3486 bytes/sec)
> 16/10/18 09:12:14 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> ---
> +-------------+-------------+--+
> | t1_text.c1 | t1_text.c2 |
> +-------------+-------------+--+
> | 1 | some data |
> +-------------+-------------+--+
> STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns
> options
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table
> T2 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text
> --columns "c1,c2"
> Output: (failure = correct)
> <SNIP>
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column names
> projected : [c1, c2]
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: Database column name - info
> map :
> c1 : [Type : 2,Precision : 38,Scale : 0]
> c2 : [Type : 12,Precision : 10,Scale : 0]
> <SNIP>
> 16/10/18 09:13:48 INFO hive.metastore: Connected to metastore.
> 16/10/18 09:13:48 INFO hcat.SqoopHCatUtilities: HCatalog full table schema
> fields = [c1, c2]
> 16/10/18 09:13:49 INFO Configuration.deprecation: mapred.output.dir is
> deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog table partitioning
> key fields = []
> 16/10/18 09:13:49 INFO hcat.SqoopHCatUtilities: HCatalog projected schema
> fields = [c1, c2]
> <SNIP>
> 16/10/18 09:14:22 INFO mapreduce.ExportJobBase: Exported 0 records.
> 16/10/18 09:14:22 ERROR tool.ExportTool: Error during export: Export job
> failed!
> ---
> 2016-10-18 09:14:19,418 INFO [main]
> org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties
> {transient_lastDdlTime=1476807097, name=default.t1_text,
> serialization.null.format=\N, columns=c1,c2,
> serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
> serialization.format=1, columns.types=decimal(38,0),varchar(10)}
> 2016-10-18 09:14:19,660 INFO [Thread-12]
> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
> finished. keepGoing=false
> 2016-10-18 09:14:19,952 ERROR [Thread-11]
> org.apache.sqoop.mapreduce.AsyncSqlOutputFormat: Got exception in update
> thread: java.sql.SQLSyntaxErrorException: ORA-00904: "c2": invalid identifier
> STEP 03 : Attempt Export to Oracle using Sqoop --hcatalog and --columns
> options and Oracle --direct
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table
> T2 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text
> --columns "c1,c2" --direct
> Output: (failure = not correct (NPE))
> <SNIP>
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory:
> **************************************************
> *** Using Data Connector for Oracle and Hadoop ***
> **************************************************
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: Oracle Database version:
> Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
> 16/10/18 09:17:47 INFO oracle.OraOopManagerFactory: This Oracle database is
> not a RAC.
> 16/10/18 09:17:47 INFO Configuration.deprecation: mapred.map.max.attempts is
> deprecated. Instead, use mapreduce.map.maxattempts
> 16/10/18 09:17:47 INFO tool.CodeGenTool: Beginning code generation
> 16/10/18 09:17:47 INFO manager.SqlManager: Executing SQL statement: SELECT
> "C1","C2" FROM T2 WHERE 0=1
> <SNIP>
> 16/10/18 09:17:50 INFO mapreduce.ExportJobBase: Configuring HCatalog for
> export job
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Configuring HCatalog specific
> details for job
> 16/10/18 09:17:50 INFO manager.SqlManager: Executing SQL statement: SELECT
> "C1","C2" FROM "T2" WHERE 1=0
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column names
> projected : [c1, c2]
> 16/10/18 09:17:50 INFO hcat.SqoopHCatUtilities: Database column name - info
> map :
> c1 : [Type : 2,Precision : 38,Scale : 0]
> c2 : [Type : 12,Precision : 10,Scale : 0]
> <SNIP>
> 16/10/18 09:17:51 INFO hive.metastore: Connected to metastore.
> 16/10/18 09:17:52 INFO hcat.SqoopHCatUtilities: HCatalog full table schema
> fields = [c1, c2]
> <SNIP>
> 16/10/18 09:18:25 INFO mapreduce.ExportJobBase: Exported 0 records.
> 16/10/18 09:18:25 ERROR tool.ExportTool: Error during export: Export job
> failed!
> ---
> 2016-10-18 09:18:23,561 INFO [main]
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of rows
> per batch is: 100
> 2016-10-18 09:18:23,561 INFO [main]
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: The number of batches
> per commit is: 100
> 2016-10-18 09:18:23,721 INFO [main]
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: This record writer is
> connected to Oracle via the JDBC URL:
> "oracle.jdbc.driver.T4CConnection@34133979"
> to the Oracle instance: "ORCL"
> 2016-10-18 09:18:23,828 INFO [main]
> org.apache.sqoop.manager.oracle.OraOopOracleQueries: Session Time Zone set to
> GMT
> 2016-10-18 09:18:23,883 INFO [main]
> org.apache.sqoop.manager.oracle.OracleConnectionFactory: Initializing Oracle
> session with SQL :
> begin
> dbms_application_info.set_module(module_name => 'Data Connector for Oracle
> and Hadoop', action_name => 'export 20161018091747PDT');
> end;
> 2016-10-18 09:18:23,883 WARN [main]
> org.apache.sqoop.manager.oracle.OracleConnectionFactory: No Oracle 'session
> initialization' statements were found to execute.
> Check that your oraoop-site-template.xml and/or oraoop-site.xml files are
> correctly installed in the ${SQOOP_HOME}/conf directory.
> 2016-10-18 09:18:24,116 INFO [main]
> org.apache.hive.hcatalog.mapreduce.InternalUtil: Initializing
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe with properties
> {transient_lastDdlTime=1476807097, name=default.t1_text,
> serialization.null.format=\N, columns=c1,c2,
> serialization.lib=org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
> serialization.format=1, columns.types=decimal(38,0),varchar(10)}
> 2016-10-18 09:18:24,345 INFO [Thread-12]
> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
> finished. keepGoing=false
> 2016-10-18 09:18:24,357 INFO [main]
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase: Batch-Mode insert
> statement:
> insert into "SQOOP"."T2"
> ("C1"
> ,"C2")
> values
> (:C1
> ,:C2)
> 2016-10-18 09:18:24,358 ERROR [main]
> org.apache.sqoop.manager.oracle.OraOopOutputFormatInsert: The following error
> occurred during configurePreparedStatement()
> java.lang.NullPointerException
> at
> org.apache.sqoop.manager.oracle.OraOopOutputFormatBase$OraOopDBRecordWriterBase.setBindValueAtName(OraOopOutputFormatBase.java:432)
> STEP 04 : Verify Export without --direct and correct column case works
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table
> T1 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text
> --columns "C1,C2"
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
> "select * from t1"
> Output:
> 16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in
> 30.9303 seconds (257.9025 bytes/sec)
> 16/10/18 09:26:46 INFO mapreduce.ExportJobBase: Exported 1 records.
> ---
> -------------------------------------
> | C1 | C2 |
> -------------------------------------
> | 1 | some data |
> | 1 | some data |
> -------------------------------------
> STEP 05 : Verify Export with --direct and correct column case works
> sqoop export --connect $MYCONN --username $MYUSER --password $MYPSWD --table
> T1 --num-mappers 2 --hcatalog-database default --hcatalog-table t1_text
> --columns "C1,C2" --direct
> sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query
> "select * from t1"
> Output:
> <SNIP>
> 16/10/18 09:30:40 INFO oracle.OraOopManagerFactory:
> **************************************************
> *** Using Data Connector for Oracle and Hadoop ***
> **************************************************
> <SNIP>
> 16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Transferred 7.79 KB in
> 30.9987 seconds (257.333 bytes/sec)
> 16/10/18 09:31:16 INFO mapreduce.ExportJobBase: Exported 1 records.
> ---
> -------------------------------------
> | C1 | C2 |
> -------------------------------------
> | 1 | some data |
> | 1 | some data |
> | 1 | some data |
> -------------------------------------
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)