Hi
The more I tried various options to get the sqoop export to work, the more I
felt that this could be a bug. I started digging into the Sqoop internal code
to debug and found this logic used in the code:
method getTableColumns:
It tries to gather column names and column (Oracle & JDBC) data types by
Step 1) First retrieving column_name and data_type FROM dba_tab_columns table (
)
And
Step 2) Then by using the column_name returned from Step 1 to retrieve the JDBC
data types ( )
The details below are from me navigating through the code .
- Step 1) succeeds and Oracle returns column names as a, x, y -> (all lower
case as expected)
- Step 2) Sqoop tries to make the following SQL to retrieve JDBC column
metadata "select a,x,y from "TestV500"."Xy" where 0=1;"
It fails here with the error "ORA-00904: "Y": invalid identifier".
This is because Sqoop is not enclosing the column names in quotes in the select
statement as it should for cases where there are lower case column names.
The code that is problematic is in the class
org.apache.sqoop.manager.oracle.OraOopOracleQueries at Line Number 683.
This is incorrect:
columnList.append(result.get(idx).getName());
It should be something like this:
if(!result.get(idx).getName().equals(result.get(idx).getName().toUpperCase()))
columnList.append("\"" + result.get(idx).getName() + "\"");
else
columnList.append(result.get(idx).getName());
Thank you
RV
From: Ramya Vasudevan
Sent: Wednesday, November 18, 2015 2:56 PM
To: '[email protected]'
Subject: RE: SQOOP export to Oracle failing
Hi
I am still stuck in this issue. Any suggestion will be very helpful.
Thank you
RV
From: Ramya Vasudevan
Sent: Monday, November 16, 2015 3:41 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: SQOOP export to Oracle failing
Thank You David.
I can see a little more information this time.
Export with direct option:
/usr/bin/sqoop export --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl
--direct --num-mappers 2 --username test --password password --export-dir
"/hdfs_nfs_mount/tmp/oracle/TestV100/Xy.txt" --table "\"\"TestV500\".\"Xy\"\""
--fields-terminated-by "\t" --input-null-string null -m 1 -verbose
Output:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/11/16 15:27:31 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.4.1
15/11/16 15:27:31 DEBUG tool.BaseSqoopTool: Enabled debug logging.
15/11/16 15:27:31 WARN tool.BaseSqoopTool: Setting your password on the
command-line is insecure. Consider using -P instead.
15/11/16 15:27:31 DEBUG sqoop.ConnFactory: Loaded manager factory:
org.apache.sqoop.manager.oracle.OraOopManagerFactory
15/11/16 15:27:31 DEBUG sqoop.ConnFactory: Loaded manager factory:
com.cloudera.sqoop.manager.DefaultManagerFactory
15/11/16 15:27:31 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
org.apache.sqoop.manager.oracle.OraOopManagerFactory
15/11/16 15:27:31 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle
and Hadoop can be called by Sqoop!
15/11/16 15:27:31 INFO manager.SqlManager: Using default fetchSize of 1000
15/11/16 15:27:32 INFO oracle.OraOopOracleQueries: Current schema is: TEST
15/11/16 15:27:32 DEBUG oracle.OraOopUtilities: The Oracle table context has
been derived from:
oracleConnectionUserName = test
tableStr = "TestV500"."Xy"
as:
owner : TestV500
table : Xy
15/11/16 15:27:32 INFO oracle.OraOopManagerFactory:
**************************************************
*** Using Data Connector for Oracle and Hadoop ***
**************************************************
15/11/16 15:27:32 DEBUG oracle.OraOopUtilities: Updated mapred.child.java.opts
from "null" to "-Djava.security.egd=file:///dev/urandom"
15/11/16 15:27:32 INFO oracle.OraOopManagerFactory: Oracle Database version:
Oracle Database 11g Release 11.1.0.6.0 - Production
15/11/16 15:27:32 INFO oracle.OraOopManagerFactory: This Oracle database is not
a RAC.
15/11/16 15:27:32 DEBUG oracle.OraOopManagerFactory: Setting mapper url
oraoop.mapper.jdbc.url.0 =
jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=adhoc-oracle11)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl)))
15/11/16 15:27:32 DEBUG oracle.OraOopManagerFactory: Setting mapper url
oraoop.mapper.jdbc.url.1 =
jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=adhoc-oracle11)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SID=orcl)))
15/11/16 15:27:32 INFO Configuration.deprecation: mapred.map.max.attempts is
deprecated. Instead, use mapreduce.map.maxattempts
15/11/16 15:27:32 DEBUG sqoop.ConnFactory: Instantiated ConnManager
org.apache.sqoop.manager.oracle.OraOopConnManager@7db63b8f<mailto:org.apache.sqoop.manager.oracle.OraOopConnManager@7db63b8f>
15/11/16 15:27:32 INFO tool.CodeGenTool: Beginning code generation
15/11/16 15:27:32 DEBUG oracle.OraOopOracleQueries: getTableColumns() : sql =
SELECT column_name, data_type FROM dba_tab_columns WHERE owner = ? and
table_name = ? and (DATA_TYPE IN
('BINARY_DOUBLE','BINARY_FLOAT','CHAR','DATE','FLOAT','NCHAR','NUMBER','NVARCHAR2','ROWID','URITYPE','VARCHAR2')
OR DATA_TYPE LIKE 'INTERVAL YEAR(%) TO MONTH' OR DATA_TYPE LIKE 'INTERVAL
DAY(%) TO SECOND(%)' OR DATA_TYPE LIKE 'TIMESTAMP(%)' OR DATA_TYPE LIKE
'TIMESTAMP(%) WITH TIME ZONE' OR DATA_TYPE LIKE 'TIMESTAMP(%) WITH LOCAL TIME
ZONE') ORDER BY column_id
15/11/16 15:27:32 ERROR sqoop.Sqoop: Got exception running Sqoop:
java.lang.RuntimeException: java.sql.SQLSyntaxErrorException: ORA-00904: "Y":
invalid identifier
java.lang.RuntimeException: java.sql.SQLSyntaxErrorException: ORA-00904: "Y":
invalid identifier
at
org.apache.sqoop.manager.oracle.OraOopConnManager.getColumnNamesInOracleTable(OraOopConnManager.java:133)
at
org.apache.sqoop.manager.oracle.OraOopConnManager.getSelectedColumnNamesInOracleTable(OraOopConnManager.java:144)
at
org.apache.sqoop.manager.oracle.OraOopConnManager.getColTypesQuery(OraOopConnManager.java:216)
at
org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226)
at
org.apache.sqoop.manager.oracle.OraOopConnManager.getColumnTypes(OraOopConnManager.java:504)
at
org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
at
org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1833)
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.sql.SQLSyntaxErrorException: ORA-00904: "Y": invalid identifier
at
oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:91)
at
oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:112)
at
oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:173)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:455)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:413)
at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:1030)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:183)
at
oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:774)
at
oracle.jdbc.driver.T4CStatement.executeMaybeDescribe(T4CStatement.java:849)
at
oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1186)
at
oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1377)
at
oracle.jdbc.driver.OracleStatementWrapper.executeQuery(OracleStatementWrapper.java:386)
at
org.apache.sqoop.manager.oracle.OraOopOracleQueries.getTableColumns(OraOopOracleQueries.java:689)
at
org.apache.sqoop.manager.oracle.OraOopOracleQueries.getTableColumns(OraOopOracleQueries.java:767)
at
org.apache.sqoop.manager.oracle.OraOopOracleQueries.getTableColumnNames(OraOopOracleQueries.java:624)
at
org.apache.sqoop.manager.oracle.OraOopConnManager.getColumnNamesInOracleTable(OraOopConnManager.java:124)
... 16 more
It looks like it is resolving the column "y" as Y and failing saying column Y
doesn't exist:
This is the table DDL:
CREATE TABLE "TestV500"."Xy"
( "a" NVARCHAR2(255) DEFAULT NULL,
"x" NUMBER(10,0) DEFAULT NULL,
"y" NUMBER(10,0) DEFAULT NULL
) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
TABLESPACE "USERS" ;
Any idea on how I can get past this error?
Thank You
RV
From: David Robson [mailto:[email protected]]
Sent: Monday, November 16, 2015 2:49 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: SQOOP export to Oracle failing
Hi Ramya,
Have you tried the "-direct" option? I have tested mixed case table and schema
names with this before and it should work. We even added a section to the
documentation on quoting them which you can read here:
http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_quote_oracle_owners_and_tables
The only thing you will need to do is use more mappers - the direct connector
only works with 2 or more mappers. So add "-direct" to your command and
increase the number of mappers to at least 2.
David
From: Ramya Vasudevan [mailto:[email protected]]
Sent: Tuesday, 17 November 2015 5:20 AM
To: [email protected]<mailto:[email protected]>
Subject: SQOOP export to Oracle failing
Hi
I am facing an issue with export to Oracle:
Scenario:
sqoop export from hdfs to oracle where the schema and objects in oracle has to
be mixed case.
Issue:
I created a schema (TestV500) and table (Xy) in oracle 11g using sqoop export
commands:
sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test
-password password -query "CREATE USER \"TestV500\" identified by \"password\""
sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test
-password password -query "GRANT CONNECT TO \"TestV500\""
sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test
-password password -query "ALTER USER \"TestV500\" QUOTA UNLIMITED ON USERS"
sqoop eval --connect jdbc:oracle:thin:@adhoc-oracle11:1521:orcl -username test
-password password -query "CREATE TABLE \"TestV500\".\"Xy\"( \"a\"
NVARCHAR2(255) DEFAULT NULL, \"x\" NUMBER(10,0) DEFAULT NULL, \"y\"
NUMBER(10,0) DEFAULT NULL )"
Installations used:
Oracle Version: 11.1.0.6.0
Sqoop Version: 1.4.5-cdh5.4.1
Ojdbc version: ojdbc6.jar
I am now trying to load the table using a tab delimited file:
/usr/bin/sudo -u hdfs sqoop export --connect
jdbc:oracle:thin:@oracle11:1521:orcl -username test -password password
"/hdfs_nfs_mount/tmp/oracle/TestV500/Xy.txt" --table "\"\"TestV500\".\"Xy\"\""
--fields-terminated-by "\t" --input-null-string null -m 1
I get this error :
15/11/12 09:33:03 INFO tool.CodeGenTool: Beginning code generation
15/11/12 09:33:03 DEBUG manager.OracleManager: Using column names query: SELECT
t.* FROM "TestV500"."Xy" t WHERE 1=0
15/11/12 09:33:03 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery :
SELECT t.* FROM "TestV500"."Xy" t WHERE 1=0
15/11/12 09:33:03 DEBUG manager.OracleManager: Creating a new connection for
jdbc:oracle:thin:@adhoc-oracle11:1521:orcl, using username: test
15/11/12 09:33:03 DEBUG manager.OracleManager: No connection paramenters
specified. Using regular API for making connection.
15/11/12 09:33:03 INFO manager.OracleManager: Time zone has been set to GMT
15/11/12 09:33:03 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
15/11/12 09:33:03 INFO manager.SqlManager: Executing SQL statement: SELECT t.*
FROM "TestV500"."Xy" t WHERE 1=0
15/11/12 09:33:04 DEBUG manager.SqlManager: Found column a of type [12, 255, 0]
15/11/12 09:33:04 DEBUG manager.SqlManager: Found column x of type [2, 10, 0]
15/11/12 09:33:04 DEBUG manager.SqlManager: Found column y of type [2, 10, 0]
15/11/12 09:33:04 DEBUG manager.OracleManager$ConnCache: Caching released
connection for jdbc:oracle:thin:@adhoc-oracle11:1521:orcl/test
15/11/12 09:33:04 ERROR sqoop.Sqoop: Got exception running Sqoop:
java.lang.IllegalArgumentException: There is no column found in the target
table "TestV500"."Xy". Please ensure that your table name is correct.
java.lang.IllegalArgumentException: There is no column found in the target
table "TestV500"."Xy". Please ensure that your table name is correct.
at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1658)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
This is the input file Xy.txt:
a 0 0
a 1 4
a 2 2
Observations:
1.
Schema and table was created successfully in oracle.
2.
If I connect to the database directly as 'test' user and run this command, it
works fine (it shows zero rows as nothing is loaded yet)
SELECT t.* FROM "TestV500"."Xy" t WHERE 1=0;
3.
If I create a schema called testv500 with a table called xy (without mixed
case), the exact same Sqoop export command, the input file loads the data in
the table.
Unless I am missing something here, I wonder if I hit a bug not being able to
handle mixed case.
Any help will be great!
Thank You,
RV