[jira] [Commented] (SQOOP-2748) Remove parentheses from WHERE clause

Markus Kemper (JIRA) Thu, 25 Aug 2016 05:33:58 -0700

    [ 
https://issues.apache.org/jira/browse/SQOOP-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15436758#comment-15436758
 ]


Markus Kemper commented on SQOOP-2748:
--------------------------------------

Hello Warren,

I will defer to others on how to best address this for the IBM IMS use case 
however to the best of my knowledge the SQL is valid.  If the IMS JDBC driver 
does not allow this syntax perhaps an IBM IMS specific option via a --direct 
path would be a good approach.

As a workaround, does using >1 map task allow  your job to run?

The snips below suggest that Sqoop may generate different SQL with >1 map tasks 
possibly avoiding the condition that is impacting you.

sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table t1 
--target-dir /user/user1/t1 --delete-target-dir --as-textfile  --split-by c1 
--num-mappers 1 --verbose
<SNIP>
2016-08-24 05:12:08,403 INFO [main] org.apache.hadoop.mapred.MapTask: 
Processing split: 1=1 AND 1=1
2016-08-24 05:12:08,520 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Working on split: 1=1 AND 1=1
2016-08-24 05:12:08,539 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: SELECT `c1`, 
`c2` FROM `t1` AS `t1` WHERE ( 1=1 ) AND ( 1=1 )

sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --table t1 
--target-dir /user/user1/t1 --delete-target-dir --as-textfile  --split-by c1 
--num-mappers 2 --verbose
<SNIP>
2016-08-24 05:03:12,669 INFO [main] org.apache.hadoop.mapred.MapTask: 
Processing split: `c1` >= 1 AND `c1` < 5001
2016-08-24 05:03:12,756 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Working on split: `c1` >= 1 AND 
`c1` < 5001
2016-08-24 05:03:12,774 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: SELECT `c1`, 
`c2` FROM `t1` AS `t1` WHERE ( `c1` >= 1 ) AND ( `c1` < 5001 )

>From your output do I am curious if you know why the IBM IMS JDBC Driver 
>generates "SELECT ... FROM MYIMSTABLE AS MYIMSTABLE WHERE ..." instead of 
>"SELECT ... FROM MYIMSTABLE WHERE ..."?

> Remove parentheses from WHERE clause
> ------------------------------------
>
>                 Key: SQOOP-2748
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2748
>             Project: Sqoop
>          Issue Type: Wish
>          Components: codegen
>    Affects Versions: 1.4.6
>         Environment: Redhat 6
> Cloudera 5.5.0
>            Reporter: Warren Chisholm
>            Priority: Minor
>              Labels: import, ims, oneequalone, parentheses, sqoop
>
> When performing a sqoop import with option "-m 1" the following where clause 
> is added by sqoop:
> WHERE ( 1=1 ) AND ( 1=1 )
> This where clause is redundant and creates an incompatibility with the IBM 
> IMS Universal JDBC driver.
> The IMS JDBC Driver does not allow the use of parentheses in the where 
> clause. The below quote was taken from 2nd dot point here: 
> https://www-01.ibm.com/support/knowledgecenter/SSEPH2_13.1.0/com.ibm.ims13.doc.apg/ims_odbjdbcusagewhere.htm.
> "Do not use parentheses. Qualification statements are evaluated from left to 
> right. The order of evaluation for operators is the IMS evaluation order for 
> segment search arguments."
> I have modified the below commands and output to anonymize the details.
> Sample sqoop command:
> {code}
> sqoop import --connect 
> 'jdbc:ims://myserver:9999/MYDATABASE:currentSchema=MYSCHEMA;' -username 
> ???????? -password ???????? --null-string '\\N' --null-non-string '\\N' 
> --hive-import --create-hive-table --table MYIMSTABLE --hive-table MYHIVETABLE 
> --hive-drop-import-delims --driver com.ibm.ims.jdbc.IMSDriver 
> --delete-target-dir -m 1
> {code}
> Extract of log produced:
> {code}
> ...
> 15/12/16 17:20:05 INFO mapreduce.Job:  map 0% reduce 0%
> 15/12/16 17:20:16 INFO mapreduce.Job: Task Id : 
> attempt_1449625205240_0003_m_000000_0, Status : FAILED
> Error: java.io.IOException: SQLException in nextKeyValue
>         at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
>         at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
>         at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>         at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.sql.SQLException: Error parsing the following SQL statement: 
> SELECT FIELD1 FROM MYIMSTABLE AS MYIMSTABLE WHERE ( 1=1 ) AND ( 1=1 ). 
> Encountered " <ONEEQUALONE> "1=1 "" at line 1, column 1167.
> Was expecting:
>     <ID> ...
>         at com.ibm.ims.jdbc.SQLForDLIParser.parse(SQLForDLIParser.java:651)
>         at 
> com.ibm.ims.jdbc.PreparedStatementImpl.parseSQLQuery(PreparedStatementImpl.java:133)
>         at 
> com.ibm.ims.jdbc.PreparedStatementImpl.<init>(PreparedStatementImpl.java:87)
>         at 
> com.ibm.ims.jdbc.ConnectionImpl.prepareStatement(ConnectionImpl.java:978)
>         at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:101)
>         at 
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
>         ... 12 more
> {code}
> In summary, for sqoop to be compatible with the IBM IMS JDBC driver, I am 
> asking for either the where clause "( 1=1 ) AND ( 1=1 )" to be removed or for 
> the parentheses to be removed.
> Workaround
> The only workaround I have been able to come up with is to switch to using 
> the --query option instead of --table and hack the $CONDITIONS variable into 
> the select part of the query - to get rid of the parentheses from the where 
> clause.
> For example:
> {code}
> sqoop import --connect 
> 'jdbc:ims://myserver:9999/MYDATABASE:currentSchema=MYSCHEMA;' -username 
> ???????? -password ???????? --null-string '\\N' --null-non-string '\\N' 
> --hive-import --create-hive-table --query "select FIELD1,'\$CONDITIONS' as 
> CONDITIONS from MYIMSTABLE" --hive-table MYHIVETABLE 
> --hive-drop-import-delims --driver com.ibm.ims.jdbc.IMSDriver 
> --delete-target-dir -m 1 --target-dir test
> {code}
> Obviously this is not desirable as the commands to retrieve the metadata 
> (WHERE 1 = 0) will retrieve the data with it. The sql will look like this:
> select FIELD1,' (1 = 0) ' as CONDITIONS from MYIMSTABLE
> however it does import the data successfully (albeit inefficiently).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SQOOP-2748) Remove parentheses from WHERE clause

Reply via email to