[ https://issues.apache.org/jira/browse/OOZIE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahesh Balakrishnan updated OOZIE-3218: --------------------------------------- Description: When running a oozie sqoop action which has command with --query in place the query is split into multiple parts causing "Unrecognized argument:" and in-turn fails. <sqoop xmlns="uri:oozie:sqoop-action:0.4"> <job-tracker>${resourceManager}</job-tracker> <name-node>${nameNode}</name-node> <command>import --verbose --connect jdbc:mysql://test.openstacklocal/db --query select * from abc where $CONDITIONS --username test --password test --driver com.mysql.jdbc.Driver -m 1 </command> </sqoop> <ok to="end"/> Oozie Launcher logs: ++++++++++++++++++++++++++++++++ Sqoop command arguments : import --verbose --connect jdbc:mysql://test.openstacklocal/db --query "select * from abc where $CONDITIONS" --username hive --password ******** --driver com.mysql.jdbc.Driver -m 1 Fetching child yarn jobs tag id : oozie-a1bbe03a0983b9e822d12ae7bb269ee3 2791 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hdp263-3.openstacklocal/172.26.105.248:8050 Child yarn jobs are found - ================================================================= >>> Invoking Sqoop command line now >>> 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6.2.6.4.0-91 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6.2.6.4.0-91 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging. 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging. 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments for import: 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments for import: 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: * 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: * 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: from 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: from 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: where 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: where 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: $CONDITIONS" 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: $CONDITIONS" 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --username 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --username 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --password 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --password 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: com.mysql.jdbc.Driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: com.mysql.jdbc.Driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: -m 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: -m 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1 3289 [main] DEBUG org.apache.sqoop.Sqoop - Try --help for usage instructions. +++++++++++++++++++++++++++++++++++++++++++ The code piece which causes the issue is (SqoopActionExecutor.java): ++++++++++++++++++++++++++++++++++ String[] args; if (actionXml.getChild("command", ns) != null) { String command = actionXml.getChild("command", ns).getTextTrim(); StringTokenizer st = new StringTokenizer(command, " "); List<String> l = new ArrayList<String>(); while (st.hasMoreTokens()) { l.add(st.nextToken()); } args = l.toArray(new String[l.size()]); } else { List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns); args = new String[eArgs.size()]; for (int i = 0; i < eArgs.size(); i++) { args[i] = eArgs.get(i).getTextTrim(); } } setSqoopCommand(actionConf, args); return actionConf; } ++++++++++++++++++++++++++++++++++ Since the delimiter is a space, the code splits the select * from table into select, *, from, table as nextToken and adds them seperatly into the array causing the issue. I have made a code change locally to address this issue and did some testing around this and it seem to work fine, hence submitting the code change for this. was: When running a oozie sqoop action which has command with --query in place the query is split into multiple parts causing "Unrecognized argument:" and in-turn fails. <sqoop xmlns="uri:oozie:sqoop-action:0.4"> <job-tracker>${resourceManager}</job-tracker> <name-node>${nameNode}</name-node> <command>import --verbose --connect jdbc:mysql://test.openstacklocal/db --query select * from abc where $CONDITIONS --username test --password test --driver com.mysql.jdbc.Driver -m 1 </command> </sqoop> <ok to="end"/> Oozie Launcher logs: ++++++++++++++++++++++++++++++++ Sqoop command arguments : import --verbose --connect jdbc:mysql://test.openstacklocal/db --query "select * from abc where $CONDITIONS" --username hive --password ******** --driver com.mysql.jdbc.Driver -m 1 Fetching child yarn jobs tag id : oozie-a1bbe03a0983b9e822d12ae7bb269ee3 2791 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hdp263-3.openstacklocal/172.26.105.248:8050 Child yarn jobs are found - ================================================================= >>> Invoking Sqoop command line now >>> 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6.2.6.4.0-91 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6.2.6.4.0-91 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging. 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug logging. 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments for import: 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments for import: 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: * 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: * 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: from 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: from 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: where 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: where 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: $CONDITIONS" 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: $CONDITIONS" 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --username 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --username 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --password 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --password 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: abc 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: com.mysql.jdbc.Driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: com.mysql.jdbc.Driver 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: -m 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: -m 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1 3289 [main] DEBUG org.apache.sqoop.Sqoop - Try --help for usage instructions. +++++++++++++++++++++++++++++++++++++++++++ The code piece which causes the issue is (SqoopActionExecutor.java): ++++++++++++++++++++++++++++++++++ String[] args; if (actionXml.getChild("command", ns) != null) { String command = actionXml.getChild("command", ns).getTextTrim(); StringTokenizer st = new StringTokenizer(command, " "); List<String> l = new ArrayList<String>(); while (st.hasMoreTokens()) { l.add(st.nextToken()); } args = l.toArray(new String[l.size()]); } else { List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns); args = new String[eArgs.size()]; for (int i = 0; i < eArgs.size(); i++) { args[i] = eArgs.get(i).getTextTrim(); } } setSqoopCommand(actionConf, args); return actionConf; } ++++++++++++++++++++++++++++++++++ Since the delimiter is a space, the code splits the select * from table into select, *, from, table as nextToken and adds them seperatly into the array causing the issue. I have made a code change locally to address this issue and did some testing around this and it seem to work fine, hence submitting the code change for this String[] args; if (actionXml.getChild("command", ns) != null) { String command = actionXml.getChild("command", ns).getTextTrim(); // Added this to get the value for select clause to be appended String QueryAppendStr =""; // Added this to get the value for select clause to be appended StringTokenizer st = new StringTokenizer(command, " "); List<String> l = new ArrayList<String>(); while (st.hasMoreTokens()) { // added to get the command delimited value to check and see if it needs to be appended or could it be directly added to the list. String QueryStr = (String) st.nextToken(); if(!(QueryStr.contains("--") || QueryStr.contains("-") || QueryStr.contains("-D"))) { QueryAppendStr = QueryAppendStr + QueryStr + " "; } else { if (!(QueryAppendStr.trim().equals(null) || QueryAppendStr.trim().equals(""))) { LOG.debug("Append : [\{0}]", QueryAppendStr.trim()); l.add(QueryAppendStr.trim()); QueryAppendStr=""; } LOG.debug("Actual : [\{0}]", QueryStr); //l.add(st.nextToken()); l.add(QueryStr); } } l.add(QueryAppendStr.trim()); args = l.toArray(new String[l.size()]); } else { List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns); args = new String[eArgs.size()]; for (int i = 0; i < eArgs.size(); i++) { args[i] = eArgs.get(i).getTextTrim(); } } setSqoopCommand(actionConf, args); return actionConf; } > Oozie Sqoop action with command splits the select clause into multiple parts > due to delimiter being space > --------------------------------------------------------------------------------------------------------- > > Key: OOZIE-3218 > URL: https://issues.apache.org/jira/browse/OOZIE-3218 > Project: Oozie > Issue Type: Bug > Components: action, workflow > Affects Versions: 3.3.2, 4.1.0, 4.2.0, 4.3.0 > Environment: Hortonworks Hadoop HDP-2.6.4.x release > oozie admin -version: Oozie server build version: 4.2.0.2.6.4.0-91 > Reporter: Mahesh Balakrishnan > Priority: Major > Attachments: OOZIE-3218.patch > > > When running a oozie sqoop action which has command with --query in place the > query is split into multiple parts causing "Unrecognized argument:" and > in-turn fails. > <sqoop > xmlns="uri:oozie:sqoop-action:0.4"> > <job-tracker>${resourceManager}</job-tracker> > <name-node>${nameNode}</name-node> > <command>import --verbose --connect jdbc:mysql://test.openstacklocal/db > --query select * from abc where $CONDITIONS --username test --password test > --driver com.mysql.jdbc.Driver -m 1 </command> > </sqoop> > <ok to="end"/> > > Oozie Launcher logs: > ++++++++++++++++++++++++++++++++ > Sqoop command arguments : > import > --verbose > --connect > jdbc:mysql://test.openstacklocal/db > --query > "select > * > from > abc > where > $CONDITIONS" > --username > hive > --password > ******** > --driver > com.mysql.jdbc.Driver > -m > 1 > Fetching child yarn jobs > tag id : oozie-a1bbe03a0983b9e822d12ae7bb269ee3 > 2791 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to > ResourceManager at hdp263-3.openstacklocal/172.26.105.248:8050 > Child yarn jobs are found - > ================================================================= > >>> Invoking Sqoop command line now >>> > 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not > been set in the environment. Cannot check for additional configuration. > 3172 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not > been set in the environment. Cannot check for additional configuration. > 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: > 1.4.6.2.6.4.0-91 > 3218 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: > 1.4.6.2.6.4.0-91 > 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug > logging. > 3287 [main] DEBUG org.apache.sqoop.tool.BaseSqoopTool - Enabled debug > logging. > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing > arguments for import: > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing > arguments for import: > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: * > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: * > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: from > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: from > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: abc > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: abc > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: where > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: where > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: $CONDITIONS" > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: $CONDITIONS" > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: --username > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: --username > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: abc > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: abc > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: --password > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: --password > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: abc > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: abc > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: --driver > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: --driver > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: com.mysql.jdbc.Driver > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: com.mysql.jdbc.Driver > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: -m > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: -m > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: 1 > 3288 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized > argument: 1 > 3289 [main] DEBUG org.apache.sqoop.Sqoop - > Try --help for usage instructions. > +++++++++++++++++++++++++++++++++++++++++++ > The code piece which causes the issue is (SqoopActionExecutor.java): > ++++++++++++++++++++++++++++++++++ > String[] args; > if (actionXml.getChild("command", ns) != null) { > String command = actionXml.getChild("command", ns).getTextTrim(); > StringTokenizer st = new StringTokenizer(command, " "); > List<String> l = new ArrayList<String>(); > while (st.hasMoreTokens()) > { l.add(st.nextToken()); } > args = l.toArray(new String[l.size()]); > } > else { > List<Element> eArgs = (List<Element>) actionXml.getChildren("arg", ns); > args = new String[eArgs.size()]; > for (int i = 0; i < eArgs.size(); i++) > { args[i] = eArgs.get(i).getTextTrim(); } > } > > setSqoopCommand(actionConf, args); > return actionConf; > } > ++++++++++++++++++++++++++++++++++ > > Since the delimiter is a space, the code splits the select * from table into > select, *, from, table as nextToken and adds them seperatly into the array > causing the issue. > > I have made a code change locally to address this issue and did some testing > around this and it seem to work fine, hence submitting the code change for > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)