Alejandro, Thank you for the advice :)
Also, while I agree Hive integration is not supported I have not had any issues with it as long as you supply a flattened site and the Hive CLI is on the node. We are breaking it into seperate steps shortly but for the sake of our ETL it is easier to refactor this later. -- Matt On Thu, Aug 16, 2012 at 12:10 PM, Alejandro Abdelnur <[email protected]>wrote: > Matt, > > For such invocation, instead using <command> you should use multiple > <arg> in the <sqoop> action, see the Sqoop docs for details: > > > http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/DG_SqoopActionExtension.html > > BTW, Sqoop Hive integration it is not supported and it does not work > when invoking sqoop as an Oozie action. > > Thx > > On Thu, Aug 16, 2012 at 12:06 PM, Matt Goeke <[email protected]> > wrote: > > Johannes, > > > > That definitely fixed the parsing issue and now it seems as if it is > purely > > an issue with the sqoop action. I can execute the statement > > in SQOOP_STATEMENT below from command line but for some reason I get the > > following error when I try it submitted to the action: > > > > 986 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - Error > > parsing arguments for import: > > 986 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: >= > > 986 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: '2012-08-14 > > 986 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: 00:00:00' > > 986 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: and > > 986 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: created > > 986 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: > > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: '2012-08-15 > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: 00:00:00'" > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --compress > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --hive-import > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --warehouse-dir > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: hdfs://ip:port/tmp/dir > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --hive-drop-import-delims > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --hive-table > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: schema.dir > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --hive-overwrite > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --null-string > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: NULL > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --null-non-string > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: NULL > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: --num-mappers > > 987 [main] ERROR com.cloudera.sqoop.tool.BaseSqoopTool - > > Unrecognized argument: 1 > > > > It looks as if it is incorrectly parsing the where clause even though it > is > > enclosed as a literal. I will continue to dig into this but if anyone has > > suggestions I am game. > > > > Thanks. > > -- > > Matt > > > > > > > > On Thu, Aug 16, 2012 at 2:20 AM, Johannes Schwenk < > > [email protected]> wrote: > > > >> Just out off the blue: > >> > >> Did you try wrapping the content with <![CDATA[ ... ]]> ? > >> > >> Am 16.08.2012 02:31, schrieb Matt Goeke: > >> > Alejandro, > >> > > >> > Here is a more verbose example. The workflow itself submits correctly > but > >> > fails when it hits the sqoop action. What's funny is the character > entity > >> > references make it through a subworkflow transition and then in the > >> action > >> > itself I see it get translated correctly in the action configuration > tab. > >> > It almost looks like Oozie itself is not the issue but the way the > Sqoop > >> > action is referencing the SQOOP_STATEMENT variable. > >> > > >> > job.properties----------------------------------- > >> > <property> > >> > <name>SQOOP_STATEMENT</name> > >> > <value>import --connect jdbc:mysql://ip:port/schema --username user > >> > --password password --table table --where "created >= '2012-08-14 > >> > 00:00:00' and created < '2012-08-15 00:00:00'" --compress > >> --hive-import > >> > --warehouse-dir hdfs://ip:port/tmp/directory --hive-drop-import-delims > >> > --hive-table schema.table --hive-overwrite --null-string NULL > >> > --null-non-string NULL --num-mappers 1 </value> > >> > </property> > >> > > >> > Workflow.xml------------------------------------ > >> > <action name="sqoop-node"> > >> > <sqoop xmlns="uri:oozie:sqoop-action:0.2"> > >> > <job-tracker>${JOB_TRACKER}</job-tracker> > >> > <name-node>${NAME_NODE}</name-node> > >> > <prepare> > >> > <delete path="${NAME_NODE}${STAGING_EXTRACTION_DIR}"/> > >> > </prepare> > >> > <configuration> > >> > <property> > >> > <name>mapred.job.queue.name</name> > >> > <value>${QUEUE_NAME}</value> > >> > </property> > >> > </configuration> > >> > <command>${SQOOP_STATEMENT}</command> > >> > </sqoop> > >> > <ok to="hive-node"/> > >> > <error to="fail"/> > >> > </action> > >> > > >> > Error------------------------------------------- > >> > JA007: Error on line 13: The content of elements must consist of > >> > well-formed character data or markup. > >> > > >> > On Wed, Aug 15, 2012 at 2:45 PM, Alejandro Abdelnur < > [email protected] > >> >wrote: > >> > > >> >> Matt, > >> >> > >> >> Would you share an example of your job.properties & workflow.xml you > >> >> are having troubles with? please obfuscate all confidential/unneeded > >> >> stuff. > >> >> > >> >> thx. > >> >> > >> >> On Wed, Aug 15, 2012 at 2:29 PM, Matt Goeke <[email protected] > > > >> >> wrote: > >> >>> All, > >> >>> > >> >>> Does anyone know how to propogate symbols through the workflow that > >> >>> normally would be escaped in XML? I am trying to push sqoop / hive > >> >>> statements through a workflow xml that have greater than and less > than > >> >>> symbols but I keep getting an error stating that the xml is > unchecked. > >> We > >> >>> have also tried letting a DOM parser handle the translation but I > get > >> the > >> >>> same error even when it is shown as > and < > >> >>> > >> >>> Thanks for any advice! > >> >>> > >> >>> -- > >> >>> Matt > >> >> > >> >> > >> >> > >> >> -- > >> >> Alejandro > >> >> > >> > > >> > >> > >> > >> Johannes Schwenk > >> > >> -- > >> Softwareentwickler (Reporting) > >> ________________________________________________________ > >> > >> ADITION technologies AG > >> Schwarzwaldstraße 78b > >> 79117 Freiburg > >> > >> http://www.adition.com > >> > >> T +49 / (0)761 / 88147 - 30 > >> F +49 / (0)761 / 88147 - 77 > >> SUPPORT +49 / (0)1805 - ADITION > >> > >> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) > >> > >> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 > >> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus > Schlüter > >> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer > >> UStIDNr.: DE 218 858 434 > >> > >> > > > > -- > Alejandro >
