Thanks for the suggestions Gwen Shapira.
-----Original Message----- From: Gwen Shapira [mailto:[email protected]] Sent: Thursday, August 14, 2014 12:24 PM To: [email protected] Subject: Re: Sqoop Import parallel sessions - Question Sqoop needs to write to a directory that doesn't exist yet. Since both your jobs try to write a single directory, one will complain that the directory exists. You can use --warehouse-dir or --target-dir parameters to make sure each job writes to its own directory. Or, you can use --partition-key and --partition value parameters to import the data into separate Hive partitions (makes sense from table design perspective too) On Thu, Aug 14, 2014 at 9:12 AM, Sethuramaswamy, Suresh <[email protected]> wrote: > Sure. > > This is my command. When I run 2 commands in parallel , I get the exception > as mentioned below. > > sqoop import --connect jdbc:oracle:thin:@<<ORACLE DB DETAILS>> --table > <Table_name> --where "date between '01-JAN-2013' and '30-JAN-2013'" -m 1 > --hive-import --hive-table <hive tablename> --compression-codec > org.apache.hadoop.io.compress.SnappyCodec --null-string '\\N' > --null-non-string '\\N' --hive-drop-import-delims; > > ... > ... > > .. > > sqoop import --connect jdbc:oracle:thin:@<<ORACLE DB DETAILS>> --table > <Table_name> --where "date between '01-DEC-2013' and '31-DEC-2013'" -m 1 > --hive-import --hive-table <hive tablename> --compression-codec > org.apache.hadoop.io.compress.SnappyCodec --null-string '\\N' > --null-non-string '\\N' --hive-drop-import-delims; > > > > Exception: > > > 14/08/14 12:04:57 ERROR tool.ImportTool: Encountered IOException running > import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output > directory <SCHEMA>.<TABLENAME> already exists > at > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:987) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:948) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:948) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:582) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:612) > at > org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186) > at > org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159) > at > org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247) > at > org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:614) > at > org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:436) > at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413) > at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506) > at org.apache.sqoop.Sqoop.run(Sqoop.java:147) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231) > at org.apache.sqoop.Sqoop.main(Sqoop.java:240) > > > -----Original Message----- > From: Jarek Jarcec Cecho [mailto:[email protected]] On Behalf Of Jarek Jarcec > Cecho > Sent: Thursday, August 14, 2014 11:41 AM > To: [email protected] > Subject: Re: Sqoop Import parallel sessions - Question > > It would be helpful if you could share your entire Sqoop commands and the > exact exception with it's stack trace. > > Jarcec > > On Aug 14, 2014, at 7:57 AM, Sethuramaswamy, Suresh > <[email protected]> wrote: > >> Team, >> >> We had to initiate Sqoop import for a month old records in a session, >> similarly I need to initiate 12 such statements in parallel in order to read >> 1 year worth of data, while I do this, >> >> I keep getting the error <SCHEMA>.<TABLENAME> folder already exists. This >> is because of all these sessions being initiated with same uid and the >> mapred temporary hdfs folder under the user's home directory until it >> completes. >> >> Is there a better option for me to accomplish .? >> >> >> Thanks >> Suresh Sethuramaswamy >> >> >> >> ============================================================================== >> Please access the attached hyperlink for an important electronic >> communications disclaimer: >> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html >> ============================================================================== > > > > =============================================================================== > Please access the attached hyperlink for an important electronic > communications disclaimer: > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > =============================================================================== > =============================================================================== Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html ===============================================================================
