Re: exporting data from sequence files back into an RDBMS

Jarek Jarcec Cecho Wed, 24 Jul 2013 17:31:13 -0700

Hi Eric,
would you mind sharing with us your entire data flow? Starting with the exact 
Sqoop import command, Hive transformations if you are doing any and finally 
with the Sqoop export command?


Importing data into Hive using the SequenceFile format is not supported by 
Sqoop, so I would like to make sure that we are understanding you use case 
correctly.

Jarcec

On Wed, Jul 24, 2013 at 05:17:30PM -0700, Eric Hernandez wrote:
> Here are my logs
> 
> sqoop export --connect 'jdbc:mysql://mysqlServer:3306/hadoop' 
> --username=hadoop -P --table=dbo_tablea --export-dir /hive/dbo_tablea -m 1 
> --input-fields-terminated-by  '\001'
> Enter password:
> 13/07/24 17:07:58 INFO manager.MySQLManager: Preparing to use a MySQL 
> streaming resultset.
> 13/07/24 17:07:58 INFO tool.CodeGenTool: Beginning code generation
> 13/07/24 17:07:58 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM `dbo_tablea` AS t LIMIT 1
> 13/07/24 17:07:58 INFO manager.SqlManager: Executing SQL statement: SELECT 
> t.* FROM `dbo_tablea` AS t LIMIT 1
> 13/07/24 17:07:58 INFO orm.CompilationManager: HADOOP_HOME is /usr/lib/hadoop
> Note: 
> /tmp/sqoop-erich/compile/5287b2ea7807ccef31ae33420fbbb7a0/dbo_tablea.java 
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 13/07/24 17:08:00 INFO orm.CompilationManager: Writing jar file: 
> /tmp/sqoop-erich/compile/5287b2ea7807ccef31ae33420fbbb7a0/dbo_tablea.jar
> 13/07/24 17:08:00 INFO mapreduce.ExportJobBase: Beginning export of dbo_tablea
> 13/07/24 17:08:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
> the arguments. Applications should implement Tool for the same.
> 13/07/24 17:08:02 INFO input.FileInputFormat: Total input paths to process : 1
> 13/07/24 17:08:02 INFO input.FileInputFormat: Total input paths to process : 1
> 13/07/24 17:08:03 INFO mapred.JobClient: Running job: job_201302261137_303267
> 13/07/24 17:08:04 INFO mapred.JobClient:  map 0% reduce 0%
> 13/07/24 17:08:20 INFO mapred.JobClient: Task Id : 
> attempt_201302261137_303267_m_000000_0, Status : FAILED
> java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
> at 
> org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:95)
> at 
> org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:38)
> at 
> org.apache.sqoop.mapreduce.CombineFileRecordReader.getCurrentKey(CombineFileRecordReader.java:77)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getCurrentKey(MapTask.java:436)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.getCurrentKey(MapContextImpl.java:66)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.getCurrentKey(WrappedMapper.java:75)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> at org.apache.hadoop.mapred.Child$4.run(Child.ja
> 13/07/24 17:08:30 INFO mapred.JobClient: Task Id : 
> attempt_201302261137_303267_m_000000_1, Status : FAILED
> java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
> at 
> org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:95)
> at 
> org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:38)
> at 
> org.apache.sqoop.mapreduce.CombineFileRecordReader.getCurrentKey(CombineFileRecordReader.java:77)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getCurrentKey(MapTask.java:436)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.getCurrentKey(MapContextImpl.java:66)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.getCurrentKey(WrappedMapper.java:75)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> at org.apache.hadoop.mapred.Child$4.run(Child.ja
> 13/07/24 17:08:38 INFO mapred.JobClient: Task Id : 
> attempt_201302261137_303267_m_000000_2, Status : FAILED
> java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
> at 
> org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:95)
> at 
> org.apache.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:38)
> at 
> org.apache.sqoop.mapreduce.CombineFileRecordReader.getCurrentKey(CombineFileRecordReader.java:77)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getCurrentKey(MapTask.java:436)
> at 
> org.apache.hadoop.mapreduce.task.MapContextImpl.getCurrentKey(MapContextImpl.java:66)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.getCurrentKey(WrappedMapper.java:75)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> at org.apache.hadoop.mapred.Child$4.run(Child.ja
> 13/07/24 17:08:48 INFO mapred.JobClient: Job complete: job_201302261137_303267
> 13/07/24 17:08:48 INFO mapred.JobClient: Counters: 8
> 13/07/24 17:08:48 INFO mapred.JobClient:   Job Counters
> 13/07/24 17:08:48 INFO mapred.JobClient:     Failed map tasks=1
> 13/07/24 17:08:48 INFO mapred.JobClient:     Launched map tasks=4
> 13/07/24 17:08:48 INFO mapred.JobClient:     Data-local map tasks=1
> 13/07/24 17:08:48 INFO mapred.JobClient:     Rack-local map tasks=2
> 13/07/24 17:08:48 INFO mapred.JobClient:     Total time spent by all maps in 
> occupied slots (ms)=30893
> 13/07/24 17:08:48 INFO mapred.JobClient:     Total time spent by all reduces 
> in occupied slots (ms)=0
> 13/07/24 17:08:48 INFO mapred.JobClient:     Total time spent by all maps 
> waiting after reserving slots (ms)=0
> 13/07/24 17:08:48 INFO mapred.JobClient:     Total time spent by all reduces 
> waiting after reserving slots (ms)=0
> 13/07/24 17:08:48 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 
> 46.3496 seconds (0 bytes/sec)
> 13/07/24 17:08:48 WARN mapreduce.Counters: Group 
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter instead
> 13/07/24 17:08:48 INFO mapreduce.ExportJobBase: Exported 0 records.
> 13/07/24 17:08:48 ERROR tool.ExportTool: Error during export: Export job 
> failed!
> 
> 
> 
> 
> 
> On Jul 24, 2013, at 4:19 PM, Eric  wrote:
> 
> Yes I can get the logs but, first I am going to have to mock it up in my lab 
> with some dummy data and credentials.  I should be able to provide full logs 
> tomorrow.
> 
> My darn signature leaked out on my last reply. If anybody can scrub my last 
> post and remove my signature that would be awesome.
> 
> Thanks,
> -Eric
> 
> 
> On Jul 24, 2013, at 3:51 PM, Abraham   wrote:
> 
> Eric,
> 
> The middle command seems right. Could you provide the rest of your logs? It 
> will help us understand where in the process sqoop fails.
> 
> -Abe
> 
> 
> 
> I have tried many different variations all with the same result
> 
> sqoop export --connect 'jdbc:mysql://mysqlIP:3306/hadoop' --username=hadoop 
> --password='sanitized' --table=tableA --export-dir /hive/tableA -m 1 
> --fields-terminated-by '\001'
> 
> sqoop export --connect 'jdbc:mysql://mysqlIP:3306/hadoop' --username=hadoop 
> --password='sanitized' --table=tableA --export-dir /hive/tableA -m 1 
> --input-fields-terminated-by  '\001'
> 
> sqoop export --connect 'jdbc:mysql://mysqlIP:3306/hadoop' --username=hadoop 
> --password='sanitized' --table=tableA --export-dir /hive/tableA -m 1
> 
> 
> 
> 
> 
> Hey Eric,
> 
> I believe its possible. Can you provide the command you are using?
> 
> -Abe
> 
> 
> On Wed, Jul 24, 2013 at 2:54 PM, Eric Hernandez  wrote:
> Hi,
> Is it possible to sqoop data out of hive back into an RDBMS like MyQL or SQL 
> Server when it has been imported via sqoop as a sequence file?
> 
> I have been trying all day to get data back out of hive and I keep getting 
> this error no matter what I try
> 
> "java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable"
> 
> I am using Sqoop 1.4.1-cdh4.1.2
> 
> Thanks,
> 
> Eric H
> 
> 
> 
> 
>

signature.asc
Description: Digital signature

Re: exporting data from sequence files back into an RDBMS

Reply via email to