I believe that Pig's SequenceFileStorage is not compatible with custom writables at the moment. Per the docs the storage is only able to work with following ones:
Text, IntWritable, LongWritable, FloatWritable, DoubleWritable, BooleanWritable, ByteWritable Jarcec On Mon, Nov 04, 2013 at 07:18:43PM +1100, Andre Araujo wrote: > Hi, all, > > I've loaded some data with Sqoop from Oracle onto HDFS, storing it as > SequenceFiles and I'm having problems loading it with Pig. > I'm using Sqoop 1.4.3 and used the following steps (simplified example > using the DUAL table). > > Any ideas of why it loads incorrectly? Am I missing any steps? > > Thanks, > Andre > > > > *1. Imported data from the table onto HDFS (the DUAL table has only 1 row > with 1 field containing the string "X") * > > sqoop import -D mapred.child.java.opts="$JDBC_JAVA_OPTS" --connect $CONNSTR > -m 1 --query "select DUMMY from dual where \$CONDITIONS" --target-dir test > --as-sequencefile --class-name com.acme.Dual > > The Dual.java file is attached. > > *2. Generated the Dual.jar file:* > > javac -cp > /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/sqoop/sqoop-1.4.3-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/hadoop/client-0.20/hadoop-core-2.0.0-mr1-cdh4.3.0.jar:/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/hadoop/hadoop-common.jar:. > com/acme/Dual.java > jar cf /tmp/Dual.jar com/acme/Dual.class > > *3. Tried to load the data with Pig, however, the field value is read as 0 > (zero) instead of the string "X"):* > > REGISTER > /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/pig/piggybank.jar; > REGISTER > /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/sqoop/sqoop-1.4.3-cdh4.3.0.jar > REGISTER > /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/hadoop/client-0.20/hadoop-core-2.0.0-mr1-cdh4.3.0.jar > REGISTER > /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/lib/hadoop/hadoop-common.jar > REGISTER /tmp/Dual.jar > DEFINE SequenceFileLoader > org.apache.pig.piggybank.storage.SequenceFileLoader(); > log = LOAD 'test' USING SequenceFileLoader AS (DUMMY:chararray); > DUMP log; > > > ... > 2013-11-04 03:21:32,325 [main] INFO > org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: > > HadoopVersion PigVersion UserId StartedAt FinishedAt > Features > 2.0.0-cdh4.3.0 0.11.0-cdh4.3.0 araujo 2013-11-04 03:21:12 2013-11-04 > 03:21:32 UNKNOWN > > Success! > > Job Stats (time in seconds): > JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime > MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime > MedianReducetime Alias Feature Outputs > job_201310230912_0065 1 0 6 6 6 6 0 > 0 0 0 log MAP_ONLY hdfs:// > n1.hadoop.cto.pythian.com:8020/tmp/temp-805635901/tmp-702886222, > > Input(s): > Successfully read 1 records (479 bytes) from: "hdfs:// > n1.hadoop.cto.pythian.com:8020/user/araujo/test" > > Output(s): > Successfully stored 1 records (8 bytes) in: "hdfs:// > n1.hadoop.cto.pythian.com:8020/tmp/temp-805635901/tmp-702886222" > > Counters: > Total records written : 1 > Total bytes written : 8 > Spillable Memory Manager spill count : 0 > Total bags proactively spilled: 0 > Total records proactively spilled: 0 > > Job DAG: > job_201310230912_0065 > > > 2013-11-04 03:21:32,338 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Success! > 2013-11-04 03:21:32,342 [main] INFO org.apache.pig.data.SchemaTupleBackend > - Key [pig.schematuple] was not set... will not generate code. > 2013-11-04 03:21:32,350 [main] INFO > org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths > to process : 1 > 2013-11-04 03:21:32,350 [main] INFO > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total > input paths to process : 1 > *(0) <--- THIS SHOULD SHOW "X"* > > > -- > André Araújo > Database Administrator / SDM > The Pythian Group - Australia - www.pythian.com > > Office (calls from within Australia): 1300 366 021 x1270 > Office (international): +61 2 8016 7000 x270 *OR* +1 613 565 8696 x1270 > Mobile: +61 410 323 559 > Fax: +61 2 9805 0544 > IM: pythianaraujo @ AIM/MSN/Y! or [email protected] @ GTalk > > “Success is not about standing at the top, it's the steps you leave behind.” > — Iker Pou (rock climber) > > -- > > > -- > > >
signature.asc
Description: Digital signature
