I'm happy to create an JIRA on this. Just an FYI, you are also able to create a JIRA even you're not a developer.
From: Daniel Rodriguez <[email protected]> Reply-To: <[email protected]> Date: Tuesday, June 24, 2014 7:24 PM To: <[email protected]> Subject: Re: Pig StoreFunc using VARBINARY Yes, on that simple example you can change it to VARCHAR or INTEGER and it works fine but my main objective is to use pig to read a binary avro file and store it on an HBase table managed by phoenix. Are you a phoenix developer?, could you create and issue on jira for this? Daniel Rodriguez On Tue, Jun 24, 2014 at 8:01 PM, Jeffrey Zhong <[email protected]> wrote: > > This seems a bug to me. Could you give a try to change your binary column type > from VARBINARY to VARCHAR to work around this issue? > > From: Daniel Rodriguez <[email protected]> > Reply-To: <[email protected]> > Date: Tuesday, June 24, 2014 8:49 AM > To: <[email protected]> > Subject: Pig StoreFunc using VARBINARY > > Hello, > > I was able to successfully insert "basic" data types (int and varchar) using > the Pig StoreFunc but I have not been able to insert a pig bytearray into a > phoenix VARBINARY column. > > Example: > > CREATE TABLE IF NOT EXISTS binary (id BIGINT NOT NULL, binary VARBINARY > CONSTRAINT my_pk PRIMARY KEY (id)); > phoenix> select * from binary; > +------------+------------+ > | ID | BINARY | > +------------+------------+ > +------------+------------+ > >> > cat testdata.tdf > 1 10 > 2 20 > 3 30 > > grunt> A = load 'testdata.tdf' USING PigStorage('\t') AS (id:long, > avro:bytearray); > > grunt> describe A; > A: {id: long,avro: bytearray} > > grunt> STORE A into 'hbase://BINARY' using > org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 1000'); > > Is throwing a cannot cast exception: > > java.lang.Exception: java.lang.ClassCastException: > org.apache.pig.data.DataByteArray cannot be cast to [B > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) > Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray > cannot be cast to [B > at org.apache.phoenix.schema.PDataType$23.toBytes(PDataType.java:2976) > at org.apache.phoenix.schema.PDataType$23.toObject(PDataType.java:3022) > at org.apache.phoenix.pig.TypeUtil.castPigTypeToPhoenix(TypeUtil.java:131) > at > org.apache.phoenix.pig.hadoop.PhoenixRecord.convertTypeSpecificValue(PhoenixRe > cord.java:87) > at > org.apache.phoenix.pig.hadoop.PhoenixRecord.write(PhoenixRecord.java:68) > at > org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.ja > va:71) > at > org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.ja > va:41) > at > org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:15 > 1) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$P > igRecordWriter.write(PigOutputFormat.java:139) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$P > igRecordWriter.write(PigOutputFormat.java:98) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:6 > 46) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOut > putContextImpl.java:89) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper. > java:112) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.co > llect(PigMapOnly.java:48) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase > .runPipeline(PigGenericMapBase.java:284) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase > .map(PigGenericMapBase.java:277) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase > .map(PigGenericMapBase.java:64) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner > .java:243) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145> ) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615> ) > at java.lang.Thread.run(Thread.java:744) > > Pig version: 0.12, Phoenix version 3.0 on EMR AMI 3.1 > > I appreciate any help/ideas. > > Thanks, > Daniel Rodriguez > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader of > this message is not the intended recipient, you are hereby notified that any > printing, copying, dissemination, distribution, disclosure or forwarding of > this communication is strictly prohibited. If you have received this > communication in error, please contact the sender immediately and delete it > from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
