Re: Pig StoreFunc using VARBINARY

Jeffrey Zhong Wed, 25 Jun 2014 10:55:22 -0700

I'm happy to create an JIRA on this. Just an FYI, you are also able to
create a JIRA even you're not a developer.


From:  Daniel Rodriguez <[email protected]>
Reply-To:  <[email protected]>
Date:  Tuesday, June 24, 2014 7:24 PM
To:  <[email protected]>
Subject:  Re: Pig StoreFunc using VARBINARY

Yes, on that simple example you can change it to VARCHAR or INTEGER and it
works fine but my main objective is to use pig to read a binary avro file
and store it on an HBase table managed by phoenix.

Are you a phoenix developer?, could you create and issue on jira for this?

Daniel Rodriguez


On Tue, Jun 24, 2014 at 8:01 PM, Jeffrey Zhong <[email protected]>
wrote:
> 
> This seems a bug to me. Could you give a try to change your binary column type
> from VARBINARY to VARCHAR to work around this issue?
> 
> From:  Daniel Rodriguez <[email protected]>
> Reply-To:  <[email protected]>
> Date:  Tuesday, June 24, 2014 8:49 AM
> To:  <[email protected]>
> Subject:  Pig StoreFunc using VARBINARY
> 
> Hello,
> 
> I was able to successfully insert "basic" data types (int and varchar) using
> the Pig StoreFunc but I have not been able to insert a pig bytearray into a
> phoenix VARBINARY column.
> 
> Example:
> 
> CREATE TABLE IF NOT EXISTS binary (id BIGINT NOT NULL, binary VARBINARY
> CONSTRAINT my_pk PRIMARY KEY (id));
> phoenix> select * from binary;
> +------------+------------+
> |     ID     |   BINARY   |
> +------------+------------+
> +------------+------------+
> 
>> > cat testdata.tdf
> 1     10
> 2     20
> 3     30
> 
> grunt> A = load 'testdata.tdf' USING PigStorage('\t') AS (id:long,
> avro:bytearray);
> 
> grunt> describe A;
> A: {id: long,avro: bytearray}
> 
> grunt> STORE A into 'hbase://BINARY' using
> org.apache.phoenix.pig.PhoenixHBaseStorage('localhost','-batchSize 1000');
> 
> Is throwing a cannot cast exception:
> 
> java.lang.Exception: java.lang.ClassCastException:
> org.apache.pig.data.DataByteArray cannot be cast to [B
>     at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>     at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray
> cannot be cast to [B
>     at org.apache.phoenix.schema.PDataType$23.toBytes(PDataType.java:2976)
>     at org.apache.phoenix.schema.PDataType$23.toObject(PDataType.java:3022)
>     at org.apache.phoenix.pig.TypeUtil.castPigTypeToPhoenix(TypeUtil.java:131)
>     at 
> org.apache.phoenix.pig.hadoop.PhoenixRecord.convertTypeSpecificValue(PhoenixRe
> cord.java:87)
>     at 
> org.apache.phoenix.pig.hadoop.PhoenixRecord.write(PhoenixRecord.java:68)
>     at 
> org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.ja
> va:71)
>     at 
> org.apache.phoenix.pig.hadoop.PhoenixRecordWriter.write(PhoenixRecordWriter.ja
> va:41)
>     at 
> org.apache.phoenix.pig.PhoenixHBaseStorage.putNext(PhoenixHBaseStorage.java:15
> 1)
>     at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$P
> igRecordWriter.write(PigOutputFormat.java:139)
>     at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$P
> igRecordWriter.write(PigOutputFormat.java:98)
>     at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:6
> 46)
>     at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOut
> putContextImpl.java:89)
>     at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.
> java:112)
>     at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.co
> llect(PigMapOnly.java:48)
>     at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase
> .runPipeline(PigGenericMapBase.java:284)
>     at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase
> .map(PigGenericMapBase.java:277)
>     at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase
> .map(PigGenericMapBase.java:64)
>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:775)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>     at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner
> .java:243)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>     at 
> 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145>
)
>     at 
> 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615>
)
>     at java.lang.Thread.run(Thread.java:744)
> 
> Pig version: 0.12, Phoenix version 3.0 on EMR AMI 3.1
> 
> I appreciate any help/ideas.
> 
> Thanks,
> Daniel Rodriguez
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.




-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Pig StoreFunc using VARBINARY

Reply via email to