[
https://issues.apache.org/jira/browse/PIG-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996845#comment-13996845
]
Daniel Dai commented on PIG-3558:
---------------------------------
bq. I don't see hive binary being any different than pig bytearray
Technically it is different. Pig bytearray means unknown data type. Consider
the following script & UDF:
{code}
public class MapGenerate extends EvalFunc<Map> {
@Override
public Map exec(Tuple input) throws IOException {
// TODO Auto-generated method stub
Map m = new HashMap();
m.put("key", new Integer(input.size()));
return m;
}
@Override
public Schema outputSchema(Schema input) {
return new Schema(new Schema.FieldSchema(null, DataType.MAP));
}
}
{code}
{code}
a = load '1.txt' as (a0);
b = foreach a generate a0, MapGenerate(*) as m:map[];
c = group c by key;
dump c;
{code}
The group key will be of data type bytearray (since it is unknown), and the map
key is NullableBytesWritable. NullableBytesWritable takes any Object instead of
just DataByteArray to accommodate this case.
It is possible we map Pig bytearray to binary, but must deal with the fact that
the data may not be DataByteArray.
> ORC support for Pig
> -------------------
>
> Key: PIG-3558
> URL: https://issues.apache.org/jira/browse/PIG-3558
> Project: Pig
> Issue Type: Improvement
> Components: impl
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Labels: porc
> Fix For: 0.13.0
>
> Attachments: PIG-3558-1.patch, PIG-3558-2.patch, PIG-3558-3.patch,
> PIG-3558-4.patch, PIG-3558-5.patch
>
>
> Adding LoadFunc and StoreFunc for ORC.
--
This message was sent by Atlassian JIRA
(v6.2#6252)