[
https://issues.apache.org/jira/browse/PIG-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060835#comment-13060835
]
Pradeep Kamath commented on PIG-2153:
-------------------------------------
Based on my (old) knowledge, the tuple returned by LoadFunc (LoadFunc always
has to return a tuple) simply stands for the record and the schema deals with
the types of the fields inside it. So if the schema is A:
{t:tuple(i:int,c:char)} that means each record contains one field of type tuple
which has an int and char). I would think this means the LoadFunc returns an
outer tuple (for the record), with a tuple inside (standing for the field)
which has int and char subfields. I will let the more active committers comment
on whether anything with respect to LoadFunc tuple handling has changed.
Hopefully I am not giving wrong information here based my old knowledge,
apologies in advance if so.
> POProject throws an error with tuples containing a single non-tuple field
> -------------------------------------------------------------------------
>
> Key: PIG-2153
> URL: https://issues.apache.org/jira/browse/PIG-2153
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.1
> Reporter: Ken Goodhope
>
> When POProject.getNext(tuple) processes a tuple with one field, the field is
> pulled out. If that field is not a tuple, a cast exception is thrown. This
> is happening in the folliwing block of code at line 401.
> if(columns.size() == 1) {
> try{
> ret = inpValue.get(columns.get(0));
> ...
> res.result = (Tuple)ret;
> I am seeing this error in a unit test that is loading an array of floats.
> The LoadFunc is converting the array to bag, and wrapping the bag in a tuple.
>
> ({(3.3),(1.2),(5.6)})
> This results on POProject attempting to cast the bag to a tuple. Looking at
> the code, it appears that if I wrapped the previous tuple in another tuple,
> then it would work.
> (({(3.3),(1.2),(5.6)}))
> In this case it would work because POProject would extract the first inner
> tuple and return it. But this would require the LoadFunc to check for tuples
> with a single non-tuple field and only wrap those.
> This could be fixed by first checking that the tuple does actually wrap
> another tuple.
> if(columns.size() == 1 && inpValue.getType(0) == DataType.TUPLE)
> {...
> I don't know the original intent of this code well enough to say this is the
> appropriate fix or not. Hoping someone with more Pig experience can help
> here. Right now this is preventing the unit tests in AvroStorage from
> working. I can change the unit test, but I think in this case the unit test
> is catching a real bug.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira