No, you can join on bytearrays. What can't be done is have pig
thinking you are joining on bytearrays when you are actually using
strings under the covers -- that's what causes the error you are
seeing.

On Wed, Apr 11, 2012 at 7:09 AM, shan s <[email protected]> wrote:
> Hi Dmitriy
> It works after explicit casting to chararray.
> So does it mean a bytearray field can't be used in JOIN or is there more to
> it?
> How to explain this behaviour ?
>
> Thanks!
> On Wed, Apr 11, 2012 at 8:45 AM, shan s <[email protected]> wrote:
>
>> When I  load my data I defined all fields to be chararray in the schema. I
>> can afford to treat everything as chararray.
>>
>> rid cold be chararray. ( but no real expectations from my side, it's a
>> guid from coming from db)
>> AA and BB do come from UDF, UDF does some string processing and
>> returns substrings as tuples.
>> Also when I tried to convert the rid to chararray in A3, I get an error,
>> "can't convert to chararray." without further explanation.
>>
>> Thank You....
>>  On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <[email protected]>wrote:
>>
>>> What type do you expect rid to be?
>>> Where did AA and BB come from?
>>>
>>> D
>>>
>>> On Tue, Apr 10, 2012 at 12:03 PM, shan s <[email protected]> wrote:
>>> > I am currently getting  “Type mismatch in key from map: expected
>>> > org.apache.pig.impl.io.NullableBytesWritable, recieved
>>> > org.apache.pig.impl.io.NullableText “
>>> >
>>> >
>>> > I looked up the PIG-919 and related comments, but could not understand
>>> the
>>> > reason or the workaround for this problem.
>>> >
>>> > Could you please kindly explain this further?
>>> >
>>> >
>>> >
>>> > I am getting this even before my GROUP, when I do my 3 way JOIN.
>>> >
>>> >
>>> >
>>> > A1 = JOIN AA BY rid, BB BY rid;
>>> >
>>> > A2 = JOIN A1 BY BB::cid, CC by cid;
>>> >
>>> > DESCRIBE A2;
>>> >
>>> > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));
>>> >
>>> > DESCRIBE A3;
>>> >
>>> > DUMP A3;
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > DESCRIBE looks like below.
>>> >
>>> >
>>> >
>>> > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid:
>>> > bytearray,A1::AA::asname: bytearray,A1::BB::rid:
>>> bytearray,A1::BB::roname:
>>> > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid:
>>> > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname:
>>> bytearray}
>>> >
>>> > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid:
>>> bytearray}
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > If map is a problem, I tried to convert it to  tuple (For A3) above,
>>> but it
>>> > still does not work, in fact A3 still describes it as map (with a {}, I
>>> > guess)  Why is that?
>>> >
>>> >
>>> >
>>> > Appreciate your help! Thanks!!
>>>
>>
>>

Reply via email to