Illustrate and Dump do not seem to work correctly for files containing utf8
---------------------------------------------------------------------------

                 Key: PIG-504
                 URL: https://issues.apache.org/jira/browse/PIG-504
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
         Environment: Hadoop 18
            Reporter: Viraj Bhat


For the snippet of code which runs on the latest type branch
{code}
A = load 'utf8.txt' using PigStorage() as (text: chararray);
illustrate A;
{code}

results in this output being produced

---------------------------------
| A     | text: bytearray cn: 1 | 
---------------------------------
|       | ????????????????      | 
---------------------------------

Three observations:
1) text should be chararray, not bytearray.
2) cn: 1 should be removed from the display
3) Value for text is "???????????????" is not displayed properly

Now replacing illustrate with dump
{code}
A = load 'utf8.txt' using PigStorage() as (text: chararray);
dump A;
{code}

produces (??????)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to