Tim Wood created IMPALA-5961:
--------------------------------

             Summary: Snapshot data for TPC-DS schema contains a non-Unicode 
character
                 Key: IMPALA-5961
                 URL: https://issues.apache.org/jira/browse/IMPALA-5961
             Project: IMPALA
          Issue Type: Task
          Components: Backend
    Affects Versions: Impala 2.10.0
            Reporter: Tim Wood
            Assignee: Tim Wood
         Attachments: ttq-50.out



The customer table contains rows whose c_birth_country values contain character 
0xd4 (o-circumflex) in an illegal position for Unicode. This causes tpcds-q30 
to fail. Either tests need to change to accommodate the different character 
set, or the test data should change to contain the proper Unicode character.

To reprouduce, build a mini-cluster and load with current snapshot data, then 
run the query from the attached file. Find the affected rows with:
SELECT * FROM customer WHERE c_birth_country LIKE '%IVOIRE%';




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to