Tim Wood created IMPALA-5961: -------------------------------- Summary: Snapshot data for TPC-DS schema contains a non-Unicode character Key: IMPALA-5961 URL: https://issues.apache.org/jira/browse/IMPALA-5961 Project: IMPALA Issue Type: Task Components: Backend Affects Versions: Impala 2.10.0 Reporter: Tim Wood Assignee: Tim Wood Attachments: ttq-50.out
The customer table contains rows whose c_birth_country values contain character 0xd4 (o-circumflex) in an illegal position for Unicode. This causes tpcds-q30 to fail. Either tests need to change to accommodate the different character set, or the test data should change to contain the proper Unicode character. To reprouduce, build a mini-cluster and load with current snapshot data, then run the query from the attached file. Find the affected rows with: SELECT * FROM customer WHERE c_birth_country LIKE '%IVOIRE%'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)