[ https://issues.apache.org/jira/browse/IMPALA-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Apple resolved IMPALA-5961. ------------------------------- Resolution: Not A Bug I think this is not about Apache Impala, since Apache Impala doesn't host or release snapshot data. If you can reproduce this from Apache Impala, please re-open. I think in that case it would also be helpful to explain how the snapshot is made and demonstrate that the issue is with Apache Impala and not a snapshot procedure, should that procedure be an external script or method. > Snapshot data for TPC-DS schema contains a non-Unicode character > ---------------------------------------------------------------- > > Key: IMPALA-5961 > URL: https://issues.apache.org/jira/browse/IMPALA-5961 > Project: IMPALA > Issue Type: Task > Components: Backend > Affects Versions: Impala 2.10.0 > Reporter: Tim Wood > Assignee: Tim Wood > Attachments: ttq-50.out > > > The customer table contains rows whose c_birth_country values contain > character 0xd4 (o-circumflex) in an illegal position for Unicode. This causes > tpcds-q30 to fail. Either tests need to change to accommodate the different > character set, or the test data should change to contain the proper Unicode > character. > To reprouduce, build a mini-cluster and load with current snapshot data, then > run the query from the attached file. Find the affected rows with: > SELECT * FROM customer WHERE c_birth_country LIKE '%IVOIRE%'; -- This message was sent by Atlassian JIRA (v6.4.14#64029)