Grant Henke created IMPALA-9583:
-----------------------------------

             Summary: Add automated tests for Kudu VARCHAR multibyte truncation
                 Key: IMPALA-9583
                 URL: https://issues.apache.org/jira/browse/IMPALA-9583
             Project: IMPALA
          Issue Type: Improvement
            Reporter: Grant Henke


Kudu VARCHAR support is added in IMPALA-5092, however adding an automated test 
to validate that multibyte characters are truncated when too wide was not 
added. Instead a manual test was performed. 

 

Something like below should be added to _test_kudu.py_ along with updates to 
the test framework to support non-ascii characters:
{code:java}
  @SkipIfKudu.no_hybrid_clock
  def test_kudu_multibyte_vc(self, vector, cursor, kudu_client, 
unique_database):
    """Test multibyte Kudu VARCHAR values that are wider than Impala's Varchar 
length."""
    cursor.execute("""CREATE TABLE %s.multibyte (a INT PRIMARY KEY, vc 
VARCHAR(8))
        PARTITION BY HASH(a) PARTITIONS 3 STORED AS KUDU""" % unique_database)
    assert kudu_client.table_exists(
        KuduTestSuite.to_kudu_table_name(unique_database, "multibyte"))    
table = kudu_client.table(KuduTestSuite.to_kudu_table_name(unique_database, 
"multibyte"))
    session = kudu_client.new_session()
    # Not truncated: 1 character in Kudu, 4 bytes in Impala.
    session.apply(table.new_insert((0, "测")))
    # Truncated: 2 characters in Kudu, 8 bytes in Impala.
    session.apply(table.new_insert((1, "测试")))
    session.flush()    self.run_test_case('QueryTest/kudu_multibyte_vc', 
vector, use_db=unique_database)    {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to