Jark Wu created FLINK-14567:
-------------------------------

             Summary: Aggregate query with more than two group fields can't be 
write into HBase sink
                 Key: FLINK-14567
                 URL: https://issues.apache.org/jira/browse/FLINK-14567
             Project: Flink
          Issue Type: Bug
          Components: Connectors / HBase, Table SQL / Legacy Planner, Table SQL 
/ Planner
            Reporter: Jark Wu
             Fix For: 1.10.0


If we have a hbase table sink with rowkey of varchar (also primary key) and a 
column of bigint, we want to write the result of the following query into the 
sink using upsert mode. However, it will fail when primary key check with the 
exception "UpsertStreamTableSink requires that Table has a full primary keys if 
it is updated."

{code:java}
select concat(f0, '-', f1) as key, sum(f2)
from T1
group by f0, f1
{code}

This happens in both blink planner and old planner. That is because if the 
query works in update mode, then there must be a primary key exist to be 
extracted and set to {{UpsertStreamTableSink#setKeyFields}}. 

That's why we want to derive primary key for concat in FLINK-14539, however, we 
found that the primary key is not preserved after concating. For example, if we 
have a primary key (f0, f1, f2) which are all varchar type, say we have two 
unique records ('a', 'b', 'c') and ('ab', '', 'c'), but the results of 
concat(f0, f1, f2) are the same, which means the concat result is not primary 
key anymore.

So here comes the problem, how can we proper support HBase sink? 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to