回复：jdbc connector写入数据到mysql数据不一致的问题

111 Thu, 16 Apr 2020 03:06:03 -0700

Hi，
业余选手建议：
1 修改groupby规则，使之与数据库主键匹配，保证幂等
2 外套一层查询，绕开upsert



Best,
xinghalo


在2020年04月16日 17:46，wldd<wldd1...@163.com> 写道：
场景：从hive读取数据计算之后写入到mysql


demo sql：
insert into data_hotel_day
select order_date,play_date,company_code,company_name,company_region,device,
cast(coalesce(sum(current_amt),0) as decimal(38,2)) current_amt,
cast(coalesce(sum(order_amt),0) as decimal(38,2)) order_amt,
coalesce(sum(room_cnt),0) room_cnt,
cast(coalesce(sum(refund_amt),0) as decimal(38,2)) refund_amt,
coalesce(sum(budget_room_cnt),0) budget_room_cnt
from db.table where plate_type='hotel'
group by order_date,play_date,company_code,company_name,company_region,device;


问题：由于jdbc connector在group by语句之后默认使用upsert sink，
但是upsert sink会从查询语句提取唯一建，通常把group by后面的字段组合作为唯一建，
因为我的场景中group by后面的字段组合并不是唯一的，这样就会造成写入到mysql
和实际查询的数据不一致，请问有什么解决办法，或者替代方案么