退订
在 2022-03-05 09:46:46,"guanyq" <dlgua...@163.com> 写道:
>kafka实时流关联hive的最新分区表数据时,关于缓存刷新的问题
>
>
>'streaming-source.monitor-interval'='12 h'
>这个参数我理解是:按照启动开始时间算起,每12小时读取一下最新分区的数据是吧?
>还有个问题是读取最新分区的时间间隔之间,实时流里面进入了预关联新分区的数据,那么是不是就相当于关联的还是上一次的最新分区数据吧?
>
>
>SETtable.sql-dialect=hive;CREATETABLEdimension_table(product_idSTRING,product_nameSTRING,unit_priceDECIMAL(10,4),pv_countBIGINT,like_countBIGINT,comment_countBIGINT,update_timeTIMESTAMP(3),update_userSTRING,...)PARTITIONEDBY(pt_yearSTRING,pt_monthSTRING,pt_daySTRING)TBLPROPERTIES(--
> using default partition-name order to load the latest partition every 12h 
>(the most recommended and convenient way)
>'streaming-source.enable'='true','streaming-source.partition.include'='latest','streaming-source.monitor-interval'='12
> h','streaming-source.partition-order'='partition-name',-- option with default 
>value, can be ignored.
>
> 
>
>
>
>
> 
>
>
>
>
>
> 
>
>
>
>
>
> 
>
>
>
>
>
> 

回复