Large intervaljoin related question

Chen Qin Fri, 13 Dec 2019 11:25:57 -0800

Hi there,

We had seen growing interest of using large window and interval join operation. 
What is recommended way of handling these use cases?(e.g DeltaLake in Spark)
After some benchmark, we found performance seems a bottleneck (still) on 
support those use cases. 
How is performance improvement https://issues.apache.org/jira/browse/FLINK-7001 
<https://issues.apache.org/jira/browse/FLINK-7001> going?
 
In tuning side, we plan to test giving larger blob cache on rocskdb side ~4GB, 
will this help?
Otherwise, we plan to write to external hive table (seems no partition 
supported yet) and run frequent ETL job there.



Thanks,
Chen

Large intervaljoin related question

Reply via email to