Use RocksDBBackend to store whether the element appeared within the last
one day, here is the code:
*public class DedupFunction extends KeyedProcessFunction<Long, IN,OUT> {*
* private ValueState<Boolean> isExist;*
* public void open(Configuration parameters) throws Exception {*
* ValueStateDescriptor<boolean> desc = new ........*
* StateTtlConfig ttlConfig =
StateTtlConfig.newBuilder(Time.hours(24)).setUpdateType......*
* desc.enableTimeToLive(ttlConfig);*
* isExist = getRuntimeContext().getState(desc);*
* }*
* public void processElement(IN in, .... ) {*
* if(null == isExist.value()) {*
* out.collect(in)*
* isExist.update(true)*
* } *
* }*
*}*
Because the number of distinct key is too large(about 10 billion one day ),
there's performance bottleneck for this operator.
How can I optimize the performance?
Thanks,
Lei