Hi Kostas,
thank you very much for your answer.
Yes, I proposed the change in https://github.com/apache/flink/pull/4997 to
compare as modificationTime < globalModificationTime (without accepting
equals). Later, however, I realized, as you correctly point out, that this
creates duplicates.
The
Hi Juan,
The problem is that once a file for a certain timestamp is processed and the
global modification timestamp is modified,
then all files for that timestamp are considered processed.
The solution is not to remove the = from the modificationTime <=
globalModificationTime; in
Hi there,
I’m trying to watch a directory for new incoming files (with
StreamExecutionEnvironment#readFile) with a subsecond latency (interval
watch of ~100ms, and using the flag FileProcessingMode.PROCESS_CONTINUOUSLY
).
If many files come in within (under) the interval watching time, flink