Your connector sounds a lot like this one https://github.com/jcustenborder/kafka-connect-spooldir
I do not think you can run such a connector in distributed mode though. Typically something like this runs in standalone mode to avoid conflicts. -hans On Wed, Apr 24, 2019 at 1:08 AM Venkata S A <asaid...@gmail.com> wrote: > Hello Team, > > I am developing a custom Source Connector that watches a > given directory for any new files. My question is in a Distributed > environment, how will the tasks in different nodes handle the file Queue? > > Referring to this sample > < > https://github.com/DataReply/kafka-connect-directory-source/tree/master/src/main/java/org/apache/kafka/connect/directory > > > , > poll() in SourceTask is polling the directory at specified interval for a > new files and fetching the files in a Queue as below: > > Queue<File> queue = ((DirWatcher) task).getFilesQueue(); > > > > So, When in a 3 node cluster, this is run individually by each task. But > then, How is the synchronization happening between all the tasks in > different nodes to avoid duplication of file reading to kafka ? > > > Thank you, > Venkata S >