If multiple Drillbits on different servers are coordinating via Zookeeper, and some files across the servers are duplicates (with identical filenames), will the cluster of distributed Drillbits avoid duplicating data on queries?
I’m specifically interested in aggregating CSV data on multiple servers, but not in HDFS.
