If multiple Drillbits on different servers are coordinating via Zookeeper,
and some files across the servers are duplicates (with identical
filenames), will the cluster of distributed Drillbits avoid duplicating
data on queries?

I’m specifically interested in aggregating CSV data on multiple servers,
but not in HDFS.

Reply via email to