Hi,

How do we choose between single large avro file (size much larger than HDFS
block size) vs multiple smaller avro files (close to HDFS block size?

Since avro is splittable, is there even a need to split a very large avro
file into smaller files?

I’m assuming that a single large avro file can also be split into multiple
mappers/reducers/executors during processing.

Thanks.

Reply via email to