Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21837#discussion_r204762586 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala --- @@ -68,4 +70,25 @@ class AvroOptions( .map(_.toBoolean) .getOrElse(!ignoreFilesWithoutExtension) } + + /** + * The `compression` option allows to specify a compression codec used in write. + * Currently supported codecs are `uncompressed`, `snappy` and `deflate`. + * If the option is not set, the `snappy` compression is used by default. + */ + val compression: String = parameters.get("compression").getOrElse(sqlConf.avroCompressionCodec) + + + /** + * Level of compression in the range of 1..9 inclusive. 1 - for fast, 9 - for best compression. + * If the compression level is not set for `deflate` compression, the current value of SQL + * config `spark.sql.avro.deflate.level` is used by default. For other compressions, the default + * value is `6`. + */ + val compressionLevel: Int = { --- End diff -- Yea, I know that could be useful in some ways but I was thinking we should better not add this just for now. Thing is, it sounds currently too specific to one compression option in Avro for now .. There are many options to expose in, for example in CSV datasource too in this way ..
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org