[GitHub] spark pull request #21837: [SPARK-24881][SQL] New Avro options - compression...

HyukjinKwon Tue, 24 Jul 2018 06:49:45 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21837#discussion_r204762586
  
    --- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroOptions.scala ---
    @@ -68,4 +70,25 @@ class AvroOptions(
           .map(_.toBoolean)
           .getOrElse(!ignoreFilesWithoutExtension)
       }
    +
    +  /**
    +   * The `compression` option allows to specify a compression codec used 
in write.
    +   * Currently supported codecs are `uncompressed`, `snappy` and `deflate`.
    +   * If the option is not set, the `snappy` compression is used by default.
    +   */
    +  val compression: String = 
parameters.get("compression").getOrElse(sqlConf.avroCompressionCodec)
    +
    +
    +  /**
    +   * Level of compression in the range of 1..9 inclusive. 1 - for fast, 9 
- for best compression.
    +   * If the compression level is not set for `deflate` compression, the 
current value of SQL
    +   * config `spark.sql.avro.deflate.level` is used by default. For other 
compressions, the default
    +   * value is `6`.
    +   */
    +  val compressionLevel: Int = {
    --- End diff --
    
    Yea, I know that could be useful in some ways but I was thinking we should 
better not add this just for now. Thing is, it sounds currently too specific to 
one compression option in Avro for now .. There are many options to expose in, 
for example in CSV datasource too in this way ..



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21837: [SPARK-24881][SQL] New Avro options - compression...

Reply via email to