[GitHub] spark pull request #20361: [SPARK-23188][SQL] Make vectorized columar reader...

cloud-fan Tue, 30 Jan 2018 01:39:07 -0800

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20361#discussion_r164685543
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
    @@ -377,6 +377,12 @@ object SQLConf {
           .booleanConf
           .createWithDefault(true)
     
    +  val PARQUET_VECTORIZED_READER_BATCH_SIZE = 
buildConf("spark.sql.parquet.batchSize")
    --- End diff --
    
    I'd say it's very hard. If we need to satisfy a sizeInBytes limitation, we 
would need to load data record by record, and stop loading if we hit the 
limitation. But for performance reasons, we wanna load the data with batch, 
which needs to know the batch size ahead.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20361: [SPARK-23188][SQL] Make vectorized columar reader...

Reply via email to