[GitHub] spark pull request #22643: [SPARK-25630][TEST] Reduce test time of HadoopFsR...

dongjoon-hyun Fri, 05 Oct 2018 12:21:42 -0700

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22643#discussion_r223115147
  
    --- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/sources/HadoopFsRelationTest.scala 
---
    @@ -114,10 +118,21 @@ abstract class HadoopFsRelationTest extends QueryTest 
with SQLTestUtils with Tes
         new UDT.MyDenseVectorUDT()
       ).filter(supportsDataType)
     
    +  private val parquetDictionaryEncodingEnabledConfs = if 
(isParquetDataSource) {
    +    // Run with/without Parquet dictionary encoding enabled for Parquet 
data source.
    +    Seq(true, false)
    +  } else {
    +    Seq(false)
    +  }
    +
       for (dataType <- supportedDataTypes) {
    -    for (parquetDictionaryEncodingEnabled <- Seq(true, false)) {
    -      test(s"test all data types - $dataType with 
parquet.enable.dictionary = " +
    -        s"$parquetDictionaryEncodingEnabled") {
    +    for (parquetDictionaryEncodingEnabled <- 
parquetDictionaryEncodingEnabledConfs) {
    +      val extraMessage = if (isParquetDataSource) {
    +        s" with parquet.enable.dictionary = 
$parquetDictionaryEncodingEnabled"
    +      } else {
    +        ""
    +      }
    +      test(s"test all data types - $dataType$extraMessage") {
    --- End diff --
    
    This PR accidentally seems to disable `parquet.enable.dictionary = true` 
cases even in `ParquetHadoopFsRelationSuite`. Could you fix that? After fixing 
that, we need to measure the time redunction again.
    - 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97000/consoleFull
    ```scala
    [info] ParquetHadoopFsRelationSuite:
    [info] - test all data types - StringType (830 milliseconds)
    ...
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22643: [SPARK-25630][TEST] Reduce test time of HadoopFsR...

Reply via email to