Github user dilipbiswal commented on the issue:

    https://github.com/apache/spark/pull/22641
  
    @srowen Thank you for your comments. Actually from a cursory look, i would 
agree that it does not look that pretty. I also agree that it does look like we 
are not testing as much as we used to. 
    
    This testcase was added as part of SPARK-21786 and if we look at what this 
is trying to test, its basically testing the following the precedence of how 
compression codec is choosen : 
    
    ```
      val codecName = parameters
          .get("compression")
          .orElse(parquetCompressionConf)
          .getOrElse(sqlConf.parquetCompressionCodec)
          .toLowerCase(Locale.ROOT)
    ```
    Basically what we are trying to test is , if table and session level codecs 
are specified which one wins ? So we could have just tested this with 1 value 
of table codec (say snappy) and one value of session codec say (gzip). But we 
are trying to be extra cautious and testing a cross product of 3 * 3 
combination. It seems to me the 3 values that we have chosen are probably the 
most commonly used ones. So i wanted to preserve this input set .. but decide 
which combination to test randomly. Also, like i mentioned above, we have a 6 
way loop on top , which mean in 1 run, we would probably pick 6 out of 9 
combination of codecs.  And in so many runs of jenkins, we will eventually test 
all the combination that we wanted to test in the first place there by catching 
any regression that occurs.
     
    Given the code that we are trying to test, it would be extremely rare that 
it would work for one codec combination but fails for another as the logic is 
codec value agnostic but merely a precedence check. However we are taking a hit 
for every run , by developers who run on their laptop and all the jenkin runs 
that happens automatically.
    
    So we have a few options : 
     1) Reduce the input codec list from 3 to 2 (or 1) => number of test 
combinations goes down to 24 from 54
    2) Do what the pr is doing - pick 1 at random => number of test combination 
is 6 in a given run
    3) Do nothing
    
    I will do whatever you guys prefer here... Please advice.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to