Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/22641 @srowen Thank you for your comments. Actually from a cursory look, i would agree that it does not look that pretty. I also agree that it does look like we are not testing as much as we used to. This testcase was added as part of SPARK-21786 and if we look at what this is trying to test, its basically testing the following the precedence of how compression codec is choosen : ``` val codecName = parameters .get("compression") .orElse(parquetCompressionConf) .getOrElse(sqlConf.parquetCompressionCodec) .toLowerCase(Locale.ROOT) ``` Basically what we are trying to test is , if table and session level codecs are specified which one wins ? So we could have just tested this with 1 value of table codec (say snappy) and one value of session codec say (gzip). But we are trying to be extra cautious and testing a cross product of 3 * 3 combination. It seems to me the 3 values that we have chosen are probably the most commonly used ones. So i wanted to preserve this input set .. but decide which combination to test randomly. Also, like i mentioned above, we have a 6 way loop on top , which mean in 1 run, we would probably pick 6 out of 9 combination of codecs. And in so many runs of jenkins, we will eventually test all the combination that we wanted to test in the first place there by catching any regression that occurs. Given the code that we are trying to test, it would be extremely rare that it would work for one codec combination but fails for another as the logic is codec value agnostic but merely a precedence check. However we are taking a hit for every run , by developers who run on their laptop and all the jenkin runs that happens automatically. So we have a few options : 1) Reduce the input codec list from 3 to 2 (or 1) => number of test combinations goes down to 24 from 54 2) Do what the pr is doing - pick 1 at random => number of test combination is 6 in a given run 3) Do nothing I will do whatever you guys prefer here... Please advice.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org