mattf-apache opened a new pull request #25482: [SPARK-28749][TEST][BRANCH-2.4] 
Fix PySpark tests not to require kafka-0-8 in branch-2.4
URL: https://github.com/apache/spark/pull/25482
 
 
   ### What changes were proposed in this pull request?
   Simple fix of https://issues.apache.org/jira/browse/SPARK-28749
   
   ### Why are the changes needed?
   As discussed in the referenced Jira, currently, the PySpark tests invoked by 
`python/run-tests` demand the presence of kafka-0-8 libraries. If not present, 
a failure message will be generated regardless of whether the tests are enabled 
by env variables. Since Kafka-0-8 libraries are not compatible with Scala-2.12, 
this means we can’t successfully run pyspark tests on a Scala-2.12 build.  
These proposed changes fix the problem. 
   
   ### Does this PR introduce any user-facing change?
   No. It only changes a test behavior. 
   
   ### How was this patch tested?
   This is a fix of a test bug. The current behavior is demonstrably wrong 
under Scala-2.12, as stated above. The corrected behavior allows the tests to 
run to completion, with results that are consistent with the expected results, 
similar to the successful Scala-2.11 results.
   
   We performed the following:
   - Full mvn build under Scala-2.11, with kafka-0-8 profile
   - Full mvn build under Scala-2.12, without kafka-0-8 profile
   - Full maven Unit Test of both, with no change (as expected)
   - PySpark tests via `python/run-tests` with both.  Both complete 
successfully.
   
   Former behavior before this patch was that the last step would post a 
Failure on the Scala-2.12 build.
   
   ### Notes for reviewers
   There are of course many ways to fix the problem. I chose to follow the 
pattern established by the Kinesis testing routines that were right “next to” 
the Kafka-0-8 test routines in the same file. They showed a presumably 
acceptable way of dealing with missing jars. 
   
   I ignored the env variable, because the presence or absence of the jar 
seemed a sufficient and more important determinant. But if you want me to use 
the env variable exactly the same way as the Kinesis testing code, I’ll be 
happy to do so. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to