[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-22 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r637489244



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# Run docker-integration-tests only if it's explicitly specified.
+if not opts.modules or modules.docker_integration_tests.name not in 
opts.modules:
+excluded_tags.extend(modules.docker_integration_tests.test_tags)

Review comment:
   @sarutak, this way seems okay but how about doing it with similarly with 
other tests like Kinesis just for consistency?  (see `git grep -r 
"ENABLE_KINESIS_TESTS"`).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640210059



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# With this script, disable docker integration by default.

Review comment:
   Could we leverage environ in modules? 
https://github.com/apache/spark/blob/620af5bcdc1fbc446f5cbb539ce79e6163a8d425/dev/sparktestsupport/modules.py#L306-L308
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640210831



##
File path: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerKrbJDBCIntegrationSuite.scala
##
@@ -107,7 +109,7 @@ abstract class DockerKrbJDBCIntegrationSuite extends 
DockerJDBCIntegrationSuite
 conn.prepareStatement("INSERT INTO bar VALUES ('hello')").executeUpdate()
   }
 
-  test("Basic read test in query option") {

Review comment:
   @sarutak, there's actually a better way with less changes by overriding 
`test` method. e.g.) 
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLCommandTestUtils.scala#L52-L55




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640211218



##
File path: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala
##
@@ -115,69 +116,72 @@ abstract class DockerJDBCIntegrationSuite extends 
SharedSparkSession with Eventu
   protected var jdbcUrl: String = _
 
   override def beforeAll(): Unit = {

Review comment:
   I would do:
   
   ```scala
   override def beforeAll(): Unit = runIfTestsEnabled(s"Prepare for 
$this.getClass.getName") {
   ```
   
   or
   
   ```scala
   override def beforeAll(): Unit = runIfTestsEnabled(
   s"Prepare for $this.getClass.getName") {
   ```
   
   to minimise the diff because of the indentation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640211658



##
File path: .github/workflows/build_and_test.yml
##
@@ -625,3 +625,84 @@ jobs:
   with:
 name: unit-tests-log-tpcds--8-hadoop3.2-hive2.3
 path: "**/target/unit-tests.log"
+
+  docker-integration-tests:
+name: Run docker integration tests
+runs-on: ubuntu-20.04
+env:
+  HADOOP_PROFILE: hadoop3.2
+  HIVE_PROFILE: hive2.3
+  GITHUB_PREV_SHA: ${{ github.event.before }}
+  SPARK_LOCAL_IP: localhost
+  ORACLE_DOCKER_IMAGE_NAME: oracle/database:18.4.0-xe
+  ENABLE_DOCKER_INTEGRATION_TESTS: 1
+steps:
+- name: Checkout Spark repository
+  uses: actions/checkout@v2
+  with:
+fetch-depth: 0
+repository: apache/spark
+ref: master
+- name: Sync the current branch with the latest in Apache Spark
+  if: github.repository != 'apache/spark'
+  id: sync-branch
+  run: |
+git fetch https://github.com/$GITHUB_REPOSITORY.git 
${GITHUB_REF#refs/heads/}
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' merge --no-commit --progress --squash 
FETCH_HEAD
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' commit -m "Merged commit"
+- name: Cache Scala, SBT and Maven
+  uses: actions/cache@v2
+  with:
+path: |
+  build/apache-maven-*
+  build/scala-*
+  build/*.jar
+  ~/.sbt
+key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 
'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 
'build/spark-build-info') }}
+restore-keys: |
+  build-
+- name: Cache Coursier local repository
+  uses: actions/cache@v2
+  with:
+path: ~/.cache/coursier
+key: docker-integration-coursier-${{ hashFiles('**/pom.xml', 
'**/plugins.sbt') }}
+restore-keys: |
+  docker-integration-coursier-
+- name: Install Java 8
+  uses: actions/setup-java@v1
+  with:
+java-version: 8
+- name: Cache Oracle docker-images repository
+  id: cache-oracle-docker-images
+  uses: actions/cache@v2
+  with:
+path: ./oracle/docker-images
+# key should contains the commit hash of the Oracle docker images to 
be checkout.
+key: oracle-docker-images-3f422c4a35b423dfcdbcc57a84f01db6c82eb6c1
+- name: Checkout Oracle docker-images repository
+  uses: actions/checkout@v2
+  with:
+fetch-depth: 0
+repository: oracle/docker-images
+ref: 3f422c4a35b423dfcdbcc57a84f01db6c82eb6c1
+path: ./oracle/docker-images
+- name: Install Oracle docker image

Review comment:
   nit but docker -> Docker




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640253426



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# With this script, disable docker integration by default.

Review comment:
   oh @sarutak, if `ENABLE_DOCKER_INTEGRATION_TESTS` is set to `1`, it will 
be only set when the module is explicitly selected. That's what 
`ENABLE_KINESIS_TESTS` does (because Kinesis test cases requires AWS resources 
which we shouldn't run too often) also see: 
https://github.com/apache/spark/commit/76f2e393a5fad0db8b56c4b8dad5ef686bf140a4
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640253426



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# With this script, disable docker integration by default.

Review comment:
   oh @sarutak, if `ENABLE_DOCKER_INTEGRATION_TESTS` is set to `1`, it will 
be only set when the module is explicitly selected by `run-tests.py`. That's 
what `ENABLE_KINESIS_TESTS` does (because Kinesis test cases requires AWS 
resources which we shouldn't run too often) also see: 
https://github.com/apache/spark/commit/76f2e393a5fad0db8b56c4b8dad5ef686bf140a4
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640253426



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# With this script, disable docker integration by default.

Review comment:
   oh @sarutak, if `ENABLE_DOCKER_INTEGRATION_TESTS` is set to `1`, it will 
be only set when the module is explicitly selected by changed modules of 
`run-tests.py`. That's what `ENABLE_KINESIS_TESTS` does (because Kinesis test 
cases requires AWS resources which we shouldn't run too often) also see: 
https://github.com/apache/spark/commit/76f2e393a5fad0db8b56c4b8dad5ef686bf140a4
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640254469



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# With this script, disable docker integration by default.

Review comment:
   Oh, okay, maybe `run-tests.py --modules` might have a different logic 
from that. I will double check




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-26 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640253426



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# With this script, disable docker integration by default.

Review comment:
   oh @sarutak, if `ENABLE_DOCKER_INTEGRATION_TESTS` is set to `1`, it will 
be only set when the module is explicitly selected by changed modules at 
`run-tests.py`. That's what `ENABLE_KINESIS_TESTS` does (because Kinesis test 
cases requires AWS resources which we shouldn't run too often) also see: 
https://github.com/apache/spark/commit/76f2e393a5fad0db8b56c4b8dad5ef686bf140a4
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640497898



##
File path: dev/sparktestsupport/modules.py
##
@@ -743,6 +743,20 @@ def __hash__(self):
 ]
 )
 
+docker_integration_tests = Module(
+name="docker-integration-tests",
+dependencies=[sql],

Review comment:
   I think it's fine to make this dependencies empty like Kinesis one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640571109



##
File path: dev/run-tests.py
##
@@ -687,6 +689,10 @@ def main():
 test_modules = determine_modules_to_test(changed_modules)
 excluded_tags = determine_tags_to_exclude(changed_modules)
 
+# With this script, disable docker integration by default.

Review comment:
   Ohh yeah .. I see we should probably run this only when we're in GitHub 
Action.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640573670



##
File path: dev/sparktestsupport/modules.py
##
@@ -743,6 +743,20 @@ def __hash__(self):
 ]
 )
 
+docker_integration_tests = Module(
+name="docker-integration-tests",
+dependencies=[],
+build_profile_flags=["-Pdocker-integration-tests"],
+source_file_regexes=["external/docker-integration-tests"],
+sbt_test_goals=["docker-integration-tests/test"],
+environ={
+"ENABLE_DOCKER_INTEGRATION_TESTS": "1"
+},

Review comment:
   @sarutak can we add this environment variable if `GITHUB_ACTION` 
environment variable is set? e.g.)
   
   ```suggestion
   environ=None if "GITHUB_ACTIONS" in os.environ else {
   "ENABLE_DOCKER_INTEGRATION_TESTS": "1"
   },
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640573670



##
File path: dev/sparktestsupport/modules.py
##
@@ -743,6 +743,20 @@ def __hash__(self):
 ]
 )
 
+docker_integration_tests = Module(
+name="docker-integration-tests",
+dependencies=[],
+build_profile_flags=["-Pdocker-integration-tests"],
+source_file_regexes=["external/docker-integration-tests"],
+sbt_test_goals=["docker-integration-tests/test"],
+environ={
+"ENABLE_DOCKER_INTEGRATION_TESTS": "1"
+},

Review comment:
   @sarutak can we add this environment variable if `GITHUB_ACTION` 
environment variable is set? e.g.)
   
   ```suggestion
   # Run Docker integration tests only in GitHub Actions because Jenkins 
doesn't have Docker.
   environ=None if "GITHUB_ACTIONS" in os.environ else {
   "ENABLE_DOCKER_INTEGRATION_TESTS": "1"
   },
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640573670



##
File path: dev/sparktestsupport/modules.py
##
@@ -743,6 +743,20 @@ def __hash__(self):
 ]
 )
 
+docker_integration_tests = Module(
+name="docker-integration-tests",
+dependencies=[],
+build_profile_flags=["-Pdocker-integration-tests"],
+source_file_regexes=["external/docker-integration-tests"],
+sbt_test_goals=["docker-integration-tests/test"],
+environ={
+"ENABLE_DOCKER_INTEGRATION_TESTS": "1"
+},

Review comment:
   @sarutak, I just read 
https://github.com/apache/spark/pull/32631#discussion_r640427313. Can we add 
this environment variable if `GITHUB_ACTION` environment variable is set? e.g.)
   
   ```suggestion
   # Run Docker integration tests only in GitHub Actions because Jenkins 
doesn't have Docker.
   environ=None if "GITHUB_ACTIONS" in os.environ else {
   "ENABLE_DOCKER_INTEGRATION_TESTS": "1"
   },
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640573670



##
File path: dev/sparktestsupport/modules.py
##
@@ -743,6 +743,20 @@ def __hash__(self):
 ]
 )
 
+docker_integration_tests = Module(
+name="docker-integration-tests",
+dependencies=[],
+build_profile_flags=["-Pdocker-integration-tests"],
+source_file_regexes=["external/docker-integration-tests"],
+sbt_test_goals=["docker-integration-tests/test"],
+environ={
+"ENABLE_DOCKER_INTEGRATION_TESTS": "1"
+},

Review comment:
   @sarutak, I just read 
https://github.com/apache/spark/pull/32631#discussion_r640427313. Can we add 
this environment variable if `GITHUB_ACTION` environment variable is set? e.g.)
   
   ```suggestion
   # Run Docker integration tests only in GitHub Actions because Jenkins 
doesn't have Docker.
   environ=None if "GITHUB_ACTIONS" not in os.environ else {
   "ENABLE_DOCKER_INTEGRATION_TESTS": "1"
   },
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640579425



##
File path: dev/run-tests.py
##
@@ -122,19 +122,21 @@ def determine_modules_to_test(changed_modules, 
deduplicated=True):
 ['graphx', 'examples']
 >>> [x.name for x in determine_modules_to_test([modules.sql])]
 ... # doctest: +NORMALIZE_WHITESPACE
-['sql', 'avro', 'hive', 'mllib', 'sql-kafka-0-10', 'examples', 
'hive-thriftserver',
- 'pyspark-sql', 'repl', 'sparkr', 'pyspark-mllib', 'pyspark-pandas', 
'pyspark-ml']
+['sql', 'avro', 'docker-integration-tests', 'hive', 'mllib', 
'sql-kafka-0-10', 'examples',
+ 'hive-thriftserver', 'pyspark-sql', 'repl', 'sparkr',
+ 'pyspark-mllib', 'pyspark-pandas', 'pyspark-ml']
 >>> sorted([x.name for x in determine_modules_to_test(
 ... [modules.sparkr, modules.sql], deduplicated=False)])
 ... # doctest: +NORMALIZE_WHITESPACE
-['avro', 'examples', 'hive', 'hive-thriftserver', 'mllib', 'pyspark-ml',
- 'pyspark-mllib', 'pyspark-pandas', 'pyspark-sql', 'repl', 'sparkr', 
'sql', 'sql-kafka-0-10']
+['avro', 'docker-integration-tests', 'examples', 'hive', 
'hive-thriftserver', 'mllib',
+ 'pyspark-ml', 'pyspark-mllib', 'pyspark-pandas', 'pyspark-sql',
+ 'repl', 'sparkr', 'sql', 'sql-kafka-0-10']
 >>> sorted([x.name for x in determine_modules_to_test(
 ... [modules.sql, modules.core], deduplicated=False)])
 ... # doctest: +NORMALIZE_WHITESPACE
-['avro', 'catalyst', 'core', 'examples', 'graphx', 'hive', 
'hive-thriftserver',
- 'mllib', 'mllib-local', 'pyspark-core', 'pyspark-ml', 'pyspark-mllib', 
'pyspark-pandas',
- 'pyspark-resource', 'pyspark-sql', 'pyspark-streaming', 'repl', 'root',
+['avro', 'catalyst', 'core', 'docker-integration-tests', 'examples', 
'graphx', 'hive',

Review comment:
   Oops, I think this test will fail because `docker-integration-tests` 
won't be selected anymore.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640580266



##
File path: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/DockerIntegrationFunSuite.scala
##
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.jdbc
+
+import org.scalactic.source.Position
+import org.scalatest.Tag
+
+import org.apache.spark.SparkFunSuite
+
+/**
+ * Helper class that runs docker integration tests.
+ * Ignores them based on env variable is set or not.
+ */
+trait DockerIntegrationFunSuite extends SparkFunSuite {
+  private val envVarNameForEnablingTests = "ENABLE_DOCKER_INTEGRATION_TESTS"
+  private val shouldRunTests = sys.env.getOrElse(envVarNameForEnablingTests, 
"0") match {
+case "1" => true
+case _ => false
+  }
+
+  /** Run the test if environment variable is set or ignore the test */
+  override def test(testName: String, testTags: Tag*)(testBody: => Any)
+(implicit pos: Position): Unit = {
+if (shouldRunTests) {
+  super.test(testName)(testBody)

Review comment:
   nit but maybe:
   ```suggestion
 super.test(testName, testTags: _*)(testBody)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r640584635



##
File path: dev/sparktestsupport/modules.py
##
@@ -743,6 +743,20 @@ def __hash__(self):
 ]
 )
 
+docker_integration_tests = Module(
+name="docker-integration-tests",
+dependencies=[],
+build_profile_flags=["-Pdocker-integration-tests"],
+source_file_regexes=["external/docker-integration-tests"],
+sbt_test_goals=["docker-integration-tests/test"],
+environ={
+"ENABLE_DOCKER_INTEGRATION_TESTS": "1"
+},

Review comment:
   @sarutak BTW, I added this suggestion a bit after a quick testing .. now 
this should work correctly ... hopefully ..




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r641276796



##
File path: .github/workflows/build_and_test.yml
##
@@ -625,3 +625,83 @@ jobs:
   with:
 name: unit-tests-log-tpcds--8-hadoop3.2-hive2.3
 path: "**/target/unit-tests.log"
+
+  docker-integration-tests:
+name: Run docker integration tests
+runs-on: ubuntu-20.04
+env:
+  HADOOP_PROFILE: hadoop3.2
+  HIVE_PROFILE: hive2.3
+  GITHUB_PREV_SHA: ${{ github.event.before }}
+  SPARK_LOCAL_IP: localhost
+  ORACLE_DOCKER_IMAGE_NAME: oracle/database:18.4.0-xe
+steps:
+- name: Checkout Spark repository
+  uses: actions/checkout@v2
+  with:
+fetch-depth: 0
+repository: apache/spark
+ref: master
+- name: Sync the current branch with the latest in Apache Spark
+  if: github.repository != 'apache/spark'
+  id: sync-branch
+  run: |
+git fetch https://github.com/$GITHUB_REPOSITORY.git 
${GITHUB_REF#refs/heads/}
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' merge --no-commit --progress --squash 
FETCH_HEAD
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' commit -m "Merged commit"

Review comment:
   Oh, we should add `echo "::set-output 
name=APACHE_SPARK_REF::$apache_spark_ref"` after this line because we're 
running tests with `run-tests.py`.

##
File path: .github/workflows/build_and_test.yml
##
@@ -625,3 +625,83 @@ jobs:
   with:
 name: unit-tests-log-tpcds--8-hadoop3.2-hive2.3
 path: "**/target/unit-tests.log"
+
+  docker-integration-tests:
+name: Run docker integration tests
+runs-on: ubuntu-20.04
+env:
+  HADOOP_PROFILE: hadoop3.2
+  HIVE_PROFILE: hive2.3
+  GITHUB_PREV_SHA: ${{ github.event.before }}
+  SPARK_LOCAL_IP: localhost
+  ORACLE_DOCKER_IMAGE_NAME: oracle/database:18.4.0-xe
+steps:
+- name: Checkout Spark repository
+  uses: actions/checkout@v2
+  with:
+fetch-depth: 0
+repository: apache/spark
+ref: master
+- name: Sync the current branch with the latest in Apache Spark
+  if: github.repository != 'apache/spark'
+  id: sync-branch
+  run: |
+git fetch https://github.com/$GITHUB_REPOSITORY.git 
${GITHUB_REF#refs/heads/}
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' merge --no-commit --progress --squash 
FETCH_HEAD
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' commit -m "Merged commit"

Review comment:
   I will revert this for now .. seems like it breaks other tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox


HyukjinKwon commented on a change in pull request #32631:
URL: https://github.com/apache/spark/pull/32631#discussion_r641277034



##
File path: .github/workflows/build_and_test.yml
##
@@ -625,3 +625,83 @@ jobs:
   with:
 name: unit-tests-log-tpcds--8-hadoop3.2-hive2.3
 path: "**/target/unit-tests.log"
+
+  docker-integration-tests:
+name: Run docker integration tests
+runs-on: ubuntu-20.04
+env:
+  HADOOP_PROFILE: hadoop3.2
+  HIVE_PROFILE: hive2.3
+  GITHUB_PREV_SHA: ${{ github.event.before }}
+  SPARK_LOCAL_IP: localhost
+  ORACLE_DOCKER_IMAGE_NAME: oracle/database:18.4.0-xe
+steps:
+- name: Checkout Spark repository
+  uses: actions/checkout@v2
+  with:
+fetch-depth: 0
+repository: apache/spark
+ref: master
+- name: Sync the current branch with the latest in Apache Spark
+  if: github.repository != 'apache/spark'
+  id: sync-branch
+  run: |
+git fetch https://github.com/$GITHUB_REPOSITORY.git 
${GITHUB_REF#refs/heads/}
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' merge --no-commit --progress --squash 
FETCH_HEAD
+git -c user.name='Apache Spark Test Account' -c 
user.email='sparktest...@gmail.com' commit -m "Merged commit"

Review comment:
   @sarutak would you mind opening a Pr again for this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org