[GitHub] [spark] HyukjinKwon commented on issue #23946: [SPARK-26860][PySpark] [SparkR] Fix for RangeBetween and RowsBetween docs to be in sync with spark documentation
HyukjinKwon commented on issue #23946: [SPARK-26860][PySpark] [SparkR] Fix for RangeBetween and RowsBetween docs to be in sync with spark documentation URL: https://github.com/apache/spark/pull/23946#issuecomment-470839794 If the docs are consistent, should be good to go. Let me compare them later if no one reviews in few days. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #23946: [SPARK-26860][PySpark] [SparkR] Fix for RangeBetween and RowsBetween docs to be in sync with spark documentation
HyukjinKwon commented on a change in pull request #23946: [SPARK-26860][PySpark] [SparkR] Fix for RangeBetween and RowsBetween docs to be in sync with spark documentation URL: https://github.com/apache/spark/pull/23946#discussion_r263692570 ## File path: python/pyspark/sql/window.py ## @@ -97,6 +97,33 @@ def rowsBetween(start, end): and ``Window.currentRow`` to specify special boundary values, rather than using integral values directly. +A row based boundary is based on the position of the row within the partition. +An offset indicates the number of rows above or below the current row, the frame for the +current row starts or ends. For instance, given a row based sliding frame with a lower bound +offset of -1 and a upper bound offset of +2. The frame for row with index 5 would range from +index 4 to index 6. + +>>> from pyspark.sql import Window +>>> from pyspark.sql import functions as func +>>> from pyspark.sql import SQLContext +>>> sc = SparkContext.getOrCreate() +>>> sqlContext = SQLContext(sc) +>>> tup = [(1, "a"), (1, "a"), (2, "a"), (1, "b"), (2, "b"), (3, "b")] +>>> df = sqlContext.createDataFrame(tup, ["id", "category"]) +>>> window = Window.partitionBy("category").orderBy("id").rowsBetween(Window.currentRow, 1) +>>> df.withColumn("sum", func.sum("id").over(window)).show() ++---++---+ +| id|category|sum| ++---++---+ +| 1| b| 3| +| 2| b| 5| +| 3| b| 3| +| 1| a| 2| +| 1| a| 3| +| 2| a| 2| ++---++---+ + Review comment: Nope, it's not necessary. `optionflags=doctest.NORMALIZE_WHITESPACE` is needed just to make the doc prettier (by getting rid of ``). ```python (failure_count, test_count) = doctest.testmod( pyspark.sql.window, ``` is just to make the module path pretty. In console, it possible to show the module path like `__main__.bla.bla`. In this way, it shows up like `pyspark.sql.window.bla.bla`. Not a big deal a t all. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SongYadong commented on issue #23985: [SQL][MINOR] List SparkSql reserved keywords in alphabet order
SongYadong commented on issue #23985: [SQL][MINOR] List SparkSql reserved keywords in alphabet order URL: https://github.com/apache/spark/pull/23985#issuecomment-470838962 @viirya thanks. I will do that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
SparkQA commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470838043 **[Test build #103197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103197/testReport)** for PR 23942 at commit [`2c02777`](https://github.com/apache/spark/commit/2c0277748f2d20ee7fc3bf0a85279b4c72cd24e4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470837603 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8642/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470837601 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470837601 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470837603 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8642/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470834979 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103195/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
SparkQA commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470836549 **[Test build #103196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103196/testReport)** for PR 24023 at commit [`1384234`](https://github.com/apache/spark/commit/1384234edee0387f6e6152cdd9d16204006385f4). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470836167 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8641/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470836162 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470836167 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8641/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470836162 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470834976 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470834976 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470834979 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103195/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470834101 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8639/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470834061 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8639/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470834101 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8639/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470834097 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470834097 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470833089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8640/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins removed a comment on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470833085 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
SparkQA commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470833473 **[Test build #103195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103195/testReport)** for PR 24023 at commit [`537e5a9`](https://github.com/apache/spark/commit/537e5a94476ec5da809725accc58b7a159150541). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470833085 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
AmplabJenkins commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470833089 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8640/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
HyukjinKwon commented on issue #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#issuecomment-470832429 cc @vanzin, @felixcheung, @squito This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
HyukjinKwon commented on a change in pull request #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023#discussion_r263686438 ## File path: core/src/main/scala/org/apache/spark/security/SocketAuthServer.scala ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.security + +import java.net._ + +import scala.concurrent.Promise +import scala.concurrent.duration.Duration +import scala.language.existentials +import scala.util.Try + +import org.apache.spark.SparkEnv +import org.apache.spark.network.util.JavaUtils +import org.apache.spark.util._ + + +/** + * Creates a server in the jvm to communicate with python for handling one batch of data, with + * authentication and error handling. + */ +private[spark] abstract class SocketAuthServer[T]( +authHelper: SocketAuthHelper, +threadName: String) { + + def this(env: SparkEnv, threadName: String) = this(new SocketAuthHelper(env.conf), threadName) + def this(threadName: String) = this(SparkEnv.get, threadName) + + private val promise = Promise[T]() Review comment: All moved codes are as are, except that I made this `private`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon opened a new pull request #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes
HyukjinKwon opened a new pull request #24023: [SPARK-27102][R][PYTHON][CORE] Remove the references to Python's Scala codes in R's Scala codes URL: https://github.com/apache/spark/pull/24023 ## What changes were proposed in this pull request? Currently, R's Scala codes happened to refer Python's Scala codes for code deduplications. It's a bit odd. For instance, when we face an exception from R, it shows python related code path, which makes confusing to debug. It should rather have one code base and R's and Python's should share. This PR proposes: 1. Make a `SocketAuthServer` and move `PythonServer` so that `PythonRDD` and `RRDD` can share it. 2. Move `readRDDFromFile` and `readRDDFromInputStream` into `JavaRDD`. 3. Reuse `RAuthHelper` and remove `RSocketAuthHelper` in `RRDD`. 4. Rename `getEncryptionEnabled` to `isEncryptionEnabled` while I am here. So, now, the places below: - `sql/core/src/main/scala/org/apache/spark/sql/api/r` - `core/src/main/scala/org/apache/spark/api/r` - `mllib/src/main/scala/org/apache/spark/ml/r` don't refer Python's Scala codes. ## How was this patch tested? Existing tests should cover this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470830713 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103189/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470830710 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470830710 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
AmplabJenkins commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470830713 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103189/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470830619 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8639/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
SparkQA removed a comment on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470804588 **[Test build #103189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103189/testReport)** for PR 23942 at commit [`3927dec`](https://github.com/apache/spark/commit/3927decc5182e2007d303457800cf54a8b1c69ed). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
SparkQA removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470827372 **[Test build #103194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103194/testReport)** for PR 23990 at commit [`b7c4e66`](https://github.com/apache/spark/commit/b7c4e660418919f965f85b27a8e202d54fb1d709). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470830312 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103194/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn
SparkQA commented on issue #23942: [SPARK-27033][SQL]Add Optimize rule RewriteArithmeticFiltersOnIntegralColumn URL: https://github.com/apache/spark/pull/23942#issuecomment-470830534 **[Test build #103189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103189/testReport)** for PR 23942 at commit [`3927dec`](https://github.com/apache/spark/commit/3927decc5182e2007d303457800cf54a8b1c69ed). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24022: [k8s] Add memory limit for kubernetes
AmplabJenkins removed a comment on issue #24022: [k8s] Add memory limit for kubernetes URL: https://github.com/apache/spark/pull/24022#issuecomment-470829861 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins removed a comment on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470830307 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24022: [k8s] Add memory limit for kubernetes
AmplabJenkins commented on issue #24022: [k8s] Add memory limit for kubernetes URL: https://github.com/apache/spark/pull/24022#issuecomment-470830298 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470830312 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103194/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470830218 **[Test build #103194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103194/testReport)** for PR 23990 at commit [`b7c4e66`](https://github.com/apache/spark/commit/b7c4e660418919f965f85b27a8e202d54fb1d709). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
AmplabJenkins commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470830307 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24022: [k8s] Add memory limit for kubernetes
AmplabJenkins removed a comment on issue #24022: [k8s] Add memory limit for kubernetes URL: https://github.com/apache/spark/pull/24022#issuecomment-470829762 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24022: [k8s] Add memory limit for kubernetes
AmplabJenkins commented on issue #24022: [k8s] Add memory limit for kubernetes URL: https://github.com/apache/spark/pull/24022#issuecomment-470829861 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24022: [k8s] Add memory limit for kubernetes
AmplabJenkins commented on issue #24022: [k8s] Add memory limit for kubernetes URL: https://github.com/apache/spark/pull/24022#issuecomment-470829762 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hehuiyuan opened a new pull request #24022: [k8s] Add memory limit for kubernetes
hehuiyuan opened a new pull request #24022: [k8s] Add memory limit for kubernetes URL: https://github.com/apache/spark/pull/24022 ## What changes were proposed in this pull request? Add the limit memory to pod, distinguish limit memory from request memory. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
SparkQA commented on issue #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#issuecomment-470827372 **[Test build #103194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103194/testReport)** for PR 23990 at commit [`b7c4e66`](https://github.com/apache/spark/commit/b7c4e660418919f965f85b27a8e202d54fb1d709). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] chandulal commented on a change in pull request #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access …
chandulal commented on a change in pull request #23990: [SPARK-27061][K8S] Expose Driver UI port on driver service to access … URL: https://github.com/apache/spark/pull/23990#discussion_r263681599 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverServiceFeatureStep.scala ## @@ -54,6 +56,8 @@ private[spark] class DriverServiceFeatureStep( config.DRIVER_PORT.key, DEFAULT_DRIVER_PORT) private val driverBlockManagerPort = kubernetesConf.sparkConf.getInt( config.DRIVER_BLOCK_MANAGER_PORT.key, DEFAULT_BLOCKMANAGER_PORT) + private val driverUIPort = kubernetesConf.sparkConf.getInt( Review comment: Fixed this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery
AmplabJenkins commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery URL: https://github.com/apache/spark/pull/23783#issuecomment-470825590 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8638/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery
AmplabJenkins removed a comment on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery URL: https://github.com/apache/spark/pull/23783#issuecomment-470825590 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8638/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery
AmplabJenkins removed a comment on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery URL: https://github.com/apache/spark/pull/23783#issuecomment-470825586 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery
AmplabJenkins commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery URL: https://github.com/apache/spark/pull/23783#issuecomment-470825586 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time
LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time URL: https://github.com/apache/spark/pull/23951#discussion_r263655647 ## File path: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ## @@ -69,18 +69,48 @@ class FakeDAGScheduler(sc: SparkContext, taskScheduler: FakeTaskScheduler) // Get the rack for a given host object FakeRackUtil { private val hostToRack = new mutable.HashMap[String, String]() + var loopCount = 0 def cleanUp() { hostToRack.clear() +loopCount = 0 } def assignHostToRack(host: String, rack: String) { hostToRack(host) = rack } def getRackForHost(host: String): Option[String] = { +loopCount = simulateRunResolveCommand(Seq(host)) hostToRack.get(host) } + + def getRacksForHosts(hosts: List[String]): List[Option[String]] = { +loopCount = simulateRunResolveCommand(hosts) +hosts.map(hostToRack.get) + } + + /** + * This is a simulation of building and executing the resolution command. + * Simulate function `runResolveCommand()` in [[org.apache.hadoop.net.ScriptBasedMapping]]. + * If Seq has 100 elements, it returns 4. If Seq has 1 elements, it returns 1. + * @param args a list of arguments + * @return script execution times + */ + private def simulateRunResolveCommand(args: Seq[String]): Int = { +val maxArgs = 30 // Simulate NET_TOPOLOGY_SCRIPT_NUMBER_ARGS_DEFAULT +var numProcessed = 0 +var loopCount = 0 +while (numProcessed != args.size) { + var start = maxArgs * loopCount + numProcessed = start + while (numProcessed < (start + maxArgs) && numProcessed < args.size) { +numProcessed += 1 + } + loopCount += 1 +} +loopCount Review comment: Why I added this complexity code for UT is that after applied this patch, the initializing processing of `TaskSetManager` won't invoke `getRackForHost()` any more. It has no place to count invocations for getRackForHost() vs counts of getRacksForHosts(). So I added this simulator to count the execution count of script. If this patch reversed, this `assert(FakeRackUtil.loopCount === 4)` will fail. For (3), I will add this kind of testing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition
SparkQA commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition URL: https://github.com/apache/spark/pull/23964#issuecomment-470824584 **[Test build #103192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103192/testReport)** for PR 23964 at commit [`ec2c5ee`](https://github.com/apache/spark/commit/ec2c5ee0253ca2732a465d24d3df6ce2aecfd703). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery
SparkQA commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery URL: https://github.com/apache/spark/pull/23783#issuecomment-470824585 **[Test build #103193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103193/testReport)** for PR 23783 at commit [`fd0db03`](https://github.com/apache/spark/commit/fd0db03c80edd9334c8079abd6815aefe44094a3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition
AmplabJenkins removed a comment on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition URL: https://github.com/apache/spark/pull/23964#issuecomment-470824201 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8637/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition
AmplabJenkins removed a comment on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition URL: https://github.com/apache/spark/pull/23964#issuecomment-470824197 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition
AmplabJenkins commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition URL: https://github.com/apache/spark/pull/23964#issuecomment-470824197 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery
dilipbiswal commented on issue #23783: [SPARK-26854][SQL] Support ANY/SOME subquery URL: https://github.com/apache/spark/pull/23783#issuecomment-470824260 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition
AmplabJenkins commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition URL: https://github.com/apache/spark/pull/23964#issuecomment-470824201 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8637/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hehuiyuan commented on a change in pull request #24009: [k8s]Unify the three variables' name : pod name prefix in kubernetes / spark.app.name in spark ui / spark-app-name in pod's ann
hehuiyuan commented on a change in pull request #24009: [k8s]Unify the three variables' name : pod name prefix in kubernetes / spark.app.name in spark ui / spark-app-name in pod's annotations URL: https://github.com/apache/spark/pull/24009#discussion_r263679533 ## File path: core/src/main/scala/org/apache/spark/SparkConf.scala ## @@ -115,9 +115,13 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging with Seria set("spark.master", master) } - /** Set a name for your application. Shown in the Spark web UI. */ + /** Set a name for your application. Shown in the Spark web UI. +* For spark on kubernetes,Unify the three variables' name : Review comment: Thank you for your reply. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition
maropu commented on issue #23964: [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition URL: https://github.com/apache/spark/pull/23964#issuecomment-470823410 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases
AmplabJenkins removed a comment on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases URL: https://github.com/apache/spark/pull/24021#issuecomment-470820126 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases
AmplabJenkins removed a comment on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases URL: https://github.com/apache/spark/pull/24021#issuecomment-470819364 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases
AmplabJenkins commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases URL: https://github.com/apache/spark/pull/24021#issuecomment-470820210 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases
AmplabJenkins commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases URL: https://github.com/apache/spark/pull/24021#issuecomment-470820126 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sandeep-katta commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases
sandeep-katta commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases URL: https://github.com/apache/spark/pull/24021#issuecomment-470819616 cc @HyukjinKwon @ueshin This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases
AmplabJenkins commented on issue #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases URL: https://github.com/apache/spark/pull/24021#issuecomment-470819364 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sandeep-katta opened a new pull request #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases
sandeep-katta opened a new pull request #24021: [SPARK-27101][Pyspark]Cleaning up the python testcases URL: https://github.com/apache/spark/pull/24021 ## What changes were proposed in this pull request? Cleaning the testcase, drop the database after use ## How was this patch tested? existing UT This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470818165 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103185/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470818165 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103185/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
SparkQA removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470774374 **[Test build #103185 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103185/testReport)** for PR 23982 at commit [`a873a98`](https://github.com/apache/spark/commit/a873a98010fdb7bd3a3f0f50151659f834a33df3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470818162 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470818162 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
SparkQA commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470817813 **[Test build #103185 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103185/testReport)** for PR 23982 at commit [`a873a98`](https://github.com/apache/spark/commit/a873a98010fdb7bd3a3f0f50151659f834a33df3). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc
dilipbiswal commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc URL: https://github.com/apache/spark/pull/24020#issuecomment-470817258 Looks good to me. cc @maropu This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
SparkQA commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470816756 **[Test build #103191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103191/testReport)** for PR 23982 at commit [`79fc151`](https://github.com/apache/spark/commit/79fc151d8adae01043c1b2b5b5a4cc474a6bd176). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470816413 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8636/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470816413 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/8636/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470816410 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470816410 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hddong commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline
hddong commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline URL: https://github.com/apache/spark/pull/23952#issuecomment-470815927 @HyukjinKwon @felixcheung @vanzin , could you help to review this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470814309 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103184/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470814306 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470814306 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
AmplabJenkins commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470814309 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/103184/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc
AmplabJenkins commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc URL: https://github.com/apache/spark/pull/24020#issuecomment-470814012 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
SparkQA removed a comment on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470772941 **[Test build #103184 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103184/testReport)** for PR 23982 at commit [`0a78cb7`](https://github.com/apache/spark/commit/0a78cb7ee8b12d6bd7afa50b9545ff7cc16bc220). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface
SparkQA commented on issue #23982: [SPARK-27096][SQL] Reconcile the join types between data frame and sql interface URL: https://github.com/apache/spark/pull/23982#issuecomment-470813956 **[Test build #103184 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/103184/testReport)** for PR 23982 at commit [`0a78cb7`](https://github.com/apache/spark/commit/0a78cb7ee8b12d6bd7afa50b9545ff7cc16bc220). * This patch **fails SparkR unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc
AmplabJenkins removed a comment on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc URL: https://github.com/apache/spark/pull/24020#issuecomment-470813663 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc
AmplabJenkins removed a comment on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc URL: https://github.com/apache/spark/pull/24020#issuecomment-470813566 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc
AmplabJenkins commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc URL: https://github.com/apache/spark/pull/24020#issuecomment-470813663 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc
AmplabJenkins commented on issue #24020: [MINOR][SQL]Fix the typo in the spark.sql.extensions conf doc URL: https://github.com/apache/spark/pull/24020#issuecomment-470813566 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] skambha opened a new pull request #24020: [MINOR][SQL]Fix the typo in the SparkSessionExtensions class name
skambha opened a new pull request #24020: [MINOR][SQL]Fix the typo in the SparkSessionExtensions class name URL: https://github.com/apache/spark/pull/24020 ## What changes were proposed in this pull request? Fix the typo (missing the s) in the class name (SparkSessionExtensions) in the doc for Spark conf spark.sql.extensions. ## How was this patch tested? Verified by checking that the configuration doc shows up correctly in spark-shell using the SET -v This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time
LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time URL: https://github.com/apache/spark/pull/23951#discussion_r263671020 ## File path: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ## @@ -1602,4 +1637,27 @@ class TaskSetManagerSuite extends SparkFunSuite with LocalSparkContext with Logg verify(sched.dagScheduler).taskEnded(manager.tasks(3), Success, result.value(), result.accumUpdates, info3) } + + test("SPARK-27038: Verify the rack resolving time has been reduced") { +sc = new SparkContext("local", "test") +for (i <- 1 to 100) { + FakeRackUtil.assignHostToRack("host" + i, "rack" + i) +} +sched = new FakeTaskScheduler(sc, + ("execA", "host1"), ("execB", "host2"), ("execC", "host3")) +sched.slowRackResolve = true +val locations = new ArrayBuffer[Seq[TaskLocation]]() +for (i <- 1 to 100) { + locations += Seq(TaskLocation("host" + i)) +} +val taskSet = FakeTask.createTaskSet(100, locations: _*) +val clock = new ManualClock +val manager = new TaskSetManager(sched, taskSet, MAX_TASK_FAILURES, clock = clock) +var total = 0 +for (i <- 1 to 100) { + total += manager.getPendingTasksForRack("rack" + i).length Review comment: Yes. I will add. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time
LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time URL: https://github.com/apache/spark/pull/23951#discussion_r263670931 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ## @@ -184,11 +184,23 @@ private[spark] class TaskSetManager( t.epoch = epoch } + // An array to store preferred location and its task index + private val locationWithTaskIndex: ArrayBuffer[(String, Int)] = new ArrayBuffer[(String, Int)]() + private val addTaskStartTime = System.nanoTime() // Add all our tasks to the pending lists. We do this in reverse order // of task index so that tasks with low indices get launched first. for (i <- (0 until numTasks).reverse) { -addPendingTask(i) +addPendingTask(i, true) } + // Convert preferred location list to rack list in one invocation and zip with the origin index + private val rackWithTaskIndex = sched.getRacksForHosts(locationWithTaskIndex.map(_._1).toList) Review comment: > The de-duping thing is minor, but I am concerned that the `locationWithTaskIndex` variable is going to be confusing if its left around as a private member variable, even though its only meaningful in this limited context. Yes, I don't want to do any de-duping here. I will refactor this part for getting more readable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time
LantaoJin commented on a change in pull request #23951: [SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time URL: https://github.com/apache/spark/pull/23951#discussion_r263670454 ## File path: core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ## @@ -69,18 +69,48 @@ class FakeDAGScheduler(sc: SparkContext, taskScheduler: FakeTaskScheduler) // Get the rack for a given host object FakeRackUtil { private val hostToRack = new mutable.HashMap[String, String]() + var loopCount = 0 def cleanUp() { hostToRack.clear() +loopCount = 0 } def assignHostToRack(host: String, rack: String) { hostToRack(host) = rack } def getRackForHost(host: String): Option[String] = { +loopCount = simulateRunResolveCommand(Seq(host)) hostToRack.get(host) } + + def getRacksForHosts(hosts: List[String]): List[Option[String]] = { +loopCount = simulateRunResolveCommand(hosts) +hosts.map(hostToRack.get) + } + + /** + * This is a simulation of building and executing the resolution command. + * Simulate function `runResolveCommand()` in [[org.apache.hadoop.net.ScriptBasedMapping]]. + * If Seq has 100 elements, it returns 4. If Seq has 1 elements, it returns 1. + * @param args a list of arguments + * @return script execution times + */ + private def simulateRunResolveCommand(args: Seq[String]): Int = { +val maxArgs = 30 // Simulate NET_TOPOLOGY_SCRIPT_NUMBER_ARGS_DEFAULT +var numProcessed = 0 +var loopCount = 0 +while (numProcessed != args.size) { + var start = maxArgs * loopCount + numProcessed = start + while (numProcessed < (start + maxArgs) && numProcessed < args.size) { +numProcessed += 1 + } + loopCount += 1 +} +loopCount Review comment: > Would it be worth also adding a test for (2) somehow? I'm not sure what you could do there without it being tied to the internals of the hadoop logic. You could test the instance of the created This part has tested in Hadoop and I just create an instance of `CachedDNSToSwitchMapping`: ``` dnsToSwitchMapping = newInstance match { case _: CachedDNSToSwitchMapping => newInstance case _ => new CachedDNSToSwitchMapping(newInstance) } ``` So I think it doesn't need to add a test for the cache. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org