[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user liufengdb commented on the issue: https://github.com/apache/spark/pull/20702 lgtm! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20807: SPARK-23660: Fix exception in yarn cluster mode w...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/20807#discussion_r174027869 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -496,7 +497,7 @@ private[yarn] class YarnAllocator( executorIdCounter += 1 val executorHostname = container.getNodeId.getHost val containerId = container.getId - val executorId = executorIdCounter.toString + val executorId = (initialExecutorIdCounter + executorIdCounter).toString --- End diff -- it seems a bit strange to me to "add" the Ids? @vanzin @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88192/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20433 **[Test build #88192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88192/testReport)** for PR 20433 at commit [`1eec819`](https://github.com/apache/spark/commit/1eec81935a0c32de67e7981d74bfb15dbb041917). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19675: [SPARK-14540][BUILD] Support Scala 2.12 closures and Jav...
Github user ShaneDelmore commented on the issue: https://github.com/apache/spark/pull/19675 Does that mean no chance of 2.12 support for ~6 months? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20802 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1481/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20802 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20802 **[Test build #88199 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88199/testReport)** for PR 20802 at commit [`f9aa019`](https://github.com/apache/spark/commit/f9aa019378180d83b372d0934d543ebd8497b616). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20802 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88190/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20802 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20802 **[Test build #88190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88190/testReport)** for PR 20802 at commit [`d0e724f`](https://github.com/apache/spark/commit/d0e724f030df830268bac727e83a799c127a5dfd). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1480/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20692: [SPARK-23531][SQL] Show attribute type in explain
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20692 @mgaido91 Thanks for you investigation! These two weeks I am swamped. Will get back to you next week. Sorry for the delay. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20433 **[Test build #88198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88198/testReport)** for PR 20433 at commit [`39fe5dc`](https://github.com/apache/spark/commit/39fe5dc179004946b378c250d3a89132a4fad444). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20808: [SPARK-23662][SQL] Support selective tests in SQLQueryTe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20808 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1479/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20808: [SPARK-23662][SQL] Support selective tests in SQLQueryTe...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20808 **[Test build #88197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88197/testReport)** for PR 20808 at commit [`093f405`](https://github.com/apache/spark/commit/093f40555ba9dd169ab2fed8f2ecf4f08dd66627). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20808: [SPARK-23662][SQL] Support selective tests in SQLQueryTe...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20808 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20808: [SPARK-23662][SQL] Support selective tests in SQLQueryTe...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20808 cc: @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20808: [SPARK-23662][SQL] Support selective tests in SQL...
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/20808 [SPARK-23662][SQL] Support selective tests in SQLQueryTestSuite ## What changes were proposed in this pull request? This pr supported selective tests in `SQLQueryTestSuite`, e.g., ``` SPARK_SQL_QUERY_TEST_FILTER=limit.sql,random.sql build/sbt "sql/test-only *SQLQueryTestSuite" ``` ## How was this patch tested? Manually checked You can merge this pull request into a Git repository by running: $ git pull https://github.com/maropu/spark RunSelectiveTests Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20808.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20808 commit 093f40555ba9dd169ab2fed8f2ecf4f08dd66627 Author: Takeshi Yamamuro Date: 2018-03-13T05:23:54Z Fix --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20208: [SPARK-23007][SQL][TEST] Add schema evolution test suite...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20208 Before adding the test cases, schema evolution is officially supported? Could you describe it in details? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20803: [SPARK-23653][SQL] Show sql statement in spark SQL UI
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/20803 ```bash cat < test.sql select '\${a}', '\${b}'; EOF spark-sql --hiveconf a=avalue --hivevar b=bvalue -f test.sql ``` SQL text is `select ${a}, ${b}` or `select avalue, bvalue`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20702 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88195/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20702 **[Test build #88195 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88195/testReport)** for PR 20702 at commit [`93e87f5`](https://github.com/apache/spark/commit/93e87f5f138a758dec8ba5f2d3f888da9a04fb67). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20702 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20795: [SPARK-23486]cache the function name from the cat...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/20795#discussion_r174019434 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1192,11 +1195,23 @@ class Analyzer( * @see https://issues.apache.org/jira/browse/SPARK-19737 */ object LookupFunctions extends Rule[LogicalPlan] { -override def apply(plan: LogicalPlan): LogicalPlan = plan.transformAllExpressions { - case f: UnresolvedFunction if !catalog.functionExists(f.name) => -withPosition(f) { - throw new NoSuchFunctionException(f.name.database.getOrElse("default"), f.name.funcName) -} +override def apply(plan: LogicalPlan): LogicalPlan = { + val catalogFunctionNameSet = new mutable.HashSet[FunctionIdentifier]() + plan.transformAllExpressions { +case f: UnresolvedFunction if catalogFunctionNameSet.contains(f.name) => f +case f: UnresolvedFunction if catalog.functionExists(f.name) => + catalogFunctionNameSet.add(normalizeFuncName(f.name)) + f +case f: UnresolvedFunction => + withPosition(f) { +throw new NoSuchFunctionException(f.name.database.getOrElse("default"), + f.name.funcName) + } + } +} + +private def normalizeFuncName(name: FunctionIdentifier): FunctionIdentifier = { + FunctionIdentifier(name.funcName.toLowerCase(Locale.ROOT), name.database) --- End diff -- the FunctionIdentifier's signature for database is Option, it is not string. since we are just used in this local cache, I think it is ok to not convert to "default" string. I saw when we do the registerFunction in FunctionRegistry.scala, we didn't put the "default" in normalizeFuncName either. What do you think? thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20705: [SPARK-23553][TESTS] Tests should not assume the ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20705#discussion_r174019305 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -526,7 +526,7 @@ object SQLConf { val DEFAULT_DATA_SOURCE_NAME = buildConf("spark.sql.sources.default") .doc("The default data source to use in input/output.") .stringConf -.createWithDefault("parquet") +.createWithDefault("orc") --- End diff -- Can you change it back? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20796: [SPARK-23649][SQL] Prevent crashes on schema inferring o...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20796 **[Test build #88196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88196/testReport)** for PR 20796 at commit [`d6c5f02`](https://github.com/apache/spark/commit/d6c5f02ea1a08513a54ea9f3b30986dd92188b3e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20796: [SPARK-23649][SQL] Prevent crashes on schema inferring o...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20796 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20795 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20795 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88189/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20795 **[Test build #88189 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88189/testReport)** for PR 20795 at commit [`211abcb`](https://github.com/apache/spark/commit/211abcb979787a22b76d05b47d2f21a98991f702). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class LookupFunctionsSuite extends PlanTest ` * `class CustomInMemoryCatalog extends InMemoryCatalog ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20795 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88188/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20795 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20795 **[Test build #88188 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88188/testReport)** for PR 20795 at commit [`99cc3b3`](https://github.com/apache/spark/commit/99cc3b394845d364f8e99de9ba136a2068fa76c6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20807: SPARK-23660: Fix exception in yarn cluster mode when app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20807 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88193/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20807: SPARK-23660: Fix exception in yarn cluster mode when app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20807 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20807: SPARK-23660: Fix exception in yarn cluster mode when app...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20807 **[Test build #88193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88193/testReport)** for PR 20807 at commit [`114ac05`](https://github.com/apache/spark/commit/114ac05102c9d563c922447423ec8445bb37e9ef). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20795: [SPARK-23486]cache the function name from the cat...
Github user kevinyu98 commented on a diff in the pull request: https://github.com/apache/spark/pull/20795#discussion_r174017514 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1192,11 +1195,23 @@ class Analyzer( * @see https://issues.apache.org/jira/browse/SPARK-19737 */ object LookupFunctions extends Rule[LogicalPlan] { -override def apply(plan: LogicalPlan): LogicalPlan = plan.transformAllExpressions { - case f: UnresolvedFunction if !catalog.functionExists(f.name) => -withPosition(f) { - throw new NoSuchFunctionException(f.name.database.getOrElse("default"), f.name.funcName) -} +override def apply(plan: LogicalPlan): LogicalPlan = { + val catalogFunctionNameSet = new mutable.HashSet[FunctionIdentifier]() + plan.transformAllExpressions { +case f: UnresolvedFunction if catalogFunctionNameSet.contains(f.name) => f --- End diff -- I will normalize the look up too. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20702 **[Test build #88195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88195/testReport)** for PR 20702 at commit [`93e87f5`](https://github.com/apache/spark/commit/93e87f5f138a758dec8ba5f2d3f888da9a04fb67). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20800: [SPARK-23627][SQL] Provide isEmpty in DataSet
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/20800#discussion_r174016939 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -511,6 +511,14 @@ class Dataset[T] private[sql]( */ def isLocal: Boolean = logicalPlan.isInstanceOf[LocalRelation] + /** + * Returns true if the `DataSet` is empty --- End diff -- Dataset --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20702 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20702 cc @liufengdb --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20433 I added related tests from hive `clientpositive` in `interval.sql`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20433: [SPARK-23264][SQL] Support interval values withou...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20433#discussion_r174016181 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -83,6 +83,15 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { private val regenerateGoldenFiles: Boolean = System.getenv("SPARK_GENERATE_GOLDEN_FILES") == "1" + private val testFilter: Option[String] = { +val testFilter = System.getenv("SPARK_SQL_QUERY_TEST_FILTER") +if (testFilter != null && !testFilter.isEmpty) { + Some(testFilter.toLowerCase(Locale.ROOT)) +} else { + None +} + } + --- End diff -- ok, I'll do later. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20433 [This](https://github.com/apache/spark/pull/20433#issuecomment-370726439) sounds good to me --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20433: [SPARK-23264][SQL] Support interval values withou...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20433#discussion_r174016110 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -83,6 +83,15 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { private val regenerateGoldenFiles: Boolean = System.getenv("SPARK_GENERATE_GOLDEN_FILES") == "1" + private val testFilter: Option[String] = { +val testFilter = System.getenv("SPARK_SQL_QUERY_TEST_FILTER") +if (testFilter != null && !testFilter.isEmpty) { + Some(testFilter.toLowerCase(Locale.ROOT)) +} else { + None +} + } + --- End diff -- Let us create a separate PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20806 **[Test build #88194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88194/testReport)** for PR 20806 at commit [`a254d15`](https://github.com/apache/spark/commit/a254d1501c0119b4881c0443f28c263f0c9dec0e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20807: SPARK-23660: Fix exception in yarn cluster mode when app...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20807 **[Test build #88193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88193/testReport)** for PR 20807 at commit [`114ac05`](https://github.com/apache/spark/commit/114ac05102c9d563c922447423ec8445bb37e9ef). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20806 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1478/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20806 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20763: [SPARK-23523] [SQL] [BACKPORT-2.3] Fix the incorrect res...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20763 Thanks! Merged to 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20807: SPARK-23660: Fix exception in yarn cluster mode when app...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20807 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20807: SPARK-23660: Fix exception in yarn cluster mode w...
GitHub user gaborgsomogyi opened a pull request: https://github.com/apache/spark/pull/20807 SPARK-23660: Fix exception in yarn cluster mode when application ended fast ## What changes were proposed in this pull request? Yarn throws the following exception in cluster mode when the application is really small: ``` 18/03/07 23:34:22 WARN netty.NettyRpcEnv: Ignored failure: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7c974942 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@1eea9d2d[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0] 18/03/07 23:34:22 ERROR yarn.ApplicationMaster: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92) at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76) at org.apache.spark.deploy.yarn.YarnAllocator.(YarnAllocator.scala:102) at org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:77) at org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:450) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:493) at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:345) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:810) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:809) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:834) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) Caused by: org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv already stopped. at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:158) at org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:135) at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:229) at org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:523) at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:91) ... 17 more 18/03/07 23:34:22 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult: ) ``` Example application: ``` object ExampleApp { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("ExampleApp") val sc = new SparkContext(conf) try { // Do nothing } finally { sc.stop() } } ``` This PR makes `initialExecutorIdCounter ` lazy. This way `YarnAllocator` can be instantiated even if the driver already ended. ## How was this patch tested? Automated: Additional unit test added Manual: Application submitted into small cluster You can merge this pull request into a Git repository by running: $ git pull https://github.com/gaborgsomogyi/spark SPARK-23660 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20807.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20807 commit 114ac05102c9d563c922447423ec8445bb37e9ef Author: Gabor Somogyi Date: 2018-03-13T04:23:59Z SPARK-23660: Fix exception in yarn cluster mode when application ended fast --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20806: [SPARK-23661][SQL] Implement treeAggregate on Dataset AP...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20806 cc @dbtsai @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20806: [SPARK-23661][SQL] Implement treeAggregate on Dat...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/20806 [SPARK-23661][SQL] Implement treeAggregate on Dataset API ## What changes were proposed in this pull request? Many algorithms in MLlib are still not migrated their internal computing workload from RDD to DataFrame. `treeAggregate` is one of obstacles we need to address in order to see complete migration. This patch is submitted to provide `treeAggregate` on Dataset API. For now this should be a private API used by ML component. The approach of tree aggregation imitates RDD's `treeAggregate`. ## How was this patch tested? Added unit test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 treeAggregate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20806.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20806 commit a254d1501c0119b4881c0443f28c263f0c9dec0e Author: Liang-Chi Hsieh Date: 2018-03-12T08:41:20Z Implement treeAggregate on Dataset API. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1477/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20433 **[Test build #88192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88192/testReport)** for PR 20433 at commit [`1eec819`](https://github.com/apache/spark/commit/1eec81935a0c32de67e7981d74bfb15dbb041917). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88191/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20433 **[Test build #88191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88191/testReport)** for PR 20433 at commit [`c0710d6`](https://github.com/apache/spark/commit/c0710d6967caf1e3acc18201ecf54dc3bc98def6). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20433: [SPARK-23264][SQL] Support interval values withou...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20433#discussion_r174011447 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -83,6 +83,15 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { private val regenerateGoldenFiles: Boolean = System.getenv("SPARK_GENERATE_GOLDEN_FILES") == "1" + private val testFilter: Option[String] = { +val testFilter = System.getenv("SPARK_SQL_QUERY_TEST_FILTER") +if (testFilter != null && !testFilter.isEmpty) { + Some(testFilter.toLowerCase(Locale.ROOT)) +} else { + None +} + } + --- End diff -- This is not related to this pr though, I think it is some useful to run tests selectively in `SQLQueryTestSuite` (cuz the number of tests there grows recently...). If possibly, could we add this feature in a separate pr? Otherwise, I'll drop this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1476/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20433 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20433: [SPARK-23264][SQL] Support interval values without INTER...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20433 **[Test build #88191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88191/testReport)** for PR 20433 at commit [`c0710d6`](https://github.com/apache/spark/commit/c0710d6967caf1e3acc18201ecf54dc3bc98def6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20744: [SPARK-23608][CORE][WebUI] Add synchronization in SHS be...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20744 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88187/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20744: [SPARK-23608][CORE][WebUI] Add synchronization in SHS be...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20744 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20744: [SPARK-23608][CORE][WebUI] Add synchronization in SHS be...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20744 **[Test build #88187 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88187/testReport)** for PR 20744 at commit [`cd7e1f6`](https://github.com/apache/spark/commit/cd7e1f63e6f6614ed3efcc70df53cde41ffb6ff2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1459/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1475/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1459/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/20669 Newest push passes all tests (with this merged I will then merge in [this](https://github.com/apache-spark-on-k8s/spark-integration/pull/42/files)) ``` KubernetesSuite: - Run SparkPi with no resources - Run SparkPi with a very long application name. - Run SparkPi with a master URL without a scheme. - Run SparkPi with an argument. - Run SparkPi with custom labels, annotations, and environment variables. - Run SparkPi with a test secret mounted into the driver and executor pods - Run FileCheck using a Remote Data File Run completed in 2 minutes, 37 seconds. Total number of tests run: 7 Suites: completed 2, aborted 0 Tests: succeeded 7, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` I welcome the opinion of the community on the strategy for passing spark.driver.extraJavaOptions to the driver as I am currently specifying the `SPARK_CONF_DIR` to be pointed at the JAVA_PROPERTIES file. Open to any better suggestions. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20803: [SPARK-23653][SQL] Show sql statement in spark SQL UI
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/20803 > What if this SQL statement contains --hiveconf or --hivevar? What's meaning? Can you give an example? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20803: [SPARK-23653][SQL] Show sql statement in spark SQL UI
Github user LantaoJin commented on the issue: https://github.com/apache/spark/pull/20803 @cloud-fan one SQL execution only has one sql statement whatever how many jobs it triggered. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20800: [SPARK-23627][SQL] Provide isEmpty in DataSet
Github user goungoun commented on a diff in the pull request: https://github.com/apache/spark/pull/20800#discussion_r174002184 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -511,6 +511,12 @@ class Dataset[T] private[sql]( */ def isLocal: Boolean = logicalPlan.isInstanceOf[LocalRelation] + /** + * Returns true if the `DataSet` is empty + * --- End diff -- @mgaido91 Thanks, I modified the comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20797: [SPARK-23583][SQL] Invoke should support interpre...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20797#discussion_r174001998 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -266,8 +266,26 @@ case class Invoke( override def nullable: Boolean = targetObject.nullable || needNullCheck || returnNullable override def children: Seq[Expression] = targetObject +: arguments - override def eval(input: InternalRow): Any = -throw new UnsupportedOperationException("Only code-generated evaluation is supported.") + override def eval(input: InternalRow): Any = { +val obj = targetObject.eval(input) +val args = arguments.map(e => e.eval(input).asInstanceOf[Object]) +val argClasses = CallMethodViaReflection.expressionJavaClasses(arguments) +val method = obj.getClass.getDeclaredMethod(functionName, argClasses : _*) +if (needNullCheck && args.exists(_ == null)) { + // return null if one of arguments is null + null +} else { + val ret = method.invoke(obj, args: _*) + + if (CodeGenerator.defaultValue(dataType) == "null") { +ret + } else { +// cast a primitive value using Boxed class +val boxedClass = CallMethodViaReflection.typeBoxedJavaMapping(dataType) +boxedClass.cast(ret) --- End diff -- The point is where we want to cause a cast exception. With this code, an exception will occur at this class. Without this code, an exception will occur anywhere in the code. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20802 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1474/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20802 **[Test build #88190 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88190/testReport)** for PR 20802 at commit [`d0e724f`](https://github.com/apache/spark/commit/d0e724f030df830268bac727e83a799c127a5dfd). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20802: [SPARK-23651][core]Add a check for host name
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20802 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...
Github user zuotingbing commented on the issue: https://github.com/apache/spark/pull/20702 @cloud-fan @felixcheung would you please take a review. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20686: [SPARK-22915][MLlib] Streaming tests for spark.ml...
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20686#discussion_r173999280 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala --- @@ -86,16 +94,19 @@ class RFormulaSuite extends MLTest with DefaultReadWriteTest { } } - test("label column already exists but is not numeric type") { + ignore("label column already exists but is not numeric type") { --- End diff -- Thanks, this is a very good catch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20686: [SPARK-22915][MLlib] Streaming tests for spark.ml...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/20686#discussion_r173999125 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -58,14 +57,16 @@ class VectorAssemblerSuite assert(v2.isInstanceOf[DenseVector]) } - test("VectorAssembler") { + ignore("VectorAssembler") { --- End diff -- @attilapiros You need revert code here and keep old `VectorAssembler` testsuite here. `VectorAssembler` do not support streaming mode unless you pipeline a `VectorSizeHint` before it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20797: [SPARK-23583][SQL] Invoke should support interpre...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20797#discussion_r173998493 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -266,8 +266,26 @@ case class Invoke( override def nullable: Boolean = targetObject.nullable || needNullCheck || returnNullable override def children: Seq[Expression] = targetObject +: arguments - override def eval(input: InternalRow): Any = -throw new UnsupportedOperationException("Only code-generated evaluation is supported.") + override def eval(input: InternalRow): Any = { +val obj = targetObject.eval(input) +val args = arguments.map(e => e.eval(input).asInstanceOf[Object]) +val argClasses = CallMethodViaReflection.expressionJavaClasses(arguments) +val method = obj.getClass.getDeclaredMethod(functionName, argClasses : _*) +if (needNullCheck && args.exists(_ == null)) { + // return null if one of arguments is null + null +} else { + val ret = method.invoke(obj, args: _*) + + if (CodeGenerator.defaultValue(dataType) == "null") { +ret + } else { +// cast a primitive value using Boxed class +val boxedClass = CallMethodViaReflection.typeBoxedJavaMapping(dataType) +boxedClass.cast(ret) --- End diff -- For interpreted execution, I think it is less meaningful to check that. cc @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 @vanzin Thanks for review~ 1. I spent some time but didn't find the reason why same executor is killed multiple times and I cannot reproduce either. 2. I found that same completed container can be processed multiple times. It happens now and then. Seems yarn doesn't promise that same completed container only returned in one response (https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L268) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20686: [SPARK-22915][MLlib] Streaming tests for spark.ml...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/20686#discussion_r173998034 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala --- @@ -299,18 +310,17 @@ class StringIndexerSuite .setInputCol("label") .setOutputCol("labelIndex") -val expected = Seq(Set((0, 0.0), (1, 0.0), (2, 2.0), (3, 1.0), (4, 1.0), (5, 0.0)), - Set((0, 2.0), (1, 2.0), (2, 0.0), (3, 1.0), (4, 1.0), (5, 2.0)), - Set((0, 1.0), (1, 1.0), (2, 0.0), (3, 2.0), (4, 2.0), (5, 1.0)), - Set((0, 1.0), (1, 1.0), (2, 2.0), (3, 0.0), (4, 0.0), (5, 1.0))) +val expected = Seq(Seq((0, 0.0), (1, 0.0), (2, 2.0), (3, 1.0), (4, 1.0), (5, 0.0)), --- End diff -- I confirmed this with @cloud-fan If use the pattern: ``` Seq(...).toDF().select(...).collect() ``` will use `localRelation` and will always use one partition to do computation. And the output row order will keep the same with the input seq. and seems many other testcases use similar way. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20795: [SPARK-23486]cache the function name from the cat...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20795#discussion_r173997939 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1192,11 +1195,23 @@ class Analyzer( * @see https://issues.apache.org/jira/browse/SPARK-19737 */ object LookupFunctions extends Rule[LogicalPlan] { -override def apply(plan: LogicalPlan): LogicalPlan = plan.transformAllExpressions { - case f: UnresolvedFunction if !catalog.functionExists(f.name) => -withPosition(f) { - throw new NoSuchFunctionException(f.name.database.getOrElse("default"), f.name.funcName) -} +override def apply(plan: LogicalPlan): LogicalPlan = { + val catalogFunctionNameSet = new mutable.HashSet[FunctionIdentifier]() + plan.transformAllExpressions { +case f: UnresolvedFunction if catalogFunctionNameSet.contains(f.name) => f +case f: UnresolvedFunction if catalog.functionExists(f.name) => + catalogFunctionNameSet.add(normalizeFuncName(f.name)) + f +case f: UnresolvedFunction => + withPosition(f) { +throw new NoSuchFunctionException(f.name.database.getOrElse("default"), + f.name.funcName) + } + } +} + +private def normalizeFuncName(name: FunctionIdentifier): FunctionIdentifier = { + FunctionIdentifier(name.funcName.toLowerCase(Locale.ROOT), name.database) --- End diff -- `name.database.getOrElse("default")`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20795: [SPARK-23486]cache the function name from the cat...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20795#discussion_r173997911 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1192,11 +1195,23 @@ class Analyzer( * @see https://issues.apache.org/jira/browse/SPARK-19737 */ object LookupFunctions extends Rule[LogicalPlan] { -override def apply(plan: LogicalPlan): LogicalPlan = plan.transformAllExpressions { - case f: UnresolvedFunction if !catalog.functionExists(f.name) => -withPosition(f) { - throw new NoSuchFunctionException(f.name.database.getOrElse("default"), f.name.funcName) -} +override def apply(plan: LogicalPlan): LogicalPlan = { + val catalogFunctionNameSet = new mutable.HashSet[FunctionIdentifier]() + plan.transformAllExpressions { +case f: UnresolvedFunction if catalogFunctionNameSet.contains(f.name) => f --- End diff -- Normalize `FunctionIdentifier` when looking up it too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20795 **[Test build #88189 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88189/testReport)** for PR 20795 at commit [`211abcb`](https://github.com/apache/spark/commit/211abcb979787a22b76d05b47d2f21a98991f702). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20795: [SPARK-23486]cache the function name from the catalog fo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20795 **[Test build #88188 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88188/testReport)** for PR 20795 at commit [`99cc3b3`](https://github.com/apache/spark/commit/99cc3b394845d364f8e99de9ba136a2068fa76c6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20763: [SPARK-23523] [SQL] [BACKPORT-2.3] Fix the incorrect res...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20763 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88186/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20763: [SPARK-23523] [SQL] [BACKPORT-2.3] Fix the incorrect res...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20763 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20763: [SPARK-23523] [SQL] [BACKPORT-2.3] Fix the incorrect res...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20763 **[Test build #88186 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88186/testReport)** for PR 20763 at commit [`c0ac5ef`](https://github.com/apache/spark/commit/c0ac5ef3a1f00eee44dd50be925f983be852fe96). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20686: [SPARK-22915][MLlib] Streaming tests for spark.ml...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/20686#discussion_r173995919 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala --- @@ -299,18 +310,17 @@ class StringIndexerSuite .setInputCol("label") .setOutputCol("labelIndex") -val expected = Seq(Set((0, 0.0), (1, 0.0), (2, 2.0), (3, 1.0), (4, 1.0), (5, 0.0)), - Set((0, 2.0), (1, 2.0), (2, 0.0), (3, 1.0), (4, 1.0), (5, 2.0)), - Set((0, 1.0), (1, 1.0), (2, 0.0), (3, 2.0), (4, 2.0), (5, 1.0)), - Set((0, 1.0), (1, 1.0), (2, 2.0), (3, 0.0), (4, 0.0), (5, 1.0))) +val expected = Seq(Seq((0, 0.0), (1, 0.0), (2, 2.0), (3, 1.0), (4, 1.0), (5, 0.0)), --- End diff -- I'd make a little argument that if without shuffling, dataframe transformation will keep row ordering. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20804: [SPARK-23656][Test] Perform assertions in XXH64Suite.tes...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20804 @hvanhovell could you please review this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20742: [SPARK-23572][docs] Bring "security.md" up to date.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20742 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20742: [SPARK-23572][docs] Bring "security.md" up to date.
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20742 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88184/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20742: [SPARK-23572][docs] Bring "security.md" up to date.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20742 **[Test build #88184 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88184/testReport)** for PR 20742 at commit [`832d871`](https://github.com/apache/spark/commit/832d87130ede866e4d877ca407e7a621282d4612). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org