date:20191013

[GitHub] [spark] liucht-inspur commented on issue #25994: [SPARK-29323][WEBUI] Add tooltip for The Executors Tab's column names in the Spark history server Page

2019-10-13 Thread GitBox

liucht-inspur commented on issue #25994: [SPARK-29323][WEBUI] Add tooltip for 
The Executors Tab's column names in the Spark history server Page
URL: https://github.com/apache/spark/pull/25994#issuecomment-541521437
 
 
   > @liucht-inspur Did u handle the tooltip for the Live UI Page for Spark.
   
   Yeah, exactly


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal edited a comment on issue #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation

2019-10-13 Thread GitBox

dilipbiswal edited a comment on issue #26011: [SPARK-29343][SQL] Eliminate 
sorts without limit in the subquery of Join/Aggregation
URL: https://github.com/apache/spark/pull/26011#issuecomment-541519499
 
 
   The idea looks reasonable to me. cc @cloud-fan 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on issue #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation

2019-10-13 Thread GitBox

dilipbiswal commented on issue #26011: [SPARK-29343][SQL] Eliminate sorts 
without limit in the subquery of Join/Aggregation
URL: https://github.com/apache/spark/pull/26011#issuecomment-541519499
 
 
   looks reasonable to me. cc @cloud-fan 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] zhengruifeng commented on issue #26057: [SPARK-29377][PYTHON][ML] Parity between Scala ML tuning and Python ML tuning

2019-10-13 Thread GitBox

zhengruifeng commented on issue #26057: [SPARK-29377][PYTHON][ML] Parity 
between Scala ML tuning and Python ML tuning
URL: https://github.com/apache/spark/pull/26057#issuecomment-541518393
 
 
   merged to master, thanks all


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] zhengruifeng closed pull request #26057: [SPARK-29377][PYTHON][ML] Parity between Scala ML tuning and Python ML tuning

2019-10-13 Thread GitBox

zhengruifeng closed pull request #26057: [SPARK-29377][PYTHON][ML] Parity 
between Scala ML tuning and Python ML tuning
URL: https://github.com/apache/spark/pull/26057
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AbhishekNew commented on issue #25994: [SPARK-29323][WEBUI] Add tooltip for The Executors Tab's column names in the Spark history server Page

2019-10-13 Thread GitBox

AbhishekNew commented on issue #25994: [SPARK-29323][WEBUI] Add tooltip for The 
Executors Tab's column names in the Spark history server Page
URL: https://github.com/apache/spark/pull/25994#issuecomment-541516449
 
 
   @liucht-inspur Did u handle the tooltip for the Live UI Page for Spark.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on a change in pull request #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation

2019-10-13 Thread GitBox

dilipbiswal commented on a change in pull request #26011: [SPARK-29343][SQL] 
Eliminate sorts without limit in the subquery of Join/Aggregation
URL: https://github.com/apache/spark/pull/26011#discussion_r334335996
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveSortInSubquery.scala
 ##
 @@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.expressions.{NamedExpression, 
PredicateHelper}
+import 
org.apache.spark.sql.catalyst.expressions.aggregate.{AggregateExpression, 
OrderIrrelevantAggs}
+import org.apache.spark.sql.catalyst.plans.logical._
+import org.apache.spark.sql.catalyst.rules.Rule
+
+/**
+ * [[Sort]] without [[Limit]] in subquery is useless. For example,
+ *
+ * {{{
+ *   SELECT * FROM
+ *(SELECT f1 FROM tbl1 ORDER BY f2) temp1
+ *   JOIN
+ *(SELECT f3 FROM tbl2) temp2
+ *   ON temp1.f1 = temp2.f3
+ * }}}
+ *
+ * is equal to
+ *
+ * {{{
+ *  SELECT * FROM
+ *   (SELECT f1 FROM tbl1) temp1
+ *  JOIN
+ *   (SELECT f3 FROM tbl2) temp2
+ *  ON temp1.f1 = temp2.f3"
+ * }}}
+ *
+ * This rule try to remove this kind of [[Sort]] operator.
+ */
+object RemoveSortInSubquery extends Rule[LogicalPlan] with PredicateHelper {
 
 Review comment:
   Should the existing RemoveRedundantSorts handle this as well ? The reason i 
ask is, i don't see any thing subquery specific in the new rule ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LantaoJin commented on a change in pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

2019-10-13 Thread GitBox

LantaoJin commented on a change in pull request #25840: [SPARK-29166][SQL] Add 
parameters to limit the number of dynamic partitions for data source table
URL: https://github.com/apache/spark/pull/25840#discussion_r334335548
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SQLHadoopMapReduceCommitProtocol.scala
 ##
 @@ -66,4 +91,37 @@ class SQLHadoopMapReduceCommitProtocol(
 logInfo(s"Using output committer class 
${committer.getClass.getCanonicalName}")
 committer
   }
+
+  /**
+   * Called on the driver after a task commits. This can be used to access 
task commit messages
+   * before the job has finished. These same task commit messages will be 
passed to commitJob()
+   * if the entire job succeeds.
+   * Override it to check dynamic partition limitation on driver side.
+   */
+  override def onTaskCommit(taskCommit: TaskCommitMessage): Unit = {
 
 Review comment:
   `SQLHadoopMapReduceCommitProtocol.onTaskCommit` overrides 
`FileCommitProtocol.onTaskCommit` on purpose.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] 
Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541514738
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112005/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] 
Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541514733
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document 
CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541514733
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document 
CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541514738
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112005/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

SparkQA removed a comment on issue #25894: [SPARK-28793][DOC][SQL] Document 
CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541512741
 
 
   **[Test build #112005 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112005/testReport)**
 for PR 25894 at commit 
[`0cb396b`](https://github.com/apache/spark/commit/0cb396bc50d0c3e9a3c6528f99ac58bc5fdc3901).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

SparkQA commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE 
FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541514654
 
 
   **[Test build #112005 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112005/testReport)**
 for PR 25894 at commit 
[`0cb396b`](https://github.com/apache/spark/commit/0cb396bc50d0c3e9a3c6528f99ac58bc5fdc3901).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LantaoJin commented on a change in pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

2019-10-13 Thread GitBox

LantaoJin commented on a change in pull request #25840: [SPARK-29166][SQL] Add 
parameters to limit the number of dynamic partitions for data source table
URL: https://github.com/apache/spark/pull/25840#discussion_r334335126
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SQLHadoopMapReduceCommitProtocol.scala
 ##
 @@ -66,4 +91,37 @@ class SQLHadoopMapReduceCommitProtocol(
 logInfo(s"Using output committer class 
${committer.getClass.getCanonicalName}")
 committer
   }
+
+  /**
+   * Called on the driver after a task commits. This can be used to access 
task commit messages
+   * before the job has finished. These same task commit messages will be 
passed to commitJob()
+   * if the entire job succeeds.
+   * Override it to check dynamic partition limitation on driver side.
+   */
+  override def onTaskCommit(taskCommit: TaskCommitMessage): Unit = {
 
 Review comment:
   > this implementation completely hides 
org.apache.spark.internal.io.HadoopMapReduceCommitProtocol#commitTask
   
   No. `onTaskCommit`  doesn't hide `commitTask`. Actually, `commitTask` is 
called on executor side but `onTaskCommit` is called on driver side.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

SparkQA commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE 
FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541512741
 
 
   **[Test build #112005 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112005/testReport)**
 for PR 25894 at commit 
[`0cb396b`](https://github.com/apache/spark/commit/0cb396bc50d0c3e9a3c6528f99ac58bc5fdc3901).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI 
store assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541512163
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112004/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

SparkQA removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI store 
assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541508496
 
 
   **[Test build #112004 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112004/testReport)**
 for PR 26107 at commit 
[`83d87bd`](https://github.com/apache/spark/commit/83d87bdea517ef3fcb8f5bee4a2b8692dbd3dd64).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI 
store assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541512157
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on a change in pull request #26039: [SPARK-29366][SQL] Subqueries created for DPP are not printed in EXPLAIN FORMATTED

2019-10-13 Thread GitBox

dilipbiswal commented on a change in pull request #26039: [SPARK-29366][SQL] 
Subqueries created for DPP are not printed in EXPLAIN FORMATTED
URL: https://github.com/apache/spark/pull/26039#discussion_r334333472
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/ExplainUtils.scala
 ##
 @@ -199,8 +199,8 @@ object ExplainUtils {
   case s: BaseSubqueryExec =>
 subqueries += ((p, e, s))
 getSubqueries(s, subqueries)
+  case _ =>
 
 Review comment:
   @cloud-fan Got it... I will send a small follow-up. Thank you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

SparkQA commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store 
assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541512138
 
 
   **[Test build #112004 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112004/testReport)**
 for PR 26107 at commit 
[`83d87bd`](https://github.com/apache/spark/commit/83d87bdea517ef3fcb8f5bee4a2b8692dbd3dd64).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store 
assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541512163
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112004/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store 
assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541512157
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] 
Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541511410
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17017/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25894: [SPARK-28793][DOC][SQL] 
Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541511406
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document 
CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541511410
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17017/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25894: [SPARK-28793][DOC][SQL] Document 
CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#issuecomment-541511406
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on a change in pull request #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

dilipbiswal commented on a change in pull request #25894: 
[SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#discussion_r334332626
 
 

 ##
 File path: docs/sql-getting-started.md
 ##
 @@ -346,6 +346,9 @@ For example:
 
 
 
+## Scalar Functions
+(to be filled soon)
 
 Review comment:
   @gatorsmile created [here](https://issues.apache.org/jira/browse/SPARK-29458)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on a change in pull request #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

gatorsmile commented on a change in pull request #25894: 
[SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#discussion_r334331120
 
 

 ##
 File path: docs/sql-getting-started.md
 ##
 @@ -346,6 +346,9 @@ For example:
 
 
 
+## Scalar Functions
+(to be filled soon)
 
 Review comment:
   Create a JIRA?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dilipbiswal commented on a change in pull request #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

dilipbiswal commented on a change in pull request #25894: 
[SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#discussion_r334331187
 
 

 ##
 File path: docs/sql-ref-syntax-ddl-create-function.md
 ##
 @@ -19,4 +19,154 @@ license: |
   limitations under the License.
 ---
 
-**This page is under construction**
+### Description
+The `CREATE FUNCTION` statement is used to create a temporary or permanent 
function
+in Spark. Temporary functions are scoped at a session level where as permanent
+functions are created in the persistent catalog and are made available to
+all sessions. The resources specified in the `USING` clause are made available
+to all executors when they are executed for the first time. In addition to the
+SQL interface, spark allows users to create custom user defined scalar and
+aggregate functions using Scala, Python and Java APIs. Please refer to 
+[scalar_functions](sql-getting-started.html#scalar-functions) and 
+[aggregate functions](sql-getting-started#aggregations) for more information.
+
+### Syntax
+{% highlight sql %}
+CREATE [ OR REPLACE ] [ TEMPORARY ] FUNCTION [ IF NOT EXISTS ]
+function_name AS class_name [ resource_locations ]
+{% endhighlight %}
+
+### Parameters
+
+  OR REPLACE
+  
+If specified, the resources for function are reloaded. This is mainly 
useful
+to pick up any changes made to the implementation of the function. This
+parameter is mutually exclusive to IF NOT EXISTS and can not
+be specified together.
+  
+  TEMPORARY
+  
+Indicates the scope of function being created. When TEMPORARY is 
specified, the
+created function is valid and visible in the current session. No persistent
+entry is made in the catalog for these kind of functions.
+  
+  IF NOT EXISTS
+  
+If specified, creates the function only when it does not exist. The 
creation
+of function succeeds (no error is thrown), if the specified function 
already
+exists in the system. This parameter is mutually exclusive to  OR 
REPLACE 
+and can not be specified together.
+  
+  function_name
+  
+Specifies a name of funnction to be created. The function name may be
+optionally qualified with a database name. 
+Syntax:
+  
+[database_name.]function_name
+  
+  
+  class_name
+  
+Specifies the name of the class that provides the implementation for 
function to be created.
+The implementing class should extend from one of the base classes as 
follows:
+
+  Should extend UDF or UDAF in 
org.apache.hadoop.hive.ql.exec package.
+  Should extend AbstractGenericUDAFResolver, 
GenericUDF, or
+  GenericUDTF in 
org.apache.hadoop.hive.ql.udf.generic package.
+  Should extend UserDefinedAggregateFunction in 
org.apache.spark.sql.expressions package.
+
+  
+  resource_locations
+  
+Specifies the list of resources that contain the implementation of the 
function
+along with its dependencies. 
+Syntax:
+  
+USING { { (JAR | FILE ) resource_uri} , ...}
+  
+  
+
+
+### Examples
+{% highlight sql %}
+-- 1. Create a simple UDF `SimpleUdf` that adds the supplied integral value by 
10.
+--import org.apache.hadoop.hive.ql.exec.UDF;
+--public class SimpleUdf extends UDF {
+--  public int evaluate(int value) {
+--  return value + 10;
+--  }
+--}
+-- 2. Compile and place it in a jar file called `SimpleUdf.jar` in /tmp.
+
+-- Create a table called `test` and insert two rows.
+CREATE TABLE test(c1 INT);
+INSERT INTO test VALUES (1), (2);
+
+-- Create a permanent function called `simple_udf`. 
+CREATE FUNCTION simple_udf AS 'SimpleUdf'
+  USING JAR '/tmp/SimpleUdf.jar';
+
+-- Verify that the function is in the registry.
+SHOW USER FUNCTIONS;
+  +--+
+  |  function|
+  +--+
+  |default.simple_udf|
+  +--+
+
+-- Invoke the function. Every selected value should be incremented by 10.
+SELECT simple_udf(c1) AS function_return_value FROM t1;
+  +-+  
   
+  |function_return_value|
+  +-+
+  |   11|
+  |   12|
+  +-+
+
+-- Created a temporary function.
+CREATE TEMPORARY FUNCTION simple_temp_udf AS 'SimpleUdf' 
+  USING JAR '/tmp/SimpleUdf.jar';
+
+-- Verify that the newly created temporary function is in the registry.
+-- Please note that the temporary function does not have a qualified
+-- database associated with it.
+SHOW USER FUNCTIONS;
+  +--+
+  |  function|
+  +--+
+  |default.simple_udf|
+  |   simple_temp_udf|
+  +--+
+
+-- 1. Mofify `SimpleUdf`'s implementation to add supplied integral value by 20.
+--import org.apache.hadoop.hive.ql.exec.UDF;
+  
+--public class SimpleUdfR extends UDF {
+--  public int evaluate(int value) {
+

[GitHub] [spark] AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI 
store assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541508728
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26107: [SPARK-28885][SQL] Follow ANSI 
store assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541508732
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17016/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on a change in pull request #25894: [SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference

2019-10-13 Thread GitBox

gatorsmile commented on a change in pull request #25894: 
[SPARK-28793][DOC][SQL] Document CREATE FUNCTION in SQL Reference
URL: https://github.com/apache/spark/pull/25894#discussion_r334330780
 
 

 ##
 File path: docs/sql-ref-syntax-ddl-create-function.md
 ##
 @@ -19,4 +19,154 @@ license: |
   limitations under the License.
 ---
 
-**This page is under construction**
+### Description
+The `CREATE FUNCTION` statement is used to create a temporary or permanent 
function
+in Spark. Temporary functions are scoped at a session level where as permanent
+functions are created in the persistent catalog and are made available to
+all sessions. The resources specified in the `USING` clause are made available
+to all executors when they are executed for the first time. In addition to the
+SQL interface, spark allows users to create custom user defined scalar and
+aggregate functions using Scala, Python and Java APIs. Please refer to 
+[scalar_functions](sql-getting-started.html#scalar-functions) and 
+[aggregate functions](sql-getting-started#aggregations) for more information.
+
+### Syntax
+{% highlight sql %}
+CREATE [ OR REPLACE ] [ TEMPORARY ] FUNCTION [ IF NOT EXISTS ]
+function_name AS class_name [ resource_locations ]
+{% endhighlight %}
+
+### Parameters
+
+  OR REPLACE
+  
+If specified, the resources for function are reloaded. This is mainly 
useful
+to pick up any changes made to the implementation of the function. This
+parameter is mutually exclusive to IF NOT EXISTS and can not
+be specified together.
+  
+  TEMPORARY
+  
+Indicates the scope of function being created. When TEMPORARY is 
specified, the
+created function is valid and visible in the current session. No persistent
+entry is made in the catalog for these kind of functions.
+  
+  IF NOT EXISTS
+  
+If specified, creates the function only when it does not exist. The 
creation
+of function succeeds (no error is thrown), if the specified function 
already
+exists in the system. This parameter is mutually exclusive to  OR 
REPLACE 
+and can not be specified together.
+  
+  function_name
+  
+Specifies a name of funnction to be created. The function name may be
+optionally qualified with a database name. 
+Syntax:
+  
+[database_name.]function_name
+  
+  
+  class_name
+  
+Specifies the name of the class that provides the implementation for 
function to be created.
+The implementing class should extend from one of the base classes as 
follows:
+
+  Should extend UDF or UDAF in 
org.apache.hadoop.hive.ql.exec package.
+  Should extend AbstractGenericUDAFResolver, 
GenericUDF, or
+  GenericUDTF in 
org.apache.hadoop.hive.ql.udf.generic package.
+  Should extend UserDefinedAggregateFunction in 
org.apache.spark.sql.expressions package.
+
+  
+  resource_locations
+  
+Specifies the list of resources that contain the implementation of the 
function
+along with its dependencies. 
+Syntax:
+  
+USING { { (JAR | FILE ) resource_uri} , ...}
+  
+  
+
+
+### Examples
+{% highlight sql %}
+-- 1. Create a simple UDF `SimpleUdf` that adds the supplied integral value by 
10.
+--import org.apache.hadoop.hive.ql.exec.UDF;
+--public class SimpleUdf extends UDF {
+--  public int evaluate(int value) {
+--  return value + 10;
+--  }
+--}
+-- 2. Compile and place it in a jar file called `SimpleUdf.jar` in /tmp.
+
+-- Create a table called `test` and insert two rows.
+CREATE TABLE test(c1 INT);
+INSERT INTO test VALUES (1), (2);
+
+-- Create a permanent function called `simple_udf`. 
+CREATE FUNCTION simple_udf AS 'SimpleUdf'
+  USING JAR '/tmp/SimpleUdf.jar';
+
+-- Verify that the function is in the registry.
+SHOW USER FUNCTIONS;
+  +--+
+  |  function|
+  +--+
+  |default.simple_udf|
+  +--+
+
+-- Invoke the function. Every selected value should be incremented by 10.
+SELECT simple_udf(c1) AS function_return_value FROM t1;
+  +-+  
   
+  |function_return_value|
+  +-+
+  |   11|
+  |   12|
+  +-+
+
+-- Created a temporary function.
+CREATE TEMPORARY FUNCTION simple_temp_udf AS 'SimpleUdf' 
+  USING JAR '/tmp/SimpleUdf.jar';
+
+-- Verify that the newly created temporary function is in the registry.
+-- Please note that the temporary function does not have a qualified
+-- database associated with it.
+SHOW USER FUNCTIONS;
+  +--+
+  |  function|
+  +--+
+  |default.simple_udf|
+  |   simple_temp_udf|
+  +--+
+
+-- 1. Mofify `SimpleUdf`'s implementation to add supplied integral value by 20.
+--import org.apache.hadoop.hive.ql.exec.UDF;
+  
+--public class SimpleUdfR extends UDF {
+--  public int evaluate(int value) {
+-

[GitHub] [spark] AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store 
assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541508728
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store 
assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541508732
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17016/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

SparkQA commented on issue #26107: [SPARK-28885][SQL] Follow ANSI store 
assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107#issuecomment-541508496
 
 
   **[Test build #112004 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112004/testReport)**
 for PR 26107 at commit 
[`83d87bd`](https://github.com/apache/spark/commit/83d87bdea517ef3fcb8f5bee4a2b8692dbd3dd64).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gengliangwang opened a new pull request #26107: [SPARK-28885][SQL] Follow ANSI store assignment rules in table insertion by default

2019-10-13 Thread GitBox

gengliangwang opened a new pull request #26107: [SPARK-28885][SQL] Follow ANSI 
store assignment rules in table insertion by default
URL: https://github.com/apache/spark/pull/26107
 
 
   
   
   ### What changes were proposed in this pull request?
   
   When inserting a value into a column with the different data type, Spark 
performs type coercion. Currently, we support 3 policies for the store 
assignment rules: ANSI, legacy and strict, which can be set via the option 
"spark.sql.storeAssignmentPolicy":
   1. ANSI: Spark performs the type coercion as per ANSI SQL. In practice, the 
behavior is mostly the same as PostgreSQL. It disallows certain unreasonable 
type conversions such as converting `string` to `int` and `double` to 
`boolean`. It will throw a runtime exception if the value is 
out-of-range(overflow).
   2. Legacy: Spark allows the type coercion as long as it is a valid `Cast`, 
which is very loose. E.g., converting either `string` to `int` or `double` to 
`boolean` is allowed. It is the current behavior in Spark 2.x for compatibility 
with Hive. When inserting an out-of-range value to a integral field, the 
low-order bits of the value is inserted(the same as Java/Scala numeric type 
casting). For example, if 257 is inserted to a field of Byte type, the result 
is 1.
   3. Strict: Spark doesn't allow any possible precision loss or data 
truncation in store assignment, e.g., converting either `double` to `int` or 
`decimal` to `double` is allowed. The rules are originally for Dataset encoder. 
As far as I know, no mainstream DBMS is using this policy by default.
   
   Currently, the V1 data source uses "Legacy" policy by default, while V2 uses 
"Strict". This proposal is to use "ANSI" policy by default for both V1 and V2 
in Spark 3.0.
   
   ### Why are the changes needed?
   
   Following the ANSI SQL standard is most reasonable among the 3 policies. 
   
   
   ### Does this PR introduce any user-facing change?
   
   Yes. 
   The default store assignment policy is ANSI for both V1 and V2 data sources.
   
   ### How was this patch tested?
   
   Unit test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] JkSelf commented on issue #25295: [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution

2019-10-13 Thread GitBox

JkSelf commented on issue #25295: [SPARK-28560][SQL] Optimize shuffle reader to 
local shuffle reader when smj converted to bhj in adaptive execution
URL: https://github.com/apache/spark/pull/25295#issuecomment-541502159
 
 
   @cloud-fan Can you help review the updated patch? Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] sandeep-katta commented on a change in pull request #26095: [SPARK-29435][Core]Shuffle is not working when spark.shuffle.useOldFetchProtocol=true

2019-10-13 Thread GitBox

sandeep-katta commented on a change in pull request #26095: 
[SPARK-29435][Core]Shuffle is not working when 
spark.shuffle.useOldFetchProtocol=true
URL: https://github.com/apache/spark/pull/26095#discussion_r334324057
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala
 ##
 @@ -47,8 +47,7 @@ private[spark] class BlockStoreShuffleReader[K, C](
   context,
   blockManager.blockStoreClient,
   blockManager,
-  mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, 
startPartition, endPartition,
-SparkEnv.get.conf.get(config.SHUFFLE_USE_OLD_FETCH_PROTOCOL)),
+  mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, 
startPartition, endPartition),
 
 Review comment:
   This is redundant code, since ShuffleWrite writes the mapId based on the 
`spark.shuffle.useOldFetchProtocol` flag, `MapStatus.mapTaskId` always gives 
the mapId which is set by the ShuffleWriter


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] sandeep-katta commented on a change in pull request #26095: [SPARK-29435][Core]Shuffle is not working when spark.shuffle.useOldFetchProtocol=true

2019-10-13 Thread GitBox

sandeep-katta commented on a change in pull request #26095: 
[SPARK-29435][Core]Shuffle is not working when 
spark.shuffle.useOldFetchProtocol=true
URL: https://github.com/apache/spark/pull/26095#discussion_r334324057
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala
 ##
 @@ -47,8 +47,7 @@ private[spark] class BlockStoreShuffleReader[K, C](
   context,
   blockManager.blockStoreClient,
   blockManager,
-  mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, 
startPartition, endPartition,
-SparkEnv.get.conf.get(config.SHUFFLE_USE_OLD_FETCH_PROTOCOL)),
+  mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, 
startPartition, endPartition),
 
 Review comment:
   This redundant code, since ShuffleWrite writes the mapId based on the 
`spark.shuffle.useOldFetchProtocol` flag, `MapStatus.mapTaskId` always gives 
the mapId which is set by the ShuffleWriter


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on issue #26092: [SPARK-29440][SQL] Support java.time.Duration as an external type of CalendarIntervalType

2019-10-13 Thread GitBox

cloud-fan commented on issue #26092: [SPARK-29440][SQL] Support 
java.time.Duration as an external type of CalendarIntervalType
URL: https://github.com/apache/spark/pull/26092#issuecomment-541499455
 
 
   My opinion on this is to expose the `CalendarInterval` class directly, with 
2 new methods `extractPeriod` and `extractDuration`. Semantically, 
`CalendarInterval` is java `Period` + `Duration`. I don't think we can map 
`CalendarInterval` to `Duration` as it's kind of a truncation.
   
   Like @MaxGekk said, we can also separate the interval type to year-month 
interval and day-time interval. But it's a lot of effort to change the type 
system and is not compatible with parquet.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] huaxingao commented on issue #26103: [SPARK-29381][PYTHON][ML] Add _ before the XXXParams classes

2019-10-13 Thread GitBox

huaxingao commented on issue #26103: [SPARK-29381][PYTHON][ML] Add _ before the 
XXXParams classes
URL: https://github.com/apache/spark/pull/26103#issuecomment-541498973
 
 
   Almost all of these  quasi-internal classes _XXXParams are newly added in 
these parity jiras with very few exceptions. One of them is ```LSHParams```. If 
user has subclass this ```LSHParams```, with the name changed to 
```_LSHParams```,  user has to explicitly import this class like ```from 
pyspark.ml.feature import _LSHParams```because import * does not import objects 
whose names start with an underscore. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #26095: [SPARK-29435][Core]Shuffle is not working when spark.shuffle.useOldFetchProtocol=true

2019-10-13 Thread GitBox

cloud-fan commented on a change in pull request #26095: 
[SPARK-29435][Core]Shuffle is not working when 
spark.shuffle.useOldFetchProtocol=true
URL: https://github.com/apache/spark/pull/26095#discussion_r334322988
 
 

 ##
 File path: 
core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala
 ##
 @@ -47,8 +47,7 @@ private[spark] class BlockStoreShuffleReader[K, C](
   context,
   blockManager.blockStoreClient,
   blockManager,
-  mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, 
startPartition, endPartition,
-SparkEnv.get.conf.get(config.SHUFFLE_USE_OLD_FETCH_PROTOCOL)),
+  mapOutputTracker.getMapSizesByExecutorId(handle.shuffleId, 
startPartition, endPartition),
 
 Review comment:
   This is the shuffle read side and we need to know the value of 
`SHUFFLE_USE_OLD_FETCH_PROTOCOL`. I think the bug is in the shuffle write side 
which is fixed in this PR. Do we really need to change the shuffle read side?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

2019-10-13 Thread GitBox

itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add 
parameters to limit the number of dynamic partitions for data source table
URL: https://github.com/apache/spark/pull/25840#discussion_r334320998
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SQLHadoopMapReduceCommitProtocol.scala
 ##
 @@ -32,10 +35,14 @@ import org.apache.spark.sql.internal.SQLConf
 class SQLHadoopMapReduceCommitProtocol(
 jobId: String,
 path: String,
-dynamicPartitionOverwrite: Boolean = false)
+dynamicPartitionOverwrite: Boolean = false,
 
 Review comment:
   thanks. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

2019-10-13 Thread GitBox

itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add 
parameters to limit the number of dynamic partitions for data source table
URL: https://github.com/apache/spark/pull/25840#discussion_r334321860
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SQLHadoopMapReduceCommitProtocol.scala
 ##
 @@ -66,4 +91,37 @@ class SQLHadoopMapReduceCommitProtocol(
 logInfo(s"Using output committer class 
${committer.getClass.getCanonicalName}")
 committer
   }
+
+  /**
+   * Called on the driver after a task commits. This can be used to access 
task commit messages
+   * before the job has finished. These same task commit messages will be 
passed to commitJob()
+   * if the entire job succeeds.
+   * Override it to check dynamic partition limitation on driver side.
+   */
+  override def onTaskCommit(taskCommit: TaskCommitMessage): Unit = {
 
 Review comment:
   this implementation completely hides 
org.apache.spark.internal.io.HadoopMapReduceCommitProtocol#commitTask, which 
was the behaviour earlier.
   
   Is it intensional?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

2019-10-13 Thread GitBox

itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add 
parameters to limit the number of dynamic partitions for data source table
URL: https://github.com/apache/spark/pull/25840#discussion_r334320842
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SQLHadoopMapReduceCommitProtocol.scala
 ##
 @@ -63,7 +70,29 @@ class SQLHadoopMapReduceCommitProtocol(
 committer = ctor.newInstance()
   }
 }
+totalPartitions = new AtomicInteger(0)
 logInfo(s"Using output committer class 
${committer.getClass.getCanonicalName}")
 committer
   }
+
+  override def newTaskTempFile(
+  taskContext: TaskAttemptContext, dir: Option[String], ext: String): 
String = {
+val path = super.newTaskTempFile(taskContext, dir, ext)
+totalPartitions.incrementAndGet()
+if (dynamicPartitionOverwrite) {
+  if (totalPartitions.get > maxDynamicPartitions) {
 
 Review comment:
   oh thanks. now it makes things clearer. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table

2019-10-13 Thread GitBox

itskals commented on a change in pull request #25840: [SPARK-29166][SQL] Add 
parameters to limit the number of dynamic partitions for data source table
URL: https://github.com/apache/spark/pull/25840#discussion_r334320752
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/SQLHadoopMapReduceCommitProtocol.scala
 ##
 @@ -66,4 +68,18 @@ class SQLHadoopMapReduceCommitProtocol(
 logInfo(s"Using output committer class 
${committer.getClass.getCanonicalName}")
 committer
   }
+
+  override def newTaskTempFile(
 
 Review comment:
   Do we have any limitation like this before? I can help me recollect case 
where the config is only for SQL and not for other modes of operation?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541493412
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541493408
 
 
   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/17015/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541493414
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17015/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status 
logging not happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541493414
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17015/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status 
logging not happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541493412
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26106: [SPARK-29454][SQL]Reduce one unsafeProjection call when read parquet file

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26106: [SPARK-29454][SQL]Reduce one 
unsafeProjection call when read parquet file
URL: https://github.com/apache/spark/pull/26106#issuecomment-541492452
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26106: [SPARK-29454][SQL]Reduce one unsafeProjection call when read parquet file

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26106: [SPARK-29454][SQL]Reduce one 
unsafeProjection call when read parquet file
URL: https://github.com/apache/spark/pull/26106#issuecomment-541492632
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26106: [SPARK-29454][SQL]Reduce one unsafeProjection call when read parquet file

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26106: [SPARK-29454][SQL]Reduce one 
unsafeProjection call when read parquet file
URL: https://github.com/apache/spark/pull/26106#issuecomment-541492452
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LuciferYang opened a new pull request #26106: [SPARK-29454][SQL]Reduce one unsafeProjection call when read parquet file

2019-10-13 Thread GitBox

LuciferYang opened a new pull request #26106: [SPARK-29454][SQL]Reduce one 
unsafeProjection call when read parquet file
URL: https://github.com/apache/spark/pull/26106
 
 
   ### What changes were proposed in this pull request?
   
   ParquetGroupConverter call unsafeProjection function to covert 
SpecificInternalRow to UnsafeRow every times when read Parquet data file use 
ParquetRecordReader, then ParquetFileFormat will call unsafeProjection function 
to covert this UnsafeRow to another UnsafeRow again when partitionSchema is not 
empty , and on the other hand we PartitionReaderWithPartitionValues  always do 
this convert process when use DataSourceV2.
   
   I think the first time convert in ParquetGroupConverter is redundant and 
ParquetRecordReader return a SpecificInternalRow is enough.
   
   ### How was this patch tested?
   
   Existing test case is enough.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541491201
 
 
   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/17015/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #26088: [SPARK-29436][K8S] Support executor for selecting scheduler through scheduler name in the case of k8s multi-scheduler scenario

2019-10-13 Thread GitBox

srowen commented on a change in pull request #26088: [SPARK-29436][K8S] Support 
executor for selecting scheduler through scheduler name in the case of k8s 
multi-scheduler scenario
URL: https://github.com/apache/spark/pull/26088#discussion_r334317158
 
 

 ##
 File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
 ##
 @@ -142,6 +142,12 @@ private[spark] object Config extends Logging {
   .stringConf
   .createOptional
 
+  val KUBERNETES_EXECUTOR_SCHEDULER_NAME =
+ConfigBuilder("spark.kubernetes.executor.scheduler.name")
+  .doc("Specify the scheduler name for each executor pod")
+  .stringConf
+  .createWithDefault("")
 
 Review comment:
   Can you make this an optional config and only set it if present?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

SparkQA removed a comment on issue #25648: [SPARK-28947][K8S] Status logging 
not happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541486166
 
 
   **[Test build #112003 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112003/testReport)**
 for PR 25648 at commit 
[`f8eeaae`](https://github.com/apache/spark/commit/f8eeaae7152bd8fc8112a8a83ed8bbc97ba815ea).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status 
logging not happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541487339
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112003/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541487306
 
 
   **[Test build #112003 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112003/testReport)**
 for PR 25648 at commit 
[`f8eeaae`](https://github.com/apache/spark/commit/f8eeaae7152bd8fc8112a8a83ed8bbc97ba815ea).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541487337
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25648: [SPARK-28947][K8S] Status 
logging not happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541487337
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541487339
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112003/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

SparkQA commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541486166
 
 
   **[Test build #112003 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112003/testReport)**
 for PR 25648 at commit 
[`f8eeaae`](https://github.com/apache/spark/commit/f8eeaae7152bd8fc8112a8a83ed8bbc97ba815ea).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on issue #25648: [SPARK-28947][K8S] Status logging not happens at an interval for liveness

2019-10-13 Thread GitBox

yaooqinn commented on issue #25648: [SPARK-28947][K8S] Status logging not 
happens at an interval for liveness
URL: https://github.com/apache/spark/pull/25648#issuecomment-541486004
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] merrily01 commented on issue #26088: [SPARK-29436][K8S] Support executor for selecting scheduler through scheduler name in the case of k8s multi-scheduler scenario

2019-10-13 Thread GitBox

merrily01 commented on issue #26088: [SPARK-29436][K8S] Support executor for 
selecting scheduler through scheduler name in the case of k8s multi-scheduler 
scenario
URL: https://github.com/apache/spark/pull/26088#issuecomment-541485438
 
 
   Could you please kindly review ? @dongjoon-hyun @srowen 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error 
message is hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541484029
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112002/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error 
message is hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541484028
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

SparkQA removed a comment on issue #25960: [SPARK-29283][SQL] Error message is 
hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541481939
 
 
   **[Test build #112002 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112002/testReport)**
 for PR 25960 at commit 
[`4aa2dd2`](https://github.com/apache/spark/commit/4aa2dd2b7441b35fe80c1656dc8487bf4afef1c7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is 
hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541484028
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is 
hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541484029
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112002/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

SparkQA commented on issue #25960: [SPARK-29283][SQL] Error message is hidden 
when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541483996
 
 
   **[Test build #112002 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112002/testReport)**
 for PR 25960 at commit 
[`4aa2dd2`](https://github.com/apache/spark/commit/4aa2dd2b7441b35fe80c1656dc8487bf4afef1c7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] WangGuangxin commented on issue #26011: [SPARK-29343][SQL] Eliminate sorts without limit in the subquery of Join/Aggregation

2019-10-13 Thread GitBox

WangGuangxin commented on issue #26011: [SPARK-29343][SQL] Eliminate sorts 
without limit in the subquery of Join/Aggregation
URL: https://github.com/apache/spark/pull/26011#issuecomment-541482908
 
 
   @dongjoon-hyun @dilipbiswal @maropu @gatorsmile  Could you please take a 
look at this PR when you have time?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error 
message is hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541482108
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17014/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #25960: [SPARK-29283][SQL] Error 
message is hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541482104
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is 
hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541482108
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17014/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #25960: [SPARK-29283][SQL] Error message is 
hidden when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541482104
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM 
when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541481992
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541481992
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541481998
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112001/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM 
when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541481998
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112001/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

SparkQA commented on issue #25960: [SPARK-29283][SQL] Error message is hidden 
when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541481939
 
 
   **[Test build #112002 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112002/testReport)**
 for PR 25960 at commit 
[`4aa2dd2`](https://github.com/apache/spark/commit/4aa2dd2b7441b35fe80c1656dc8487bf4afef1c7).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

SparkQA removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541465420
 
 
   **[Test build #112001 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112001/testReport)**
 for PR 26105 at commit 
[`7fc8117`](https://github.com/apache/spark/commit/7fc8117c30555046de313e19b78c445986b0297e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

SparkQA commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541481840
 
 
   **[Test build #112001 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112001/testReport)**
 for PR 26105 at commit 
[`7fc8117`](https://github.com/apache/spark/commit/7fc8117c30555046de313e19b78c445986b0297e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LantaoJin commented on issue #25960: [SPARK-29283][SQL] Error message is hidden when query from JDBC, especially enabled adaptive execution

2019-10-13 Thread GitBox

LantaoJin commented on issue #25960: [SPARK-29283][SQL] Error message is hidden 
when query from JDBC, especially enabled adaptive execution
URL: https://github.com/apache/spark/pull/25960#issuecomment-541481584
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26094: [SPARK-29442][SQL][PYSPARK] Set `default` mode should override the existing mode

2019-10-13 Thread GitBox

dongjoon-hyun commented on a change in pull request #26094: 
[SPARK-29442][SQL][PYSPARK] Set `default` mode should override the existing mode
URL: https://github.com/apache/spark/pull/26094#discussion_r334309190
 
 

 ##
 File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
 ##
 @@ -77,7 +77,7 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) {
* `overwrite`: overwrite the existing data.
* `append`: append the data.
* `ignore`: ignore the operation (i.e. no-op).
-   * `error` or `errorifexists`: default option, throw an exception at 
runtime.
+   * `error`, `errorifexists`, or `default`: default option, throw an 
exception at runtime.
 
 Review comment:
   I also want to hide this `default` from the document. In any way, we need to 
fix the code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26094: [SPARK-29442][SQL][PYSPARK] Set `default` mode should override the existing mode

2019-10-13 Thread GitBox

dongjoon-hyun commented on a change in pull request #26094: 
[SPARK-29442][SQL][PYSPARK] Set `default` mode should override the existing mode
URL: https://github.com/apache/spark/pull/26094#discussion_r334309160
 
 

 ##
 File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
 ##
 @@ -87,10 +87,9 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) 
{
   case "overwrite" => mode(SaveMode.Overwrite)
   case "append" => mode(SaveMode.Append)
   case "ignore" => mode(SaveMode.Ignore)
-  case "error" | "errorifexists" => mode(SaveMode.ErrorIfExists)
-  case "default" => this
 
 Review comment:
   Yes. It seems that we didn't document it before.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #24936: [SPARK-24634][SS] Add a new metric regarding number of rows later than watermark plus allowed delay

2019-10-13 Thread GitBox

HeartSaVioR commented on issue #24936: [SPARK-24634][SS] Add a new metric 
regarding number of rows later than watermark plus allowed delay
URL: https://github.com/apache/spark/pull/24936#issuecomment-541469976
 
 
   also cc. to @srowen as this PR helps to identify correctness problem 
described in #24890 in running query. Required background knowledge in this 
patch is similar to #24890 - we may need to evaluate which is better approach 
between #21617 and this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query

2019-10-13 Thread GitBox

HeartSaVioR commented on issue #22952: [SPARK-20568][SS] Provide option to 
clean up completed files in streaming query
URL: https://github.com/apache/spark/pull/22952#issuecomment-541469585
 
 
   cc. @tdas @zsxwing @jose-torres @gaborgsomogyi Kindly reminder.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR edited a comment on issue #24936: [SPARK-24634][SS] Add a new metric regarding number of rows later than watermark plus allowed delay

2019-10-13 Thread GitBox

HeartSaVioR edited a comment on issue #24936: [SPARK-24634][SS] Add a new 
metric regarding number of rows later than watermark plus allowed delay
URL: https://github.com/apache/spark/pull/24936#issuecomment-541469251
 
 
   cc. @tdas @zsxwing @jose-torres @gaborgsomogyi Kindly reminder.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #24936: [SPARK-24634][SS] Add a new metric regarding number of rows later than watermark plus allowed delay

2019-10-13 Thread GitBox

HeartSaVioR commented on issue #24936: [SPARK-24634][SS] Add a new metric 
regarding number of rows later than watermark plus allowed delay
URL: https://github.com/apache/spark/pull/24936#issuecomment-541469251
 
 
   cc. @tdas @zsxwing @jose-torres @arunmahadevan @gaborgsomogyi Kindly 
reminder.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR commented on issue #26032: [SPARK-29361][SQL] Enable DataFrame with streaming source support on DSv1

2019-10-13 Thread GitBox

HeartSaVioR commented on issue #26032: [SPARK-29361][SQL] Enable DataFrame with 
streaming source support on DSv1
URL: https://github.com/apache/spark/pull/26032#issuecomment-541468753
 
 
   Closing this, as there was some explanation that DSv2 is the first one 
supporting streaming source.
   
   
https://lists.apache.org/thread.html/2684c0fd155a21ef100377a23e135feeabd0b0a7a098ca5e40f20e37@%3Cdev.spark.apache.org%3E
   
https://lists.apache.org/thread.html/3f0f5306b8d61a43023114bbcae7bb404ca5a0ddc3ba56f01876f8f6@


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HeartSaVioR closed pull request #26032: [SPARK-29361][SQL] Enable DataFrame with streaming source support on DSv1

2019-10-13 Thread GitBox

HeartSaVioR closed pull request #26032: [SPARK-29361][SQL] Enable DataFrame 
with streaming source support on DSv1
URL: https://github.com/apache/spark/pull/26032
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM 
when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541465527
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17013/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541465525
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541465527
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17013/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

AmplabJenkins removed a comment on issue #26105: [SPARK-26570][SQL] Prevent OOM 
when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541465525
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

SparkQA commented on issue #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105#issuecomment-541465420
 
 
   **[Test build #112001 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112001/testReport)**
 for PR 26105 at commit 
[`7fc8117`](https://github.com/apache/spark/commit/7fc8117c30555046de313e19b78c445986b0297e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya opened a new pull request #26105: [SPARK-26570][SQL] Prevent OOM when transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles

2019-10-13 Thread GitBox

viirya opened a new pull request #26105: [SPARK-26570][SQL] Prevent OOM when 
transforming very many filestatus in InMemoryFileIndex.bulkListLeafFiles
URL: https://github.com/apache/spark/pull/26105
 
 
   
   
   ### What changes were proposed in this pull request?
   
   
   This PR proposes to wrap the collected SerializableFileStatus with 
SoftReference, when collecting file statuses in 
InMemoryFileIndex.bulkListLeafFiles.
   
   Then later when we transform SerializableFileStatus back to Status, if there 
is memory pressure, traversed SerializableFileStatus can be candidates for GC.
   
   ### Why are the changes needed?
   
   
   We get an array of (String, Seq[SerializableFileStatus]) when collecting 
file status in InMemoryFileIndex.bulkListLeafFiles. Then later we transform it 
back to sequence of (Path, Seq[FileStatus]).
   
   During the transforming, the items in the array are hold and can not be 
released by GC. Actually we double memory consumption here. When facing very 
many file status, this can be OOM.
   
   This change wraps the sequence of SerializableFileStatus with SoftReference 
in the array. So when we transform SerializableFileStatus back to Status, if 
there is memory pressure, traversed SerializableFileStatus can be candidates 
for GC.
   
   If SerializableFileStatus that is not traversed yet is cleared up by GC, we 
log necessary to users suggesting increasing driver memory. Then we re-list 
file status for the path.
   
   ### Does this PR introduce any user-facing change?
   
   
   No
   
   ### How was this patch tested?
   
   
   Unit test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 224 matches

Mail list logo