[GitHub] [spark] AmplabJenkins removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881996770






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881996772






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


SparkQA removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881984416


   **[Test build #141206 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141206/testReport)**
 for PR 33405 at commit 
[`e50a8d2`](https://github.com/apache/spark/commit/e50a8d2508f598fe69fafe5515fae590cce57991).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881994653


   **[Test build #141206 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141206/testReport)**
 for PR 33405 at commit 
[`e50a8d2`](https://github.com/apache/spark/commit/e50a8d2508f598fe69fafe5515fae590cce57991).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


SparkQA removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881950484


   **[Test build #141204 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141204/testReport)**
 for PR 33405 at commit 
[`304ed04`](https://github.com/apache/spark/commit/304ed0493c02d33a7d660d0647e5de6db6b3f4c6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881993795


   **[Test build #141204 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141204/testReport)**
 for PR 33405 at commit 
[`304ed04`](https://github.com/apache/spark/commit/304ed0493c02d33a7d660d0647e5de6db6b3f4c6).
* This patch **fails from timeout after a configured wait of `500m`**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33361: [SPARK-36155][SQL] Eliminate outer join base uniqueness

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33361:
URL: https://github.com/apache/spark/pull/33361#issuecomment-881993395


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45720/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33361: [SPARK-36155][SQL] Eliminate outer join base uniqueness

2021-07-17 Thread GitBox


SparkQA commented on pull request #33361:
URL: https://github.com/apache/spark/pull/33361#issuecomment-881993393


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45720/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33361: [SPARK-36155][SQL] Eliminate outer join base uniqueness

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33361:
URL: https://github.com/apache/spark/pull/33361#issuecomment-881993395


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45720/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33361: [SPARK-36155][SQL] Eliminate outer join base uniqueness

2021-07-17 Thread GitBox


SparkQA commented on pull request #33361:
URL: https://github.com/apache/spark/pull/33361#issuecomment-881991431


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45720/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881989761


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45719/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881989761


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45719/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881989755


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45719/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33361: [SPARK-36155][SQL] Eliminate outer join base uniqueness

2021-07-17 Thread GitBox


SparkQA commented on pull request #33361:
URL: https://github.com/apache/spark/pull/33361#issuecomment-881988727


   **[Test build #141207 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141207/testReport)**
 for PR 33361 at commit 
[`6339da0`](https://github.com/apache/spark/commit/6339da03b69ac2d0dad37b6aee5d356a84dd92e2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


viirya commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671771189



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {

Review comment:
   Either 
`RemoveRedundantAggregatesInLeftSemiAntiJoin`/`RemoveRedundantAggsInLeftSemiAntiJoin`
 or `RemoveAggregatesInLeftSemiAntiJoin`/`RemoveAggsInLeftSemiAntiJoin` are 
okay for me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


dongjoon-hyun commented on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881987742


   Also, cc @cloud-fan , @maropu , @viirya 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671770195



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+_.containsAllPatterns(AGGREGATE, LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+case agg @ Aggregate(grouping, aggExps, j @ Join(left: Aggregate, _, 
LeftSemi | LeftAnti, _, _))
+  if agg.groupOnly && left.groupOnly &&
+aggExps.forall(e => 
left.aggregateExpressions.exists(_.semanticEquals(e))) &&

Review comment:
   Yes~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671770115



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+_.containsAllPatterns(AGGREGATE, LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+case agg @ Aggregate(grouping, aggExps, j @ Join(left: Aggregate, _, 
LeftSemi | LeftAnti, _, _))

Review comment:
   It's only one line addition here, isn't it? Line 36 and 37 looks like 
having enough extra space for renaming.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671769998



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {

Review comment:
   What about `RemoveRedundantAggregatesInLeftSemiAntiJoin`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881987488


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45719/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671769829



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.

Review comment:
   Thanks. If you don't mind, can we have a UT for that in addition to 
TPCDS q38?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671769310



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {

Review comment:
   Since this is not `pushdown-like` optimizer, I'd drop `Through`. This is 
a kind of no-op removal or similar with `RemoveRedundantAggregates`, isn't it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671769310



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {

Review comment:
   Since this is not `pushdown-like` optimizer, I'd drop `Through`. This is 
a kind of no-op removal, isn't it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #33361: [SPARK-36155][SQL] Eliminate join base uniqueness

2021-07-17 Thread GitBox


wangyum commented on pull request #33361:
URL: https://github.com/apache/spark/pull/33361#issuecomment-881985013


   I plan to remove support **Elimination of left semi -> inner if uniqueness 
can be guaranteed on the right side** because it may introduce SMJ as [it can 
not estimate `EqualNullSafe` join 
condition](https://issues.apache.org/jira/browse/SPARK-36162).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881984416


   **[Test build #141206 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141206/testReport)**
 for PR 33405 at commit 
[`e50a8d2`](https://github.com/apache/spark/commit/e50a8d2508f598fe69fafe5515fae590cce57991).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881984040


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141205/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881984040


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141205/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


SparkQA removed a comment on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881957803


   **[Test build #141205 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141205/testReport)**
 for PR 31905 at commit 
[`ec7f727`](https://github.com/apache/spark/commit/ec7f7275ca7902db83aef86d174890bd1e240fe7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


SparkQA commented on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881981056


   **[Test build #141205 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141205/testReport)**
 for PR 31905 at commit 
[`ec7f727`](https://github.com/apache/spark/commit/ec7f7275ca7902db83aef86d174890bd1e240fe7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


wangyum commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671763303



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+_.containsAllPatterns(AGGREGATE, LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+case agg @ Aggregate(grouping, aggExps, j @ Join(left: Aggregate, _, 
LeftSemi | LeftAnti, _, _))
+  if agg.groupOnly && left.groupOnly &&
+aggExps.forall(e => 
left.aggregateExpressions.exists(_.semanticEquals(e))) &&

Review comment:
   Do you mean this condition?
   ```scala
   aggExps.forall(e => left.aggregateExpressions.exists(_.semanticEquals(e)))
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


wangyum commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671763262



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {

Review comment:
   Do you have a recommended name?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


wangyum commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671762429



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+_.containsAllPatterns(AGGREGATE, LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+case agg @ Aggregate(grouping, aggExps, j @ Join(left: Aggregate, _, 
LeftSemi | LeftAnti, _, _))

Review comment:
   I also like `groupingExprs`, but it will exceeds 100 characters. We need 
a new line.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


wangyum commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671761972



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.

Review comment:
   TPC-DS q38 is that case:
   
![image](https://user-images.githubusercontent.com/5399861/126052401-1d072f77-e584-45c2-939d-f87deda989e1.png)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] venkata91 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-07-17 Thread GitBox


venkata91 commented on a change in pull request #33078:
URL: https://github.com/apache/spark/pull/33078#discussion_r671759804



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/protocol/PushBlockStream.java
##
@@ -85,12 +96,13 @@ public boolean equals(Object other) {
 
   @Override
   public int encodedLength() {
-return Encoders.Strings.encodedLength(appId) + 16;
+return Encoders.Strings.encodedLength(appId) + 20;

Review comment:
   nit: may be 16 + 4 similar to the other place or 4 + 4 + 4 + 4 + 4?.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #30930: [SPARK-33070][SQL] Optimize higher order functions

2021-07-17 Thread GitBox


github-actions[bot] closed pull request #30930:
URL: https://github.com/apache/spark/pull/30930


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881959523


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45717/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


dongjoon-hyun commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881975762


   I'm observing.
   ```
   java.lang.OutOfMemoryError: Metaspace
   Error: Exception in thread "dispatcher-event-loop-110" 
java.lang.OutOfMemoryError: Metaspace
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xuanyuanking commented on pull request #33220: [WIP][SPARK-35993][TESTS] Fix flaky tests for RocksDBSuite

2021-07-17 Thread GitBox


xuanyuanking commented on pull request #33220:
URL: https://github.com/apache/spark/pull/33220#issuecomment-881974743


   Deleted in https://github.com/apache/spark/pull/33401


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xuanyuanking closed pull request #33220: [WIP][SPARK-35993][TESTS] Fix flaky tests for RocksDBSuite

2021-07-17 Thread GitBox


xuanyuanking closed pull request #33220:
URL: https://github.com/apache/spark/pull/33220


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xuanyuanking commented on pull request #33401: [SPARK-35785][SS][FOLLOWUP] Remove ignored test from RocksDBSuite

2021-07-17 Thread GitBox


xuanyuanking commented on pull request #33401:
URL: https://github.com/apache/spark/pull/33401#issuecomment-881974736


   Agree to delete this first.
   Refer to https://github.com/apache/spark/pull/33220, even with the exception 
var fix, you may still found the test is flaky in Jenkins env. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881963631


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45718/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881963630


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141202/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881963630


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141202/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881963631


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45718/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


SparkQA commented on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881962345


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45718/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


SparkQA removed a comment on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881924781


   **[Test build #141202 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141202/testReport)**
 for PR 33404 at commit 
[`5486d64`](https://github.com/apache/spark/commit/5486d649ea4760a81726993c7a186bfd85d41480).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


SparkQA commented on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881961913


   **[Test build #141202 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141202/testReport)**
 for PR 33404 at commit 
[`5486d64`](https://github.com/apache/spark/commit/5486d649ea4760a81726993c7a186bfd85d41480).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881959518


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45717/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881959523


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45717/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-07-17 Thread GitBox


SparkQA commented on pull request #31905:
URL: https://github.com/apache/spark/pull/31905#issuecomment-881957803


   **[Test build #141205 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141205/testReport)**
 for PR 31905 at commit 
[`ec7f727`](https://github.com/apache/spark/commit/ec7f7275ca7902db83aef86d174890bd1e240fe7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881957463






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881957464






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #33401: [SPARK-35785][SS][FOLLOWUP] Remove ignored test from RocksDBSuite

2021-07-17 Thread GitBox


viirya commented on pull request #33401:
URL: https://github.com/apache/spark/pull/33401#issuecomment-881956895


   Thanks @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881956008


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45717/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881954944


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45716/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881954507


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45715/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881950518


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45716/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881950484


   **[Test build #141204 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141204/testReport)**
 for PR 33405 at commit 
[`304ed04`](https://github.com/apache/spark/commit/304ed0493c02d33a7d660d0647e5de6db6b3f4c6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881950205


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141203/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881950227


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45715/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881950205


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141203/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA removed a comment on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881945012


   **[Test build #141203 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141203/testReport)**
 for PR 33405 at commit 
[`8d7a987`](https://github.com/apache/spark/commit/8d7a9878d1c421126acfd9951e74a3c7fcffac27).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881946927


   **[Test build #141203 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141203/testReport)**
 for PR 33405 at commit 
[`8d7a987`](https://github.com/apache/spark/commit/8d7a9878d1c421126acfd9951e74a3c7fcffac27).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


dongjoon-hyun commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881945408


   Hi, @kbendick . Thank you for your contribution. I made this PR and added 
you as a co-author. You will be marked as one of the author of this commit when 
this PR is merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


SparkQA commented on pull request #33405:
URL: https://github.com/apache/spark/pull/33405#issuecomment-881945012


   **[Test build #141203 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141203/testReport)**
 for PR 33405 at commit 
[`8d7a987`](https://github.com/apache/spark/commit/8d7a9878d1c421126acfd9951e74a3c7fcffac27).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun opened a new pull request #33405: [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 1g

2021-07-17 Thread GitBox


dongjoon-hyun opened a new pull request #33405:
URL: https://github.com/apache/spark/pull/33405


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


dongjoon-hyun commented on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881941519


   Thank you, @wangyum !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671729749



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+_.containsAllPatterns(AGGREGATE, LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+case agg @ Aggregate(grouping, aggExps, j @ Join(left: Aggregate, _, 
LeftSemi | LeftAnti, _, _))

Review comment:
   If you don't mind, `grouping` -> `groupingExprs`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671729433



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.

Review comment:
   Is there a case where we can remove 2+ aggregates in the decedent child 
plans via this optimizer?

##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.

Review comment:
   I'm wondering if there is a case where we can remove 2+ aggregates in 
the decedent child plans via this optimizer?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671729433



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.

Review comment:
   Is there a case where we can remove 2+ aggregates via this optimizer?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671729299



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+_.containsAllPatterns(AGGREGATE, LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+case agg @ Aggregate(grouping, aggExps, j @ Join(left: Aggregate, _, 
LeftSemi | LeftAnti, _, _))
+  if agg.groupOnly && left.groupOnly &&
+aggExps.forall(e => 
left.aggregateExpressions.exists(_.semanticEquals(e))) &&

Review comment:
   This seems to assume `RemoveRepetitionFromGroupExpressions` always, but 
the test suite doesn't cover that.
   Can we add a more complex test case which needs 
`RemoveRepetitionFromGroupExpressions` explicitly?

##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {
+  // Transform down to remove more Aggregates.
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformDownWithPruning(
+_.containsAllPatterns(AGGREGATE, LEFT_SEMI_OR_ANTI_JOIN), ruleId) {
+case agg @ Aggregate(grouping, aggExps, j @ Join(left: Aggregate, _, 
LeftSemi | LeftAnti, _, _))
+  if agg.groupOnly && left.groupOnly &&
+aggExps.forall(e => 
left.aggregateExpressions.exists(_.semanticEquals(e))) &&

Review comment:
   This seems to assume `RemoveRepetitionFromGroupExpressions` always, but 
the test suite doesn't cover that. Can we add a more complex test case which 
needs `RemoveRepetitionFromGroupExpressions` explicitly?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, 

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left si

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33404:
URL: https://github.com/apache/spark/pull/33404#discussion_r671729083



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveAggsThroughLeftSemiAntiJoin.scala
##
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, LeftSemi}
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.catalyst.trees.TreePattern.{AGGREGATE, 
LEFT_SEMI_OR_ANTI_JOIN}
+
+/**
+ * Remove the aggregation from left semi/anti join if the same aggregation has 
already been done
+ * on left side.
+ */
+object RemoveAggsThroughLeftSemiAntiJoin extends Rule[LogicalPlan] {

Review comment:
   `Remove ... Through ..` sounds a little strange. Do we have a similar 
rule like `Remove ... Through ..`? Otherwise, can we change the name?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33350:
URL: https://github.com/apache/spark/pull/33350#discussion_r671725543



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitionsSuite.scala
##
@@ -42,35 +43,27 @@ class PruneFileSourcePartitionsSuite extends 
PrunePartitionSuiteBase {
 
   test("PruneFileSourcePartitions should not change the output of 
LogicalRelation") {
 withTable("test") {
-  withTempDir { dir =>
-sql(
-  s"""
-|CREATE EXTERNAL TABLE test(i int)
-|PARTITIONED BY (p int)
-|STORED AS parquet
-|LOCATION '${dir.toURI}'""".stripMargin)
-
-val tableMeta = spark.sharedState.externalCatalog.getTable("default", 
"test")
-val catalogFileIndex = new CatalogFileIndex(spark, tableMeta, 0)
-
-val dataSchema = StructType(tableMeta.schema.filterNot { f =>
-  tableMeta.partitionColumnNames.contains(f.name)
-})
-val relation = HadoopFsRelation(
-  location = catalogFileIndex,
-  partitionSchema = tableMeta.partitionSchema,
-  dataSchema = dataSchema,
-  bucketSpec = None,
-  fileFormat = new ParquetFileFormat(),
-  options = Map.empty)(sparkSession = spark)
-
-val logicalRelation = LogicalRelation(relation, tableMeta)
-val query = Project(Seq(Symbol("i"), Symbol("p")),
-  Filter(Symbol("p") === 1, logicalRelation)).analyze
-
-val optimized = Optimize.execute(query)
-assert(optimized.missingInput.isEmpty)
-  }
+  spark.range(10).selectExpr("id", "id % 3 as 
p").write.partitionBy("p").saveAsTable("test")

Review comment:
   If you are not sure, why don't we keep the AS-IS existing one, @sunchao 
? Refactoring is good but always needs verification.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33350:
URL: https://github.com/apache/spark/pull/33350#discussion_r671725543



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitionsSuite.scala
##
@@ -42,35 +43,27 @@ class PruneFileSourcePartitionsSuite extends 
PrunePartitionSuiteBase {
 
   test("PruneFileSourcePartitions should not change the output of 
LogicalRelation") {
 withTable("test") {
-  withTempDir { dir =>
-sql(
-  s"""
-|CREATE EXTERNAL TABLE test(i int)
-|PARTITIONED BY (p int)
-|STORED AS parquet
-|LOCATION '${dir.toURI}'""".stripMargin)
-
-val tableMeta = spark.sharedState.externalCatalog.getTable("default", 
"test")
-val catalogFileIndex = new CatalogFileIndex(spark, tableMeta, 0)
-
-val dataSchema = StructType(tableMeta.schema.filterNot { f =>
-  tableMeta.partitionColumnNames.contains(f.name)
-})
-val relation = HadoopFsRelation(
-  location = catalogFileIndex,
-  partitionSchema = tableMeta.partitionSchema,
-  dataSchema = dataSchema,
-  bucketSpec = None,
-  fileFormat = new ParquetFileFormat(),
-  options = Map.empty)(sparkSession = spark)
-
-val logicalRelation = LogicalRelation(relation, tableMeta)
-val query = Project(Seq(Symbol("i"), Symbol("p")),
-  Filter(Symbol("p") === 1, logicalRelation)).analyze
-
-val optimized = Optimize.execute(query)
-assert(optimized.missingInput.isEmpty)
-  }
+  spark.range(10).selectExpr("id", "id % 3 as 
p").write.partitionBy("p").saveAsTable("test")

Review comment:
   If you are not sure, why don't we keep the AS-IS one, @sunchao ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #33341: [SPARK-36091][SQL] Support TimestampNTZ type in expression TimeWindow

2021-07-17 Thread GitBox


dongjoon-hyun commented on pull request #33341:
URL: https://github.com/apache/spark/pull/33341#issuecomment-881932237


   To @beliefer , please resolve the conflict.
   To @gengliangwang , is this targeting Apache Spark 3.2?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33341: [SPARK-36091][SQL] Support TimestampNTZ type in expression TimeWindow

2021-07-17 Thread GitBox


dongjoon-hyun commented on a change in pull request #33341:
URL: https://github.com/apache/spark/pull/33341#discussion_r671723488



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
##
@@ -352,4 +463,31 @@ class DataFrameTimeWindowingSuite extends QueryTest with 
SharedSparkSession {
   )
 }
   }
+
+  test("SPARK-36091: Support TimestampNTZ type in expression TimeWindow") {
+val df1 = Seq(
+  ("2016-03-27 19:39:30", 1, "a"),
+  ("2016-03-27 19:39:25", 2, "a")).toDF("time", "value", "id")
+val df2 = Seq((LocalDateTime.parse("2016-03-27T19:39:30"), 1, "a"),
+  (LocalDateTime.parse("2016-03-27T19:39:25"), 2, "a")).toDF("time", 
"value", "id")
+val type1 = StructType(
+  Seq(StructField("start", TimestampType), StructField("end", 
TimestampType)))
+val type2 = StructType(
+  Seq(StructField("start", TimestampNTZType), StructField("end", 
TimestampNTZType)))
+
+  Seq((df1, type1), (df2, type2)).foreach { tuple =>

Review comment:
   indentation?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier

2021-07-17 Thread GitBox


imback82 commented on a change in pull request #33200:
URL: https://github.com/apache/spark/pull/33200#discussion_r671595283



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala
##
@@ -229,22 +228,13 @@ case class ReplaceTableAsSelectStatement(
  * Column data as parsed by ALTER TABLE ... ADD COLUMNS.
  */
 case class QualifiedColType(
-name: Seq[String],
+fieldName: FieldName,

Review comment:
   For `AlterTableAddColumns`, we need to 1) resolve the "parent" name if 
the column being added is a nested one, and 2) check if the column name already 
exists.
   
   For `AlterTableReplaceColumns`, it seems that we do not need to check 
anything with this new change (I removed it in the recent 
[commit](https://github.com/apache/spark/pull/33200/commits/1eab11f13030c76a500d6d2b46fc8e39a3617d71))
   
   The reason I was using `FieldName` is so that I can check whether 
`QualifiedColType` is resolved or not so that rule doesn't run if already 
resolved:
   ```
   private def hasUnresolvedColumns(cols: Seq[QualifiedColType]): Boolean = {
 cols.exists(col => !col.fieldName.resolved || 
col.position.exists(!_.resolved))
   ```
   , but I agree that using `ResolvedFieldName` is a bit weird since the field 
name is being "added". Maybe turn `QualifiedColType` into an `Expression`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


SparkQA commented on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881924781


   **[Test build #141202 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141202/testReport)**
 for PR 33404 at commit 
[`5486d64`](https://github.com/apache/spark/commit/5486d649ea4760a81726993c7a186bfd85d41480).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881924636


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45714/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881924636


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45714/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


SparkQA commented on pull request #33404:
URL: https://github.com/apache/spark/pull/33404#issuecomment-881924186


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45714/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum opened a new pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-07-17 Thread GitBox


wangyum opened a new pull request #33404:
URL: https://github.com/apache/spark/pull/33404


   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   Unit test and benchmark test.
   
   TPCDS 5T
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak edited a comment on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-17 Thread GitBox


sarutak edited a comment on pull request #33253:
URL: https://github.com/apache/spark/pull/33253#issuecomment-881910368


   Some tests seem to fail. Could you fix them?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-17 Thread GitBox


sarutak commented on pull request #33253:
URL: https://github.com/apache/spark/pull/33253#issuecomment-881910368


   Some tests seem to fail. Could you fix it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] venkata91 commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-17 Thread GitBox


venkata91 commented on pull request #33253:
URL: https://github.com/apache/spark/pull/33253#issuecomment-881909197


   @sarutak @AngersZh can you please take a look again?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #33301: [SPARK-36122][CORE] Passing on needClientAuth to Jetty SSLContextFactory

2021-07-17 Thread GitBox


srowen commented on pull request #33301:
URL: https://github.com/apache/spark/pull/33301#issuecomment-881903788


   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen closed pull request #33301: [SPARK-36122][CORE] Passing on needClientAuth to Jetty SSLContextFactory

2021-07-17 Thread GitBox


srowen closed pull request #33301:
URL: https://github.com/apache/spark/pull/33301


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33388: [SPARK-36176][PYTHON] Expose tableExists in pyspark.sql.catalog

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33388:
URL: https://github.com/apache/spark/pull/33388#issuecomment-881902160


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33392: [SPARK-36178][PYTHON] List pyspark.sql.catalog APIs in documentation

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33392:
URL: https://github.com/apache/spark/pull/33392#issuecomment-881902148


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33394: [SPARK-36181][PYTHON] updating pyspark sql readwriter documentation

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33394:
URL: https://github.com/apache/spark/pull/33394#issuecomment-881901283


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33379: [SPARK-35810][PYTHON] Deprecate ps.broadcast API

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33379:
URL: https://github.com/apache/spark/pull/33379#issuecomment-881895142


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45713/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33379: [SPARK-35810][PYTHON] Deprecate ps.broadcast API

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33379:
URL: https://github.com/apache/spark/pull/33379#issuecomment-881895142


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45713/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33379: [SPARK-35810][PYTHON] Deprecate ps.broadcast API

2021-07-17 Thread GitBox


SparkQA commented on pull request #33379:
URL: https://github.com/apache/spark/pull/33379#issuecomment-881895138


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45713/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33379: [SPARK-35810][PYTHON] Deprecate ps.broadcast API

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33379:
URL: https://github.com/apache/spark/pull/33379#issuecomment-881892662


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141201/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33403: [SPARK-36193][CORE] Recover SparkSubmit.runMain not to stop SparkContext in non-K8s env

2021-07-17 Thread GitBox


AmplabJenkins removed a comment on pull request #33403:
URL: https://github.com/apache/spark/pull/33403#issuecomment-881892664


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141200/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33379: [SPARK-35810][PYTHON] Deprecate ps.broadcast API

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33379:
URL: https://github.com/apache/spark/pull/33379#issuecomment-881892662


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141201/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33403: [SPARK-36193][CORE] Recover SparkSubmit.runMain not to stop SparkContext in non-K8s env

2021-07-17 Thread GitBox


AmplabJenkins commented on pull request #33403:
URL: https://github.com/apache/spark/pull/33403#issuecomment-881892664


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141200/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33379: [SPARK-35810][PYTHON] Deprecate ps.broadcast API

2021-07-17 Thread GitBox


SparkQA commented on pull request #33379:
URL: https://github.com/apache/spark/pull/33379#issuecomment-881890542


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45713/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33379: [SPARK-35810][PYTHON] Deprecate ps.broadcast API

2021-07-17 Thread GitBox


SparkQA removed a comment on pull request #33379:
URL: https://github.com/apache/spark/pull/33379#issuecomment-881885077


   **[Test build #141201 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141201/testReport)**
 for PR 33379 at commit 
[`80c0d88`](https://github.com/apache/spark/commit/80c0d8829f7753c35db86dedcceb043ff5e49e5d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33403: [SPARK-36193][CORE] Recover SparkSubmit.runMain not to stop SparkContext in non-K8s env

2021-07-17 Thread GitBox


SparkQA removed a comment on pull request #33403:
URL: https://github.com/apache/spark/pull/33403#issuecomment-881868933


   **[Test build #141200 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141200/testReport)**
 for PR 33403 at commit 
[`c216516`](https://github.com/apache/spark/commit/c2165168509ff71c92bbc2c74c29fb19c91df97d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >