[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-07-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/6810


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-07-20 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-123187284
  
I'm working on a patch to enable Unsafe mode by default and ran into this 
problem as well, so I've opened #7560 to bring this patch up to date and merge 
it.  I'll still credit you with the primary patch authorship.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-07-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-118639190
  
Now that you've moved the second bug to #7225, do you mind cleaning this PR 
up to reflect only the first bug?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-07-04 Thread navis
Github user navis commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-118584139
  
Thanks, @JoshRosen. Actually, it's two bugs which is
* memory leak on empty input
* CCE in some cases (codeGen=false && (groupbyException.isEmpty || unsafe = 
false))

I've booked second bug to SPARK-8826(#7225).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-30 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-117362141
  
As I understand it, there are two separate bugs here:

- A memory leak fix.
- A bug related to mis-handling of empty inputs.

If this is the case, can we fix these separately?  Combining them into the 
same PR / test code makes them harder to understand.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-116950964
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-116950828
  
  [Test build #36072 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36072/console)
 for   PR 6810 at commit 
[`c5419b3`](https://github.com/apache/spark/commit/c5419b37f2ece4842d8c2bd7463a50b61245fbcf).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-116920493
  
  [Test build #36072 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/36072/consoleFull)
 for   PR 6810 at commit 
[`c5419b3`](https://github.com/apache/spark/commit/c5419b37f2ece4842d8c2bd7463a50b61245fbcf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-116918598
  
Build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-116918513
  
 Build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33535200
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
@@ -153,13 +153,14 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] 
with Logging with Serializ
   }
 
   protected def newProjection(
-  expressions: Seq[Expression], inputSchema: Seq[Attribute]): 
Projection = {
+  expressions: Seq[Expression],
+  inputSchema: Seq[Attribute], mutableRow: Boolean = false): 
Projection = {
--- End diff --

done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33535187
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.SQLConf
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val input0 = Seq.empty[(String, Int, Double)]
+val input1 = Seq(("Hello", 4, 2.0))
+
+// hack : current default parallelism of test local backend is two
--- End diff --

In the case that null buffer should be forwarded as described above, it 
should be handled in single task to make single row. But the parallelism for 
test is fixed as two("local[2]") and not easy to change the value. So expected 
output is two rows, instead of one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33533948
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.SQLConf
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val input0 = Seq.empty[(String, Int, Double)]
+val input1 = Seq(("Hello", 4, 2.0))
+
+// hack : current default parallelism of test local backend is two
+val x0 = Seq(Tuple1(0L), Tuple1(0L))
+val y0 = Seq.empty[Tuple1[Long]]
+
+val x1 = Seq(Tuple1(0L), Tuple1(1L))
+val y1 = Seq(Tuple1(1L))
+
+val codegenDefault = TestSQLContext.getConf(SQLConf.CODEGEN_ENABLED)
+try {
+  for ((input, x, y) <- Seq((input0, x0, y0), (input1, x1, y1))) {
+val df = input.toDF("a", "b", "c")
+val colB = df.col("b").expr
+val colC = df.col("c").expr
+val aggrExpr = Alias(Count(Cast(colC, LongType)), "Count")()
+
+for ((codegen, unsafe) <- Seq((false, false), (true, false), 
(true, true));
+ partial <- Seq(false, true); groupExpr <- Seq(colB :: Nil, 
Seq.empty)) {
--- End diff --

nit = fixed

It seemed related to the question below : "(partial || 
groupingExpressions.nonEmpty)". Handling  some test fails, I've recognized that 
even with empty input iterator, if the group-by operator is for 
global(groupingExpression.isEmpty) and final(partial=false), we still need to 
make a null row.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33533643
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.SQLConf
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val input0 = Seq.empty[(String, Int, Double)]
+val input1 = Seq(("Hello", 4, 2.0))
+
+// hack : current default parallelism of test local backend is two
+val x0 = Seq(Tuple1(0L), Tuple1(0L))
+val y0 = Seq.empty[Tuple1[Long]]
+
+val x1 = Seq(Tuple1(0L), Tuple1(1L))
+val y1 = Seq(Tuple1(1L))
+
+val codegenDefault = TestSQLContext.getConf(SQLConf.CODEGEN_ENABLED)
+try {
+  for ((input, x, y) <- Seq((input0, x0, y0), (input1, x1, y1))) {
+val df = input.toDF("a", "b", "c")
+val colB = df.col("b").expr
+val colC = df.col("c").expr
+val aggrExpr = Alias(Count(Cast(colC, LongType)), "Count")()
+
+for ((codegen, unsafe) <- Seq((false, false), (true, false), 
(true, true));
--- End diff --

Just wanted to see all is well. I'll remove the condition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33524986
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala 
---
@@ -270,7 +270,10 @@ case class GeneratedAggregate(
 
   val joinedRow = new JoinedRow3
 
-  if (groupingExpressions.isEmpty) {
+  if (!iter.hasNext && (partial || groupingExpressions.nonEmpty)) {
--- End diff --

Can you explain the `(partial || groupingExpressions.nonEmpty)` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33524575
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.SQLConf
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val input0 = Seq.empty[(String, Int, Double)]
+val input1 = Seq(("Hello", 4, 2.0))
+
+// hack : current default parallelism of test local backend is two
+val x0 = Seq(Tuple1(0L), Tuple1(0L))
+val y0 = Seq.empty[Tuple1[Long]]
+
+val x1 = Seq(Tuple1(0L), Tuple1(1L))
+val y1 = Seq(Tuple1(1L))
+
+val codegenDefault = TestSQLContext.getConf(SQLConf.CODEGEN_ENABLED)
+try {
+  for ((input, x, y) <- Seq((input0, x0, y0), (input1, x1, y1))) {
+val df = input.toDF("a", "b", "c")
+val colB = df.col("b").expr
+val colC = df.col("c").expr
+val aggrExpr = Alias(Count(Cast(colC, LongType)), "Count")()
+
+for ((codegen, unsafe) <- Seq((false, false), (true, false), 
(true, true));
+ partial <- Seq(false, true); groupExpr <- Seq(colB :: Nil, 
Seq.empty)) {
--- End diff --

AFAIK we also don't need to test `partial` here, since that only has an 
effect on the `requiredChildDistribution`, which would matter if we were 
running this through the physical plan optimizer but which shouldn't matter 
here because we're explicitly constructing the plan physical plan by hand.  
Have I overlooked something here, though?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33523617
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
@@ -153,13 +153,14 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] 
with Logging with Serializ
   }
 
   protected def newProjection(
-  expressions: Seq[Expression], inputSchema: Seq[Attribute]): 
Projection = {
+  expressions: Seq[Expression],
+  inputSchema: Seq[Attribute], mutableRow: Boolean = false): 
Projection = {
--- End diff --

Style nit: `mutableRow` needs to go on its own line (one parameter per 
line).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33523568
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.SQLConf
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val input0 = Seq.empty[(String, Int, Double)]
+val input1 = Seq(("Hello", 4, 2.0))
+
+// hack : current default parallelism of test local backend is two
--- End diff --

Can you elaborate a bit on this comment?  Why does the parallelism of the 
test backend matter here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33523518
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.SQLConf
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val input0 = Seq.empty[(String, Int, Double)]
+val input1 = Seq(("Hello", 4, 2.0))
+
+// hack : current default parallelism of test local backend is two
+val x0 = Seq(Tuple1(0L), Tuple1(0L))
+val y0 = Seq.empty[Tuple1[Long]]
+
+val x1 = Seq(Tuple1(0L), Tuple1(1L))
+val y1 = Seq(Tuple1(1L))
+
+val codegenDefault = TestSQLContext.getConf(SQLConf.CODEGEN_ENABLED)
+try {
+  for ((input, x, y) <- Seq((input0, x0, y0), (input1, x1, y1))) {
+val df = input.toDF("a", "b", "c")
+val colB = df.col("b").expr
+val colC = df.col("c").expr
+val aggrExpr = Alias(Count(Cast(colC, LongType)), "Count")()
+
+for ((codegen, unsafe) <- Seq((false, false), (true, false), 
(true, true));
+ partial <- Seq(false, true); groupExpr <- Seq(colB :: Nil, 
Seq.empty)) {
--- End diff --

Style nit: put each nested loop on its own line (e.g. one <- per line, 
since that's easier to read).  Also, I think `Seq(Seq(colB), Seq.empty)` is a 
bit more readable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-29 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33523492
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.SQLConf
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val input0 = Seq.empty[(String, Int, Double)]
+val input1 = Seq(("Hello", 4, 2.0))
+
+// hack : current default parallelism of test local backend is two
+val x0 = Seq(Tuple1(0L), Tuple1(0L))
+val y0 = Seq.empty[Tuple1[Long]]
+
+val x1 = Seq(Tuple1(0L), Tuple1(1L))
+val y1 = Seq(Tuple1(1L))
+
+val codegenDefault = TestSQLContext.getConf(SQLConf.CODEGEN_ENABLED)
+try {
+  for ((input, x, y) <- Seq((input0, x0, y0), (input1, x1, y1))) {
+val df = input.toDF("a", "b", "c")
+val colB = df.col("b").expr
+val colC = df.col("c").expr
+val aggrExpr = Alias(Count(Cast(colC, LongType)), "Count")()
+
+for ((codegen, unsafe) <- Seq((false, false), (true, false), 
(true, true));
--- End diff --

AFAIK we only need to test the `(true, true)` case, since this bug only 
occurred when using the unsafe aggregation path?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115647116
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115647008
  
  [Test build #35850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35850/console)
 for   PR 6810 at commit 
[`19907a5`](https://github.com/apache/spark/commit/19907a5984936bb6a15a92b98a5d418576624fe3).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class StreamingLinearAlgorithm(object):`
  * `class StreamingLogisticRegressionWithSGD(StreamingLinearAlgorithm):`
  * `class LinearDataGenerator(object):`
  * `class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)`
  * `case class CountFunction(expr: Expression, base: AggregateExpression) 
extends AggregateFunction `
  * `case class CountDistinctFunction(`
  * `case class ApproxCountDistinctPartitionFunction(`
  * `case class ApproxCountDistinctMergeFunction(`
  * `case class Sum(child: Expression) extends PartialAggregate with 
trees.UnaryNode[Expression] `
  * `case class CombineSum(child: Expression) extends AggregateExpression `
  * `case class SumDistinct(child: Expression)`
  * `case class CombineSetsAndSum(inputSet: Expression, base: Expression) 
extends AggregateExpression `
  * `case class CombineSetsAndSumFunction(`
  * `case class First(child: Expression) extends PartialAggregate with 
trees.UnaryNode[Expression] `
  * `case class Last(child: Expression) extends PartialAggregate with 
trees.UnaryNode[Expression] `
  * `case class Sha2(left: Expression, right: Expression)`
  * `case class BroadcastHint(child: LogicalPlan) extends UnaryNode `
  * `case class PrecisionInfo(precision: Int, scale: Int) `
  * `case class TakeOrderedAndProject(`
  * `s"Using output committer class $`
  * `logInfo(s"Using user defined output committer class $`
  * `  s"Using output committer class $`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115591371
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115591322
  
  [Test build #35849 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35849/console)
 for   PR 6810 at commit 
[`c4b1f32`](https://github.com/apache/spark/commit/c4b1f32dbd0f890c989ee8b6e2038320a8a2e1b4).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)`
  * `case class Sha2(left: Expression, right: Expression)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115588321
  
  [Test build #35850 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35850/consoleFull)
 for   PR 6810 at commit 
[`19907a5`](https://github.com/apache/spark/commit/19907a5984936bb6a15a92b98a5d418576624fe3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115587446
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115587563
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115586038
  
  [Test build #35849 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35849/consoleFull)
 for   PR 6810 at commit 
[`c4b1f32`](https://github.com/apache/spark/commit/c4b1f32dbd0f890c989ee8b6e2038320a8a2e1b4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115585527
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-115585420
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r5692
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala
 ---
@@ -22,9 +22,11 @@ package org.apache.spark.sql.catalyst.expressions
  * @param expressions a sequence of expressions that determine the value 
of each column of the
  *output row.
  */
-class InterpretedProjection(expressions: Seq[Expression]) extends 
Projection {
-  def this(expressions: Seq[Expression], inputSchema: Seq[Attribute]) =
-this(expressions.map(BindReferences.bindReference(_, inputSchema)))
+class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)
+  extends Projection {
+  def this(expressions: Seq[Expression],
+   inputSchema: Seq[Attribute], mutableRow: Boolean = false) =
--- End diff --

Sound like there should another test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-26 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r5666
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala
 ---
@@ -22,9 +22,11 @@ package org.apache.spark.sql.catalyst.expressions
  * @param expressions a sequence of expressions that determine the value 
of each column of the
  *output row.
  */
-class InterpretedProjection(expressions: Seq[Expression]) extends 
Projection {
-  def this(expressions: Seq[Expression], inputSchema: Seq[Attribute]) =
-this(expressions.map(BindReferences.bindReference(_, inputSchema)))
+class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)
+  extends Projection {
+  def this(expressions: Seq[Expression],
+   inputSchema: Seq[Attribute], mutableRow: Boolean = false) =
--- End diff --

Sorry for late reply. It happened when I changed empty input in 
AggregateSuite test to something like ("Hello", 4, 2.0). In GeneratedAggreate, 
whenever the grouping expression is empty or unsafeEnabled is not applied for 
some reason spark tries to case GenericRow to MutableRow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33059743
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala
 ---
@@ -22,9 +22,11 @@ package org.apache.spark.sql.catalyst.expressions
  * @param expressions a sequence of expressions that determine the value 
of each column of the
  *output row.
  */
-class InterpretedProjection(expressions: Seq[Expression]) extends 
Projection {
-  def this(expressions: Seq[Expression], inputSchema: Seq[Attribute]) =
-this(expressions.map(BindReferences.bindReference(_, inputSchema)))
+class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)
+  extends Projection {
+  def this(expressions: Seq[Expression],
+   inputSchema: Seq[Attribute], mutableRow: Boolean = false) =
--- End diff --

I see that your comment mentioned that this is avoiding a potential 
ClassCastException.  Did you actually run into that exception somewhere?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33059434
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala
 ---
@@ -22,9 +22,11 @@ package org.apache.spark.sql.catalyst.expressions
  * @param expressions a sequence of expressions that determine the value 
of each column of the
  *output row.
  */
-class InterpretedProjection(expressions: Seq[Expression]) extends 
Projection {
-  def this(expressions: Seq[Expression], inputSchema: Seq[Attribute]) =
-this(expressions.map(BindReferences.bindReference(_, inputSchema)))
+class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)
+  extends Projection {
+  def this(expressions: Seq[Expression],
+   inputSchema: Seq[Attribute], mutableRow: Boolean = false) =
--- End diff --

Could you explain this `mutableRow` change?  If we do need to keep it, the 
style here isn't quite right; see 
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide for 
how to properly wrap long parameter lists.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114422982
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114422960
  
  [Test build #35532 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35532/console)
 for   PR 6810 at commit 
[`f1859be`](https://github.com/apache/spark/commit/f1859bee95bbd688e42e8d1953acc6fd121eb487).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class InterpretedProjection(expressions: Seq[Expression], mutableRow: 
Boolean = false)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114395415
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114394069
  
  [Test build #35532 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35532/consoleFull)
 for   PR 6810 at commit 
[`f1859be`](https://github.com/apache/spark/commit/f1859bee95bbd688e42e8d1953acc6fd121eb487).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114393939
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114393960
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33016547
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableOperations.scala
 ---
@@ -49,7 +49,7 @@ import org.apache.spark.sql.SQLConf
 import org.apache.spark.sql.catalyst.expressions.{Attribute, Expression, 
InternalRow, _}
 import org.apache.spark.sql.execution.{LeafNode, SparkPlan, UnaryNode}
 import org.apache.spark.sql.types.StructType
-import org.apache.spark.{Logging, TaskContext}
--- End diff --

Seemed mistakenly included. Will be removed in next update.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33016397
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+for ((codegen, unsafe) <- Seq((false, false), (true, false), (true, 
true));
+ partial <- Seq(false, true)) {
+  TestSQLContext.conf.setConfString("spark.sql.codegen", 
String.valueOf(codegen))
--- End diff --

Done. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114393003
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114392946
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-23 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r33015113
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+for ((codegen, unsafe) <- Seq((false, false), (true, false), (true, 
true));
+ partial <- Seq(false, true)) {
+  TestSQLContext.conf.setConfString("spark.sql.codegen", 
String.valueOf(codegen))
--- End diff --

Could you use the new SparkConf API? This line will be 
`TestSQLContext.setConf(SQLConf.CODEGEN_ENABLED, codegen)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114335130
  
Also, if we do introduce a more general way of handling this sort of state 
cleanup as part of the test runner, then I suspect we'll want to both remove a 
lot of the old try-finally code and will have to fix any bugs due to tests that 
implicitly relied on the old behavior.  Since that has the potential to grow to 
cover a lot more changes than this bugfix, I'd like to tackle it separately.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114334861
  
I appreciate you trying to add a more general save / restore / backup 
mechanism for SqlConf state, but I'd rather not introduce that change as part 
of a bugfix.  I think that we might want to be able to introduce / test / 
backport that test infra change independent of the change here.

Do you mind just mimicking the simple try-finally save/restore that's used 
by other SQL tests and rolling back the wider test infra changes?  Happy to 
include them, just not here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114333710
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114333653
  
  [Test build #35507 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35507/console)
 for   PR 6810 at commit 
[`d20f8f9`](https://github.com/apache/spark/commit/d20f8f9ee075e94541f33760b9f45db01d8ff1d2).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114328832
  
  [Test build #35507 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35507/consoleFull)
 for   PR 6810 at commit 
[`d20f8f9`](https://github.com/apache/spark/commit/d20f8f9ee075e94541f33760b9f45db01d8ff1d2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114327877
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114327952
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114325568
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114325567
  
  [Test build #35504 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35504/console)
 for   PR 6810 at commit 
[`eb86c67`](https://github.com/apache/spark/commit/eb86c67265f283e56408ca4b30ba3a5dddffdd86).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114325431
  
  [Test build #35504 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35504/consoleFull)
 for   PR 6810 at commit 
[`eb86c67`](https://github.com/apache/spark/commit/eb86c67265f283e56408ca4b30ba3a5dddffdd86).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114325291
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114325255
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r32998574
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.SparkEnv
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+SparkEnv.get.conf.set("spark.unsafe.exceptionOnMemoryLeak", "true")
--- End diff --

Ah, I can see how this would be an issue for IntelliJ. If we do want to 
modify the conf, though, we should do it elsewhere (when creating the 
SparkContext that's used to create the TestSQLContext).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r32998276
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.SparkEnv
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+SparkEnv.get.conf.set("spark.unsafe.exceptionOnMemoryLeak", "true")
+for (codegen <- Seq(false, true); partial <- Seq(false, true); unsafe 
<- Seq(false, true)) {
--- End diff --

Ok, I'll do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread navis
Github user navis commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r32998221
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.SparkEnv
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+SparkEnv.get.conf.set("spark.unsafe.exceptionOnMemoryLeak", "true")
--- End diff --

Yes, I know that. Maybe I'm wrong but when running the test in 
IDE(intelliJ), the policy seemed not applied. I'll remove this line, anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r32984080
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.SparkEnv
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+SparkEnv.get.conf.set("spark.unsafe.exceptionOnMemoryLeak", "true")
+for (codegen <- Seq(false, true); partial <- Seq(false, true); unsafe 
<- Seq(false, true)) {
--- End diff --

Actually, some of these cross-product options don't make sense: if codegen 
is off, then the planner won't use GeneratedAggregate.  I think that we should 
change this test to store the old configuration values, then enable unsafe, run 
the regression test, then switch back to the old value.  We should use a 
try-finally block to ensure that the reset of the settings is done properly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r32970075
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.SparkEnv
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+SparkEnv.get.conf.set("spark.unsafe.exceptionOnMemoryLeak", "true")
+for (codegen <- Seq(false, true); partial <- Seq(false, true); unsafe 
<- Seq(false, true)) {
--- End diff --

I don't think that this is necessarily a safe way to implement this kind of 
configuration cross-product logic in SQL because this test's modifications to 
the global `TestSqlContext` won't be cleaned up at the end of this test.

Ping @mambrus, is there a right way to handle this sort of cleanup / 
rollback of configuration changes?  In Spark core, we have a similar problem 
with system properties and use a test suite mixin to automatically snapshot the 
configuration before the tests and roll back to it after they've run.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r32969602
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/AggregateSuite.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.SparkEnv
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.test.TestSQLContext
+import org.apache.spark.sql.types.DataTypes._
+
+class AggregateSuite extends SparkPlanTest {
+
+  test("SPARK-8357 Memory leakage on unsafe aggregation path with empty 
input") {
+
+val df = Seq.empty[(String, Int, Double)].toDF("a", "b", "c")
+
+val groupExpr = df.col("b").expr
+val aggrExpr = Alias(Count(Cast(groupExpr, LongType)), "Count")()
+
+SparkEnv.get.conf.set("spark.unsafe.exceptionOnMemoryLeak", "true")
--- End diff --

This option is automatically enabled when running tests in Maven or SBT, so 
we don't need to put this here.  Additionally, mutating the SparkConf after 
it's already been used to create a SparkContext is not always safe, so I think 
that we should remove this line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114154895
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114154839
  
  [Test build #35450 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35450/console)
 for   PR 6810 at commit 
[`0a5bf17`](https://github.com/apache/spark/commit/0a5bf170a1ed89344f2a57bd891154c17a6c44f8).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class Module(object):`
  * `class PCAModel(JavaVectorTransformer):`
  * `class PCA(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114108709
  
  [Test build #35450 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35450/consoleFull)
 for   PR 6810 at commit 
[`0a5bf17`](https://github.com/apache/spark/commit/0a5bf170a1ed89344f2a57bd891154c17a6c44f8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114107943
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114108019
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114057184
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114057166
  
  [Test build #35440 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35440/console)
 for   PR 6810 at commit 
[`36ecd1f`](https://github.com/apache/spark/commit/36ecd1f4d9ef5d49be4f4826adf0778c70b82395).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114053435
  
  [Test build #35440 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35440/consoleFull)
 for   PR 6810 at commit 
[`36ecd1f`](https://github.com/apache/spark/commit/36ecd1f4d9ef5d49be4f4826adf0778c70b82395).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114052812
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114038705
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-22 Thread navis
Github user navis commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114038518
  
@JoshRosen Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-21 Thread navis
Github user navis commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-114017606
  
@JoshRosen Sure, I'll review your patch first. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-18 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-113321764
  
Now that we've merged #6885, it should be possible to write a regression 
test for this.  Take a look and let me know if you're interested in doing that. 
 If not, I can merge this as-is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111859376
  
  [Test build #34889 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34889/console)
 for   PR 6810 at commit 
[`993c289`](https://github.com/apache/spark/commit/993c289afbee58b9cb82d7c828d8658c25baab91).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111859431
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111844186
  
  [Test build #34889 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/34889/consoleFull)
 for   PR 6810 at commit 
[`993c289`](https://github.com/apache/spark/commit/993c289afbee58b9cb82d7c828d8658c25baab91).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-14 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111844007
  
I'm actually working on some tooling that will make it very easy to write 
regressions tests for these kinds of issues (basically, a way to instantiate 
individual operators and run them with canned inputs, which is much easier than 
trying to rig a high-level query that triggers the code path under test).  This 
should not block on that, though, but I'll ping this thread later today with an 
example of how we could write such a test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111843956
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111843953
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-14 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111843931
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-13 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111777483
  
I guess there's not a super easy way to add a regression test for this 
since it's kind of hard to just instantiate the operator and run it with some 
empty input.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-13 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/6810#discussion_r32375944
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/GeneratedAggregate.scala 
---
@@ -270,7 +270,9 @@ case class GeneratedAggregate(
 
   val joinedRow = new JoinedRow3
 
-  if (groupingExpressions.isEmpty) {
+  if (!iter.hasNext) {
+Iterator[InternalRow]()
--- End diff --

Can we add a comment here to explain why we're doing this?  Just needs to 
mention the memory leak issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6810#issuecomment-111776267
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8357] [SQL] Memory leakage on unsafe ag...

2015-06-13 Thread navis
GitHub user navis opened a pull request:

https://github.com/apache/spark/pull/6810

[SPARK-8357] [SQL] Memory leakage on unsafe aggregation path with empty 
input

Currently, unsafe-based hash is released on 'next' call but if input is 
empty, it would not be called ever. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/navis/spark SPARK-8357

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6810.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6810


commit b67b8aad2f4ebdcbfeba173bfaf626d73e4f2355
Author: navis.ryu 
Date:   2015-06-14T03:10:03Z

[SPARK-8357] [SQL] Memory leakage on unsafe aggregation path with empty 
input




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org