[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130519454
  
  [Test build #40724 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40724/consoleFull)
 for   PR 8113 at commit 
[`f828bdf`](https://github.com/apache/spark/commit/f828bdf1612a5fc9466b9a7e80700d0dd94faaf5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130519356
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130519340
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130512114
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130512083
  
  [Test build #40705 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40705/console)
 for   PR 8113 at commit 
[`ad0ac67`](https://github.com/apache/spark/commit/ad0ac67874a9e2a7d266f4a792e7fcfe0f5617ce).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class First(child: Expression, ignoreNullsExpr: Expression) 
extends AlgebraicAggregate `
  * `case class Last(child: Expression, ignoreNullsExpr: Expression) 
extends AlgebraicAggregate `
  * `case class First(`
  * `case class FirstFunction(`
  * `case class Last(`
  * `case class LastFunction(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130493011
  
  [Test build #40705 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40705/consoleFull)
 for   PR 8113 at commit 
[`ad0ac67`](https://github.com/apache/spark/commit/ad0ac67874a9e2a7d266f4a792e7fcfe0f5617ce).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130493005
  
@hvanhovell Since we already pass the feature freeze deadline, I will not 
add new interfaces to DataFrame API (this PR is mainly about fixing the default 
behavior). If users request adding DF functions that expose `ignoreNulls` flag, 
we can do it in 1.6.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130492880
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130492894
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/8113#discussion_r36878009
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
 ---
@@ -130,24 +147,67 @@ case class First(child: Expression) extends 
AlgebraicAggregate {
 
   private val first = AttributeReference("first", child.dataType)()
 
-  override val bufferAttributes = first :: Nil
+  private val valueSet = AttributeReference("valueSet", BooleanType)()
+
+  override val bufferAttributes = first :: valueSet :: Nil
 
   override val initialValues = Seq(
-/* first = */ Literal.create(null, child.dataType)
+/* first = */ Literal.create(null, child.dataType),
+/* valueSet = */ Literal.create(false, BooleanType)
   )
 
-  override val updateExpressions = Seq(
-/* first = */ If(IsNull(first), child, first)
-  )
+  override val updateExpressions = {
+val litTrue = Literal.create(true, BooleanType)
+if (ignoreNulls) {
+  Seq(
+/* first = */ If(Or(valueSet, IsNull(child)), first, child),
+/* valueSet = */ If(Or(valueSet, IsNull(child)), valueSet, litTrue)
+  )
+} else {
+  Seq(
+/* first = */ If(valueSet, first, child),
+/* valueSet = */ litTrue
+  )
+}
+  }
 
-  override val mergeExpressions = Seq(
-/* first = */ If(IsNull(first.left), first.right, first.left)
-  )
+  override val mergeExpressions = {
+val litTrue = Literal.create(true, BooleanType)
+if (ignoreNulls) {
+  Seq(
+/* first = */ If(Or(valueSet.left, IsNull(first.right)), 
first.left, first.right),
--- End diff --

I don't think we need to check ```IsNull(first.right)```.

On a more general note: I think we don't need to separate the 
```ignoreNulls = false``` and ```ignoreNulls = true``` paths. This should be 
fine for both:
```
/* first = */ If(valueSet.left, first.left, first.right),
/* valueSet = */ Or(valueSet.left, valueSet.right)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/8113#discussion_r36877100
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
 ---
@@ -130,24 +147,67 @@ case class First(child: Expression) extends 
AlgebraicAggregate {
 
   private val first = AttributeReference("first", child.dataType)()
 
-  override val bufferAttributes = first :: Nil
+  private val valueSet = AttributeReference("valueSet", BooleanType)()
+
+  override val bufferAttributes = first :: valueSet :: Nil
 
   override val initialValues = Seq(
-/* first = */ Literal.create(null, child.dataType)
+/* first = */ Literal.create(null, child.dataType),
+/* valueSet = */ Literal.create(false, BooleanType)
   )
 
-  override val updateExpressions = Seq(
-/* first = */ If(IsNull(first), child, first)
-  )
+  override val updateExpressions = {
+val litTrue = Literal.create(true, BooleanType)
+if (ignoreNulls) {
+  Seq(
+/* first = */ If(Or(valueSet, IsNull(child)), first, child),
+/* valueSet = */ If(Or(valueSet, IsNull(child)), valueSet, litTrue)
+  )
+} else {
+  Seq(
+/* first = */ If(valueSet, first, child),
+/* valueSet = */ litTrue
+  )
+}
+  }
 
-  override val mergeExpressions = Seq(
-/* first = */ If(IsNull(first.left), first.right, first.left)
-  )
+  override val mergeExpressions = {
+val litTrue = Literal.create(true, BooleanType)
+if (ignoreNulls) {
+  Seq(
+/* first = */ If(Or(valueSet.left, IsNull(first.right)), 
first.left, first.right),
+/* valueSet = */ If(Or(valueSet.left, IsNull(first.right)), 
valueSet.left, litTrue)
--- End diff --

Nit: ```Or(valueSet.left. valueSet.right)``` is shorter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread hvanhovell
Github user hvanhovell commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130343660
  
One more small thing. We should probably also add the ```ignoreNulls``` 
option to the ```first``` and ```last``` dataframe functions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/8113#discussion_r36873218
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
 ---
@@ -130,24 +147,67 @@ case class First(child: Expression) extends 
AlgebraicAggregate {
 
   private val first = AttributeReference("first", child.dataType)()
 
-  override val bufferAttributes = first :: Nil
+  private val valueSet = AttributeReference("valueSet", BooleanType)()
+
+  override val bufferAttributes = first :: valueSet :: Nil
 
   override val initialValues = Seq(
-/* first = */ Literal.create(null, child.dataType)
+/* first = */ Literal.create(null, child.dataType),
+/* valueSet = */ Literal.create(false, BooleanType)
   )
 
-  override val updateExpressions = Seq(
-/* first = */ If(IsNull(first), child, first)
-  )
+  override val updateExpressions = {
+val litTrue = Literal.create(true, BooleanType)
+if (ignoreNulls) {
+  Seq(
+/* first = */ If(Or(valueSet, IsNull(child)), first, child),
+/* valueSet = */ If(Or(valueSet, IsNull(child)), valueSet, litTrue)
+  )
+} else {
+  Seq(
+/* first = */ If(valueSet, first, child),
+/* valueSet = */ litTrue
+  )
+}
+  }
 
-  override val mergeExpressions = Seq(
-/* first = */ If(IsNull(first.left), first.right, first.left)
-  )
+  override val mergeExpressions = {
+val litTrue = Literal.create(true, BooleanType)
+if (ignoreNulls) {
+  Seq(
+/* first = */ If(Or(valueSet.left, IsNull(first.right)), 
first.left, first.right),
+/* valueSet = */ If(Or(valueSet.left, IsNull(first.right)), 
valueSet.left, litTrue)
+  )
+} else {
+  Seq(
+/* first = */ If(valueSet.left, first.left, first.right),
+/* valueSet = */ litTrue
--- End diff --

Is it possible that two non-updated aggregates get merged? If it is, then 
this should be: ```Or(valueSet.left. valueSet.right)```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/8113#discussion_r36873052
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
 ---
@@ -130,24 +147,67 @@ case class First(child: Expression) extends 
AlgebraicAggregate {
 
   private val first = AttributeReference("first", child.dataType)()
 
-  override val bufferAttributes = first :: Nil
+  private val valueSet = AttributeReference("valueSet", BooleanType)()
+
+  override val bufferAttributes = first :: valueSet :: Nil
 
   override val initialValues = Seq(
-/* first = */ Literal.create(null, child.dataType)
+/* first = */ Literal.create(null, child.dataType),
+/* valueSet = */ Literal.create(false, BooleanType)
   )
 
-  override val updateExpressions = Seq(
-/* first = */ If(IsNull(first), child, first)
-  )
+  override val updateExpressions = {
+val litTrue = Literal.create(true, BooleanType)
+if (ignoreNulls) {
+  Seq(
+/* first = */ If(Or(valueSet, IsNull(child)), first, child),
+/* valueSet = */ If(Or(valueSet, IsNull(child)), valueSet, litTrue)
--- End diff --

Nit: ```Or(valueSet, Not(IsNull(child)))``` is a bit shorter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130240478
  
  [Test build #1488 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1488/console)
 for   PR 8113 at commit 
[`2d93d89`](https://github.com/apache/spark/commit/2d93d891c24d3b661b68991fef6672095692ac1b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-12 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130233416
  
  [Test build #1488 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1488/consoleFull)
 for   PR 8113 at commit 
[`2d93d89`](https://github.com/apache/spark/commit/2d93d891c24d3b661b68991fef6672095692ac1b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130185947
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130185862
  
  [Test build #40589 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40589/console)
 for   PR 8113 at commit 
[`15c9e31`](https://github.com/apache/spark/commit/15c9e3122d49743ec5c6df268083289156e22541).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class First(child: Expression, ignoreNulls: Boolean) extends 
AlgebraicAggregate `
  * `case class Last(child: Expression, ignoreNulls: Boolean) extends 
AlgebraicAggregate `
  * `case class First(`
  * `case class FirstFunction(`
  * `case class Last(`
  * `case class LastFunction(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130172240
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130171233
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread yhuai
Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130168995
  
@ggupta81 Yeah. We can close #7928. Can you open a pr to fix branch 1.4, or 
reopen #7233 and update that? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130167835
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130167841
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread ggupta81
Github user ggupta81 commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130167793
  
Should i close my original pull request now?
https://github.com/apache/spark/pull/7928

On Wed, Aug 12, 2015 at 10:38 AM, UCB AMPLab 
wrote:

> Merged build started.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



-- 

*Gaurav Gupta*Engineering Manager @ Adobe
9971666780



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130167342
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8113#issuecomment-130167350
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/8113#discussion_r36826885
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala
 ---
@@ -630,59 +632,114 @@ case class CombineSetsAndSumFunction(
   }
 }
 
-case class First(child: Expression) extends UnaryExpression with 
PartialAggregate1 {
+case class First(
+child: Expression,
+ignoreNulls: Boolean)
+  extends UnaryExpression with PartialAggregate1 {
+
+  def this(child: Expression) = this(child, false)
+
+  def this(child: Expression, ignoreNulls: Expression) = this(child, 
ignoreNulls match {
+case Literal(b: Boolean, BooleanType) => b
+case _ =>
+  throw new AnalysisException("The second argument of First should be 
a boolean literal.")
+  })
+
   override def nullable: Boolean = true
   override def dataType: DataType = child.dataType
-  override def toString: String = s"FIRST($child)"
+  override def toString: String = s"FIRST(${child}${if (ignoreNulls) " 
IGNORE NULLS"})"
 
   override def asPartial: SplitEvaluation = {
-val partialFirst = Alias(First(child), "PartialFirst")()
+val partialFirst = Alias(First(child, ignoreNulls), "PartialFirst")()
 SplitEvaluation(
-  First(partialFirst.toAttribute),
+  First(partialFirst.toAttribute, ignoreNulls),
   partialFirst :: Nil)
   }
-  override def newInstance(): FirstFunction = new FirstFunction(child, 
this)
+  override def newInstance(): FirstFunction = new FirstFunction(child, 
ignoreNulls, this)
 }
 
-case class FirstFunction(expr: Expression, base: AggregateExpression1) 
extends AggregateFunction1 {
-  def this() = this(null, null) // Required for serialization.
+object First {
+  def apply(child: Expression): First = First(child, ignoreNulls = false)
+}
 
-  var result: Any = null
+case class FirstFunction(
+expr: Expression,
+ignoreNulls: Boolean,
+base: AggregateExpression1)
+  extends AggregateFunction1 {
+
+  def this() = this(null, null.asInstanceOf[Boolean], null) // Required 
for serialization.
+
+  private[this] var result: Any = null
+
+  private[this] var valueSet: Boolean = false
 
   override def update(input: InternalRow): Unit = {
-if (result == null) {
-  result = expr.eval(input)
+if (!valueSet) {
+  val value = expr.eval(input)
+  // When we have not set the result, we will set the result if we 
respect nulls
+  // (i.e. ignoreNulls is false), or we ignore nulls and the evaluated 
value is not null.
+  if (!ignoreNulls || (ignoreNulls && value != null)) {
+result = value
+valueSet = true
+  }
 }
   }
 
   override def eval(input: InternalRow): Any = result
 }
 
-case class Last(child: Expression) extends UnaryExpression with 
PartialAggregate1 {
+case class Last(
+child: Expression,
+ignoreNulls: Boolean)
+  extends UnaryExpression with PartialAggregate1 {
+
+  def this(child: Expression) = this(child, false)
+
+  def this(child: Expression, ignoreNulls: Expression) = this(child, 
ignoreNulls match {
+case Literal(b: Boolean, BooleanType) => b
+case _ =>
+  throw new AnalysisException("The second argument of Last should be a 
boolean literal.")
+  })
+
   override def references: AttributeSet = child.references
   override def nullable: Boolean = true
   override def dataType: DataType = child.dataType
-  override def toString: String = s"LAST($child)"
+  override def toString: String = s"LAST($child)${if (ignoreNulls) " 
IGNORE NULLS"}"
 
   override def asPartial: SplitEvaluation = {
-val partialLast = Alias(Last(child), "PartialLast")()
+val partialLast = Alias(Last(child, ignoreNulls), "PartialLast")()
 SplitEvaluation(
-  Last(partialLast.toAttribute),
+  Last(partialLast.toAttribute, ignoreNulls),
   partialLast :: Nil)
   }
-  override def newInstance(): LastFunction = new LastFunction(child, this)
+  override def newInstance(): LastFunction = new LastFunction(child, 
ignoreNulls, this)
 }
 
-case class LastFunction(expr: Expression, base: AggregateExpression1) 
extends AggregateFunction1 {
-  def this() = this(null, null) // Required for serialization.
+object Last {
+  def apply(child: Expression): Last = Last(child, ignoreNulls = false)
+}
+
+case class LastFunction(
+expr: Expression,
+ignoreNulls: Boolean,
+base: AggregateExpression1)
+  extends AggregateFunction1 {
+
+  def this() = this(null, null.asInstanceOf[Boolean], null) // Required 
for serializ

[GitHub] spark pull request: [SPARK-9740] [SPARK-9592] [SQL] Change the def...

2015-08-11 Thread ggupta81
Github user ggupta81 commented on a diff in the pull request:

https://github.com/apache/spark/pull/8113#discussion_r36826491
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregates.scala
 ---
@@ -630,59 +632,114 @@ case class CombineSetsAndSumFunction(
   }
 }
 
-case class First(child: Expression) extends UnaryExpression with 
PartialAggregate1 {
+case class First(
+child: Expression,
+ignoreNulls: Boolean)
+  extends UnaryExpression with PartialAggregate1 {
+
+  def this(child: Expression) = this(child, false)
+
+  def this(child: Expression, ignoreNulls: Expression) = this(child, 
ignoreNulls match {
+case Literal(b: Boolean, BooleanType) => b
+case _ =>
+  throw new AnalysisException("The second argument of First should be 
a boolean literal.")
+  })
+
   override def nullable: Boolean = true
   override def dataType: DataType = child.dataType
-  override def toString: String = s"FIRST($child)"
+  override def toString: String = s"FIRST(${child}${if (ignoreNulls) " 
IGNORE NULLS"})"
 
   override def asPartial: SplitEvaluation = {
-val partialFirst = Alias(First(child), "PartialFirst")()
+val partialFirst = Alias(First(child, ignoreNulls), "PartialFirst")()
 SplitEvaluation(
-  First(partialFirst.toAttribute),
+  First(partialFirst.toAttribute, ignoreNulls),
   partialFirst :: Nil)
   }
-  override def newInstance(): FirstFunction = new FirstFunction(child, 
this)
+  override def newInstance(): FirstFunction = new FirstFunction(child, 
ignoreNulls, this)
 }
 
-case class FirstFunction(expr: Expression, base: AggregateExpression1) 
extends AggregateFunction1 {
-  def this() = this(null, null) // Required for serialization.
+object First {
+  def apply(child: Expression): First = First(child, ignoreNulls = false)
+}
 
-  var result: Any = null
+case class FirstFunction(
+expr: Expression,
+ignoreNulls: Boolean,
+base: AggregateExpression1)
+  extends AggregateFunction1 {
+
+  def this() = this(null, null.asInstanceOf[Boolean], null) // Required 
for serialization.
+
+  private[this] var result: Any = null
+
+  private[this] var valueSet: Boolean = false
 
   override def update(input: InternalRow): Unit = {
-if (result == null) {
-  result = expr.eval(input)
+if (!valueSet) {
+  val value = expr.eval(input)
+  // When we have not set the result, we will set the result if we 
respect nulls
+  // (i.e. ignoreNulls is false), or we ignore nulls and the evaluated 
value is not null.
+  if (!ignoreNulls || (ignoreNulls && value != null)) {
+result = value
+valueSet = true
+  }
 }
   }
 
   override def eval(input: InternalRow): Any = result
 }
 
-case class Last(child: Expression) extends UnaryExpression with 
PartialAggregate1 {
+case class Last(
+child: Expression,
+ignoreNulls: Boolean)
+  extends UnaryExpression with PartialAggregate1 {
+
+  def this(child: Expression) = this(child, false)
+
+  def this(child: Expression, ignoreNulls: Expression) = this(child, 
ignoreNulls match {
+case Literal(b: Boolean, BooleanType) => b
+case _ =>
+  throw new AnalysisException("The second argument of Last should be a 
boolean literal.")
+  })
+
   override def references: AttributeSet = child.references
   override def nullable: Boolean = true
   override def dataType: DataType = child.dataType
-  override def toString: String = s"LAST($child)"
+  override def toString: String = s"LAST($child)${if (ignoreNulls) " 
IGNORE NULLS"}"
 
   override def asPartial: SplitEvaluation = {
-val partialLast = Alias(Last(child), "PartialLast")()
+val partialLast = Alias(Last(child, ignoreNulls), "PartialLast")()
 SplitEvaluation(
-  Last(partialLast.toAttribute),
+  Last(partialLast.toAttribute, ignoreNulls),
   partialLast :: Nil)
   }
-  override def newInstance(): LastFunction = new LastFunction(child, this)
+  override def newInstance(): LastFunction = new LastFunction(child, 
ignoreNulls, this)
 }
 
-case class LastFunction(expr: Expression, base: AggregateExpression1) 
extends AggregateFunction1 {
-  def this() = this(null, null) // Required for serialization.
+object Last {
+  def apply(child: Expression): Last = Last(child, ignoreNulls = false)
+}
+
+case class LastFunction(
+expr: Expression,
+ignoreNulls: Boolean,
+base: AggregateExpression1)
+  extends AggregateFunction1 {
+
+  def this() = this(null, null.asInstanceOf[Boolean], null) // Required 
for seria