[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111712646
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -368,6 +369,8 @@ case class NullPropagation(conf: SQLConf) extends 
Rule[LogicalPlan] {
   case EqualNullSafe(Literal(null, _), r) => IsNull(r)
   case EqualNullSafe(l, Literal(null, _)) => IsNull(l)
 
+  case AssertNotNull(c, _) if !c.nullable => c
--- End diff --

actually, I checked all the usage of `AssertNotNull`, we never use 
`AssertNotNull` to check a not nullable column/field, seems the document of 
`AssertNotNull` is wrong. Can you double check?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17568
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75850/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17568
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17568
  
**[Test build #75850 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75850/testReport)**
 for PR 17568 at commit 
[`f695e50`](https://github.com/apache/spark/commit/f695e50e38bd329db3b75951dd7af52fea3b3dde).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111711655
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -368,6 +369,8 @@ case class NullPropagation(conf: SQLConf) extends 
Rule[LogicalPlan] {
   case EqualNullSafe(Literal(null, _), r) => IsNull(r)
   case EqualNullSafe(l, Literal(null, _)) => IsNull(l)
 
+  case AssertNotNull(c, _) if !c.nullable => c
--- End diff --

ah good catch! sorry it was my mistake, but then seems we can not remove 
`MapObjects`, as the null check have to be done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17596: [SPARK-12837][CORE] Do not send the accumulator name to ...

2017-04-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17596
  
seems this breaks python accumulator anyone know how python accumulator 
works?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-16 Thread wangmiao1981
Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/17640
  
Based on my understanding, it does not directly solvethe 12360. This one 
just solves the serialization of a specific type `bigint` in struct field. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17640: [SPARK-17608][SPARKR]:Long type has incorrect serializat...

2017-04-16 Thread wangmiao1981
Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/17640
  
For `Inf` case, I used a very large number:


1380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013
 
80742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240138074279341524013807427934152401380742793415240


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17620: [SPARK-20305][Spark Core]Master may keep in the state of...

2017-04-16 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17620
  
Execute me, Can this issue be closed or threre are some other problem? 
@jerryshao 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operations in...

2017-04-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17540
  
yea let's remove that test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIKE' patt...

2017-04-16 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/15398
  
Re-checked the current change, I think it is in a good shape. Do we have 
unsolved issues or decisions on this?

ping @jodersky Would you like to update this with master? Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17568
  
**[Test build #75850 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75850/testReport)**
 for PR 17568 at commit 
[`f695e50`](https://github.com/apache/spark/commit/f695e50e38bd329db3b75951dd7af52fea3b3dde).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111704431
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -368,6 +369,8 @@ case class NullPropagation(conf: SQLConf) extends 
Rule[LogicalPlan] {
   case EqualNullSafe(Literal(null, _), r) => IsNull(r)
   case EqualNullSafe(l, Literal(null, _)) => IsNull(l)
 
+  case AssertNotNull(c, _) if !c.nullable => c
--- End diff --

I am not sure if @cloud-fan's no-op `AssertNotNull` is as the same as the 
case in `AssertNotNull`'s description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75847/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111704129
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/objects.scala
 ---
@@ -96,3 +98,30 @@ object CombineTypedFilters extends Rule[LogicalPlan] {
 }
   }
 }
+
+/**
+ * Removes MapObjects when the following conditions are satisfied
+ *   1. Mapobject(e) where e is lambdavariable(), which means types for 
input output
+ *  are primitive types
+ *   2. no custom collection class specified
+ * representation of data item.  For example back to back map operations.
+ */
+object EliminateMapObjects extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case _ @ DeserializeToObject(Invoke(
+MapObjects(_, _, _, Cast(LambdaVariable(_, _, dataType, _), 
castDataType, _),
+  inputData, None),
+funcName, returnType: ObjectType, arguments, propagateNull, 
returnNullable),
+outputObjAttr, child) if dataType == castDataType =>
--- End diff --

I see


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17655
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17655
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75849/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111704118
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -368,6 +369,8 @@ case class NullPropagation(conf: SQLConf) extends 
Rule[LogicalPlan] {
   case EqualNullSafe(Literal(null, _), r) => IsNull(r)
   case EqualNullSafe(l, Literal(null, _)) => IsNull(l)
 
+  case AssertNotNull(c, _) if !c.nullable => c
--- End diff --

I think that this is what @cloud-fan suggested in[ his 
comment](https://github.com/apache/spark/pull/17568#discussion_r111521892).
Am my interpretation wrong?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15435
  
**[Test build #75847 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75847/testReport)**
 for PR 15435 at commit 
[`053284d`](https://github.com/apache/spark/commit/053284da60d72a79eb1f94da6d2c7dda74a21af8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17655
  
**[Test build #75849 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75849/testReport)**
 for PR 17655 at commit 
[`65b0ff7`](https://github.com/apache/spark/commit/65b0ff76a2af83053e45948d1df60092fae118fd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIK...

2017-04-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/15398#discussion_r111704017
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 ---
@@ -68,7 +68,30 @@ trait StringRegexExpression extends 
ImplicitCastInputTypes {
  * Simple RegEx pattern matching function
  */
 @ExpressionDescription(
-  usage = "str _FUNC_ pattern - Returns true if `str` matches `pattern`, 
or false otherwise.")
+  usage = "str _FUNC_ pattern - Returns true if str matches pattern, " +
+"null if any arguments are null, false otherwise.",
+  extended = """
+Arguments:
+  str - a string expression
+  pattern - a string expression. The pattern is a string which is 
matched literally, with
+exception to the following special symbols:
+
+  _ matches any one character in the input (similar to . in posix 
regular expressions)
+
+  % matches zero ore more characters in the input (similar to .* 
in posix regular
--- End diff --

ore -> or?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75846/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r111703865
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala ---
@@ -39,6 +39,32 @@ object SQLExecution {
 executionIdToQueryExecution.get(executionId)
   }
 
+  private val testing = sys.props.contains("spark.testing")
+
+  private[sql] def checkSQLExecutionId(sparkSession: SparkSession): Unit = 
{
--- End diff --

this is only called in `FileFormatWirter`, is there any other places we 
need to consider?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15435
  
**[Test build #75846 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75846/testReport)**
 for PR 15435 at commit 
[`bd40098`](https://github.com/apache/spark/commit/bd40098912e28a42e2a9011c4a5d298ca737dc69).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17540: [SPARK-20213][SQL][UI] Fix DataFrameWriter operat...

2017-04-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/17540#discussion_r111703744
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -180,9 +180,9 @@ class Dataset[T] private[sql](
 // to happen right away to let these side effects take place eagerly.
 queryExecution.analyzed match {
   case c: Command =>
-LocalRelation(c.output, 
queryExecution.executedPlan.executeCollect())
+LocalRelation(c.output, withAction("collect", 
queryExecution)(_.executeCollect()))
   case u @ Union(children) if children.forall(_.isInstanceOf[Command]) 
=>
-LocalRelation(u.output, 
queryExecution.executedPlan.executeCollect())
+LocalRelation(u.output, withAction("collect", 
queryExecution)(_.executeCollect()))
--- End diff --

shall we only add execution id for commands that will trigger execution? 
AFAIK there are 3 commands: `CreateDataSourceTableAsSelectCommand`, 
`CreateHiveTableAsSelectCommand` and `CacheTable`. We can call 
`SQLExecution.withNewExecutionId` inside these 3 commands.

Then we don't need to worry about nested execution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17655
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17655
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75848/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17655
  
**[Test build #75848 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75848/testReport)**
 for PR 17655 at commit 
[`47771e1`](https://github.com/apache/spark/commit/47771e1ce11107b62057c7bc4e9909c008b3fe58).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17641: [SPARK-20329][SQL] Make timezone aware expression...

2017-04-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17641#discussion_r111703152
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala
 ---
@@ -99,12 +99,9 @@ case class ResolveInlineTables(conf: SQLConf) extends 
Rule[LogicalPlan] {
   val castedExpr = if (e.dataType.sameType(targetType)) {
 e
   } else {
-Cast(e, targetType)
+Cast(e, targetType, Some(conf.sessionLocalTimeZone))
   }
-  castedExpr.transform {
-case e: TimeZoneAwareExpression if e.timeZoneId.isEmpty =>
-  e.withTimeZone(conf.sessionLocalTimeZone)
-  }.eval()
+  castedExpr.eval()
--- End diff --

oh, right. I saw the changes to `TimeZoneAwareExpression`. :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17641: [SPARK-20329][SQL] Make timezone aware expression...

2017-04-16 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/17641#discussion_r111702719
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala
 ---
@@ -99,12 +99,9 @@ case class ResolveInlineTables(conf: SQLConf) extends 
Rule[LogicalPlan] {
   val castedExpr = if (e.dataType.sameType(targetType)) {
 e
   } else {
-Cast(e, targetType)
+Cast(e, targetType, Some(conf.sessionLocalTimeZone))
   }
-  castedExpr.transform {
-case e: TimeZoneAwareExpression if e.timeZoneId.isEmpty =>
-  e.withTimeZone(conf.sessionLocalTimeZone)
-  }.eval()
+  castedExpr.eval()
--- End diff --

I guess now that `TimeZoneAwareExpression` is resolved if it has 
`timeZoneId`, so we don't need to   transform child.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17623: [SPARK-20292][SQL] Clean up string representation of Tre...

2017-04-16 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17623
  
> What are the external impacts of these changes? Which commands are 
impacted?

This patch mainly cleans up the definition of two string representation 
methods: `simpleString`, `verboseString`. `simpleString` doesn't show argument 
info anymore. `verboseString` doesn't show children info anymore.
 
Due to above change, `Expression.treeString` is changed too. Previously we 
show duplicate children information, like the example shown in the pr 
description. Now the children info is shown only once as tree representation.

We don't have too much similar mess in `QueryPlan`. Previously, 
`QueryPlan.verboseString` is the alias of `QueryPlan.simpleString`. Following 
the definition above, now `QueryPlan.simpleString` shows simple string 
representation without argument info.

After this patch, in order to know the arguments of an expression/query 
plan, an user should use `verboseString`, instead of `simpleString`. In order 
to know the children of an expression/query plan, an user should use 
`treeString`, instead of `verboseString`.

I think the only one command uses those string representations is 
`explain`. This patch won't cause change to its output.






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17641: [SPARK-20329][SQL] Make timezone aware expression...

2017-04-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17641#discussion_r111701221
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala
 ---
@@ -99,12 +99,9 @@ case class ResolveInlineTables(conf: SQLConf) extends 
Rule[LogicalPlan] {
   val castedExpr = if (e.dataType.sameType(targetType)) {
 e
   } else {
-Cast(e, targetType)
+Cast(e, targetType, Some(conf.sessionLocalTimeZone))
   }
-  castedExpr.transform {
-case e: TimeZoneAwareExpression if e.timeZoneId.isEmpty =>
-  e.withTimeZone(conf.sessionLocalTimeZone)
-  }.eval()
+  castedExpr.eval()
--- End diff --

If there are nested expressions which are timezone aware, I think we still 
need to attach time zone to them?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17149: [SPARK-19257][SQL]location for table/partition/database ...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17149
  
@gatorsmile, Thanks for your pointer. There is a good discussion there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17149: [SPARK-19257][SQL]location for table/partition/database ...

2017-04-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17149
  
Our parser might need a change regarding escape handling. We are having a 
related discussion in another PR: https://github.com/apache/spark/pull/15398


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-04-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17644
  
I'll review it after branch 2.2 is cut


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111699305
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/objects.scala
 ---
@@ -96,3 +98,30 @@ object CombineTypedFilters extends Rule[LogicalPlan] {
 }
   }
 }
+
+/**
+ * Removes MapObjects when the following conditions are satisfied
+ *   1. Mapobject(e) where e is lambdavariable(), which means types for 
input output
+ *  are primitive types
+ *   2. no custom collection class specified
+ * representation of data item.  For example back to back map operations.
--- End diff --

Is this comment broken?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111699178
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -368,6 +369,8 @@ case class NullPropagation(conf: SQLConf) extends 
Rule[LogicalPlan] {
   case EqualNullSafe(Literal(null, _), r) => IsNull(r)
   case EqualNullSafe(l, Literal(null, _)) => IsNull(l)
 
+  case AssertNotNull(c, _) if !c.nullable => c
--- End diff --

Is this safe to do? According to the description of `AssertNotNull`, even 
`c` is non-nullable, we still need to add this assertion for some cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17375: [SPARK-19019][PYTHON][BRANCH-1.6] Fix hijacked `collecti...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17375
  
gentle ping ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17622: [SPARK-20300][ML][PYSPARK] Python API for ALSModel.recom...

2017-04-16 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/17622
  
LGTM except for a doc comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17622: [SPARK-20300][ML][PYSPARK] Python API for ALSMode...

2017-04-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/17622#discussion_r111698372
  
--- Diff: python/pyspark/ml/recommendation.py ---
@@ -384,6 +392,28 @@ def itemFactors(self):
 """
 return self._call_java("itemFactors")
 
+@since("2.2.0")
+def recommendForAllUsers(self, numItems):
+"""
+Returns top `numItems` items recommended for each user, for all 
users.
+
+:param numItems: max number of recommendations for each user
+:return: a DataFrame of (userCol, recommendations), where 
recommendations are
+ stored as an array of (itemCol, rating) Rows.
+"""
+return self._call_java("recommendForAllUsers", numItems)
+
+@since("2.2.0")
+def recommendForAllItems(self, numUsers):
+"""
+Returns top `numUsers` users recommended for each item, for all 
items.
+
+:param numItems: max number of recommendations for each item
--- End diff --

numItems -> numUsers


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111697961
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/objects.scala
 ---
@@ -96,3 +98,30 @@ object CombineTypedFilters extends Rule[LogicalPlan] {
 }
   }
 }
+
+/**
+ * Removes MapObjects when the following conditions are satisfied
+ *   1. Mapobject(e) where e is lambdavariable(), which means types for 
input output
+ *  are primitive types
+ *   2. no custom collection class specified
+ * representation of data item.  For example back to back map operations.
+ */
+object EliminateMapObjects extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case _ @ DeserializeToObject(Invoke(
+MapObjects(_, _, _, Cast(LambdaVariable(_, _, dataType, _), 
castDataType, _),
+  inputData, None),
+funcName, returnType: ObjectType, arguments, propagateNull, 
returnNullable),
+outputObjAttr, child) if dataType == castDataType =>
+  DeserializeToObject(Invoke(
+inputData, funcName, returnType, arguments, propagateNull, 
returnNullable),
+outputObjAttr, child)
+case _ @ DeserializeToObject(Invoke(
+MapObjects(_, _, _, LambdaVariable(_, _, dataType, _), inputData, 
None),
--- End diff --

Ok, for safety, we can keep it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111697946
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/objects.scala
 ---
@@ -96,3 +98,30 @@ object CombineTypedFilters extends Rule[LogicalPlan] {
 }
   }
 }
+
+/**
+ * Removes MapObjects when the following conditions are satisfied
+ *   1. Mapobject(e) where e is lambdavariable(), which means types for 
input output
+ *  are primitive types
+ *   2. no custom collection class specified
+ * representation of data item.  For example back to back map operations.
+ */
+object EliminateMapObjects extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case _ @ DeserializeToObject(Invoke(
+MapObjects(_, _, _, Cast(LambdaVariable(_, _, dataType, _), 
castDataType, _),
+  inputData, None),
+funcName, returnType: ObjectType, arguments, propagateNull, 
returnNullable),
+outputObjAttr, child) if dataType == castDataType =>
--- End diff --

The order does not matter. The batch will be run multiple times. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17655
  
**[Test build #75849 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75849/testReport)**
 for PR 17655 at commit 
[`65b0ff7`](https://github.com/apache/spark/commit/65b0ff76a2af83053e45948d1df60092fae118fd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17655
  
**[Test build #75848 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75848/testReport)**
 for PR 17655 at commit 
[`47771e1`](https://github.com/apache/spark/commit/47771e1ce11107b62057c7bc4e9909c008b3fe58).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111697079
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/objects.scala
 ---
@@ -96,3 +98,30 @@ object CombineTypedFilters extends Rule[LogicalPlan] {
 }
   }
 }
+
+/**
+ * Removes MapObjects when the following conditions are satisfied
+ *   1. Mapobject(e) where e is lambdavariable(), which means types for 
input output
+ *  are primitive types
+ *   2. no custom collection class specified
+ * representation of data item.  For example back to back map operations.
+ */
+object EliminateMapObjects extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case _ @ DeserializeToObject(Invoke(
+MapObjects(_, _, _, Cast(LambdaVariable(_, _, dataType, _), 
castDataType, _),
+  inputData, None),
+funcName, returnType: ObjectType, arguments, propagateNull, 
returnNullable),
+outputObjAttr, child) if dataType == castDataType =>
+  DeserializeToObject(Invoke(
+inputData, funcName, returnType, arguments, propagateNull, 
returnNullable),
+outputObjAttr, child)
+case _ @ DeserializeToObject(Invoke(
+MapObjects(_, _, _, LambdaVariable(_, _, dataType, _), inputData, 
None),
--- End diff --

As @cloud-fan pointed out in [this 
comment](https://github.com/apache/spark/pull/17568#discussion_r110510575) , it 
is necessary. `customCollectionCls` is introduced by #16541.
This is not equal to `None` when `Seq()` is used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase ...

2017-04-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17655
  
cc @srowen @HyukjinKwon @cloud-fan @nihavend 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17655: [SPARK-20156] [SQL] [FOLLOW-UP] Java String toLow...

2017-04-16 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/17655

[SPARK-20156] [SQL] [FOLLOW-UP] Java String toLowerCase "Turkish locale 
bug" in Database and Table DDLs

### What changes were proposed in this pull request?
Database and Table names conform the Hive standard ("[a-zA-z_0-9]+"), i.e. 
if this name only contains characters, numbers, and _. 

When calling `toLowerCase` on the names, we should add `Locale.ROOT` to the 
`toLowerCase`for avoiding inadvertent locale-sensitive variation in behavior 
(aka the "Turkish locale problem").

### How was this patch tested?
Added a test case

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark locale

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17655.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17655


commit 47771e1ce11107b62057c7bc4e9909c008b3fe58
Author: Xiao Li 
Date:   2017-04-17T01:33:54Z

fix.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15435
  
**[Test build #75847 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75847/testReport)**
 for PR 15435 at commit 
[`053284d`](https://github.com/apache/spark/commit/053284da60d72a79eb1f94da6d2c7dda74a21af8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17568: [SPARK-20254][SQL] Remove unnecessary data conver...

2017-04-16 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/17568#discussion_r111696732
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/objects.scala
 ---
@@ -96,3 +98,30 @@ object CombineTypedFilters extends Rule[LogicalPlan] {
 }
   }
 }
+
+/**
+ * Removes MapObjects when the following conditions are satisfied
+ *   1. Mapobject(e) where e is lambdavariable(), which means types for 
input output
+ *  are primitive types
+ *   2. no custom collection class specified
+ * representation of data item.  For example back to back map operations.
+ */
+object EliminateMapObjects extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case _ @ DeserializeToObject(Invoke(
+MapObjects(_, _, _, Cast(LambdaVariable(_, _, dataType, _), 
castDataType, _),
+  inputData, None),
+funcName, returnType: ObjectType, arguments, propagateNull, 
returnNullable),
+outputObjAttr, child) if dataType == castDataType =>
--- End diff --

For now, as you pointed out, `Cast` has been removed by `SimplifyCasts`.
I leave this for robustness. In the future, this optimization will be 
executed before `SimplifyCasts` by reordering.
What do you think? cc: @cloud-fan


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17649: [SPARK-20023][SQL][follow up] Output table commen...

2017-04-16 Thread sujith71955
Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17649#discussion_r111696709
  
--- Diff: 
sql/core/src/test/resources/sql-tests/inputs/describe_tbleproperty_validation.sql
 ---
@@ -0,0 +1,24 @@
+CREATE TABLE table_with_comment (a STRING, b INT) COMMENT 'actual comment';
+
+DESC formatted table_with_comment;
+
+-- ALTER TABLE BY MODIFYING COMMENT
+ALTER TABLE table_with_comment set tblproperties(comment = "modified 
comment");
+
+DESC formatted table_with_comment;
+
+-- DROP TEST TABLE
+DROP TABLE table_with_comment;
+
+-- CREATE TABLE WITHOUT COMMENT
+CREATE TABLE table_comment (a STRING, b INT);
+
+DESC formatted table_comment;
+
+-- ALTER TABLE BY ADDING COMMENT
+ALTER TABLE table_comment set tblproperties(comment = "added comment");
+
+DESC formatted table_comment;
+
+-- DROP TEST TABLE
+DROP TABLE table_comment;
--- End diff --

sure, i will add a new jira for this problem and i will update the test 
suite name as per the suggestion, as you suggested i will  also verify ALTER 
TABLE UNSET .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15435
  
@sethah Thanks! I have merged your updates and fix mima file conflicts.
@yanboliang has just come back from trip and will help review and merge it 
into 2.2 so don't worry about it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17649: [SPARK-20023][SQL][follow up] Output table commen...

2017-04-16 Thread sujith71955
Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/spark/pull/17649#discussion_r111696418
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -232,7 +232,9 @@ case class AlterTableSetPropertiesCommand(
 val table = catalog.getTableMetadata(tableName)
 DDLUtils.verifyAlterTableType(catalog, table, isView)
 // This overrides old properties
-val newTable = table.copy(properties = table.properties ++ properties)
+val newTable = table.copy(
+  properties = table.properties ++ properties,
+  comment = properties.get("comment"))
--- End diff --

I will add a comment


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15435
  
**[Test build #75846 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75846/testReport)**
 for PR 15435 at commit 
[`bd40098`](https://github.com/apache/spark/commit/bd40098912e28a42e2a9011c4a5d298ca737dc69).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17654: [SPARK-20351] [ML] Add trait hasTrainingSummary to repla...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17654
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17654: [SPARK-20351] [ML] Add trait hasTrainingSummary to repla...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17654
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75845/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17654: [SPARK-20351] [ML] Add trait hasTrainingSummary to repla...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17654
  
**[Test build #75845 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75845/testReport)**
 for PR 17654 at commit 
[`3bca3b1`](https://github.com/apache/spark/commit/3bca3b1429fe6da01b17c74634952009250457da).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17651
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75843/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17651
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17651
  
**[Test build #75843 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75843/testReport)**
 for PR 17651 at commit 
[`0031804`](https://github.com/apache/spark/commit/00318043d0a5c6d1eb1404402fc390904d2ba2dd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17654: [SPARK-20351] [ML] Add trait hasTrainingSummary to repla...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17654
  
**[Test build #75845 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75845/testReport)**
 for PR 17654 at commit 
[`3bca3b1`](https://github.com/apache/spark/commit/3bca3b1429fe6da01b17c74634952009250457da).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17654: [SPARK-20351] [ML] Add trait hasTrainingSummary t...

2017-04-16 Thread hhbyyh
GitHub user hhbyyh opened a pull request:

https://github.com/apache/spark/pull/17654

[SPARK-20351] [ML] Add trait hasTrainingSummary to replace the duplicate 
code

## What changes were proposed in this pull request?

Add a trait HasTrainingSummary to avoid code duplicate related to training 
summary.


## How was this patch tested?

existing Java and Scala unit tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hhbyyh/spark hassummary

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17654.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17654


commit 3bca3b1429fe6da01b17c74634952009250457da
Author: Yuhao Yang 
Date:   2017-04-16T23:29:27Z

has summary trait




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17527
  
Yes. The codes have the bug. For example, when the locale is TR, users are 
unable to create a table with a table name containing `I`. This does not make 
sense to me. I believe we have more issues like this. I can submit a PR to fix 
this, but I do not think this is the only one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16722: [SPARK-19591][ML][MLlib] Add sample weights to decision ...

2017-04-16 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/16722
  
Btw, I've been working on this and just posted some thoughts about one 
design choice here: https://issues.apache.org/jira/browse/SPARK-9478


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17653: [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17653
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17653: [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17653
  
**[Test build #75844 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75844/testReport)**
 for PR 17653 at commit 
[`17d8190`](https://github.com/apache/spark/commit/17d819022de875777e158e94ad3ef1c8d6d2f3aa).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17653: [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17653
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75844/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17652: [SPARK-20335] [SQL] [BACKPORT-2.1] Children expressions ...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17652
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75842/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17652: [SPARK-20335] [SQL] [BACKPORT-2.1] Children expressions ...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17652
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17652: [SPARK-20335] [SQL] [BACKPORT-2.1] Children expressions ...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17652
  
**[Test build #75842 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75842/testReport)**
 for PR 17652 at commit 
[`68d1e4d`](https://github.com/apache/spark/commit/68d1e4d47c6479e899e59777c7a6e86f2d6e75dd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-04-16 Thread sethah
Github user sethah commented on the issue:

https://github.com/apache/spark/pull/15435
  
@WeichenXu123 I made a PR to your branch. Can you check it? I think you'll 
still need to update the Mima file. Also, this may not make 2.2, so then you'd 
have to update the since tags.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17653: [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17653
  
**[Test build #75844 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75844/testReport)**
 for PR 17653 at commit 
[`17d8190`](https://github.com/apache/spark/commit/17d819022de875777e158e94ad3ef1c8d6d2f3aa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17653: [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17653
  
cc @felixcheung, this simply renames it to `as.json.array`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17653: [SPARK-19828][R][FOLLOWUP] Rename asJsonArray to ...

2017-04-16 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/17653

[SPARK-19828][R][FOLLOWUP] Rename asJsonArray to as.json.array in from_json 
function in R

## What changes were proposed in this pull request?

This was suggested to be `as.json.array` at the first place in the PR to 
SPARK-19828 but we could not do this as the lint check emits an error for 
multiple dots in the variable names.

After SPARK-20278, now we are able to use `multiple.dots.in.names`. 
`asJsonArray` in `from_json` function is still able to be changed as 2.2 is not 
released yet.

So, this PR proposes to rename `asJsonArray` to `as.json.array`.

## How was this patch tested?

Jenkins tests, local tests with `./R/run-tests.sh` and manual 
`./dev/lint-r`. Existing tests should cover this.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-19828-followup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17653.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17653


commit 17d819022de875777e158e94ad3ef1c8d6d2f3aa
Author: hyukjinkwon 
Date:   2017-04-16T21:39:28Z

Rename asJsonArray to as.json.array in from_json function in R




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17651
  
**[Test build #75843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75843/testReport)**
 for PR 17651 at commit 
[`0031804`](https://github.com/apache/spark/commit/00318043d0a5c6d1eb1404402fc390904d2ba2dd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17651
  
cc @srowen, could you check if it makes sense to you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17650: [SPARK-20350] Add optimization rules to apply Com...

2017-04-16 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/17650#discussion_r111692337
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -153,6 +153,11 @@ object BooleanSimplification extends Rule[LogicalPlan] 
with PredicateHelper {
   case TrueLiteral Or _ => TrueLiteral
   case _ Or TrueLiteral => TrueLiteral
 
+  case a And b if Not(a).semanticEquals(b) => FalseLiteral
+  case a Or b if Not(a).semanticEquals(b) => TrueLiteral
+  case a And b if a.semanticEquals(Not(b)) => FalseLiteral
--- End diff --

I meant something like this for `Not`:

```
  override def semanticEquals(other: Expression): Boolean = other match {
case Not(otherChild) => child.semanticEquals(otherChild)
case _ => child match {
  case Not(innerChild) =>
// eliminate double negation
innerChild.semanticEquals(other)
  case _ =>
super.semanticEquals(other)
}
  }
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17650: [SPARK-20350] Add optimization rules to apply Com...

2017-04-16 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/17650#discussion_r111692327
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/BooleanSimplificationSuite.scala
 ---
@@ -160,4 +166,12 @@ class BooleanSimplificationSuite extends PlanTest with 
PredicateHelper {
   testRelation.where('a > 2 || ('b > 3 && 'b < 5)))
 comparePlans(actual, expected)
   }
+
+  test("Complementation Laws") {
--- End diff --

How about double negation ? ie. `'a && !(!'a)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17590: [SPARK-20278][R] Disable 'multiple_dots_linter' lint rul...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17590
  
Sure, thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17527
  
Ah, sorry, it was only about fixing tests. I thought we have bugs in the 
main codes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17650: [SPARK-20350] Add optimization rules to apply Com...

2017-04-16 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/17650#discussion_r111692175
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
 ---
@@ -153,6 +153,11 @@ object BooleanSimplification extends Rule[LogicalPlan] 
with PredicateHelper {
   case TrueLiteral Or _ => TrueLiteral
   case _ Or TrueLiteral => TrueLiteral
 
+  case a And b if Not(a).semanticEquals(b) => FalseLiteral
+  case a Or b if Not(a).semanticEquals(b) => TrueLiteral
+  case a And b if a.semanticEquals(Not(b)) => FalseLiteral
--- End diff --

Logically it feels like duplication of code from line 156 ... but 
unfortunately `Not` is not smart enough to realise that. I think if you 
override the `semanticEquals` in `Not` then you should be able to get rid of 
this line. The advantage being we would make the expression smart enough to 
figure this out by itself rather than handling this in outside code (which is 
possibly more places in the code).

Same applies for line 159.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17651
  
I left a uesless comment and removed it back (I misunderstood). Yes, I will 
add a small comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17652: [SPARK-20335] [SQL] [BACKPORT-2.1] Children expressions ...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17652
  
**[Test build #75842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75842/testReport)**
 for PR 17652 at commit 
[`68d1e4d`](https://github.com/apache/spark/commit/68d1e4d47c6479e899e59777c7a6e86f2d6e75dd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17651
  
Yea, pom was the first try and it was kind if a failed. Please check out 
the discussion in https://github.com/apache/spark/pull/17642


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17652: [SPARK-20335] [SQL] [BACKPORT-2.1] Children expre...

2017-04-16 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/17652

[SPARK-20335] [SQL] [BACKPORT-2.1] Children expressions of Hive UDF impacts 
the determinism of Hive UDF

### What changes were proposed in this pull request?

This PR is to backport https://github.com/apache/spark/pull/17635 to Spark 
2.1

---
```JAVA
  /**
   * Certain optimizations should not be applied if UDF is not 
deterministic.
   * Deterministic UDF returns same result each time it is invoked with a
   * particular input. This determinism just needs to hold within the 
context of
   * a query.
   *
   * @return true if the UDF is deterministic
   */
  boolean deterministic() default true;
```

Based on the definition of 
[UDFType](https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java#L42-L50),
 when Hive UDF's children are non-deterministic, Hive UDF is also 
non-deterministic.

### How was this patch tested?
Added test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark backport-17635

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17652.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17652


commit 68d1e4d47c6479e899e59777c7a6e86f2d6e75dd
Author: Xiao Li 
Date:   2017-04-16T20:34:34Z

fix.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17651: [SPARK-20343][BUILD] Force Avro 1.7.7 in sbt build to re...

2017-04-16 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/17651
  
perhaps have a reference in pom.xml to this so they both change together 
the next time?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17633: [SPARK-20331][SQL] Enhanced Hive partition prunin...

2017-04-16 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/17633#discussion_r111691537
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
@@ -589,18 +590,34 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
 col.getType.startsWith(serdeConstants.CHAR_TYPE_NAME))
   .map(col => col.getName).toSet
 
-filters.collect {
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: 
IntegralType)) =>
-s"${a.name} ${op.symbol} $v"
-  case op @ BinaryComparison(Literal(v, _: IntegralType), a: 
Attribute) =>
-s"$v ${op.symbol} ${a.name}"
-  case op @ BinaryComparison(a: Attribute, Literal(v, _: StringType))
-  if !varcharKeys.contains(a.name) =>
-s"""${a.name} ${op.symbol} ${quoteStringLiteral(v.toString)}"""
-  case op @ BinaryComparison(Literal(v, _: StringType), a: Attribute)
-  if !varcharKeys.contains(a.name) =>
-s"""${quoteStringLiteral(v.toString)} ${op.symbol} ${a.name}"""
-}.mkString(" and ")
+def isFoldable(expr: Expression): Boolean =
+  (expr.dataType.isInstanceOf[IntegralType] || 
expr.dataType.isInstanceOf[StringType]) &&
--- End diff --

Can this support all `AtomicType`'s ? From my understanding these are 
partition columns and can support other types besides int and string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17524: [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Test Cases...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17524
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75841/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17524: [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Test Cases...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17524
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17524: [SPARK-19235] [SQL] [TEST] [FOLLOW-UP] Enable Test Cases...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17524
  
**[Test build #75841 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75841/testReport)**
 for PR 17524 at commit 
[`427741f`](https://github.com/apache/spark/commit/427741f548ff4469d62906546655f7ec96564ced).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class JavaImputerExample `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17527
  
Yes you have a point. It is minor in that it is just a test that is now 
locale sensitive and supporting the locale in tests is much less important. 
However ideally whatever fails should be fixed as I suspect it would be some 
trivial piece we missed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17527
  
Sorry, my previous comment is to @HyukjinKwon 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-04-16 Thread tejasapatil
Github user tejasapatil commented on the issue:

https://github.com/apache/spark/pull/17644
  
cc @cloud-fan @hvanhovell @sameeragarwal for review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-16 Thread nihavend
Github user nihavend commented on the issue:

https://github.com/apache/spark/pull/17527
  
 maybe


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17650: [SPARK-20350] Add optimization rules to apply Complement...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17650
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75839/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17650: [SPARK-20350] Add optimization rules to apply Complement...

2017-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17650
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17650: [SPARK-20350] Add optimization rules to apply Complement...

2017-04-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17650
  
**[Test build #75839 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75839/testReport)**
 for PR 17650 at commit 
[`688b2f0`](https://github.com/apache/spark/commit/688b2f0696f1d1d867e872f43506e52f95f46362).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17557: [SPARK-20208][R][DOCS] Document R fpGrowth suppor...

2017-04-16 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17557#discussion_r111690507
  
--- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
@@ -906,6 +910,37 @@ predicted <- predict(model, df)
 head(predicted)
 ```
 
+ FP-growth
+
+`spark.fpGrowth` executes FP-growth algorithm to mine frequent itemsets on 
a `SparkDataFrame`. `itemsCol` should be an array of values.
+
+```{r}
+items <- selectExpr(createDataFrame(data.frame(items = c(
+  "T,R,U", "T,S", "V,R", "R,U,T,V", "R,S", "V,S,U", "U,R", "S,T", "V,R", 
"V,U,S",
+  "T,V,U", "R,V", "T,S", "T,S", "S,T", "S,U", "T,R", "V,R", "S,V", "T,S,U"
+))), "split(items, ',') AS items")
--- End diff --

perhaps it's slightly less clear, since there are 3 references to "items" 
(or really, just the SparkDataFrame and its column name), which "items" L923 is 
referring to?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17557: [SPARK-20208][R][DOCS] Document R fpGrowth suppor...

2017-04-16 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/17557#discussion_r111690515
  
--- Diff: R/pkg/vignettes/sparkr-vignettes.Rmd ---
@@ -906,6 +910,37 @@ predicted <- predict(model, df)
 head(predicted)
 ```
 
+ FP-growth
+
+`spark.fpGrowth` executes FP-growth algorithm to mine frequent itemsets on 
a `SparkDataFrame`. `itemsCol` should be an array of values.
+
+```{r}
+items <- selectExpr(createDataFrame(data.frame(items = c(
+  "T,R,U", "T,S", "V,R", "R,U,T,V", "R,S", "V,S,U", "U,R", "S,T", "V,R", 
"V,U,S",
+  "T,V,U", "R,V", "T,S", "T,S", "S,T", "S,U", "T,R", "V,R", "S,V", "T,S,U"
+))), "split(items, ',') AS items")
--- End diff --

I like the approach you have there

https://github.com/apache/spark/pull/17557/files#diff-1d0d34d8ea18a9340f0a02c6befe6269R30



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >