date:20160627

[GitHub] spark issue #13378: [SPARK-15643] [Doc] [ML] Update spark.ml and spark.mllib...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13378
  
**[Test build #61364 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61364/consoleFull)**
 for PR 13378 at commit 
[`5472fb9`](https://github.com/apache/spark/commit/5472fb9e4d1158644c0c4fc22cc02083acc4576f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13906
  
**[Test build #61363 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61363/consoleFull)**
 for PR 13906 at commit 
[`c06ae60`](https://github.com/apache/spark/commit/c06ae6011985d3c839bea372333b0d5a6491f55d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13931
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13931
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61360/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13931
  
**[Test build #61360 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61360/consoleFull)**
 for PR 13931 at commit 
[`793afb9`](https://github.com/apache/spark/commit/793afb91f5e573e40d78bb2aa8a9bf89154396f2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class HiveSparkSubmitTests(SparkSubmitTests):`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13680
  
**[Test build #61362 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61362/consoleFull)**
 for PR 13680 at commit 
[`9c113aa`](https://github.com/apache/spark/commit/9c113aa6e0a914ce8dfa571df68db603e3e42140).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13758
  
**[Test build #61361 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61361/consoleFull)**
 for PR 13758 at commit 
[`280d97e`](https://github.com/apache/spark/commit/280d97e718f5e5ac2b1cbf6628905fe63c2334ad).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13940: [SPARK-16241] [ML] model loading backward compatibility ...

2016-06-27 Thread hhbyyh

Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/13940
  
LGTM. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13906#discussion_r68703430
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala
 ---
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest}
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class CollapseEmptyPlanSuite extends PlanTest {
--- End diff --

I'll update to have more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13858: [SPARK-16148] [Scheduler] Allow for underscores in TaskL...

2016-06-27 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/13858
  
I had an outstanding comment from the previous PR too: 
https://github.com/apache/spark/pull/13857#discussion_r68134544


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68702737
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

Thank you, @hvanhovell .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13940: [SPARK-16241] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13940
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13940: [SPARK-16241] [ML] model loading backward compati...

2016-06-27 Thread zlpmichelle

GitHub user zlpmichelle opened a pull request:

https://github.com/apache/spark/pull/13940

[SPARK-16241] [ML] model loading backward compatibility for ml NaiveBayes 
#16241

## What changes were proposed in this pull request?

model loading backward compatibility for ml NaiveBayes


## How was this patch tested?

existing ut and manual test for loading models saved by Spark 1.6.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zlpmichelle/spark naivebayes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13940.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13940


commit c957730e60cc237ce684a94e0b4867ebadd938c7
Author: zlpmichelle 
Date:   2016-06-28T06:00:30Z

[SPARK-16241] [ML] model loading backward compatibility for ml NaiveBayes 
#16241




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13903: [SPARK-16202] [SQL] [DOC] Correct The Description...

2016-06-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13903


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13903: [SPARK-16202] [SQL] [DOC] Correct The Description of Cre...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13903
  
Merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68702096
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

@dongjoon-hyun Nevermind. We use the datatypes of the arguments passed to 
the HiveUDF/UDAF/UDFT to determine which object inspectors to use for 
conversion. So there is no way we can fix this using `ExpectsInputTypes`; sorry 
about the confusion...

We have only changed the default datatype for decimal conversion, so your I 
guess your fix is ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
Anyway, thank you for review again, @rxin !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13938
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13938
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61355/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13938
  
**[Test build #61355 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61355/consoleFull)**
 for PR 13938 at commit 
[`7455a49`](https://github.com/apache/spark/commit/7455a4925ea0f859ea3978930f03e972a7e07929).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13906#discussion_r68701978
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1053,6 +1055,41 @@ object PruneFilters extends Rule[LogicalPlan] with 
PredicateHelper {
 }
 
 /**
+ * Collapse plans consisting all empty local relations generated by 
[[PruneFilters]].
+ * Note that the ObjectProducer/Consumer and direct aggregations are the 
exceptions.
+ * {{{
+ *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty 
result
+ *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY 
a (Not optimized)
+ * }}}
+ */
+object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
+  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
+plan.isInstanceOf[LocalRelation] && 
plan.asInstanceOf[LocalRelation].data.isEmpty
+
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
+case x if x.isInstanceOf[ObjectProducer] || 
x.isInstanceOf[ObjectConsumer] => x
+
+// Case 1: If groupingExpressions contains all aggregation 
expressions, the result is empty.
+case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && 
ae.forall(ge.contains(_)) =>
+  LocalRelation(a.output, data = Seq.empty)
+
+// Case 2: General aggregations can generate non-empty results.
+case a: Aggregate => a
+
+// Case 3: The following non-leaf plans having only empty relations 
return empty results.
+case p: LogicalPlan if p.children.nonEmpty && 
p.children.forall(isEmptyLocalRelation) =>
+  p match {
+case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
--- End diff --

Yep, right! I'll add them explicitly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13906#discussion_r68701768
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1053,6 +1055,41 @@ object PruneFilters extends Rule[LogicalPlan] with 
PredicateHelper {
 }
 
 /**
+ * Collapse plans consisting all empty local relations generated by 
[[PruneFilters]].
+ * Note that the ObjectProducer/Consumer and direct aggregations are the 
exceptions.
+ * {{{
+ *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty 
result
+ *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY 
a (Not optimized)
+ * }}}
+ */
+object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
+  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
+plan.isInstanceOf[LocalRelation] && 
plan.asInstanceOf[LocalRelation].data.isEmpty
+
+  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
+case x if x.isInstanceOf[ObjectProducer] || 
x.isInstanceOf[ObjectConsumer] => x
+
+// Case 1: If groupingExpressions contains all aggregation 
expressions, the result is empty.
+case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && 
ae.forall(ge.contains(_)) =>
+  LocalRelation(a.output, data = Seq.empty)
+
+// Case 2: General aggregations can generate non-empty results.
+case a: Aggregate => a
+
+// Case 3: The following non-leaf plans having only empty relations 
return empty results.
+case p: LogicalPlan if p.children.nonEmpty && 
p.children.forall(isEmptyLocalRelation) =>
+  p match {
+case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
--- End diff --

actually for intersect you only need one child to be empty

for join if it is inner join you just need one child to be empty too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13906#discussion_r68701793
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala
 ---
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest}
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class CollapseEmptyPlanSuite extends PlanTest {
--- End diff --

Ur, any other scenario except the existing followings?
- test("one non-empty local relation")
- test("one non-empty and one empty local relations")
- test("aggregating expressions on empty plan")


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13931
  
**[Test build #61360 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61360/consoleFull)**
 for PR 13931 at commit 
[`793afb9`](https://github.com/apache/spark/commit/793afb91f5e573e40d78bb2aa8a9bf89154396f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13906#discussion_r68701620
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala
 ---
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.Row
+import org.apache.spark.sql.catalyst.dsl.expressions._
+import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest}
+import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, 
LogicalPlan}
+import org.apache.spark.sql.catalyst.rules.RuleExecutor
+
+class CollapseEmptyPlanSuite extends PlanTest {
--- End diff --

you should test something that shouldn't have been converted too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
Hi, @rxin .
I just remembered this PR while looking your whitelist PR. :)
Any advice for this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61356/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13939: [SPARK-16248][SQL] Whitelist the list of Hive fallback f...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13939
  
**[Test build #61358 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61358/consoleFull)**
 for PR 13939 at commit 
[`ef5db42`](https://github.com/apache/spark/commit/ef5db42b6630c7c891c9f0e5252daf4a37ddca91).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11863
  
**[Test build #61359 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61359/consoleFull)**
 for PR 11863 at commit 
[`db95290`](https://github.com/apache/spark/commit/db9529066e9c9dab145f09f2332284f6869ed312).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13939#discussion_r68701105
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -221,4 +214,18 @@ private[sql] class HiveSessionCatalog(
 }
 }
   }
+
+  /** List of functions we pass over to Hive. Note that over time this 
list should go to 0. */
+  // We have a list of Hive built-in functions that we do not support. So, 
we will check
+  // Hive's function registry and lazily load needed functions into our 
own function registry.
+  // Those Hive built-in functions are
+  // compute_stats, context_ngrams, create_union,
+  // current_user ,elt, ewah_bitmap, ewah_bitmap_and, ewah_bitmap_empty, 
ewah_bitmap_or, field,
+  // histogram_numeric, in_file, index, inline, java_method, map_keys, 
map_values,
+  // matchpath, ngrams, noop, noopstreaming, noopwithmap, 
noopwithmapstreaming,
+  // parse_url, parse_url_tuple, percentile, percentile_approx, 
posexplode, reflect, reflect2,
+  // regexp, sentences, stack, std, str_to_map, windowingtablefunction, 
xpath, xpath_boolean,
+  // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
+  // xpath_short, and xpath_string.
+  private val hiveFunctions = Seq("percentile", "percentile_approx")
--- End diff --

Oh.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread mengxr

Github user mengxr commented on the issue:

https://github.com/apache/spark/pull/13921
  
I think the error was because this PR left `predict`, `write.ml`, etc 
documented without title. So this PR has to be combined with SPARK-16144. 
Basically, let us add some doc to the function declarations under `generics.R`.

cc: @yinxusen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61356 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61356/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700984
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

Oh, @rxin . I misunderstood your question. Yes. We don't register the hive 
function before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...

2016-06-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13939#discussion_r68700956
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -162,17 +162,6 @@ private[sql] class HiveSessionCatalog(
 }
   }
 
-  // We have a list of Hive built-in functions that we do not support. So, 
we will check
-  // Hive's function registry and lazily load needed functions into our 
own function registry.
-  // Those Hive built-in functions are
-  // assert_true, collect_list, collect_set, compute_stats, 
context_ngrams, create_union,
--- End diff --

assert_true, collect_list, collect_set are supported already


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...

2016-06-27 Thread rxin

GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/13939

[SPARK-16248][SQL] Whitelist the list of Hive fallback functions - WIP

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)


## How was this patch tested?
N/A



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark hive-whitelist

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13939.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13939


commit ef5db42b6630c7c891c9f0e5252daf4a37ddca91
Author: Reynold Xin 
Date:   2016-06-28T05:53:22Z

[SPARK-16248][SQL] Whitelist the list of Hive fallback functions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61351 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61351/consoleFull)**
 for PR 13937 at commit 
[`ce04e08`](https://github.com/apache/spark/commit/ce04e08e5fff17ecdf47a1934ae8a453d051b67e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61357 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61357/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61351/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700695
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

I mean we need to call `createTempFunction` with `double` children instead 
of `decimal` children.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700636
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

@rxin . Actually, we do `createTempFunction` for the hive function on the 
fly but with **different** signature (Decimal).
`makeFunctionBuilder` indeed uses `children` implicitly. That's the reason 
why I rename `lookupFunction` into `subLookupFunction` and repeats the same 
process with different children.
```
  val builder = makeFunctionBuilder(functionName, className)
  // Put this Hive built-in function to our function registry.
  val info = new ExpressionInfo(className, functionName)
  createTempFunction(functionName, info, builder, ignoreIfExists = 
false)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61356 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61356/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
Again, I think the error message is not related with this change. I will 
retest this and meanwhile try to build in my local.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL pro...

2016-06-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13938


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13938
  
LGTM - merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13938
  
**[Test build #61355 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61355/consoleFull)**
 for PR 13938 at commit 
[`7455a49`](https://github.com/apache/spark/commit/7455a4925ea0f859ea3978930f03e972a7e07929).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9183: [SPARK-11215] [ML] Add multiple columns support to String...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/9183
  
I think @yanboliang just need to push this forward and get people to review 
it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700193
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

For the following opinion, I think that is the exact same way of the Spark 
1.6 and previous. I think that is not a problem.
> this will fail again as soon as we pass in an argument with a slightly 
different value


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13839: [SPARK-16128][SQL] Allow setting length of characters to...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13839
  
LGTM pending tests.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700137
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

yea i think the problem is that we don't register the hive function?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700034
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

Hi, @hvanhovell .
I tried again, but, as you saw in my first commit, this happens during 
resolving `UnresolvedFunction`.


https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L884

IMHO, we can not do this in `ExpectsInputTypes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61349 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61349/consoleFull)**
 for PR 13937 at commit 
[`5246bcf`](https://github.com/apache/spark/commit/5246bcfa1ba510c281c456b0f61bf32f70d10174).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13839: [SPARK-16128][SQL] Allow setting length of characters to...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13839
  
**[Test build #61354 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61354/consoleFull)**
 for PR 13839 at commit 
[`b170741`](https://github.com/apache/spark/commit/b170741c4b286893e20b8894f20812af1d6e6fd4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68699881
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

@dongjoon-hyun the current fix is quite brittle; this will fail again as 
soon as we pass in an argument with a slightly different value. The Analyzer 
will create casts to the proper type if we implement `ExpectsInputTypes`. So 
this seems like the best course of action. It might not be the easiest fix, or 
entirely possible; but I'd prefer to try this first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61349/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9183: [SPARK-11215] [ML] Add multiple columns support to String...

2016-06-27 Thread pkch

Github user pkch commented on the issue:

https://github.com/apache/spark/pull/9183
  
What needs to happen to move this forward? This was a PR that would have 
been the first iteration of a significant improvement in handling of wide 
datasets.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL pro...

2016-06-27 Thread yhuai

GitHub user yhuai opened a pull request:

https://github.com/apache/spark/pull/13938

[SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programming guide.

## What changes were proposed in this pull request?
This PR makes several updates to SQL programming guide.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yhuai/spark doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13938.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13938


commit ce0f54e074099f2c416169d5f62f93b23587f43a
Author: Yin Huai 
Date:   2016-06-28T04:20:12Z

wip

commit 7455a4925ea0f859ea3978930f03e972a7e07929
Author: Yin Huai 
Date:   2016-06-28T05:26:33Z

[SPARK-15863][SQL][DOC] Update SQL programming guide.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread hhbyyh

Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/13937
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13933
  
LGTM -- cc @tdas to take a look since he wrote the original patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61353/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61353 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61353/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13517#discussion_r68699390
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -435,6 +434,37 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   }
 
   /**
+   * Parse a key-value map from a [[OptionParameterListContext]], assuming 
all values are
+   * specified. This allows string, boolean, decimal and integer literals 
which are converted
+   * to strings.
+   */
+  override def visitOptionParameterList(ctx: OptionParameterListContext): 
Map[String, String] = {
+// TODO: Currently it does not treat null. Hive does not allow null 
for metadata and
+// throws an exception.
+val properties = ctx.optionParameter.asScala.map { property =>
+  val key = visitTablePropertyKey(property.key)
+  val value = if (property.value.STRING != null) {
+string(property.value.STRING)
+  } else if (property.value.booleanValue != null) {
+property.value.getText.toLowerCase
+  } else {
+property.value.getText
+  }
+  key -> value
+}
+
+// Check for duplicate property names.
+checkDuplicateKeys(properties, ctx)
+val props = properties.toMap
+val badKeys = props.filter { case (_, v) => v == null }.keys
--- End diff --

NIT (not your code): `val badKeys = props.collect { case (key, null) => key 
}`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13936: [SPARK-16243][ML] model loading backward compatibility f...

2016-06-27 Thread hhbyyh

Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/13936
  
Just saw @yanboliang opened a jira for this too. I'll close the PR and 
resolve the jira.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13936: [SPARK-16243][ML] model loading backward compatib...

2016-06-27 Thread hhbyyh

Github user hhbyyh closed the pull request at:

https://github.com/apache/spark/pull/13936


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13517#discussion_r68699131
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -45,11 +45,11 @@ statement
 | ALTER DATABASE identifier SET DBPROPERTIES tablePropertyList 
#setDatabaseProperties
 | DROP DATABASE (IF EXISTS)? identifier (RESTRICT | CASCADE)?  
#dropDatabase
 | createTableHeader ('(' colTypeList ')')? tableProvider
-(OPTIONS tablePropertyList)?
+(OPTIONS optionParameterList)?
 (PARTITIONED BY partitionColumnNames=identifierList)?
 bucketSpec?
#createTableUsing
 | createTableHeader tableProvider
-(OPTIONS tablePropertyList)?
--- End diff --

Why not generalize the `tableProperty` rule and use `optionValue` (rename 
it to something more consistent) as its value rule? Seems easier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61353 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61353/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-27 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/13933
  
cc @rxin The code is ready for review. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
Hm... am I doing something wrong here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13517#discussion_r68698738
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -252,6 +252,21 @@ tablePropertyKey
 | STRING
 ;
 
+optionParameterList
+: '(' optionParameter (',' optionParameter)* ')'
+;
+
+optionParameter
+: key=tablePropertyKey (EQ? value=optionValue)?
--- End diff --

We could remove `EQ?` here. This is actually not supported by data source 
tables. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61352/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61352 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61352/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13918
  
Yea it's good to have this in branch-2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61352 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61352/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13918
  
Thank you for merging, @liancheng ! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordReader in ...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13914
  
Thank you for merging, @rxin .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread liancheng

Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/13918
  
Thanks, merged to master.

@rxin Shall we have this in branch-2.0 at this stage?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13915: [SPARK-16081][BUILD] Disallow using `l` as variable name

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13915
  
@mengxr 's idea sounds good to me, too.
May I update this PR, @rxin ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger vi...

2016-06-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13918


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61351 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61351/consoleFull)**
 for PR 13937 at commit 
[`ce04e08`](https://github.com/apache/spark/commit/ce04e08e5fff17ecdf47a1934ae8a453d051b67e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordRea...

2016-06-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13914


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13517: [SPARK-14839][SQL] Support for other types as option in ...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13517
  
cc @hvanhovell for this one


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61337/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61337 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61337/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61350 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61350/consoleFull)**
 for PR 13937 at commit 
[`8be63d5`](https://github.com/apache/spark/commit/8be63d5dbd8e3e62fd23248efa6be826e09e3ce3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68697699
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

Thank you for advice, @hvanhovell .
Do you mean adding `ExpectsInputTypes` to `HiveSimpleUDF`, 
`HiveGenericUDF`, `HiveUDAFFunction`?
We only have 4 expressions to handle all generic Hive functions. So, 
currently, `makeFunctionBuilder` seems to type-checking by calling 
`udf.dataType` on the fly .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordReader in ...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13914
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13915: [SPARK-16081][BUILD] Disallow using `l` as variable name

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13915
  
yea I think you can argue this should be discouraged but not necessarily 
justify banning.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13891: [SPARK-6685][MLLIB]Use DSYRK to compute AtA in ALS

2016-06-27 Thread hqzizania

Github user hqzizania commented on the issue:

https://github.com/apache/spark/pull/13891
  
@mengxr  this is a simple imitation of the loop in `computeFactors[ID]()` 
ALS using. It runs on a bare-metal node with 4 cores. All tests use all cores 
by RDD multi-partitions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61336/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13937: [SPARK-16245] [ML] model loading backward compati...

2016-06-27 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/13937#discussion_r68697383
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala ---
@@ -206,24 +206,21 @@ object PCAModel extends MLReadable[PCAModel] {
 override def load(path: String): PCAModel = {
   val metadata = DefaultParamsReader.loadMetadata(path, sc, className)
 
-  // explainedVariance field is not present in Spark <= 1.6
-  val versionRegex = "([0-9]+)\\.([0-9]+).*".r
-  val hasExplainedVariance = metadata.sparkVersion match {
-case versionRegex(major, minor) =>
-  major.toInt >= 2 || (major.toInt == 1 && minor.toInt > 6)
-case _ => false
-  }
+  val versionRegex = "([0-9]+)\\.(.+)".r
+  val versionRegex(major, _) = metadata.sparkVersion
 
   val dataPath = new Path(path, "data").toString
-  val model = if (hasExplainedVariance) {
+  val model = if (major.toInt >= 2) {
 val Row(pc: DenseMatrix, explainedVariance: DenseVector) =
   sparkSession.read.parquet(dataPath)
 .select("pc", "explainedVariance")
 .head()
 new PCAModel(metadata.uid, pc, explainedVariance)
   } else {
-val Row(pc: DenseMatrix) = 
sparkSession.read.parquet(dataPath).select("pc").head()
-new PCAModel(metadata.uid, pc, 
Vectors.dense(Array.empty[Double]).asInstanceOf[DenseVector])
+// explainedVariance field is not present and we use the old 
matrix in Spark <= 2.0
+val Row(pc: OldDenseMatrix) = 
sparkSession.read.parquet(dataPath).select("pc").head()
+new PCAModel(metadata.uid, pc.asML,
+  Vectors.dense(Array.empty[Double]).asInstanceOf[DenseVector])
--- End diff --

Here we combine the ```explainedVariance``` field issue and the old matrix 
issue together to handle backward compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61336 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61336/consoleFull)**
 for PR 13806 at commit 
[`2a55091`](https://github.com/apache/spark/commit/2a550912f1194e9c212d9f4f78824eaf375ddccc).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 625 matches

Mail list logo