[GitHub] spark issue #13378: [SPARK-15643] [Doc] [ML] Update spark.ml and spark.mllib...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13378 **[Test build #61364 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61364/consoleFull)** for PR 13378 at commit [`5472fb9`](https://github.com/apache/spark/commit/5472fb9e4d1158644c0c4fc22cc02083acc4576f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13906 **[Test build #61363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61363/consoleFull)** for PR 13906 at commit [`c06ae60`](https://github.com/apache/spark/commit/c06ae6011985d3c839bea372333b0d5a6491f55d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13931 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13931 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61360/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13931 **[Test build #61360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61360/consoleFull)** for PR 13931 at commit [`793afb9`](https://github.com/apache/spark/commit/793afb91f5e573e40d78bb2aa8a9bf89154396f2). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class HiveSparkSubmitTests(SparkSubmitTests):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61362/consoleFull)** for PR 13680 at commit [`9c113aa`](https://github.com/apache/spark/commit/9c113aa6e0a914ce8dfa571df68db603e3e42140). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13758: [SPARK-16043][SQL] Prepare GenericArrayData implementati...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13758 **[Test build #61361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61361/consoleFull)** for PR 13758 at commit [`280d97e`](https://github.com/apache/spark/commit/280d97e718f5e5ac2b1cbf6628905fe63c2334ad). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13940: [SPARK-16241] [ML] model loading backward compatibility ...
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/13940 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r68703430 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.Row +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest} +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan} +import org.apache.spark.sql.catalyst.rules.RuleExecutor + +class CollapseEmptyPlanSuite extends PlanTest { --- End diff -- I'll update to have more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13858: [SPARK-16148] [Scheduler] Allow for underscores in TaskL...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/13858 I had an outstanding comment from the previous PR too: https://github.com/apache/spark/pull/13857#discussion_r68134544 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68702737 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- Thank you, @hvanhovell . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13940: [SPARK-16241] [ML] model loading backward compatibility ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13940 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13940: [SPARK-16241] [ML] model loading backward compati...
GitHub user zlpmichelle opened a pull request: https://github.com/apache/spark/pull/13940 [SPARK-16241] [ML] model loading backward compatibility for ml NaiveBayes #16241 ## What changes were proposed in this pull request? model loading backward compatibility for ml NaiveBayes ## How was this patch tested? existing ut and manual test for loading models saved by Spark 1.6. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zlpmichelle/spark naivebayes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13940.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13940 commit c957730e60cc237ce684a94e0b4867ebadd938c7 Author: zlpmichelle Date: 2016-06-28T06:00:30Z [SPARK-16241] [ML] model loading backward compatibility for ml NaiveBayes #16241 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13903: [SPARK-16202] [SQL] [DOC] Correct The Description...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13903 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13903: [SPARK-16202] [SQL] [DOC] Correct The Description of Cre...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13903 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68702096 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- @dongjoon-hyun Nevermind. We use the datatypes of the arguments passed to the HiveUDF/UDAF/UDFT to determine which object inspectors to use for conversion. So there is no way we can fix this using `ExpectsInputTypes`; sorry about the confusion... We have only changed the default datatype for decimal conversion, so your I guess your fix is ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13906 Anyway, thank you for review again, @rxin ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13938 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13938 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61355/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13938 **[Test build #61355 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61355/consoleFull)** for PR 13938 at commit [`7455a49`](https://github.com/apache/spark/commit/7455a4925ea0f859ea3978930f03e972a7e07929). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r68701978 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1053,6 +1055,41 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper { } /** + * Collapse plans consisting all empty local relations generated by [[PruneFilters]]. + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions. + * {{{ + * SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result + * SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized) + * }}} + */ +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper { + private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = +plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty + + def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { +case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x + +// Case 1: If groupingExpressions contains all aggregation expressions, the result is empty. +case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) => + LocalRelation(a.output, data = Seq.empty) + +// Case 2: General aggregations can generate non-empty results. +case a: Aggregate => a + +// Case 3: The following non-leaf plans having only empty relations return empty results. +case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) => + p match { +case _: Project | _: Generate | _: Filter | _: Sample | _: Join | --- End diff -- Yep, right! I'll add them explicitly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r68701768 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1053,6 +1055,41 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper { } /** + * Collapse plans consisting all empty local relations generated by [[PruneFilters]]. + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions. + * {{{ + * SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result + * SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized) + * }}} + */ +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper { + private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = +plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty + + def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { +case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x + +// Case 1: If groupingExpressions contains all aggregation expressions, the result is empty. +case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) => + LocalRelation(a.output, data = Seq.empty) + +// Case 2: General aggregations can generate non-empty results. +case a: Aggregate => a + +// Case 3: The following non-leaf plans having only empty relations return empty results. +case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) => + p match { +case _: Project | _: Generate | _: Filter | _: Sample | _: Join | --- End diff -- actually for intersect you only need one child to be empty for join if it is inner join you just need one child to be empty too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r68701793 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.Row +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest} +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan} +import org.apache.spark.sql.catalyst.rules.RuleExecutor + +class CollapseEmptyPlanSuite extends PlanTest { --- End diff -- Ur, any other scenario except the existing followings? - test("one non-empty local relation") - test("one non-empty and one empty local relations") - test("aggregating expressions on empty plan") --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13931: [SPARK-16224] [SQL] [PYSPARK] SparkSession builder's con...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13931 **[Test build #61360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61360/consoleFull)** for PR 13931 at commit [`793afb9`](https://github.com/apache/spark/commit/793afb91f5e573e40d78bb2aa8a9bf89154396f2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r68701620 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala --- @@ -0,0 +1,133 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.optimizer + +import org.apache.spark.sql.Row +import org.apache.spark.sql.catalyst.dsl.expressions._ +import org.apache.spark.sql.catalyst.dsl.plans._ +import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest} +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan} +import org.apache.spark.sql.catalyst.rules.RuleExecutor + +class CollapseEmptyPlanSuite extends PlanTest { --- End diff -- you should test something that shouldn't have been converted too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13906 Hi, @rxin . I just remembered this PR while looking your whitelist PR. :) Any advice for this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61356/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13939: [SPARK-16248][SQL] Whitelist the list of Hive fallback f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13939 **[Test build #61358 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61358/consoleFull)** for PR 13939 at commit [`ef5db42`](https://github.com/apache/spark/commit/ef5db42b6630c7c891c9f0e5252daf4a37ddca91). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11863 **[Test build #61359 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61359/consoleFull)** for PR 11863 at commit [`db95290`](https://github.com/apache/spark/commit/db9529066e9c9dab145f09f2332284f6869ed312). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13939#discussion_r68701105 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -221,4 +214,18 @@ private[sql] class HiveSessionCatalog( } } } + + /** List of functions we pass over to Hive. Note that over time this list should go to 0. */ + // We have a list of Hive built-in functions that we do not support. So, we will check + // Hive's function registry and lazily load needed functions into our own function registry. + // Those Hive built-in functions are + // compute_stats, context_ngrams, create_union, + // current_user ,elt, ewah_bitmap, ewah_bitmap_and, ewah_bitmap_empty, ewah_bitmap_or, field, + // histogram_numeric, in_file, index, inline, java_method, map_keys, map_values, + // matchpath, ngrams, noop, noopstreaming, noopwithmap, noopwithmapstreaming, + // parse_url, parse_url_tuple, percentile, percentile_approx, posexplode, reflect, reflect2, + // regexp, sentences, stack, std, str_to_map, windowingtablefunction, xpath, xpath_boolean, + // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, + // xpath_short, and xpath_string. + private val hiveFunctions = Seq("percentile", "percentile_approx") --- End diff -- Oh. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/13921 I think the error was because this PR left `predict`, `write.ml`, etc documented without title. So this PR has to be combined with SPARK-16144. Basically, let us add some doc to the function declarations under `generics.R`. cc: @yinxusen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61356/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68700984 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- Oh, @rxin . I misunderstood your question. Yes. We don't register the hive function before. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13939#discussion_r68700956 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -162,17 +162,6 @@ private[sql] class HiveSessionCatalog( } } - // We have a list of Hive built-in functions that we do not support. So, we will check - // Hive's function registry and lazily load needed functions into our own function registry. - // Those Hive built-in functions are - // assert_true, collect_list, collect_set, compute_stats, context_ngrams, create_union, --- End diff -- assert_true, collect_list, collect_set are supported already --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/13939 [SPARK-16248][SQL] Whitelist the list of Hive fallback functions - WIP ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark hive-whitelist Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13939.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13939 commit ef5db42b6630c7c891c9f0e5252daf4a37ddca91 Author: Reynold Xin Date: 2016-06-28T05:53:22Z [SPARK-16248][SQL] Whitelist the list of Hive fallback functions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13937 **[Test build #61351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61351/consoleFull)** for PR 13937 at commit [`ce04e08`](https://github.com/apache/spark/commit/ce04e08e5fff17ecdf47a1934ae8a453d051b67e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61357 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61357/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13937 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61351/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13937 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68700695 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- I mean we need to call `createTempFunction` with `double` children instead of `decimal` children. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68700636 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- @rxin . Actually, we do `createTempFunction` for the hive function on the fly but with **different** signature (Decimal). `makeFunctionBuilder` indeed uses `children` implicitly. That's the reason why I rename `lookupFunction` into `subLookupFunction` and repeats the same process with different children. ``` val builder = makeFunctionBuilder(functionName, className) // Put this Hive built-in function to our function registry. val info = new ExpressionInfo(className, functionName) createTempFunction(functionName, info, builder, ignoreIfExists = false) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61356 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61356/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13806 Again, I think the error message is not related with this change. I will retest this and meanwhile try to build in my local. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13806 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL pro...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13938 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13938 LGTM - merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13938 **[Test build #61355 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61355/consoleFull)** for PR 13938 at commit [`7455a49`](https://github.com/apache/spark/commit/7455a4925ea0f859ea3978930f03e972a7e07929). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9183: [SPARK-11215] [ML] Add multiple columns support to String...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/9183 I think @yanboliang just need to push this forward and get people to review it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68700193 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- For the following opinion, I think that is the exact same way of the Spark 1.6 and previous. I think that is not a problem. > this will fail again as soon as we pass in an argument with a slightly different value --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13839: [SPARK-16128][SQL] Allow setting length of characters to...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13839 LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68700137 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- yea i think the problem is that we don't register the hive function? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68700034 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- Hi, @hvanhovell . I tried again, but, as you saw in my first commit, this happens during resolving `UnresolvedFunction`. https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L884 IMHO, we can not do this in `ExpectsInputTypes`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13937 **[Test build #61349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61349/consoleFull)** for PR 13937 at commit [`5246bcf`](https://github.com/apache/spark/commit/5246bcfa1ba510c281c456b0f61bf32f70d10174). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13839: [SPARK-16128][SQL] Allow setting length of characters to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13839 **[Test build #61354 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61354/consoleFull)** for PR 13839 at commit [`b170741`](https://github.com/apache/spark/commit/b170741c4b286893e20b8894f20812af1d6e6fd4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68699881 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- @dongjoon-hyun the current fix is quite brittle; this will fail again as soon as we pass in an argument with a slightly different value. The Analyzer will create casts to the proper type if we implement `ExpectsInputTypes`. So this seems like the best course of action. It might not be the easiest fix, or entirely possible; but I'd prefer to try this first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13937 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61349/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #9183: [SPARK-11215] [ML] Add multiple columns support to String...
Github user pkch commented on the issue: https://github.com/apache/spark/pull/9183 What needs to happen to move this forward? This was a PR that would have been the first iteration of a significant improvement in handling of wide datasets. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13937 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL pro...
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/13938 [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programming guide. ## What changes were proposed in this pull request? This PR makes several updates to SQL programming guide. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yhuai/spark doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13938.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13938 commit ce0f54e074099f2c416169d5f62f93b23587f43a Author: Yin Huai Date: 2016-06-28T04:20:12Z wip commit 7455a4925ea0f859ea3978930f03e972a7e07929 Author: Yin Huai Date: 2016-06-28T05:26:33Z [SPARK-15863][SQL][DOC] Update SQL programming guide. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/13937 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13933 LGTM -- cc @tdas to take a look since he wrote the original patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61353/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61353/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13517#discussion_r68699390 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -435,6 +434,37 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder { } /** + * Parse a key-value map from a [[OptionParameterListContext]], assuming all values are + * specified. This allows string, boolean, decimal and integer literals which are converted + * to strings. + */ + override def visitOptionParameterList(ctx: OptionParameterListContext): Map[String, String] = { +// TODO: Currently it does not treat null. Hive does not allow null for metadata and +// throws an exception. +val properties = ctx.optionParameter.asScala.map { property => + val key = visitTablePropertyKey(property.key) + val value = if (property.value.STRING != null) { +string(property.value.STRING) + } else if (property.value.booleanValue != null) { +property.value.getText.toLowerCase + } else { +property.value.getText + } + key -> value +} + +// Check for duplicate property names. +checkDuplicateKeys(properties, ctx) +val props = properties.toMap +val badKeys = props.filter { case (_, v) => v == null }.keys --- End diff -- NIT (not your code): `val badKeys = props.collect { case (key, null) => key }` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13936: [SPARK-16243][ML] model loading backward compatibility f...
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/13936 Just saw @yanboliang opened a jira for this too. I'll close the PR and resolve the jira. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13936: [SPARK-16243][ML] model loading backward compatib...
Github user hhbyyh closed the pull request at: https://github.com/apache/spark/pull/13936 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13517#discussion_r68699131 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -45,11 +45,11 @@ statement | ALTER DATABASE identifier SET DBPROPERTIES tablePropertyList #setDatabaseProperties | DROP DATABASE (IF EXISTS)? identifier (RESTRICT | CASCADE)? #dropDatabase | createTableHeader ('(' colTypeList ')')? tableProvider -(OPTIONS tablePropertyList)? +(OPTIONS optionParameterList)? (PARTITIONED BY partitionColumnNames=identifierList)? bucketSpec? #createTableUsing | createTableHeader tableProvider -(OPTIONS tablePropertyList)? --- End diff -- Why not generalize the `tableProperty` rule and use `optionValue` (rename it to something more consistent) as its value rule? Seems easier. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61353 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61353/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13806 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13933 cc @rxin The code is ready for review. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13806 Hm... am I doing something wrong here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13517#discussion_r68698738 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -252,6 +252,21 @@ tablePropertyKey | STRING ; +optionParameterList +: '(' optionParameter (',' optionParameter)* ')' +; + +optionParameter +: key=tablePropertyKey (EQ? value=optionValue)? --- End diff -- We could remove `EQ?` here. This is actually not supported by data source tables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61352/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61352/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13918 Yea it's good to have this in branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61352/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/13806 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13918 Thank you for merging, @liancheng ! :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordReader in ...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13914 Thank you for merging, @rxin . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13918 Thanks, merged to master. @rxin Shall we have this in branch-2.0 at this stage? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13915: [SPARK-16081][BUILD] Disallow using `l` as variable name
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13915 @mengxr 's idea sounds good to me, too. May I update this PR, @rxin ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger vi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13918 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13937 **[Test build #61351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61351/consoleFull)** for PR 13937 at commit [`ce04e08`](https://github.com/apache/spark/commit/ce04e08e5fff17ecdf47a1934ae8a453d051b67e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordRea...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13914 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13517: [SPARK-14839][SQL] Support for other types as option in ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13517 cc @hvanhovell for this one --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61337/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61337 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61337/consoleFull)** for PR 13806 at commit [`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13937 **[Test build #61350 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61350/consoleFull)** for PR 13937 at commit [`8be63d5`](https://github.com/apache/spark/commit/8be63d5dbd8e3e62fd23248efa6be826e09e3ce3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/13930#discussion_r68697699 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog( // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number, // xpath_short, and xpath_string. override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = { +try { + subLookupFunction(name, children) +} catch { --- End diff -- Thank you for advice, @hvanhovell . Do you mean adding `ExpectsInputTypes` to `HiveSimpleUDF`, `HiveGenericUDF`, `HiveUDAFFunction`? We only have 4 expressions to handle all generic Hive functions. So, currently, `makeFunctionBuilder` seems to type-checking by calling `udf.dataType` on the fly . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordReader in ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13914 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13915: [SPARK-16081][BUILD] Disallow using `l` as variable name
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13915 yea I think you can argue this should be discouraged but not necessarily justify banning. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13891: [SPARK-6685][MLLIB]Use DSYRK to compute AtA in ALS
Github user hqzizania commented on the issue: https://github.com/apache/spark/pull/13891 @mengxr this is a simple imitation of the loop in `computeFactors[ID]()` ALS using. It runs on a bare-metal node with 4 cores. All tests use all cores by RDD multi-partitions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13806 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61336/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13937: [SPARK-16245] [ML] model loading backward compati...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/13937#discussion_r68697383 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala --- @@ -206,24 +206,21 @@ object PCAModel extends MLReadable[PCAModel] { override def load(path: String): PCAModel = { val metadata = DefaultParamsReader.loadMetadata(path, sc, className) - // explainedVariance field is not present in Spark <= 1.6 - val versionRegex = "([0-9]+)\\.([0-9]+).*".r - val hasExplainedVariance = metadata.sparkVersion match { -case versionRegex(major, minor) => - major.toInt >= 2 || (major.toInt == 1 && minor.toInt > 6) -case _ => false - } + val versionRegex = "([0-9]+)\\.(.+)".r + val versionRegex(major, _) = metadata.sparkVersion val dataPath = new Path(path, "data").toString - val model = if (hasExplainedVariance) { + val model = if (major.toInt >= 2) { val Row(pc: DenseMatrix, explainedVariance: DenseVector) = sparkSession.read.parquet(dataPath) .select("pc", "explainedVariance") .head() new PCAModel(metadata.uid, pc, explainedVariance) } else { -val Row(pc: DenseMatrix) = sparkSession.read.parquet(dataPath).select("pc").head() -new PCAModel(metadata.uid, pc, Vectors.dense(Array.empty[Double]).asInstanceOf[DenseVector]) +// explainedVariance field is not present and we use the old matrix in Spark <= 2.0 +val Row(pc: OldDenseMatrix) = sparkSession.read.parquet(dataPath).select("pc").head() +new PCAModel(metadata.uid, pc.asML, + Vectors.dense(Array.empty[Double]).asInstanceOf[DenseVector]) --- End diff -- Here we combine the ```explainedVariance``` field issue and the old matrix issue together to handle backward compatibility. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13806 **[Test build #61336 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61336/consoleFull)** for PR 13806 at commit [`2a55091`](https://github.com/apache/spark/commit/2a550912f1194e9c212d9f4f78824eaf375ddccc). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org