[GitHub] spark pull request: [SPARK-13992][Core][PySpark][FollowUp] Update ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12126#issuecomment-209240468 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13992][Core][PySpark][FollowUp] Update ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12126#issuecomment-209240472 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55684/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13992][Core][PySpark][FollowUp] Update ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12126#issuecomment-209240279 **[Test build #55684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55684/consoleFull)** for PR 12126 at commit [`273c0aa`](https://github.com/apache/spark/commit/273c0aa6c1e17ba6853e418f4d3e6035aee2fc2f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14573][PYSPARK][BUILD] Fix PyDoc Makefi...
Github user BryanCutler commented on the pull request: https://github.com/apache/spark/pull/12336#issuecomment-209238027 `?=` seems to do the trick, LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209233666 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209233668 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55686/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209233403 **[Test build #55686 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55686/consoleFull)** for PR 12342 at commit [`099b668`](https://github.com/apache/spark/commit/099b66886a7a4650f6eea2cb2061cb7ebcfad3d3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13419] Update SubquerySuite to use chec...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12269#issuecomment-209232767 **[Test build #55691 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55691/consoleFull)** for PR 12269 at commit [`e58cf23`](https://github.com/apache/spark/commit/e58cf23605955ab18f23971b7ab615e10ae2886e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209232569 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55685/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209232566 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209232491 **[Test build #55685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55685/consoleFull)** for PR 12342 at commit [`3c11e4c`](https://github.com/apache/spark/commit/3c11e4c2084475b2cb3c7542fd519f58b92dc178). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13992][Core][PySpark][FollowUp] Update ...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/12126#issuecomment-209232413 LGTM pending Jenkins. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209232165 btw https://spark-prs.appspot.com/ already does this for you --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209232172 **[Test build #55689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55689/consoleFull)** for PR 12349 at commit [`9a8b509`](https://github.com/apache/spark/commit/9a8b509d66d0f0c9f9fd2075f6023975fc7ea1fe). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14589][SQL] Enhance DB2 JDBC Dialect do...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12348#issuecomment-209232158 **[Test build #55690 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55690/consoleFull)** for PR 12348 at commit [`7250fce`](https://github.com/apache/spark/commit/7250fce8205690ae8265c33e319257660b3687b2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209229451 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209229455 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55682/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209229345 **[Test build #55682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55682/consoleFull)** for PR 12349 at commit [`d927d13`](https://github.com/apache/spark/commit/d927d13e6d99e36828d980f091e462c707f76986). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209228701 **[Test build #55688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55688/consoleFull)** for PR 12342 at commit [`21c8482`](https://github.com/apache/spark/commit/21c848267346e974b1845198083da070537f558e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14589][SQL] Enhance DB2 JDBC Dialect do...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12348#issuecomment-209228513 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55681/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14589][SQL] Enhance DB2 JDBC Dialect do...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12348#issuecomment-209228510 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14589][SQL] Enhance DB2 JDBC Dialect do...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12348#issuecomment-209228412 **[Test build #55681 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55681/consoleFull)** for PR 12348 at commit [`1f9d6fe`](https://github.com/apache/spark/commit/1f9d6fe089335fe33342d9b974cab1f8e9fb0bd5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `assert(types(0).equals(\"class java.lang.Integer\"))` * `assert(types(1).equals(\"class java.lang.Integer\"))` * `assert(types(2).equals(\"class java.lang.Long\"))` * `assert(types(3).equals(\"class java.math.BigDecimal\"))` * `assert(types(4).equals(\"class java.lang.Double\"))` * `assert(types(5).equals(\"class java.lang.Double\"))` * `assert(types(3).equals(\"class [B\"))` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209228199 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14499] [SQL] [TEST] Drop Partition Does...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12350#issuecomment-209227366 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14499] [SQL] [TEST] Drop Partition Does...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12350#issuecomment-209227367 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55683/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59492783 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +940,68 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, child) + if child.isInstanceOf[Union] || child.isInstanceOf[Intersect] => + // Union/Intersect could change the rows, so non-deterministic predicate can't be pushed down + val (pushDown, stayUp) = splitConjunctivePredicates(condition).partition { cond => +cond.deterministic + } + if (pushDown.nonEmpty) { +val pushDownCond = pushDown.reduceLeft(And) +val output = child.output +val newGrandChildren = child.children.map { grandchild => + val newCond = pushDownCond transform { +case e if output.exists(_.semanticEquals(e)) => --- End diff -- this can be simplified to: `case a: Attributes => grandchild.output(output.indexWhere(_.semanticEquals(a)))` The filter condition can only reference to child output, so all attribute should be exist in `child.output` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14499] [SQL] [TEST] Drop Partition Does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12350#issuecomment-209227265 **[Test build #55683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55683/consoleFull)** for PR 12350 at commit [`f282a8d`](https://github.com/apache/spark/commit/f282a8d5a4cc86fa7c46287cbb9ebc48cc2845c8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14388][SQL] Implement CREATE TABLE
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12271#discussion_r59492629 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveSqlParser.scala --- @@ -121,84 +123,115 @@ class HiveSqlAstBuilder extends SparkSqlAstBuilder { } /** - * Create a [[CatalogStorageFormat]]. This is part of the [[CreateTableAsSelect]] command. + * Create a [[CatalogStorageFormat]] for creating tables. */ override def visitCreateFileFormat( ctx: CreateFileFormatContext): CatalogStorageFormat = withOrigin(ctx) { -if (ctx.storageHandler == null) { - typedVisit[CatalogStorageFormat](ctx.fileFormat) -} else { - visitStorageHandler(ctx.storageHandler) +(ctx.fileFormat, ctx.storageHandler) match { + // Expected format: INPUTFORMAT input_format OUTPUTFORMAT output_format + case (c: TableFileFormatContext, null) => + visitTableFileFormat(c) + // Expected format: SEQUENCEFILE | TEXTFILE | RCFILE | ORC | PARQUET | AVRO + case (c: GenericFileFormatContext, null) => +visitGenericFileFormat(c) + case (null, storageHandler) => +throw new ParseException("Operation not allowed: ... STORED BY storage_handler ...", ctx) + case _ => +throw new ParseException("expected either STORED AS or STORED BY, not both", ctx) } } /** - * Create a [[CreateTableAsSelect]] command. + * Create a table, returning either a [[CreateTable]] or a [[CreateTableAsSelect]]. + * + * This is not used to create datasource tables, which is handled through + * "CREATE TABLE ... USING ...". + * + * Note: several features are currently not supported - temporary tables, bucketing, + * skewed columns and storage handlers (STORED BY). + * + * Expected format: + * {{{ + * CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name + * [(col1 data_type [COMMENT col_comment], ...)] + * [COMMENT table_comment] + * [PARTITIONED BY (col3 data_type [COMMENT col_comment], ...)] + * [CLUSTERED BY (col1, ...) [SORTED BY (col1 [ASC|DESC], ...)] INTO num_buckets BUCKETS] + * [SKEWED BY (col1, col2, ...) ON ((col_value, col_value, ...), ...) [STORED AS DIRECTORIES]] + * [ROW FORMAT row_format] + * [STORED AS file_format | STORED BY storage_handler_class [WITH SERDEPROPERTIES (...)]] + * [LOCATION path] + * [TBLPROPERTIES (property_name=property_value, ...)] + * [AS select_statement]; + * }}} */ - override def visitCreateTable(ctx: CreateTableContext): LogicalPlan = { -if (ctx.query == null) { - HiveNativeCommand(command(ctx)) + override def visitCreateTable(ctx: CreateTableContext): LogicalPlan = withOrigin(ctx) { +val (name, temp, ifNotExists, external) = visitCreateTableHeader(ctx.createTableHeader) +// TODO: implement temporary tables +if (temp) { + throw new ParseException( +"CREATE TEMPORARY TABLE is not supported yet. " + +"Please use registerTempTable as an alternative.", ctx) +} +if (ctx.skewSpec != null) { + throw new ParseException("Operation not allowed: CREATE TABLE ... SKEWED BY ...", ctx) +} +if (ctx.bucketSpec != null) { + throw new ParseException("Operation not allowed: CREATE TABLE ... CLUSTERED BY ...", ctx) +} +val tableType = if (external) { + CatalogTableType.EXTERNAL_TABLE } else { - // Get the table header. - val (table, temp, ifNotExists, external) = visitCreateTableHeader(ctx.createTableHeader) - val tableType = if (external) { -CatalogTableType.EXTERNAL_TABLE - } else { -CatalogTableType.MANAGED_TABLE - } - - // Unsupported clauses. - if (temp) { -throw new ParseException(s"Unsupported operation: TEMPORARY clause.", ctx) - } - if (ctx.bucketSpec != null) { -// TODO add this - we need cluster columns in the CatalogTable for this to work. -throw new ParseException("Unsupported operation: " + - "CLUSTERED BY ... [ORDERED BY ...] INTO ... BUCKETS clause.", ctx) - } - if (ctx.skewSpec != null) { -throw new ParseException("Operation not allowed: " + - "SKEWED BY ... ON ... [STORED AS DIRECTORIES] clause.", ctx) - } - - // Create the schema. - val schema = Option(ctx.columns).toSeq.flatMap(visitCatalogColumns(_, _.toLowerCase)) - - // Get the column by which the table is partitioned. - val partitionCols =
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59492551 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala --- @@ -681,4 +679,67 @@ class FilterPushdownSuite extends PlanTest { comparePlans(optimized, correctAnswer) } + + test("broadcast hint") { +val originalQuery = BroadcastHint(testRelation) + .where('a === 2L && 'b + Rand(10).as("rnd") === 3) + +val optimized = Optimize.execute(originalQuery.analyze) + +val correctAnswer = BroadcastHint(testRelation.where('a === 2L)) + .where('b + Rand(10).as("rnd") === 3) + .analyze + +comparePlans(optimized, correctAnswer) + } + + test("union") { +val testRelation2 = LocalRelation('d.int, 'e.int, 'f.int) + +val originalQuery = Union(Seq(testRelation, testRelation2)) + .where('a === 2L && 'b + Rand(10).as("rnd") === 3) + +val optimized = Optimize.execute(originalQuery.analyze) + +val correctAnswer = Union(Seq( + testRelation.where('a === 2L && 'b + Rand(10).as("rnd") === 3), --- End diff -- the non-deterministic expression shouldn't be pushed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14388][SQL] Implement CREATE TABLE
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12271#discussion_r59492242 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveSqlParser.scala --- @@ -121,84 +123,115 @@ class HiveSqlAstBuilder extends SparkSqlAstBuilder { } /** - * Create a [[CatalogStorageFormat]]. This is part of the [[CreateTableAsSelect]] command. + * Create a [[CatalogStorageFormat]] for creating tables. */ override def visitCreateFileFormat( ctx: CreateFileFormatContext): CatalogStorageFormat = withOrigin(ctx) { -if (ctx.storageHandler == null) { - typedVisit[CatalogStorageFormat](ctx.fileFormat) -} else { - visitStorageHandler(ctx.storageHandler) +(ctx.fileFormat, ctx.storageHandler) match { + // Expected format: INPUTFORMAT input_format OUTPUTFORMAT output_format + case (c: TableFileFormatContext, null) => + visitTableFileFormat(c) + // Expected format: SEQUENCEFILE | TEXTFILE | RCFILE | ORC | PARQUET | AVRO + case (c: GenericFileFormatContext, null) => +visitGenericFileFormat(c) + case (null, storageHandler) => +throw new ParseException("Operation not allowed: ... STORED BY storage_handler ...", ctx) + case _ => +throw new ParseException("expected either STORED AS or STORED BY, not both", ctx) } } /** - * Create a [[CreateTableAsSelect]] command. + * Create a table, returning either a [[CreateTable]] or a [[CreateTableAsSelect]]. + * + * This is not used to create datasource tables, which is handled through + * "CREATE TABLE ... USING ...". + * + * Note: several features are currently not supported - temporary tables, bucketing, + * skewed columns and storage handlers (STORED BY). + * + * Expected format: + * {{{ + * CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name + * [(col1 data_type [COMMENT col_comment], ...)] + * [COMMENT table_comment] + * [PARTITIONED BY (col3 data_type [COMMENT col_comment], ...)] + * [CLUSTERED BY (col1, ...) [SORTED BY (col1 [ASC|DESC], ...)] INTO num_buckets BUCKETS] + * [SKEWED BY (col1, col2, ...) ON ((col_value, col_value, ...), ...) [STORED AS DIRECTORIES]] + * [ROW FORMAT row_format] + * [STORED AS file_format | STORED BY storage_handler_class [WITH SERDEPROPERTIES (...)]] + * [LOCATION path] + * [TBLPROPERTIES (property_name=property_value, ...)] + * [AS select_statement]; + * }}} */ - override def visitCreateTable(ctx: CreateTableContext): LogicalPlan = { -if (ctx.query == null) { - HiveNativeCommand(command(ctx)) + override def visitCreateTable(ctx: CreateTableContext): LogicalPlan = withOrigin(ctx) { +val (name, temp, ifNotExists, external) = visitCreateTableHeader(ctx.createTableHeader) +// TODO: implement temporary tables +if (temp) { + throw new ParseException( +"CREATE TEMPORARY TABLE is not supported yet. " + +"Please use registerTempTable as an alternative.", ctx) +} +if (ctx.skewSpec != null) { + throw new ParseException("Operation not allowed: CREATE TABLE ... SKEWED BY ...", ctx) +} +if (ctx.bucketSpec != null) { + throw new ParseException("Operation not allowed: CREATE TABLE ... CLUSTERED BY ...", ctx) +} +val tableType = if (external) { + CatalogTableType.EXTERNAL_TABLE } else { - // Get the table header. - val (table, temp, ifNotExists, external) = visitCreateTableHeader(ctx.createTableHeader) - val tableType = if (external) { -CatalogTableType.EXTERNAL_TABLE - } else { -CatalogTableType.MANAGED_TABLE - } - - // Unsupported clauses. - if (temp) { -throw new ParseException(s"Unsupported operation: TEMPORARY clause.", ctx) - } - if (ctx.bucketSpec != null) { -// TODO add this - we need cluster columns in the CatalogTable for this to work. -throw new ParseException("Unsupported operation: " + - "CLUSTERED BY ... [ORDERED BY ...] INTO ... BUCKETS clause.", ctx) - } - if (ctx.skewSpec != null) { -throw new ParseException("Operation not allowed: " + - "SKEWED BY ... ON ... [STORED AS DIRECTORIES] clause.", ctx) - } - - // Create the schema. - val schema = Option(ctx.columns).toSeq.flatMap(visitCatalogColumns(_, _.toLowerCase)) - - // Get the column by which the table is partitioned. - val partitionCols =
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user lresende commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209225537 ok, I will leave it open for a day in case anyone else has interest on the change, otherwise close it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13432][SQL] add the source file name an...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11301#issuecomment-209224318 **[Test build #55687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55687/consoleFull)** for PR 11301 at commit [`d22d610`](https://github.com/apache/spark/commit/d22d6104326e9c74b94140bb4c1a9e81658edd74). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209222504 I'm not sure if we want this. People are already complaining the template is too long. This creates more work to create a pr, and does not add any extra information. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user lresende commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209221983 @rxin It is, but you then have to search for the jira. This makes it much more easy to just access the jira, from the link, similar to how the jira has the link to the pr. I have seen this in different projects, and found it very convenient and useful. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14388][SQL] Implement CREATE TABLE
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12271#discussion_r59491556 --- Diff: sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala --- @@ -790,14 +821,13 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter { "nullscript", "optional_outer", "orc_dictionary_threshold", -"orc_empty_files", "order", "order2", "outer_join_ppr", "parallel", "parenthesis_star_by", "part_inherit_tbl_props", -"part_inherit_tbl_props_empty", +//"part_inherit_tbl_props_empty", // TODO: results don't match --- End diff -- This one and others that are commented out below are caused by differences of describe table. There are mainly three differences. First, for managed tables, we will EXTERNAL set to false in table properties. For tables that are not bucketed, we will have numBuckets set to 0 instead of -1. For tables that does not specify the file format, the output format will be org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat instead of org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat (For Hive, it somehow replaced even if the explain output still shows IgnoreKeyTextOutputFormat). We do not need to block this PR for any of them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14388][SQL] Implement CREATE TABLE
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12271#discussion_r59491359 --- Diff: sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala --- @@ -490,15 +538,13 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter { "count", "cp_mj_rc", "create_insert_outputformat", -"create_like_tbl_props", +//"create_like_tbl_props", // TODO: results don't match --- End diff -- https://issues.apache.org/jira/browse/SPARK-14592 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14388][SQL] Implement CREATE TABLE
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12271#discussion_r59491266 --- Diff: sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala --- @@ -509,7 +555,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter { "date_comparison", "date_join1", "date_serde", -"decimal_1", +//"decimal_1", // TODO: cannot parse column decimal(5) --- End diff -- https://issues.apache.org/jira/browse/SPARK-14591 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209219838 **[Test build #55686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55686/consoleFull)** for PR 12342 at commit [`099b668`](https://github.com/apache/spark/commit/099b66886a7a4650f6eea2cb2061cb7ebcfad3d3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14388][SQL] Implement CREATE TABLE
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12271#discussion_r59491102 --- Diff: sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala --- @@ -509,7 +555,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter { "date_comparison", "date_join1", "date_serde", -"decimal_1", +//"decimal_1", // TODO: cannot parse column decimal(5) --- End diff -- Yea. We should support decimal(5). At here, the scale is 0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209218940 **[Test build #55685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55685/consoleFull)** for PR 12342 at commit [`3c11e4c`](https://github.com/apache/spark/commit/3c11e4c2084475b2cb3c7542fd519f58b92dc178). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13992][Core][PySpark][FollowUp] Update ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12126#issuecomment-209217681 **[Test build #55684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55684/consoleFull)** for PR 12126 at commit [`273c0aa`](https://github.com/apache/spark/commit/273c0aa6c1e17ba6853e418f4d3e6035aee2fc2f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12347#issuecomment-209217404 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209217502 Isn't this obvious from the JIRA title? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12347#issuecomment-209217405 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55680/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12347#issuecomment-209217278 **[Test build #55680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55680/consoleFull)** for PR 12347 at commit [`06c8948`](https://github.com/apache/spark/commit/06c8948b20cacccead019e42e3a07b89f727893a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209217192 @cloud-fan Most of the predicates are determistic, so I'd like to not push down non-determistic predicate aggresively in this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13992][Core][PySpark][FollowUp] Update ...
Github user lw-lin commented on the pull request: https://github.com/apache/spark/pull/12126#issuecomment-209217089 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59490343 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => --- End diff -- +1 we should probably avoid nondeterministic predicates. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13992][Core][PySpark][FollowUp] Update ...
Github user lw-lin commented on the pull request: https://github.com/apache/spark/pull/12126#issuecomment-209216092 @rxin would you mind taking a look, or should I close this PR? Thank you! :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14499] [SQL] [TEST] Drop Partition Does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12350#issuecomment-209215778 **[Test build #55683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55683/consoleFull)** for PR 12350 at commit [`f282a8d`](https://github.com/apache/spark/commit/f282a8d5a4cc86fa7c46287cbb9ebc48cc2845c8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14125] [SQL] Native DDL Support: Alter ...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12324#issuecomment-209215575 cc @yhuai @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14499] [SQL] [TEST] Drop Partition Does...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/12350 [SPARK-14499] [SQL] [TEST] Drop Partition Does Not Delete Data of External Tables What changes were proposed in this pull request? This PR is to add a test to ensure drop partitions of an external table will not delete data. cc @yhuai @andrewor14 How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark testDropPartition Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12350.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12350 commit f282a8d5a4cc86fa7c46287cbb9ebc48cc2845c8 Author: gatorsmileDate: 2016-04-13T03:40:31Z added test case --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14554][SQL][follow-up] use checkDataset...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12346 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14554][SQL][follow-up] use checkDataset...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12346#issuecomment-209213072 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14554][SQL][follow-up] use checkDataset...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12346#issuecomment-209213073 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55679/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14554][SQL][follow-up] use checkDataset...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12346#issuecomment-209212967 **[Test build #55679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55679/consoleFull)** for PR 12346 at commit [`00180bd`](https://github.com/apache/spark/commit/00180bd0c20bee0f85e134896e9ba9252eda59db). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12349#issuecomment-209212144 **[Test build #55682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55682/consoleFull)** for PR 12349 at commit [`d927d13`](https://github.com/apache/spark/commit/d927d13e6d99e36828d980f091e462c707f76986). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14590] Update pull request template wit...
GitHub user lresende opened a pull request: https://github.com/apache/spark/pull/12349 [SPARK-14590] Update pull request template with JIRA link ## What changes were proposed in this pull request? Update pull request template to have direct link to jira issue You can merge this pull request into a Git repository by running: $ git pull https://github.com/lresende/spark SPARK-14590 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12349.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12349 commit d927d13e6d99e36828d980f091e462c707f76986 Author: Luciano ResendeDate: 2016-04-13T03:27:24Z [SPARK-14590] Update pull request template with JIRA link --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59488521 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => --- End diff -- yea I agree, but this depends on the current implementation of `Union`, and it's weird to use the RDD partition stuff while working on SQL optimization rules... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14447][SQL] Speed up TungstenAggregate ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12345#issuecomment-209210452 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14447][SQL] Speed up TungstenAggregate ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12345#issuecomment-209210454 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55678/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14447][SQL] Speed up TungstenAggregate ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12345#issuecomment-209210278 **[Test build #55678 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55678/consoleFull)** for PR 12345 at commit [`c2fc385`](https://github.com/apache/spark/commit/c2fc38584dd073036a1f04f7cd7da9fcf50739e8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14589][SQL] Enhance DB2 JDBC Dialect do...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12348#issuecomment-209209961 **[Test build #55681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55681/consoleFull)** for PR 12348 at commit [`1f9d6fe`](https://github.com/apache/spark/commit/1f9d6fe089335fe33342d9b974cab1f8e9fb0bd5). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14589][SQL] Enhance DB2 JDBC Dialect do...
GitHub user lresende opened a pull request: https://github.com/apache/spark/pull/12348 [SPARK-14589][SQL] Enhance DB2 JDBC Dialect docker tests ## What changes were proposed in this pull request? Enhance the DB2 JDBC Dialect docker tests as they seemed to have had some issues on previous merge causing some tests to fail. ## How was this patch tested? By running the integration tests locally. You can merge this pull request into a Git repository by running: $ git pull https://github.com/lresende/spark SPARK-14589 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12348.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12348 commit 1f9d6fe089335fe33342d9b974cab1f8e9fb0bd5 Author: Luciano ResendeDate: 2016-04-13T03:15:19Z [SPARK-14589][SQL] Enhance DB2 JDBC Dialect docker tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12342#issuecomment-209209559 My general thoughts about filter push down: If the filter's condition is non-deterministic, we shouldn't push it down through some operators that will change the number or order of the input rows, e.g. set operations, others should be ok, e.g. the following optimization is allowed: ``` // From: df.select('a, 'b).filter('c > rand(42)) // To: df.filter('c > rand(42)).select('a, 'b) ``` If the underlying operator contains non-deterministic expression, filter push down is not allowed. Currently only 4 operator may contain non-deterministic expression: Project, Filter, Aggregate, Window --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14447][SQL] Speed up TungstenAggregate ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12345#issuecomment-209209164 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55677/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14447][SQL] Speed up TungstenAggregate ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12345#issuecomment-209209159 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14447][SQL] Speed up TungstenAggregate ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12345#issuecomment-209208765 **[Test build #55677 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55677/consoleFull)** for PR 12345 at commit [`4ee5687`](https://github.com/apache/spark/commit/4ee56873764d62efdaf8c47cb74aa399f2194fde). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59487855 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => --- End diff -- Each state should be separate in partition, Union does not change the partition. Coalesce() could change that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59487786 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => + val output = u.output + val newChildren = u.children.map { child => +val attrMap: Map[Expression, Expression] = output.zip(child.output).toMap +val newCond = condition transform { + case e if attrMap.contains(e) => attrMap(e) +} +Filter(newCond, child) + } + Union(newChildren) + +case filter @ Filter(condition, i: Intersect) => + // Intersect could change the rows, so non-deterministic predicate can't be pushed down + val (pushDown, stayUp) = splitConjunctivePredicates(condition).partition { cond => +cond.deterministic + } + if (pushDown.nonEmpty) { +val pushDownCond = pushDown.reduceLeft(And) +val output = i.output +val newChildren = i.children.map { child => + val attrMap: Map[Expression, Expression] = output.zip(child.output).toMap + val newCond = pushDownCond transform { +case e if attrMap.contains(e) => attrMap(e) + } + Filter(newCond, child) +} +val newIntersect = i.withNewChildren(newChildren) +if (stayUp.nonEmpty) { + Filter(stayUp.reduceLeft(And), newIntersect) +} else { + newIntersect +} + } else { +filter + } + +case filter @ Filter(condition, e @ Except(left, _)) => + pushDownPredicate(filter, e, e.left) { pushDown => +e.copy(left = Filter(pushDown, left)) + } + +case filter @ Filter(condition, u: UnaryNode) if u.expressions.forall(_.deterministic) => --- End diff -- Sample and Sort could change the rows, just to be safe. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59487680 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => + val output = u.output + val newChildren = u.children.map { child => +val attrMap: Map[Expression, Expression] = output.zip(child.output).toMap +val newCond = condition transform { + case e if attrMap.contains(e) => attrMap(e) +} +Filter(newCond, child) + } + Union(newChildren) + +case filter @ Filter(condition, i: Intersect) => + // Intersect could change the rows, so non-deterministic predicate can't be pushed down + val (pushDown, stayUp) = splitConjunctivePredicates(condition).partition { cond => +cond.deterministic + } + if (pushDown.nonEmpty) { +val pushDownCond = pushDown.reduceLeft(And) +val output = i.output +val newChildren = i.children.map { child => + val attrMap: Map[Expression, Expression] = output.zip(child.output).toMap + val newCond = pushDownCond transform { +case e if attrMap.contains(e) => attrMap(e) + } + Filter(newCond, child) +} +val newIntersect = i.withNewChildren(newChildren) +if (stayUp.nonEmpty) { + Filter(stayUp.reduceLeft(And), newIntersect) +} else { + newIntersect +} + } else { +filter + } + +case filter @ Filter(condition, e @ Except(left, _)) => + pushDownPredicate(filter, e, e.left) { pushDown => +e.copy(left = Filter(pushDown, left)) + } + +case filter @ Filter(condition, u: UnaryNode) if u.expressions.forall(_.deterministic) => --- End diff -- For `UnaryNode`, I think we can also push down non-deterministic filter, as long as this operator doesn't re-order the input rows(do we have such an operator?) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14109][SQL] Fix HDFSMetadataLog to fall...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/11925#discussion_r59487598 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala --- @@ -196,4 +195,148 @@ class HDFSMetadataLog[T: ClassTag](sqlContext: SQLContext, path: String) } None } + + private def createFileManager(): FileManager = { +val hadoopConf = sqlContext.sparkContext.hadoopConfiguration +try { + new FileContextManager(metadataPath, hadoopConf) +} catch { + case e: UnsupportedFileSystemException => +logWarning("Could not use FileContext API for managing metadata log file. The log may be" + + "inconsistent under failures.", e) --- End diff -- Can we remove this stack trace? Its not helpful (the error is always thrown from `createFileSystem`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59487580 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => --- End diff -- "non-determinstic" means the condition has internal state and may change every time it processes an input row. Before push down, we have only one stateful condition that processes all output rows of `Union`. But if we push it down, then each child of `Union` will have a stateful condition, I think it's different than before. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14400] [SQL] ScriptTransformation does ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12194#issuecomment-209205949 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55676/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14400] [SQL] ScriptTransformation does ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12194#issuecomment-209205945 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14400] [SQL] ScriptTransformation does ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12194#issuecomment-209205565 **[Test build #55676 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55676/consoleFull)** for PR 12194 at commit [`1054b71`](https://github.com/apache/spark/commit/1054b717859d37b8c0dd7b41a087cd2924b97b0a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59486877 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => --- End diff -- The rows in child of Union should be exactly the same as Union, I think we could push it down. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14581] [SQL] push predicatese through m...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12342#discussion_r59486756 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -975,6 +939,73 @@ object PushPredicateThroughAggregate extends Rule[LogicalPlan] with PredicateHel } else { filter } + +case filter @ Filter(condition, u: Union) => --- End diff -- I think we can't push filter down if the condition is non-determinstic --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [TEST] Test cherry-pick of a commit into branc...
Github user ericl closed the pull request at: https://github.com/apache/spark/pull/12343 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-14447] Experiments: AggregateHashM...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12224#issuecomment-209203350 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55675/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-14447] Experiments: AggregateHashM...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12224#issuecomment-209203348 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-14447] Experiments: AggregateHashM...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12224#issuecomment-209203224 **[Test build #55675 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55675/consoleFull)** for PR 12224 at commit [`b4f6ce2`](https://github.com/apache/spark/commit/b4f6ce291cf8ce07bf148da39d410c413c866a02). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [TEST] Test cherry-pick of a commit into branc...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12343#issuecomment-209202161 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55673/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [TEST] Test cherry-pick of a commit into branc...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12343#issuecomment-209202157 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [TEST] Test cherry-pick of a commit into branc...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12343#issuecomment-209201893 **[Test build #55673 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55673/consoleFull)** for PR 12343 at commit [`280be8a`](https://github.com/apache/spark/commit/280be8ae487df04a223418883ddebf84a40bc025). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14556][SQL] Code clean-ups for package ...
Github user lw-lin commented on the pull request: https://github.com/apache/spark/pull/12323#issuecomment-209201130 @zsxwing thank you for the review & merging ! :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12347#issuecomment-209200845 To be honest, I do not know why we need to merge these test case files. Their purposes are different. One is to verify the functionalities of parsers; another is to verify the execution of DDL/commands. In the future, we might add more test cases to both files. The files will grow bigger and bigger. cc @yhuai @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14574][BUILD][test-maven] Stop cross-bu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12334#issuecomment-209200571 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14574][BUILD][test-maven] Stop cross-bu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12334#issuecomment-209200573 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55668/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14574][BUILD][test-maven] Stop cross-bu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12334#issuecomment-209200262 **[Test build #55668 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55668/consoleFull)** for PR 12334 at commit [`944e8ab`](https://github.com/apache/spark/commit/944e8ab3da77a406fec9a8c1da118299f228a3e0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12347#issuecomment-209200120 **[Test build #55680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55680/consoleFull)** for PR 12347 at commit [`06c8948`](https://github.com/apache/spark/commit/06c8948b20cacccead019e42e3a07b89f727893a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12347#issuecomment-209200210 ```Use .contains(...) instead == Some(...) for Options, this method is introduced in Scala 2.11 and it is recommended method to use for this purpose;``` This should not be used in Spark code. It will break the 2.10 build. See my PR: https://github.com/apache/spark/pull/12201 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/12347#discussion_r59485621 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDDLCommandSuite.scala --- @@ -113,10 +244,10 @@ class HiveDDLCommandSuite extends PlanTest { val (desc, exists) = extractTableDesc(s2) assert(exists) -assert(desc.identifier.database == Some("mydb")) +assert(desc.identifier.database.contains("mydb")) --- End diff -- All these similar changes will break the Scala 2.10 build --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13681][SPARK-14458][SPARK-14566][SQL] A...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12179#issuecomment-209199922 LGTM (assume the tests are just copy-pasted from original code) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14441] [SQL] Consolidate DDL tests
GitHub user bomeng opened a pull request: https://github.com/apache/spark/pull/12347 [SPARK-14441] [SQL] Consolidate DDL tests ## What changes were proposed in this pull request? Today we have `DDLSuite`, `DDLCommandSuite`, `HiveDDLCommandSuite` and `HiveDDLSuite`. In this PR, I am trying to consolidate the files as much as possible. Two files are left for now, since it is good to put Hive related test suite in the Hive package. Along with the combination, I also did some modification of current codes mainly in order to improve the codings, here is the summary: 1. Use `.contains()` instead `==` for Options, this method is introduced in Scala 2.11 and it is recommended method to use for this purpose; 2. Make map consistent to use `->`, instead of 2 elements tuple; 3. Use `isEmpty()` instead of `== None` 4. Add `private` to the parser and narrow the scope of implicit (put inside one of the tests); 5. Modify the names of some tests to be unique, since they are in one file now; ## How was this patch tested? I did not change the logic of any tests, just move around and improve the codes, so existing test cases should remain same. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bomeng/spark SPARK-14441 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12347.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12347 commit af298b63693586e7c0598ccd98ba3f9e6a7634f7 Author: bomengDate: 2016-04-13T02:19:27Z Consolidate DDL tests commit 06c8948b20cacccead019e42e3a07b89f727893a Author: bomeng Date: 2016-04-13T02:21:52Z remove extra line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR][SQL] Remove some unused imports in dat...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-209197528 @HyukjinKwon could you open another PR to remove the `SqlNewHadoopRDD`? I think it's not needed anymore. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR][SQL] Remove some unused imports in dat...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12326 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR][SQL] Remove some unused imports in dat...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-209197308 Thanks ! merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14554][SQL][follow-up] use checkDataset...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/12346#issuecomment-209196952 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR][SQL] Remove some unused imports in dat...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12326#issuecomment-209196372 LGTM, cc @liancheng @yhuai should we remove `SqlNewHadoopRDD`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org