[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50081670 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -44,6 +46,13 @@ private[sql] case class CSVParameters(@transient parameters: Map[String, String] } } + // Available compression codec list + val shortCompressionCodecNames = Map( --- End diff -- this should go into the object rather than in the case class --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172768286 **[Test build #49671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49671/consoleFull)** for PR 10805 at commit [`adb9eb2`](https://github.com/apache/spark/commit/adb9eb22a256895ad4bad11893222c485c7afa37). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP] [SPARK-12854][SQL] Implement complex typ...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10820#discussion_r50080842 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java --- @@ -17,22 +17,37 @@ package org.apache.spark.sql.execution.vectorized; import org.apache.spark.memory.MemoryMode; -import org.apache.spark.sql.types.DataType; +import org.apache.spark.sql.types.*; /** * This class represents a column of values and provides the main APIs to access the data * values. It supports all the types and contains get/put APIs as well as their batched versions. * The batched versions are preferable whenever possible. * - * Most of the APIs take the rowId as a parameter. This is the local 0-based row id for values + * To handle nested schemas, ColumnVector has two types: Arrays and Structs. In both cases these + * columns have child columns. All of the data is stored in the child columns and the parent column + * contains nullability, and in the case of Arrays, the lengths and offsets into the child column. --- End diff -- can you explain how lengths and offsets are stored? also is there a single "parent" column that encodes nullability, length, and offset? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172766510 Supported shorten names for compression codecs are below (case insensitive): `bzip2` -> `org.apache.hadoop.io.compress.BZip2Codec` `gzip` -> `org.apache.hadoop.io.compress.GzipCodec` `lz4` -> `org.apache.hadoop.io.compress.Lz4Codec` `snappy` -> `org.apache.hadoop.io.compress.SnappyCodec` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization and metrics ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10828#issuecomment-172766340 **[Test build #49670 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49670/consoleFull)** for PR 10828 at commit [`f0f3da6`](https://github.com/apache/spark/commit/f0f3da6821ce03059ac472c3ab58bd6b183b4b7e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP] [SPARK-12854][SQL] Implement complex typ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10820#issuecomment-172765612 if the schema is array>,? array of int? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization and metrics ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10828#issuecomment-172764655 This is pretty cool. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization and metrics ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10828#issuecomment-172764304 cc @zsxwing --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11944][PYSPARK][MLLIB] python mllib.clu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10150#issuecomment-172764328 **[Test build #2406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2406/consoleFull)** for PR 10150 at commit [`5eec54b`](https://github.com/apache/spark/commit/5eec54b9072e737d32a38efb8f1c101ae05b3044). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization and metrics ...
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/10828 [SPARK-12902] [SQL] visualization and metrics for generated operators This PR brings back SQL metrics and visualization for generated operators, they looks like: ![vis_codegen](https://cloud.githubusercontent.com/assets/40902/12412503/d9ad17bc-be3b-11e5-99a0-80759dc59dd0.png) You can merge this pull request into a Git repository by running: $ git pull https://github.com/davies/spark viz_codegen Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10828.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10828 commit f0f3da6821ce03059ac472c3ab58bd6b183b4b7e Author: Davies Liu Date: 2016-01-19T07:27:44Z visualize the plans inside whole stage codegen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][MINOR] Fix one little mismatched comment...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10824#issuecomment-172762812 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12689][SQL] Migrate DDL parsing to the ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10723#issuecomment-172762190 **[Test build #49669 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49669/consoleFull)** for PR 10723 at commit [`d800c58`](https://github.com/apache/spark/commit/d800c589748ba9f2e07f7959ed704bd2963a27ff). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12770][SQL] Implement rules for branch ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10827#issuecomment-172762012 **[Test build #49668 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49668/consoleFull)** for PR 10827 at commit [`8e980b5`](https://github.com/apache/spark/commit/8e980b5da667f5953d354942cd2c7cb03047b033). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12770][SQL] Implement rules for branch ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10827#issuecomment-172760208 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12770][SQL] Implement rules for branch ...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/10827 [SPARK-12770][SQL] Implement rules for branch elimination for CaseWhen The three optimization cases are: 1. If the first branch's condition is a true literal, remove the CaseWhen and use the value from that branch. 2. If a branch's condition is a false or null literal, remove that branch. 3. If only the else branch is left, remove the CaseWhen and use the value from the else branch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-12770 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10827.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10827 commit 6c8201c4f884a16660088f3aa695a9b4f773c0d6 Author: Reynold Xin Date: 2016-01-19T07:05:03Z [SPARK-12770][SQL] Implement rules for branch elimination for CaseWhen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12120][PYSPARK] Improve exception messa...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10126#issuecomment-172759666 Ping. No rush but just wanted to bump this back up in the PR review list. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12689][SQL] Migrate DDL parsing to the ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10723#discussion_r50078090 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala --- @@ -316,7 +316,7 @@ class HiveContext private[hive]( } protected[sql] override def parseSql(sql: String): LogicalPlan = { -super.parseSql(substitutor.substitute(hiveconf, sql)) +sqlParser.parsePlan(substitutor.substitute(hiveconf, sql)) --- End diff -- DDLParser is still used in SQLContext. Do we want to completely remove it? Because I already migrate three commands. I think we can test them all together. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12848][SQL] Change parsed decimal liter...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10796#issuecomment-172759502 **[Test build #49667 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49667/consoleFull)** for PR 10796 at commit [`df8e237`](https://github.com/apache/spark/commit/df8e237813476edc954d32e344feedae27889976). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12682][SQL] Add support for (optionally...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10826#issuecomment-172758227 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12682][SQL] Add support for (optionally...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10826#issuecomment-172758230 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49664/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12682][SQL] Add support for (optionally...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10826#issuecomment-172757695 **[Test build #49664 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49664/consoleFull)** for PR 10826 at commit [`cae4413`](https://github.com/apache/spark/commit/cae4413d9f87d9c8c332280a547ae4f8ba63267b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11944][PYSPARK][MLLIB] python mllib.clu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10150#issuecomment-172756623 **[Test build #2406 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2406/consoleFull)** for PR 10150 at commit [`5eec54b`](https://github.com/apache/spark/commit/5eec54b9072e737d32a38efb8f1c101ae05b3044). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][MINOR] Fix one little mismatched comment...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10824#issuecomment-172755667 **[Test build #2405 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2405/consoleFull)** for PR 10824 at commit [`276ff4f`](https://github.com/apache/spark/commit/276ff4f07813646f319ea83cacc8b853e10defd8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][MINOR] Fix one little mismatched comment...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10824#issuecomment-172755341 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12898. Consider having dummyCallSite for...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10825#issuecomment-172755241 Can you explain more about "HiveTableScan runs with getCallSite which is really expensive and shows up when scanning through large table with partitions"? What's expensive? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12841][SQL][branch-1.6] fix cast in fil...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10819#issuecomment-172754648 @cloud-fan you'd need to close this pull request yourself since it was not merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12705] [SQL] Analyzer Rule ResolveSortR...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10678#issuecomment-172754753 **[Test build #49666 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49666/consoleFull)** for PR 10678 at commit [`598a673`](https://github.com/apache/spark/commit/598a673d1b01ebef6a89582f8e63ab487e465e71). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172754429 **[Test build #49665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49665/consoleFull)** for PR 10815 at commit [`bfb3c05`](https://github.com/apache/spark/commit/bfb3c050387b091bfccdbb548aa107f3c135d293). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: RoutingTablePartition code comment errors(Prev...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10804#issuecomment-172754187 I also tried and couldn't merge it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4131][SQL] support writing data into th...
Github user litao-buptsse commented on the pull request: https://github.com/apache/spark/pull/4380#issuecomment-172753680 I think it's a useful feature and widely used in hive. Why not finish this feature and merge it to branch-1.6? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12705] [SQL] Analyzer Rule ResolveSortR...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10678#issuecomment-172753568 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172752658 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-172751017 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49663/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-172751016 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-172750914 **[Test build #49663 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49663/consoleFull)** for PR 10731 at commit [`0daa766`](https://github.com/apache/spark/commit/0daa766d03538c806175c3389106a83b0fe977e3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12689][SQL] Migrate DDL parsing to the ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10723#discussion_r50074715 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkQl.scala --- @@ -42,6 +45,84 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly getClauses(Seq("TOK_QUERY", "FORMATTED", "EXTENDED"), explainArgs) ExplainCommand(nodeToPlan(query), extended = extended.isDefined) + case Token("TOK_REFRESHTABLE", nameParts :: Nil) => +val tableIdent = extractTableIdent(nameParts) +RefreshTable(tableIdent) + + case Token("TOK_CREATETABLEUSING", createTableArgs) => +val clauses = getClauses( +Seq("TEMPORARY", "TOK_IFNOTEXISTS", "TOK_TABNAME", "TOK_TABCOLLIST", + "TOK_TABLEPROVIDER", "TOK_TABLEOPTIONS", "TOK_QUERY"), createTableArgs) + +val temp = clauses(0) +val allowExisting = clauses(1) +val Some(tabName) = clauses(2) +val tableCols = clauses(3) +val Some(tableProvider) = clauses(4) +val tableOpts = clauses(5) +val tableAs = clauses(6) + +val tableIdent: TableIdentifier = tabName match { + case Token("TOK_TABNAME", Token(dbName, _) :: Token(tableName, _) :: Nil) => +new TableIdentifier(tableName, Some(dbName)) + case Token("TOK_TABNAME", Token(tableName, _) :: Nil) => +TableIdentifier(tableName) +} + +val columns = tableCols.map { + case Token("TOK_TABCOLLIST", fields) => StructType(fields.map(nodeToStructField)) +} + +val provider = tableProvider match { + case Token("TOK_TABLEPROVIDER", Token(provider, _) :: Nil) => provider +} + +val options = tableOpts.map { opts => + opts match { +case Token("TOK_TABLEOPTIONS", options) => + options.map { +case Token("TOK_TABLEOPTION", Token(key, _) :: Token(value, _) :: Nil) => + (key, value.replaceAll("^\'|^\"|\"$|\'$", "")) --- End diff -- Don't know there is `unquoteString`. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12867] [SQL] Nullability of Intersect c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10812#issuecomment-172747648 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49660/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12867] [SQL] Nullability of Intersect c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10812#issuecomment-172747647 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172747397 I will resolve conflicts and update this soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12668][SQL] Providing aliases for CSV o...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10800 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12867] [SQL] Nullability of Intersect c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10812#issuecomment-172746974 **[Test build #49660 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49660/consoleFull)** for PR 10812 at commit [`84507c8`](https://github.com/apache/spark/commit/84507c83f7f5ec9379b00cb9c6ad8bc8ca9950ea). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10805#issuecomment-172745372 Yup we are dropping Hadoop 1.x support, so it is OK to have it only for Hadoop 2.x. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50074461 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -71,6 +71,8 @@ private[sql] case class CSVParameters(parameters: Map[String, String]) extends L val nullValue = parameters.getOrElse("nullValue", "") + val codec = parameters.getOrElse("compression", parameters.getOrElse("codec", null)) --- End diff -- the other thing is that i'd create short-form names for the common options, e.g. "gzip" should become GzipCodec. You'd need to look into what the commonly supported formats are and come up with their short names. We should also make sure this is case insensitive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12871][SQL] Support to specify the opti...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10805#discussion_r50074385 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParameters.scala --- @@ -71,6 +71,8 @@ private[sql] case class CSVParameters(parameters: Map[String, String]) extends L val nullValue = parameters.getOrElse("nullValue", "") + val codec = parameters.getOrElse("compression", parameters.getOrElse("codec", null)) --- End diff -- for this one i'd name the internally name compression or compressionCodec since codec can mean a lot of different things. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12668][SQL] Providing aliases for CSV o...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10800#issuecomment-172744060 Thanks - I'm going to merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11518] [Deploy, Windows] Handle spaces ...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/10789#issuecomment-172741122 @JoshRosen Thank you for your involvement. It seems a good fix, but it doesn't work for my environment because we should fix more files to handle spaces properly. For example, in `pyspark2.cmd` we should also fix these `call` lines because %SPARK_HOME% contains space. ``` ...(snip)... call %SPARK_HOME%\bin\load-spark-env.cmd ...(snip)... call %SPARK_HOME%\bin\spark-submit2.cmd pyspark-shell-main --name "PySparkShell" %* ``` As this is just a example, there are many other codes that should be double-quoted other than this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12841][SQL][branch-1.6] fix cast in fil...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/10819#issuecomment-172741093 Thanks! Merging to branch 1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11137][Streaming] Make StreamingContext...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/10807#discussion_r50073547 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala --- @@ -714,12 +714,20 @@ class StreamingContext private[streaming] ( // interrupted. See SPARK-12001 for more details. Because the body of this case can be // executed twice in the case of a partial stop, all methods called here need to be // idempotent. - scheduler.stop(stopGracefully) --- End diff -- I think one of the primary goal of this JIRA is to allow partial clean-up and retry on `stop()` calls. In this specific code path, it is already written in a way to allow for retry by setting the state to `STOPPED` only almost at the end on [line 728](https://github.com/apache/spark/pull/10807/files#diff-8a7f0e3f26c15ba484e6312c3caf033dL728) in the original code. `tryLogNonFatalError` swallows and logs "non-fatal" exception, and with that added, despite any non-critical error thrown it could reach the line `state = STOPPED`. For instance, if `env.metricsSystem.removeSource()` throws then it will continue on and setting `state` to `STOPPED`, at which point the caller cannot get back to the same code to retry cleanup because of the `state` match case [above](https://github.com/apache/spark/pull/10807/files#diff-8a7f0e3f26c15ba484e6312c3caf033dR707). Is that what we want? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12682][SQL] Add support for (optionally...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10826#issuecomment-172739541 **[Test build #49664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49664/consoleFull)** for PR 10826 at commit [`cae4413`](https://github.com/apache/spark/commit/cae4413d9f87d9c8c332280a547ae4f8ba63267b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12847][Core][Streaming]Remove Streaming...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10779#issuecomment-172737916 Sounds great. Let me update it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10491#issuecomment-172737699 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49662/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10491#issuecomment-172737638 **[Test build #49662 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49662/consoleFull)** for PR 10491 at commit [`6983d15`](https://github.com/apache/spark/commit/6983d15524d260ae49574b27e47221b902165f30). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10491#issuecomment-172737697 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12682] Add support for (optionally) not...
GitHub user sameeragarwal opened a pull request: https://github.com/apache/spark/pull/10826 [SPARK-12682] Add support for (optionally) not storing tables in hive metadata format This PR adds a new table option (`skip_hive_metadata`) that'd allow the user to skip storing the table metadata in hive metadata format. While this could be useful in general, the specific use-case for this change is that Hive doesn't handle wide schemas well (see https://issues.apache.org/jira/browse/SPARK-12682 and https://issues.apache.org/jira/browse/SPARK-6024) which in turn prevents such tables from being queried in SparkSQL. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sameeragarwal/spark skip-hive-metadata Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10826.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10826 commit cae4413d9f87d9c8c332280a547ae4f8ba63267b Author: Sameer Agarwal Date: 2016-01-19T04:46:12Z Add support for (optionally) not storing tables in hive metadata format --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user zhichao-li commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-172736625 @hvanhovell didn't aware of #10052, would be happy if @dereksabryfb can pick up that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-172736637 **[Test build #49663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49663/consoleFull)** for PR 10731 at commit [`0daa766`](https://github.com/apache/spark/commit/0daa766d03538c806175c3389106a83b0fe977e3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r50072670 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -445,6 +445,26 @@ class Analyzer( val newOrdering = resolveSortOrders(ordering, child, throws = false) Sort(newOrdering, global, child) + // Resolve the order index to be a specific column + case s @ Sort(ordering, global, child) if child.resolved && s.resolved => +def indexToColumn(index: Int, direction: SortDirection) = { + val orderNodes = child.output + if (index > 0 && index <= orderNodes.size) { +SortOrder(orderNodes(index - 1), direction) + } else { +throw new UnresolvedException(s, + s"""Order by position: $index does not exist \n + |The Select List is indexed from 1 to ${orderNodes.size}""".stripMargin) + } +} +val newOrdering = ordering map { --- End diff -- Seems like it might be more reasonable from the semantic point of view to override the `resolved` method and move the logic to resolveSortOrders. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10491#issuecomment-172736310 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49661/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10491#issuecomment-172736309 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user zhichao-li commented on a diff in the pull request: https://github.com/apache/spark/pull/10731#discussion_r50072543 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala --- @@ -17,6 +17,7 @@ package org.apache.spark.sql.execution.joins +import org.apache.spark.sql.{DataFrame, Row, SQLConf} --- End diff -- This is for passing the style check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10491#issuecomment-172735969 **[Test build #49662 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49662/consoleFull)** for PR 10491 at commit [`6983d15`](https://github.com/apache/spark/commit/6983d15524d260ae49574b27e47221b902165f30). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12534][DOC] update documentation to lis...
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/10491#issuecomment-172735041 Thank you for the detailed explanation, @srowen. Agreed it would be valuable to have a dedicated doc page for CLI flags, perhaps as a bigger project later. As of now, I have moved references to config prop to job-scheduling.md. I hope that looks alright. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12898. Consider having dummyCallSite for...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10825#issuecomment-172734435 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12898. Consider having dummyCallSite for...
GitHub user rajeshbalamohan opened a pull request: https://github.com/apache/spark/pull/10825 SPARK-12898. Consider having dummyCallSite for HiveTableScan Currently, HiveTableScan runs with getCallSite which is really expensive and shows up when scanning through large table with partitions (e.g TPC-DS) which slows down the overall runtime of the job. It would be good to consider having dummyCallSite in HiveTableScan. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajeshbalamohan/spark SPARK-12898 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10825.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10825 commit 3a32561eb905b236014cad74472c3a8c359b1aa0 Author: Rajesh Balamohan Date: 2016-01-19T04:27:52Z SPARK-12898. Consider having dummyCallSite for HiveTableScan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12841][SQL][branch-1.6] fix cast in fil...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10819#issuecomment-172733640 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49655/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12841][SQL][branch-1.6] fix cast in fil...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10819#issuecomment-172733639 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12841][SQL][branch-1.6] fix cast in fil...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10819#issuecomment-172733571 **[Test build #49655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49655/consoleFull)** for PR 10819 at commit [`4e3269b`](https://github.com/apache/spark/commit/4e3269bd5e0aa6b262500f92da502f441b1671b3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT] [Build] Changed the import order
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10823#issuecomment-172732848 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49654/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT] [Build] Changed the import order
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10823#issuecomment-172732847 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT] [Build] Changed the import order
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10823#issuecomment-172732724 **[Test build #49654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49654/consoleFull)** for PR 10823 at commit [`ed88430`](https://github.com/apache/spark/commit/ed884307a1812530546bcf499ee6a5aed1449059). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12330] [MESOS] Fix mesos coarse mode cl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10319#issuecomment-172732607 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49659/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12330] [MESOS] Fix mesos coarse mode cl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10319#issuecomment-172732603 **[Test build #49659 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49659/consoleFull)** for PR 10319 at commit [`b781297`](https://github.com/apache/spark/commit/b78129727afa11d3b75dd73c8f8384021b0a8239). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12330] [MESOS] Fix mesos coarse mode cl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10319#issuecomment-172732605 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12330] [MESOS] Fix mesos coarse mode cl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10319#issuecomment-172732497 **[Test build #49659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49659/consoleFull)** for PR 10319 at commit [`b781297`](https://github.com/apache/spark/commit/b78129727afa11d3b75dd73c8f8384021b0a8239). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12867] [SQL] Nullability of Intersect c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10812#issuecomment-172732208 **[Test build #49660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49660/consoleFull)** for PR 10812 at commit [`84507c8`](https://github.com/apache/spark/commit/84507c83f7f5ec9379b00cb9c6ad8bc8ca9950ea). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][MINOR] Fix one little mismatched comment...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10824#issuecomment-172731825 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12867] [SQL] Nullability of Intersect c...
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/10812#issuecomment-172731686 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12330] [MESOS] Fix mesos coarse mode cl...
Github user drcrallen commented on the pull request: https://github.com/apache/spark/pull/10319#issuecomment-172731505 @dragos Updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL][MINOR] Fix one little mismatched comment...
GitHub user proflin opened a pull request: https://github.com/apache/spark/pull/10824 [SQL][MINOR] Fix one little mismatched comment according to the codes in interface.scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/proflin/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10824.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10824 commit 276ff4f07813646f319ea83cacc8b853e10defd8 Author: proflin Date: 2016-01-19T04:00:22Z Fix one little mismatched comments according to the codes in interface.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12668][SQL] Providing aliases for CSV o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10800#issuecomment-172731135 **[Test build #49658 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49658/consoleFull)** for PR 10800 at commit [`976e3af`](https://github.com/apache/spark/commit/976e3afec587ce12dd2b5b7005151274792ee827). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172730654 Fixed the conflicts at https://github.com/andrewor14/spark/pull/3 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12668][SQL] Providing aliases for CSV o...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10800#issuecomment-172729369 I am not too sure why I am hitting this issue though, but I just corrected some imports in an alphabetical order at `SparkStrategies` and `InnerJoinSuite`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT] [Build] Changed the import order
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10823 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT] [Build] Changed the import order
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10823#issuecomment-172728774 LGTM. Thanks for fixing this. I'm going to merge this now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172728630 **[Test build #49657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49657/consoleFull)** for PR 10815 at commit [`12bd943`](https://github.com/apache/spark/commit/12bd943340eb887711f15665e7c3805e3a76558c). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172728632 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49657/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172728631 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172727201 Uh oh, this merge-conflicted. I'm going to try to fix the conflicts and will submit a PR against your PR. You can on-the-go merge-button it, then I'll retest to get this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12847][Core][Streaming]Remove Streaming...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/10779#issuecomment-172727012 On the streaming side, @andrewor14 and I talked offline and there can be cleaner design, with better abstractions. Current design basically stores the list of StreamingListener in the SparkListenerBus (using the adaptor), and makes each StreamingListenerEvent extends a SparkListenerEvent. Since there is no StreamingListenerBus anymore, the abstraction is a little hard to understand on what gets posted where and who is calling the callbacks. Also the public API is being changed, which is also awkward - StreamingListener does not extend SparkListener but StreamingListenerEvent extends SparkListenerEvent. I think a better design is the following. The goal is simply for the existing StreamingListenerBus to not maintain its own thread and use the SparkListenerBus's thread to post everything, To do that all that needs to be done is for the StreamingListenerBus to forward the events into the SparkListenerBus. This can be done by the following. ``` class StreamingListenerForwardingBus(sparkListenerBus: SparkListenerBus) extends SparkListener { case class WrappedStreamingListenerEvent(streamingListenerEvent: StreamingListenerEvent) extends SparkListenerEvent { protected[spark] override def logEvent: Boolean = false } private val listeners = new ArrayBuffer[StreamingListener]() sparkListenerBus.add(this)// for getting callbacks on spark events def addListener(listener: StreamingListener) { listeners += listener } def post(event: StreamingListenerEvent) { sparkListenerBus.post(new WrappedStreamingListenerEvent(event)) } override def onOtherEvents(event: SparkListenerEvent) { event match { case WrappedStreamingListenerEvent(sle) => sle match { // call listeners } case _ => } } } class JobScheduler { ... val listenerBus = new FakeStreamingListenerBus(sparkContext.listenerBus) } ``` This maintains the clean abstraction that streaming events get posted to streaming bus (internally forwarded to spark bus), AND does not require public API changes (streaming events do not have to extend spark events). What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12885] [MINOR] Rename 3 fields in Shuff...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10811 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172726969 **[Test build #49657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49657/consoleFull)** for PR 10815 at commit [`12bd943`](https://github.com/apache/spark/commit/12bd943340eb887711f15665e7c3805e3a76558c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12885] [MINOR] Rename 3 fields in Shuff...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10811#issuecomment-172726657 LGTM, so I'm going to merge this now. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172725115 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49632/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172725112 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172725095 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12145][SQL] Command 'Set Role [ADMIN|NO...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10144#issuecomment-172725111 **[Test build #49656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49656/consoleFull)** for PR 10144 at commit [`7351f83`](https://github.com/apache/spark/commit/7351f835b564b0d8c6340678f938568d8822b752). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12145][SQL] Command 'Set Role [ADMIN|NO...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10144#issuecomment-172725114 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12887] Do not expose var's in TaskMetri...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10815#issuecomment-172725068 **[Test build #49632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49632/consoleFull)** for PR 10815 at commit [`d2e4e23`](https://github.com/apache/spark/commit/d2e4e23be82a0afb2f39d629ee7413591bc08c8d). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12145][SQL] Command 'Set Role [ADMIN|NO...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10144#issuecomment-172725116 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49656/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12145][SQL] Command 'Set Role [ADMIN|NO...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10144#issuecomment-172724897 **[Test build #49656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49656/consoleFull)** for PR 10144 at commit [`7351f83`](https://github.com/apache/spark/commit/7351f835b564b0d8c6340678f938568d8822b752). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org