[GitHub] spark pull request: [SPARK-15087][MINOR][DOC] Follow Up: Fix the C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12953#issuecomment-217495328 **[Test build #58000 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58000/consoleFull)** for PR 12953 at commit [`dbb6632`](https://github.com/apache/spark/commit/dbb663222ba379fbf0b846e2342173f0f0a0ecef). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13370][SQL] Require whitespace between ...
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/12897#issuecomment-217494270 @yhuai I checked the most recent 1.6 branch. They both interpret `1.0L` as `1.0 as L`. ## SQL: ```scala scala> sql("select 1.0L").explain(true) == Parsed Logical Plan == 'Project [unresolvedalias(1.0 AS L#2)] +- OneRowRelation$ ``` ## Hive: ```scala scala> sql("select 1.0L").explain(true) == Parsed Logical Plan == 'Project [unresolvedalias(1.0 AS L#0)] +- OneRowRelation$ ``` I am leaning towards not changing the behavior. What is your opinion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15173][SQL] DataFrameWriter.insertInto ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12949#issuecomment-217493037 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15173][SQL] DataFrameWriter.insertInto ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12949#issuecomment-217493039 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58001/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15173][SQL] DataFrameWriter.insertInto ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12949#issuecomment-217492842 **[Test build #58001 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58001/consoleFull)** for PR 12949 at commit [`167a7d6`](https://github.com/apache/spark/commit/167a7d6d8ad13a9d754e22dbd75cd9a16e9d1a56). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15173][SQL] DataFrameWriter.insertInto ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/12949#discussion_r62354994 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -239,8 +239,13 @@ case class DataSource( } } - /** Create a resolved [[BaseRelation]] that can be used to read data from this [[DataSource]] */ - def resolveRelation(): BaseRelation = { + /** + * Create a resolved [[BaseRelation]] that can be used to read data from or write data into this + * [[DataSource]] + * + * @param checkPathExist A flag to indicate whether to check the existence of path or not. + */ + def resolveRelation(checkPathExist: Boolean = true): BaseRelation = { --- End diff -- I am not sure I understand this change. For a `FileFormat`, when do we not need to check if the path exists? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15112][SQL] Allows query plan schema an...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12952#issuecomment-217492073 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15112][SQL] Allows query plan schema an...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12952#issuecomment-217492075 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58002/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15112][SQL] Allows query plan schema an...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12952#issuecomment-217491949 **[Test build #58002 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58002/consoleFull)** for PR 12952 at commit [`9412b9b`](https://github.com/apache/spark/commit/9412b9b7743cda7df22e7834059e7c7be2e1eb85). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13370][SQL] Require whitespace between ...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/12897#issuecomment-217491293 @hvanhovell What is the behavior of 1.6? Does 1.6 treat `L` as a suffix for a bigint literal? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14542][CORE] PipeRDD should allow confi...
Github user sitalkedia commented on the pull request: https://github.com/apache/spark/pull/12309#issuecomment-217489895 I don't understand `./dev/mima` passes on my laptop. I also verified that `./dev/mima` fails without my changes in `MimaExcludes.scala`. Something weird with the Jenkins build? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15051] [SQL] Create a TypedColumn alias...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12893#issuecomment-217488982 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57998/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15051] [SQL] Create a TypedColumn alias...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12893#issuecomment-217488980 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15051] [SQL] Create a TypedColumn alias...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12893#issuecomment-217488726 **[Test build #57998 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57998/consoleFull)** for PR 12893 at commit [`e408fdf`](https://github.com/apache/spark/commit/e408fdf43c207a189f6316a80599e7f54eb832b6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15112][SQL] Allows query plan schema an...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12952#discussion_r62352447 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -394,7 +405,7 @@ class Dataset[T] private[sql]( * @group basic * @since 1.6.0 */ - def schema: StructType = queryExecution.analyzed.schema + def schema: StructType = resolvedTEncoder.schema --- End diff -- I'm kind of worried about it. We can't guarantee encoder's schema is always same with plan's schema(in this PR we add a project to try to make them consistent, but it can't handle inner field). If they are different, users may select a un-exist column which exists in schema. cc @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15093][SQL] create/delete/rename direct...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12871#issuecomment-217485501 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15093][SQL] create/delete/rename direct...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12871#issuecomment-217485502 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57999/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15093][SQL] create/delete/rename direct...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12871#issuecomment-217485371 **[Test build #57999 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57999/consoleFull)** for PR 12871 at commit [`0261d25`](https://github.com/apache/spark/commit/0261d252f8baa1e823a97261e111a3e93019a0dc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...
Github user RussellSpitzer commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-217483240 Sorry I forgot about this, I'll clean this up tomorrow and get it ready --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15183][Streaming] Adding outputMode to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12958#issuecomment-217482035 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15182] [ML] Copy MLlib doc to ML: ml.fe...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12957#issuecomment-217481568 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15182] [ML] Copy MLlib doc to ML: ml.fe...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12957#issuecomment-217481571 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58005/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15182] [ML] Copy MLlib doc to ML: ml.fe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12957#issuecomment-217481463 **[Test build #58005 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58005/consoleFull)** for PR 12957 at commit [`2cc977e`](https://github.com/apache/spark/commit/2cc977e15bfe86c0944fab9fb3f0609339d580a0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-15183
GitHub user agsachin opened a pull request: https://github.com/apache/spark/pull/12958 SPARK-15183 ## What changes were proposed in this pull request? while experimenting with structure streaming. I found that mode() is used for non-continuous queries while outputMode() is used for continuous queries. ouputMode is not defined, so I have written the some raw implementation and test cases just to make sure the streaming app works Note:- /** Start a query */ private[sql] def startQuery( name: String, checkpointLocation: String, df: DataFrame, sink: Sink, trigger: Trigger = ProcessingTime(0), triggerClock: Clock = new SystemClock(), outputMode: OutputMode = Append): ContinuousQuery = { As per me outputMode should be defined before triggerClock, the constructor with outputMode defined will be used more often then triggerClock. I have added triggerClock() method also ## How was this patch tested? using unit test locally (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) You can merge this pull request into a Git repository by running: $ git pull https://github.com/agsachin/spark streaming Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12958.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12958 commit b418b4526e57b1ef437b9dab7779c3be1a5fd497 Author: sachin aggarwal Date: 2016-05-06T15:47:16Z SPARK-15183 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15182] [ML] Copy MLlib doc to ML: ml.fe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12957#issuecomment-217478950 **[Test build #58005 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58005/consoleFull)** for PR 12957 at commit [`2cc977e`](https://github.com/apache/spark/commit/2cc977e15bfe86c0944fab9fb3f0609339d580a0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15182] [ML] Copy MLlib doc to ML: ml.fe...
GitHub user hhbyyh opened a pull request: https://github.com/apache/spark/pull/12957 [SPARK-15182] [ML] Copy MLlib doc to ML: ml.feature ## What changes were proposed in this pull request? We should now begin copying algorithm details from the spark.mllib guide to spark.ml as needed, rather than just linking back to the corresponding algorithms in the spark.mllib user guide. ## How was this patch tested? manual review for doc. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hhbyyh/spark tfidfdoc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12957.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12957 commit 2cc977e15bfe86c0944fab9fb3f0609339d580a0 Author: Yuhao Yang Date: 2016-05-06T15:38:07Z copy doc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11714][Mesos] Make Spark on Mesos honor...
Github user skonto commented on the pull request: https://github.com/apache/spark/pull/11157#issuecomment-217477341 @mgummelt ready. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12177] [STREAMING] Update KafkaDStreams...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10953#issuecomment-217477017 @markgrover Mind adding `Closes #10681` in the PR description so that merging script can close that together? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15180][SQL] Support subexpression elimi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12956#issuecomment-217474807 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11714][Mesos] Make Spark on Mesos honor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11157#issuecomment-217475441 **[Test build #58004 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58004/consoleFull)** for PR 11157 at commit [`9c7cf33`](https://github.com/apache/spark/commit/9c7cf332ccf350e721d25b0070b7c2637261ccaf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15180][SQL] Support subexpression elimi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12956#issuecomment-217474810 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57997/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15180][SQL] Support subexpression elimi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12956#issuecomment-217474554 **[Test build #57997 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57997/consoleFull)** for PR 12956 at commit [`55e43ef`](https://github.com/apache/spark/commit/55e43ef9f25c68d0c3773f36156b34d42a9baedc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SubExprEliminationState(isNull: String, value: String, exprCode: Option[ExprCode])` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15112][SQL] Allows query plan schema an...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12952#issuecomment-217473744 **[Test build #58002 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58002/consoleFull)** for PR 12952 at commit [`9412b9b`](https://github.com/apache/spark/commit/9412b9b7743cda7df22e7834059e7c7be2e1eb85). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15173][SQL] DataFrameWriter.insertInto ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12949#issuecomment-217473754 **[Test build #58003 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58003/consoleFull)** for PR 12949 at commit [`8f5b688`](https://github.com/apache/spark/commit/8f5b688754bc493548ecab714cb1a56136c3b02e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12954#issuecomment-217472088 **[Test build #57994 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57994/consoleFull)** for PR 12954 at commit [`f0871c9`](https://github.com/apache/spark/commit/f0871c921285a05602cf566c9f2c23901224d73e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15112][SQL] Allows query plan schema an...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/12952#discussion_r62345191 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala --- @@ -463,7 +463,12 @@ class SparkSession private( */ @Experimental def range(start: Long, end: Long, step: Long, numPartitions: Int): Dataset[java.lang.Long] = { -new Dataset(self, Range(start, end, step, numPartitions), Encoders.LONG) +val encoder = { + val schema = StructType(Seq(StructField("id", LongType, nullable = false))) + ExpressionEncoder[java.lang.Long]().copy[java.lang.Long](schema = schema) +} + +new Dataset(self, Range(start, end, step, numPartitions), encoder) --- End diff -- We are now using the encoder schema as Dataset schema, thus we need to rename the default primitive encoder column name "value" to the desired name. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15112][SQL] Allows query plan schema an...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/12952#discussion_r62345241 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala --- @@ -502,7 +507,7 @@ class SparkSession private( /* * | Catalog-related methods | - * - -- */ + * */ --- End diff -- Mysterious missing space... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15155][Mesos] Optionally ignore default...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12933#issuecomment-217472869 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15155][Mesos] Optionally ignore default...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12933#issuecomment-217472873 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57993/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15155][Mesos] Optionally ignore default...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12933#issuecomment-217472670 **[Test build #57993 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57993/consoleFull)** for PR 12933 at commit [`d2b7ad4`](https://github.com/apache/spark/commit/d2b7ad444e02b947f4a7264018b4e48610731408). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14939][SQL] Add FoldablePropagation opt...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-217472671 Hi, @cloud-fan . Now, it's ready for review. Could you review this when you have some time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12954#issuecomment-217472331 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57994/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12954#issuecomment-217472325 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14476][SQL][WIP] Improve the physical p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12947#issuecomment-217471682 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14476][SQL][WIP] Improve the physical p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12947#issuecomment-217471690 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57995/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14476][SQL][WIP] Improve the physical p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12947#issuecomment-217471315 **[Test build #57995 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57995/consoleFull)** for PR 12947 at commit [`438d70e`](https://github.com/apache/spark/commit/438d70e02cfaf9e3b6beccc8d3a8d0c65f7499da). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15087][MINOR][DOC] Follow Up: Fix the C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12953#issuecomment-217468171 **[Test build #58000 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58000/consoleFull)** for PR 12953 at commit [`dbb6632`](https://github.com/apache/spark/commit/dbb663222ba379fbf0b846e2342173f0f0a0ecef). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15173][SQL] DataFrameWriter.insertInto ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12949#issuecomment-217468181 **[Test build #58001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58001/consoleFull)** for PR 12949 at commit [`167a7d6`](https://github.com/apache/spark/commit/167a7d6d8ad13a9d754e22dbd75cd9a16e9d1a56). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...
Github user cenyuhai commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-217467789 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15087][MINOR][DOC] Follow Up: Fix the C...
Github user techaddict commented on a diff in the pull request: https://github.com/apache/spark/pull/12953#discussion_r62342197 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -389,11 +389,10 @@ private[spark] class TaskSchedulerImpl( // (taskId, stageId, stageAttemptId, accumUpdates) val accumUpdatesWithTaskIds: Array[(Long, Int, Int, Seq[AccumulableInfo])] = synchronized { accumUpdates.flatMap { case (id, updates) => -// We should call `acc.value` here as we are at driver side now. However, the RPC framework +// We call `acc.value` here as we are at driver side now. However, the RPC framework --- End diff -- @srowen @cloud-fan Done ð --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Docs] Added Scaladoc for countApprox and coun...
Github user ntietz commented on the pull request: https://github.com/apache/spark/pull/12955#issuecomment-217467133 Good call, I will add it to the Java version as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...
Github user cenyuhai commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-217466944 @andrewor14 I alter the code as what you said, but the test failed because of timeout. It seems like that it is none of my business... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15051] [SQL] Create a TypedColumn alias...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12893#issuecomment-217463479 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11714][Mesos] Make Spark on Mesos honor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11157#issuecomment-217465989 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57992/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11714][Mesos] Make Spark on Mesos honor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11157#issuecomment-217465986 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Docs] Added Scaladoc for countApprox and coun...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12955#issuecomment-217465951 How about doc'ing the Java version as well in JavaRDDLike.scala? You're welcome to expand on the java/scaladoc of lots of these methods. It'd be nicer to have more complete doc of method args and return type for such a central API. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11714][Mesos] Make Spark on Mesos honor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11157#issuecomment-217465665 **[Test build #57992 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57992/consoleFull)** for PR 11157 at commit [`dba3e34`](https://github.com/apache/spark/commit/dba3e34c826ddfcaa096254e1f0d230c49b4349d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15093][SQL] create/delete/rename direct...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12871#issuecomment-217465398 **[Test build #57999 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57999/consoleFull)** for PR 12871 at commit [`0261d25`](https://github.com/apache/spark/commit/0261d252f8baa1e823a97261e111a3e93019a0dc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15051] [SQL] Create a TypedColumn alias...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12893#issuecomment-217465397 **[Test build #57998 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57998/consoleFull)** for PR 12893 at commit [`e408fdf`](https://github.com/apache/spark/commit/e408fdf43c207a189f6316a80599e7f54eb832b6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12639 SQL Improve Explain for Datasource...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10655#issuecomment-217465276 ping @RussellSpitzer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15087][MINOR][DOC] Follow Up: Fix the C...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12953#discussion_r62340446 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -389,11 +389,10 @@ private[spark] class TaskSchedulerImpl( // (taskId, stageId, stageAttemptId, accumUpdates) val accumUpdatesWithTaskIds: Array[(Long, Int, Int, Seq[AccumulableInfo])] = synchronized { accumUpdates.flatMap { case (id, updates) => -// We should call `acc.value` here as we are at driver side now. However, the RPC framework +// We call `acc.value` here as we are at driver side now. However, the RPC framework --- End diff -- Since we have no `localValue` anymore, this comment can be removed entirely. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12988][SQL] Can't drop columns that con...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10943#issuecomment-217461632 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15180][SQL] Support subexpression elimi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12956#issuecomment-217457647 **[Test build #57997 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57997/consoleFull)** for PR 12956 at commit [`55e43ef`](https://github.com/apache/spark/commit/55e43ef9f25c68d0c3773f36156b34d42a9baedc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15180][SQL] Support subexpression elimi...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/12956#issuecomment-217456943 Codes: val ds = Seq(("a", 10), ("a", 20), ("b", 1), ("b", 2), ("c", 1)).toDS().filter("_2 + 1 > 5").filter("_2 + 1 > 20") ds.collect() Generated codes: /* 030 */ protected void processNext() throws java.io.IOException { /* 031 */ /*** PRODUCE: Filter (((_2#3 + 1) > 20) && ((_2#3 + 1) > 5)) */ /* 032 */ /* 033 */ /*** PRODUCE: INPUT */ /* 034 */ /* 035 */ while (inputadapter_input.hasNext()) { /* 036 */ InternalRow inputadapter_row = (InternalRow) inputadapter_input.next(); /* 037 */ /*** CONSUME: Filter (((_2#3 + 1) > 20) && ((_2#3 + 1) > 5)) */ /* 038 */ /* 039 */ /* ((input[1, int] + 1) > 20) */ /* 040 */ // Common expression /* 041 */ /* (input[1, int] + 1) */ /* 042 */ /* input[1, int] */ /* 043 */ /* input[1, int] */ /* 044 */ int inputadapter_value1 = inputadapter_row.getInt(1); /* 045 */ /* 046 */ int filter_value = -1; /* 047 */ filter_value = inputadapter_value1 + 1; /* 048 */ /* 049 */ /* (input[1, int] + 1) */ /* 050 */ /* 051 */ boolean filter_value3 = false; /* 052 */ filter_value3 = filter_value > 20; /* 053 */ if (!filter_value3) continue; /* 054 */ /* ((input[1, int] + 1) > 5) */ /* 055 */ /* (input[1, int] + 1) */ /* 056 */ /* 057 */ boolean filter_value5 = false; /* 058 */ filter_value5 = filter_value > 5; /* 059 */ if (!filter_value5) continue; /* 060 */ /* 061 */ filter_numOutputRows.add(1); /* 062 */ /* 063 */ /*** CONSUME: WholeStageCodegen */ /* 064 */ /* 065 */ /* input[0, string] */ /* 066 */ boolean inputadapter_isNull = inputadapter_row.isNullAt(0); /* 067 */ UTF8String inputadapter_value = inputadapter_isNull ? null : (inputadapter_row.getUTF8String(0)); /* 068 */ filter_holder.reset(); /* 069 */ /* 070 */ filter_rowWriter.zeroOutNullBytes(); /* 071 */ /* 072 */ if (inputadapter_isNull) { /* 073 */ filter_rowWriter.setNullAt(0); /* 074 */ } else { /* 075 */ filter_rowWriter.write(0, inputadapter_value); /* 076 */ } /* 077 */ /* 078 */ filter_rowWriter.write(1, inputadapter_value1); /* 079 */ filter_result.setTotalSize(filter_holder.totalSize()); /* 080 */ append(filter_result); /* 081 */ if (shouldStop()) return; /* 082 */ } /* 083 */ } /* 084 */ } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12177][Streaming][Kafka] Update KafkaDS...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11863#issuecomment-217456649 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12177][Streaming][Kafka] Update KafkaDS...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11863#issuecomment-217456641 **[Test build #57996 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57996/consoleFull)** for PR 11863 at commit [`544bf88`](https://github.com/apache/spark/commit/544bf888984e20dadae852faa8ca1dd26fc416e7). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12177][Streaming][Kafka] Update KafkaDS...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11863#issuecomment-217456651 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57996/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-217456515 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-217456518 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57982/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12177][Streaming][Kafka] Update KafkaDS...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11863#issuecomment-217456405 **[Test build #57996 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57996/consoleFull)** for PR 11863 at commit [`544bf88`](https://github.com/apache/spark/commit/544bf888984e20dadae852faa8ca1dd26fc416e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-217456396 **[Test build #57982 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57982/consoleFull)** for PR 11546 at commit [`27fd070`](https://github.com/apache/spark/commit/27fd07058112dad0760c7fa5480fb43d4f046d96). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15180][SQL] Support subexpression elimi...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/12956 [SPARK-15180][SQL] Support subexpression elimination in Fliter ## What changes were proposed in this pull request? This patch tries to add the support of subexpression elimination in wholestage codegen `Fliter`. Because the predicate expressions are evaluated in `Filter` with an optimized ordering that reduces unnecessary evaluation as much as possible, we follow this ordering when doing subexpression elimination too. Due to that, we can't just extract all common subexpressions and evaluate them first. Instead, we extract common subexpressions but don't evaluate them. We evaluate common subexpressions only when the predicate expressions containing them are evaluated. ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 subexpr-filter Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12956.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12956 commit 55e43ef9f25c68d0c3773f36156b34d42a9baedc Author: Liang-Chi Hsieh Date: 2016-05-06T14:22:49Z Support subexpression elimination in Fliter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12177][Streaming][Kafka] Update KafkaDS...
Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/11863#discussion_r62336356 --- Diff: external/kafka-beta/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala --- @@ -0,0 +1,259 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.streaming.kafka + +import java.{ util => ju } + +import scala.collection.mutable.ArrayBuffer +import scala.reflect.{classTag, ClassTag} + +import org.apache.kafka.clients.consumer.{ ConsumerConfig, ConsumerRecord } +import org.apache.kafka.common.TopicPartition + +import org.apache.spark.{Partition, SparkContext, SparkException, TaskContext} +import org.apache.spark.internal.Logging +import org.apache.spark.partial.{BoundedDouble, PartialResult} +import org.apache.spark.rdd.RDD +import org.apache.spark.scheduler.ExecutorCacheTaskLocation +import org.apache.spark.storage.StorageLevel + +/** + * A batch-oriented interface for consuming from Kafka. + * Starting and ending offsets are specified in advance, + * so that you can control exactly-once semantics. + * @param kafkaParams Kafka + * http://kafka.apache.org/documentation.htmll#newconsumerconfigs";> + * configuration parameters. Requires "bootstrap.servers" to be set + * with Kafka broker(s) specified in host1:port1,host2:port2 form. + * @param offsetRanges offset ranges that define the Kafka data belonging to this RDD + */ + +class KafkaRDD[ + K: ClassTag, + V: ClassTag] private[spark] ( +sc: SparkContext, +val kafkaParams: ju.Map[String, Object], +val offsetRanges: Array[OffsetRange], +val preferredHosts: ju.Map[TopicPartition, String] +) extends RDD[ConsumerRecord[K, V]](sc, Nil) with Logging with HasOffsetRanges { + + assert("none" == + kafkaParams.get(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG).asInstanceOf[String], +ConsumerConfig.AUTO_OFFSET_RESET_CONFIG + + " must be set to none for executor kafka params, else messages may not match offsetRange") + + assert(false == --- End diff -- The override is done in the companion object not in this constructor. And it's still possible for subclasses to construct this. The real question is whether you'd ever want to allow executors to mess with offsets, and I'm pretty sure the answer is no. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12177][Streaming][Kafka] Update KafkaDS...
Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/11863#discussion_r62336181 --- Diff: external/kafka-beta/src/main/scala/org/apache/spark/streaming/kafka/DirectKafkaInputDStream.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.streaming.kafka + +import java.{ util => ju } +import java.util.concurrent.ConcurrentLinkedQueue +import java.util.concurrent.atomic.AtomicReference + +import scala.annotation.tailrec +import scala.collection.JavaConverters._ +import scala.collection.mutable +import scala.reflect.ClassTag + +import org.apache.kafka.clients.consumer._ +import org.apache.kafka.common.{ PartitionInfo, TopicPartition } + +import org.apache.spark.SparkException +import org.apache.spark.internal.Logging +import org.apache.spark.storage.StorageLevel +import org.apache.spark.streaming.{StreamingContext, Time} +import org.apache.spark.streaming.dstream._ +import org.apache.spark.streaming.scheduler.{RateController, StreamInputInfo} +import org.apache.spark.streaming.scheduler.rate.RateEstimator + +/** + * A stream of {@link org.apache.spark.streaming.kafka.KafkaRDD} where + * each given Kafka topic/partition corresponds to an RDD partition. + * The spark configuration spark.streaming.kafka.maxRatePerPartition gives the maximum number + * of messages + * per second that each '''partition''' will accept. + * Starting offsets are specified in advance, + * and this DStream is not responsible for committing offsets, + * so that you can control exactly-once semantics. + * @param kafkaParams Kafka http://kafka.apache.org/documentation.html#newconsumerconfigs";> + * configuration parameters. + * Requires "bootstrap.servers" to be set with Kafka broker(s), + * NOT zookeeper servers, specified in host1:port1,host2:port2 form. + */ + +class DirectKafkaInputDStream[K: ClassTag, V: ClassTag] private[spark] ( +_ssc: StreamingContext, +preferredHosts: ju.Map[TopicPartition, String], +executorKafkaParams: ju.Map[String, Object], +driverConsumer: () => Consumer[K, V] + ) extends InputDStream[ConsumerRecord[K, V]](_ssc) with Logging { + + @transient private var kc: Consumer[K, V] = null + def consumer(): Consumer[K, V] = this.synchronized { +if (null == kc) { + kc = driverConsumer() +} +kc + } + consumer() + + override def persist(newLevel: StorageLevel): DStream[ConsumerRecord[K, V]] = { +log.error("Kafka ConsumerRecord is not serializable. " + + "Use .map to extract fields before calling .persist or .window") +super.persist(newLevel) + } + + protected def getBrokers = { +val c = consumer +val result = new ju.HashMap[TopicPartition, String]() +val hosts = new ju.HashMap[TopicPartition, String]() +val assignments = c.assignment().iterator() +while (assignments.hasNext()) { + val tp: TopicPartition = assignments.next() + if (null == hosts.get(tp)) { +val infos = c.partitionsFor(tp.topic).iterator() +while (infos.hasNext()) { + val i = infos.next() + hosts.put(new TopicPartition(i.topic(), i.partition()), i.leader.host()) +} + } + result.put(tp, hosts.get(tp)) +} +result + } + + protected def getPreferredHosts: ju.Map[TopicPartition, String] = { +if (preferredHosts == DirectKafkaInputDStream.preferBrokers) { + getBrokers +} else { + preferredHosts +} + } + + // Keep this consistent with how other streams are named (e.g. "Flume polling stream [2]") + private[streaming] override def name: String = s"Kafka beta direct stream [$id]" + + protected[streaming] override val checkpointData = +new DirectKafkaInputDStreamCheckpointData + + +
[GitHub] spark pull request: [Docs] Added Scaladoc for countApprox and coun...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12955#issuecomment-217452079 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Docs] Added Scaladoc for countApprox and coun...
GitHub user ntietz opened a pull request: https://github.com/apache/spark/pull/12955 [Docs] Added Scaladoc for countApprox and countByValueApprox parameters This pull request simply adds Scaladoc documentation of the parameters for countApprox and countByValueApprox. This is an important documentation change, as it clarifies what should be passed in for the timeout. Without units, this was previously unclear. I did not open a JIRA ticket per my understanding of the project contribution guidelines; as they state, the description in the ticket would be essentially just what is in the PR. If I should open one, let me know and I will do so. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ntietz/spark rdd-countapprox-docs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12955.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12955 commit a28014dc45981b79df6e6c18f473565eb740638c Author: Nicholas Tietz Date: 2016-05-06T14:07:21Z Added Scaladoc for countApprox and countByValueApprox --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13370][SQL] Require whitespace between ...
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/12897#issuecomment-217451180 @rxin @yhuai what is the next step? Are we changing the behavior? Or keeping it as it is? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15074][Shuffle] Cache shuffle index fil...
Github user sitalkedia commented on a diff in the pull request: https://github.com/apache/spark/pull/12944#discussion_r62334156 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexRecord.java --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.network.shuffle; + +/** + * Contains offset and length of the shuffle block data. + */ +public class ShuffleIndexRecord { + private final long offset; + private final long length; + + public ShuffleIndexRecord(long offset, long length) { +this.offset = offset; +this.length = length; + } + + public long getOffset() { +return offset; + } + + public long getLength() { +return length; + } +} --- End diff -- will fix, thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15074][Shuffle] Cache shuffle index fil...
Github user sitalkedia commented on the pull request: https://github.com/apache/spark/pull/12944#issuecomment-217450001 @holdenk - `TransportConf` is not specific to the , it is used to create Transport client in other modules as well. Since number of index cache entry is very specific to the `ShuffleService`, I did not want to expose that as an api in the `TransportConf`. Let me know what you think about it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15119] [ML] Add a validator to Decision...
Github user dominik-jastrzebski commented on the pull request: https://github.com/apache/spark/pull/12895#issuecomment-217449436 Ok, I can check the other validators in `treeParams.scala`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15087][MINOR][DOC] Follow Up: Fix the C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12953#issuecomment-217449336 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57990/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15087][MINOR][DOC] Follow Up: Fix the C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12953#issuecomment-217449332 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15087][MINOR][DOC] Follow Up: Fix the C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12953#issuecomment-217449102 **[Test build #57990 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57990/consoleFull)** for PR 12953 at commit [`032e042`](https://github.com/apache/spark/commit/032e042b55fbd8a5fcc932e29c6654d68b499c5b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14476][SQL][WIP] Improve the physical p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12947#issuecomment-217448700 **[Test build #57995 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57995/consoleFull)** for PR 12947 at commit [`438d70e`](https://github.com/apache/spark/commit/438d70e02cfaf9e3b6beccc8d3a8d0c65f7499da). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14613][ML] Add @Since into the matrix a...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/12416#issuecomment-217447705 @pravingadakh go ahead - would be good to get this in for 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15074][Shuffle] Cache shuffle index fil...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12944#discussion_r62331757 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexRecord.java --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.network.shuffle; + +/** + * Contains offset and length of the shuffle block data. + */ +public class ShuffleIndexRecord { + private final long offset; + private final long length; + + public ShuffleIndexRecord(long offset, long length) { +this.offset = offset; +this.length = length; + } + + public long getOffset() { +return offset; + } + + public long getLength() { +return length; + } +} --- End diff -- And a newline here maybe :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12954#issuecomment-217447455 **[Test build #57994 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57994/consoleFull)** for PR 12954 at commit [`f0871c9`](https://github.com/apache/spark/commit/f0871c921285a05602cf566c9f2c23901224d73e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize p...
Github user hvanhovell commented on the pull request: https://github.com/apache/spark/pull/12954#issuecomment-217447302 cc @rxin @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15176][Core] Add maxShares setting to P...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/12951#discussion_r62332468 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala --- @@ -21,6 +21,7 @@ import java.util.concurrent.{ConcurrentHashMap, ConcurrentLinkedQueue} import scala.collection.JavaConverters._ import scala.collection.mutable.ArrayBuffer +import scala.math.{max,min} --- End diff -- It's a tiny nit, but while here, I usually see `math.max` just written out in Scala code; no need to import a standard class's methods. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize p...
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/12954 [SPARK-15122][SQL] Fix TPC-DS 41 - Normalize predicates before pulling them out ## What changes were proposed in this pull request? The official TPC-DS 41 query currently fails because it contains a scalar subquery with a disjunctive correlated predicate (the correlated predicates were nested in ORs). This makes the `Analyzer` pull out the entire predicate which is wrong and causes the following (correct) analysis exception: `The correlated scalar subquery can only contain equality predicates` This PR fixes this by first simplifing (or normalizing) the correlated predicates before pulling them out of the subquery. I have also added a small optimizer rule that rewrites correlated scalar subqueries into predicate subqueries if they are used in a `Filter` and are wrapped by a predicate. This is allows us to use semi joins instead of left outer joins. ## How was this patch tested? Manual testing on TPC-DS 41, and added a test to SubquerySuite. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hvanhovell/spark SPARK-15122 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12954.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12954 commit f0871c921285a05602cf566c9f2c23901224d73e Author: Herman van Hovell Date: 2016-05-06T13:39:43Z Fix TPC-DS 41 - normalize predicates before pulling them out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1239] Improve fetching of map output st...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/12113#discussion_r62332230 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -428,40 +503,89 @@ private[spark] class MapOutputTrackerMaster(conf: SparkConf) } } + private def removeBroadcast(bcast: Broadcast[_]): Unit = { +if (null != bcast) { + broadcastManager.unbroadcast(bcast.id, +removeFromDriver = true, blocking = false) +} + } + + private def clearCachedBroadcast(): Unit = { +for (cached <- cachedSerializedBroadcast) removeBroadcast(cached._2) +cachedSerializedBroadcast.clear() + } + def getSerializedMapOutputStatuses(shuffleId: Int): Array[Byte] = { var statuses: Array[MapStatus] = null +var retBytes: Array[Byte] = null var epochGotten: Long = -1 -epochLock.synchronized { - if (epoch > cacheEpoch) { -cachedSerializedStatuses.clear() -cacheEpoch = epoch - } - cachedSerializedStatuses.get(shuffleId) match { -case Some(bytes) => - return bytes -case None => - statuses = mapStatuses.getOrElse(shuffleId, Array[MapStatus]()) - epochGotten = epoch + +// Check to see if we have a cached version, returns true if it does +// and has side effect of setting retBytes. If not returns false +// with side effect of setting statuses +def checkCachedStatuses(): Boolean = { + epochLock.synchronized { +if (epoch > cacheEpoch) { + cachedSerializedStatuses.clear() + clearCachedBroadcast() + cacheEpoch = epoch +} +cachedSerializedStatuses.get(shuffleId) match { + case Some(bytes) => +retBytes = bytes +true + case None => +logDebug("cached status not found for : " + shuffleId) +statuses = mapStatuses.getOrElse(shuffleId, Array[MapStatus]()) +epochGotten = epoch +false +} } } -// If we got here, we failed to find the serialized locations in the cache, so we pulled -// out a snapshot of the locations as "statuses"; let's serialize and return that -val bytes = MapOutputTracker.serializeMapStatuses(statuses) -logInfo("Size of output statuses for shuffle %d is %d bytes".format(shuffleId, bytes.length)) -// Add them into the table only if the epoch hasn't changed while we were working -epochLock.synchronized { - if (epoch == epochGotten) { -cachedSerializedStatuses(shuffleId) = bytes + +if (checkCachedStatuses()) return retBytes +var shuffleIdLock = shuffleIdLocks.get(shuffleId) +if (null == shuffleIdLock) { + val newLock = new Object() + // in general, this condition should be false - but good to be paranoid + val prevLock = shuffleIdLocks.putIfAbsent(shuffleId, newLock) --- End diff -- Its purely defensive programming to allow things to work when the unexpected happen. Would you rather have your production job that was running for 5 hours throw a null pointer exception or try to fix itself and continue to run? In distributed systems weird things happen and this is processing a message from another host/task which you don't have direct control of. You can get network breaks, weird host failures or pauses, etc and a message comes in late asking for a shuffle id that isn't there anymore. The unregister shuffle which removes the lock for the shuffle id is being called from the context cleaner. So if an RDD goes out of scope and is cleaned up the shuffle lock gets removed. As I mention above if some host was slightly out of sync and sent a message to fetch that id late, we would throw a null pointer exception. Everything else in the GetMapOutputStatuses handle this case and there is actually a test for this (fetching after unregister) so if this line is removed that test fails. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15155][Mesos] Optionally ignore default...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12933#issuecomment-217446062 **[Test build #57993 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57993/consoleFull)** for PR 12933 at commit [`d2b7ad4`](https://github.com/apache/spark/commit/d2b7ad444e02b947f4a7264018b4e48610731408). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-15155][Mesos] Optionally ignore default...
Github user hellertime commented on the pull request: https://github.com/apache/spark/pull/12933#issuecomment-217445754 Rebasing against master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14476][SQL][WIP] Improves the output of...
Github user clockfly commented on the pull request: https://github.com/apache/spark/pull/12947#issuecomment-217445675 @davies I made some changes in UI, please check whether it is better now? ``` scala> spark.sql("select * from tt").explain() == Physical Plan == WholeStageCodegen : +- BatchedScan HadoopFiles default.tt[id#0L] Format: ParquetFormat, InputPaths: file:/home/xzhong10/spark-linux/assembly/spark-warehouse/tt, PushedFilters: [], ReadSchema: struct ``` ![change_v2](https://cloud.githubusercontent.com/assets/2595532/15074961/8fa7f828-13d4-11e6-95b3-a3df261809f7.jpg) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15080][CORE] Break copyAndReset into co...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12936#issuecomment-217444691 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15080][CORE] Break copyAndReset into co...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12936#issuecomment-217444695 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57987/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15176][Core] Add maxShares setting to P...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12951#discussion_r62331312 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala --- @@ -21,6 +21,7 @@ import java.util.concurrent.{ConcurrentHashMap, ConcurrentLinkedQueue} import scala.collection.JavaConverters._ import scala.collection.mutable.ArrayBuffer +import scala.math.{max,min} --- End diff -- (FYI, running `./dev/run-tests` will trigger style check first. Running the first part of this script before submitting more commits will show some comments I said here) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15080][CORE] Break copyAndReset into co...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12936#issuecomment-21725 **[Test build #57987 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57987/consoleFull)** for PR 12936 at commit [`ca9c80c`](https://github.com/apache/spark/commit/ca9c80c4fcc78d571606c9abb0312f24bdc12340). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15176][Core] Add maxShares setting to P...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12951#discussion_r62330957 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Pool.scala --- @@ -47,6 +49,15 @@ private[spark] class Pool( var name = poolName var parent: Pool = null + override def maxShare = { --- End diff -- Maybe specifying return types? (See Return types in https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-15176][Core] Add maxShares setting to P...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12951#discussion_r62330929 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -98,6 +98,14 @@ private[spark] class TaskSetManager( var totalResultSize = 0L var calculatedTasks = 0 + override def maxShare = { --- End diff -- Maybe specifying return types? (See Return types in https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org