[GitHub] spark pull request: [SPARK-6980] [CORE] [WIP] Akka timeout excepti...
Github user squito commented on the pull request: https://github.com/apache/spark/pull/6205#issuecomment-107218808 @BryanCutler haven't looked at this closely yet, but quick question -- are you talking using this just for `actorSelction.resolveOne`? Or are you proposing doing this in place of what you already have? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7426] [MLlib] [ML] Updated Attribute.fr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6540#issuecomment-107231676 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7426] [MLlib] [ML] Updated Attribute.fr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6540#issuecomment-107231654 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Add license for dagre-d3 and graphlib-...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/6539#issuecomment-107231729 Oops good catch. I'm merging this into master 1.4 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-7949] [MLlib] [Doc] update document wit...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/6498#issuecomment-107233326 No problem (and thanks for catching that other typo I missed). Merging into master and branch-1.4 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-7949] [MLlib] [Doc] update document wit...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6498 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Enable PySpark SQL readerwriter and wi...
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/6542 [MINOR] Enable PySpark SQL readerwriter and window tests. PySpark SQL's `readerwriter` and `window` doctests weren't being run by our test runner script; this patch re-enables them. You can merge this pull request into a Git repository by running: $ git pull https://github.com/JoshRosen/spark enable-more-pyspark-sql-tests Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6542.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6542 commit 9f46ce41ac79080cf18a6d8496cd18a2dc878e37 Author: Josh Rosen joshro...@databricks.com Date: 2015-05-31T20:17:00Z Enable PySpark SQL readerwriter and window tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ML] [MLlib] [Docs] Remove fittingParamMap r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6514#issuecomment-107206148 [Test build #33856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33856/consoleFull) for PR 6514 at commit [`d850e0e`](https://github.com/apache/spark/commit/d850e0ee550b28b7238e7c2d70ac720e7390e839). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ML] [MLlib] [Docs] Updating ML Doc Estimator...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6514#issuecomment-107206061 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ML] [MLlib] [Docs] Updating ML Doc Estimator...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6514#issuecomment-107206066 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Add license for dagre-d3 and graphlib-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6539#issuecomment-107206122 [Test build #33853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33853/consoleFull) for PR 6539 at commit [`82b0475`](https://github.com/apache/spark/commit/82b047526f8a248b4070686e31272f8e6f421962). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SortOrder(child: Expression, direction: SortDirection) extends Expression` * `abstract class BinaryMathExpression(f: (Double, Double) = Double, name: String)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Add license for dagre-d3 and graphlib-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6539#issuecomment-107206125 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user coderxiang commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107226690 @jkbradley thanks for the comments. I highlighted the Pipelines API in the revision and include the wiki page. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107226749 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107226739 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Add license for dagre-d3 and graphlib-...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6539 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Enable PySpark SQL readerwriter and wi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6542#issuecomment-107241516 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107223911 [Test build #33857 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33857/consoleFull) for PR 6505 at commit [`b6401ba`](https://github.com/apache/spark/commit/b6401ba59cf98cd9218a6f69da9bf089f4ee5240). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3873] [build] Add style checker to enfo...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6502#issuecomment-107242251 By the way, if you make this configurable then I'd model the configuration language / format after IntelliJ's import ordering configuration dialog. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ML] [MLlib] [Docs] Remove fittingParamMap r...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6514#issuecomment-107225575 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ML] [MLlib] [Docs] Remove fittingParamMap r...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6514#issuecomment-107225572 [Test build #33856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33856/consoleFull) for PR 6514 at commit [`d850e0e`](https://github.com/apache/spark/commit/d850e0ee550b28b7238e7c2d70ac720e7390e839). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7426] [MLlib] [ML] Updated Attribute.fr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6540#issuecomment-107232500 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7985] [ML] [MLlib] [Docs] Remove fitti...
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/6514#discussion_r31393311 --- Diff: docs/ml-guide.md --- @@ -207,7 +207,7 @@ val model1 = lr.fit(training.toDF) // we can view the parameters it used during fit(). // This prints the parameter (name: value) pairs, where names are unique IDs for this // LogisticRegression instance. -println(Model 1 was fit using parameters: + model1.fittingParamMap) +println(Model 1 was fit using parameters: + model1.extractParamMap) --- End diff -- I should have noticed before: The right way to access the parameters is really: ```model1.parent.extractParamMap``` since that gets the Params for the parent Estimator, which could potentially differ from the Model Params. Could you please update these 4 examples? Other than this, the changes look fine. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107238979 [Test build #33859 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33859/consoleFull) for PR 6504 at commit [`df5bd14`](https://github.com/apache/spark/commit/df5bd147797409cf0f2c28ac5afa18fa4f62a268). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107238985 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Enable PySpark SQL readerwriter and wi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6542#issuecomment-107241508 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Enable PySpark SQL readerwriter and wi...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6542#issuecomment-107242078 In principle, we should probably be using a proper testrunner like `nose` to handle test discovery and execution. The reason why we didn't do this initially is because we need some custom code in `__main__` to emulate a shared SparkContext fixture for the doctests, since putting SparkContext setup and teardown code into each doctest would be very messy and slow. In the medium-term, we're going to want to refactor `run-tests` anyways in order to make it easier to run subsets of the tests and python versions (see #4269, for example). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-107246805 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-107247403 [Test build #33865 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33865/consoleFull) for PR 6262 at commit [`f552d49`](https://github.com/apache/spark/commit/f552d49127d9e43799d5728f52682a1609fdedb8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6541 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ML] [MLlib] [Docs] Remove fittingParamMap r...
Github user dusenberrymw commented on the pull request: https://github.com/apache/spark/pull/6514#issuecomment-107225885 @jkbradley I removed all of the references to `fittingParamMap` throughout Spark, including fixing the equivalent ML example in Java. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107226467 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107226473 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5482][PySpark] Allow individual test su...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4269#issuecomment-107239038 Another useful refactoring would be to let us run the tests for only one of the supported Python versions instead of all of them. This is useful when augmenting the test suite to collect coverage metrics. To do this, maybe we should split the script into two, one that tests with a particular python version and another that loops over the versions and invokes that script. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107239837 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6541#issuecomment-107247683 **[Test build #33861 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33861/consoleFull)** for PR 6541 at commit [`f72ebe4`](https://github.com/apache/spark/commit/f72ebe402a5c1a4d755b9941863688bb205c2010) after a configured wait of `150m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6541#issuecomment-107247689 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6028][Core][WIP] A new RPC implemetatio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6457#issuecomment-107212614 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6028][Core][WIP] A new RPC implemetatio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6457#issuecomment-107212610 [Test build #33854 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33854/consoleFull) for PR 6457 at commit [`879ecd5`](https://github.com/apache/spark/commit/879ecd57c10090723f2253df5ff94c490f95dd12). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` class MessageLoop extends Runnable ` * `class NettyRpcEnvFactory extends RpcEnvFactory with Logging ` * `class NettyRpcEndpointRef(@transient conf: SparkConf)` * `class NettyRpcHandler(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6541#issuecomment-107233246 [Test build #33861 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33861/consoleFull) for PR 6541 at commit [`f72ebe4`](https://github.com/apache/spark/commit/f72ebe402a5c1a4d755b9941863688bb205c2010). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Simplifies binary node pattern matching
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6537#discussion_r31393869 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -127,20 +127,20 @@ trait HiveTypeCoercion { case e if !e.childrenResolved = e /* Double Conversions */ -case b: BinaryExpression if b.left == stringNaN b.right.dataType == DoubleType = - b.makeCopy(Array(b.right, Literal(Double.NaN))) -case b: BinaryExpression if b.left.dataType == DoubleType b.right == stringNaN = - b.makeCopy(Array(Literal(Double.NaN), b.left)) -case b: BinaryExpression if b.left == stringNaN b.right == stringNaN = - b.makeCopy(Array(Literal(Double.NaN), b.left)) +case b @ BinaryExpression(StringNaN, r @ DoubleType()) = + b.makeCopy(Array(r, Literal(Double.NaN))) +case b @ BinaryExpression(l @ DoubleType(), StringNaN) = + b.makeCopy(Array(Literal(Double.NaN), l)) /* Float Conversions */ -case b: BinaryExpression if b.left == stringNaN b.right.dataType == FloatType = +case b @ BinaryExpression(StringNaN, r @ FloatType()) = b.makeCopy(Array(b.right, Literal(Float.NaN))) -case b: BinaryExpression if b.left.dataType == FloatType b.right == stringNaN = - b.makeCopy(Array(Literal(Float.NaN), b.left)) -case b: BinaryExpression if b.left == stringNaN b.right == stringNaN = - b.makeCopy(Array(Literal(Float.NaN), b.left)) +case b @ BinaryExpression(l @ FloatType(), StringNaN) = --- End diff -- unrelated to this pr, but i find l somewhat hard to differentiate from 1. might be better in the future to use left vs right --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7978] [SQL] [PYSPARK] DecimalType shoul...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6532#issuecomment-107241920 As an experiment, I put together some code to run the PySpark test suite through `coverage.py` (https://gist.github.com/JoshRosen/60d590b1cdc271d332e5) and it turns out that line + branch coverage for the `DecimalType` class itself doesn't uncover this bug since it won't catch the fact that the constructor is only ever called with defaults for both of its arguments. The closest thing to a red flag in the coverage report was the fact that the only call to DecimalType with arguments, inside of `_parse_datatype_json_value`, wasn't hit by the tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Enable PySpark SQL readerwriter and wi...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6542#issuecomment-107241930 I wonder if we should create a simple linter rule in lint-python to check for stuff like this ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-107247020 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-107247022 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6541#issuecomment-107247868 Actually I'm going to merge this since it's already passed the style checker stage. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6541#issuecomment-107247843 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7978] [SQL] [PYSPARK] DecimalType shoul...
Github user airhorns commented on the pull request: https://github.com/apache/spark/pull/6532#issuecomment-107224842 Maybe another systematic fix is to not allow the singleton-ing of any Type that takes constructor args? At subclass time can you check that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107239416 [Test build #33863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33863/consoleFull) for PR 6505 at commit [`77f0f39`](https://github.com/apache/spark/commit/77f0f39fdbe8a5858e38a79f72e131bed23e385e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107248961 [Test build #33863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33863/consoleFull) for PR 6505 at commit [`77f0f39`](https://github.com/apache/spark/commit/77f0f39fdbe8a5858e38a79f72e131bed23e385e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107248964 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-7983] [MLlib] Add require for one-based...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6538#issuecomment-107225342 [Test build #33855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33855/consoleFull) for PR 6538 at commit [`9956365`](https://github.com/apache/spark/commit/9956365ae578ac0e1bff8c84104e26e71bf5116c). * This patch **passes all tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-7983] [MLlib] Add require for one-based...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6538#issuecomment-107225358 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7555][docs] Add doc for elastic net in ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6504#issuecomment-107226789 [Test build #33859 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33859/consoleFull) for PR 6504 at commit [`df5bd14`](https://github.com/apache/spark/commit/df5bd147797409cf0f2c28ac5afa18fa4f62a268). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7978] [SQL] [PYSPARK] DecimalType shoul...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6532#issuecomment-107231326 @davies, I agree that the test you added here acts as a proper regression test. My comment was more to suggest that we could have prevented this regression in the first place with a relatively simple test that just tries to instantiate each data type with all of its constructor arguments. The fact that this bug evaded unit tests implies that our existing unit tests didn't create DecimalTypes with any constructor arguments, implying that our test coverage of decimal-related code might be insufficient. I think that this patch is fine, but for 1.5 we should make a dedicated effort to improve Python's test coverage. @airhorns, do you mean that the single metaclass would act as a no-op when applied to Types that take constructor arguments or that it would throw an exception if applied to those types? This is purely academic at this point, but I can imagine some contrived scenarios where the no-op behavior might be confusing: what if I had a class which accepted constructor parameters, then created a subclass which called its superclass constructor with constant values for those parameters? In this case, the subclass can be a singleton but the superclass can't. To avoid having to reason about these corner-cases, maybe it's better to just accept a bit of verbosity and use decorators instead. We shouldn't do that for this patch, though; we can leave it as a followup for 1.5. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7426] [MLlib] [ML] Updated Attribute.fr...
GitHub user dusenberrymw opened a pull request: https://github.com/apache/spark/pull/6540 [SPARK-7426] [MLlib] [ML] Updated Attribute.fromStructField to allow any NumericType. Updated `Attribute.fromStructField` to allow any `NumericType`, rather than just `DoubleType`, and added unit tests for a few of the other NumericTypes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dusenberrymw/spark SPARK-7426_AttributeFactory.fromStructField_Should_Allow_NumericTypes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6540.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6540 commit 87fecb3e6c5476554517873c79b65a7efde606a3 Author: Mike Dusenberry dusenberr...@gmail.com Date: 2015-05-31T18:13:31Z Updated Attribute.fromStructField to allow any NumericType, rather than just DoubleType, and added unit tests for a few of the other NumericTypes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7978] [SQL] [PYSPARK] DecimalType shoul...
Github user airhorns commented on the pull request: https://github.com/apache/spark/pull/6532#issuecomment-107232319 @JoshRosen sounds good to me! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7985] [ML] [MLlib] [Docs] Remove fitti...
Github user dusenberrymw commented on the pull request: https://github.com/apache/spark/pull/6514#issuecomment-107232304 I also just created a JIRA to attach this to since it ended up being more than a single commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Trim trailing spaces for MLlib.
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6534 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6541#issuecomment-107233130 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6541#issuecomment-107233124 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/6541 [SPARK-3850] Turn style checker on for trailing whitespaces. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark trailing-whitespace-on Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6541.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6541 commit f72ebe402a5c1a4d755b9941863688bb205c2010 Author: Reynold Xin r...@databricks.com Date: 2015-05-31T18:45:44Z [SPARK-3850] Turn style checker on for trailing whitespaces. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7979] Enforce structural type checker.
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/6536#discussion_r31393163 --- Diff: examples/src/main/scala/org/apache/spark/examples/mllib/DecisionTreeRunner.scala --- @@ -354,7 +353,11 @@ object DecisionTreeRunner { /** * Calculates the mean squared error for regression. + * + * This is just for demo purpose. In general, don't copy this code because it is NOT efficient --- End diff -- Good to know. I'll make a note to remove it later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107233819 [Test build #33857 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33857/consoleFull) for PR 6505 at commit [`b6401ba`](https://github.com/apache/spark/commit/b6401ba59cf98cd9218a6f69da9bf089f4ee5240). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107239309 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107239315 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Simplifies binary node pattern matching
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6537#discussion_r31393883 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -127,20 +127,20 @@ trait HiveTypeCoercion { case e if !e.childrenResolved = e /* Double Conversions */ -case b: BinaryExpression if b.left == stringNaN b.right.dataType == DoubleType = - b.makeCopy(Array(b.right, Literal(Double.NaN))) -case b: BinaryExpression if b.left.dataType == DoubleType b.right == stringNaN = - b.makeCopy(Array(Literal(Double.NaN), b.left)) -case b: BinaryExpression if b.left == stringNaN b.right == stringNaN = - b.makeCopy(Array(Literal(Double.NaN), b.left)) +case b @ BinaryExpression(StringNaN, r @ DoubleType()) = + b.makeCopy(Array(r, Literal(Double.NaN))) --- End diff -- let's do the minimum thing needed and not switch the order here. No need to confuse the future user why orders are switched. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Enable PySpark SQL readerwriter and wi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6542#issuecomment-107241632 [Test build #33864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33864/consoleFull) for PR 6542 at commit [`9f46ce4`](https://github.com/apache/spark/commit/9f46ce41ac79080cf18a6d8496cd18a2dc878e37). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Enable PySpark SQL readerwriter and wi...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6542#issuecomment-107242142 Integrating my coverage reporting harness into the Jenkins builds would also help to catch this problem, since that makes it really obvious when test code isn't run. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-107246838 Can you add a regression test for this in `python/pyspark/tests.py`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-107247548 [Test build #33865 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33865/consoleFull) for PR 6262 at commit [`f552d49`](https://github.com/apache/spark/commit/f552d49127d9e43799d5728f52682a1609fdedb8). * This patch **fails Python style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6541#discussion_r31394463 --- Diff: scalastyle-config.xml --- @@ -50,6 +50,9 @@ */]]/parameter /parameters /check + + check level=error class=org.scalastyle.file.WhitespaceEndOfLineChecker enabled=true/check --- End diff -- I have a separate thing to rewrite this whole file later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/6541#discussion_r31394456 --- Diff: scalastyle-config.xml --- @@ -50,6 +50,9 @@ */]]/parameter /parameters /check + + check level=error class=org.scalastyle.file.WhitespaceEndOfLineChecker enabled=true/check --- End diff -- indent is weird --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7227][SPARKR] Support fillna / dropna i...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/6183#issuecomment-107247487 I thought I was waiting for a update from @sun-rui, but actually my comments have been addressed. There is a minor R documentation change, but I can do that with the merge. LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-107247553 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7937][SQL] Support comparison on Struct...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/6519#issuecomment-107254789 Also, is there anyway to do this without so much new pattern matching in the critical path and duplicated code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7862][SQL]Fix the deadlock in script tr...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/6404#issuecomment-107257613 /cc @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7824][SQL]Collapsing operator reorderin...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6351#issuecomment-107257589 [Test build #33867 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33867/consoleFull) for PR 6351 at commit [`a04ffae`](https://github.com/apache/spark/commit/a04ffae9a10dc943d16e0f2bf3d4371cf37226ca). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7824][SQL]Collapsing operator reorderin...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6351#issuecomment-107257590 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7691] [SQL] Refactor CatalystTypeConver...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6222#issuecomment-107261643 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7691] [SQL] Refactor CatalystTypeConver...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6222#issuecomment-107261630 Alright, pushed a commit to document the Option-handling semantics as well as to address the method dispatch issues for primitive types. This should be ready for review now. /cc @davies, you might want to look at this also. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7691] [SQL] Refactor CatalystTypeConver...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6222#issuecomment-107261652 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7986] Split scalastyle config into 3 se...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/6543 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7824][SQL]Collapsing operator reorderin...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6351#issuecomment-107277829 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7824][SQL]Collapsing operator reorderin...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6351#issuecomment-107277827 [Test build #33869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33869/consoleFull) for PR 6351 at commit [`ae3af6d`](https://github.com/apache/spark/commit/ae3af6d7614c3c7ad3d0407393cffc4796d04f2f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7938][BUILD]Use Google ErrorProne durin...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/6515#issuecomment-107279699 Thanks @yijieshen. Just tried this locally and worked. Can you change the report level from warning to error, and fix the few instances of warnings? The reason is I don't think it'd be very useful to run this at warning level, since there are so many warning messages that people will just ignore. The only way to make this useful is to fail the build if there are violations. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/6409#discussion_r31397611 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -849,6 +852,27 @@ private[spark] class Client( } } } + + private def cleanupStagingDir(): Unit = { --- End diff -- Yes, we need to refactor, thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6409#issuecomment-107280850 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7938][BUILD]Use Google ErrorProne durin...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6515#issuecomment-107280681 Since this is now guarded behind a special Maven profile, we should also make sure to update the Jenkins build to run with this enabled. AFAIK the Jenkins Maven builds don't actually run through run-tests-jenkins so we'll have to log in and make the changes via the Jenkins admin panel. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7889] make sure click the App ID on H...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6545#issuecomment-107285299 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/6505#issuecomment-107288766 Is it ready to go? cc @rxin @liancheng @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7699][Core] Lazy start the scheduler fo...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/6430#discussion_r31398962 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -262,15 +267,22 @@ private[spark] class ExecutorAllocationManager( val maxNeeded = maxNumExecutorsNeeded if (maxNeeded numExecutorsTarget) { - // The target number exceeds the number we actually need, so stop adding new - // executors and inform the cluster manager to cancel the extra pending requests - val oldNumExecutorsTarget = numExecutorsTarget - numExecutorsTarget = math.max(maxNeeded, minNumExecutors) - client.requestTotalExecutors(numExecutorsTarget) - numExecutorsToAdd = 1 - logInfo(sLowering target number of executors to $numExecutorsTarget because + -snot all requests are actually needed (previously $oldNumExecutorsTarget)) - numExecutorsTarget - oldNumExecutorsTarget + if (!numTargetExecutorAdjustable.get) { +// Keep the initial number of target executor to not ramp down until the first job is +// submitted or the first idle executor is released. +client.requestTotalExecutors(numExecutorsTarget) +0 + } else { +// The target number exceeds the number we actually need, so stop adding new +// executors and inform the cluster manager to cancel the extra pending requests +val oldNumExecutorsTarget = numExecutorsTarget +numExecutorsTarget = math.max(maxNeeded, minNumExecutors) +client.requestTotalExecutors(numExecutorsTarget) --- End diff -- @sryza and @andrewor14 , do we need to avoid requesting executors also here and `addExecutors` when `oldNumExecutorsTarget` == `numExecutorsTarget`, not in initializing status? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7989][Core][Tests] Fix flaky tests in E...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/6546#issuecomment-107293380 [Test build #33872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33872/consoleFull) for PR 6546 at commit [`3b69840`](https://github.com/apache/spark/commit/3b69840cf7dadf43667f59d2687c1fd257b5c0e4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6126][SQL] enforce UDTs as well in Json...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/6193#discussion_r31399451 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/UserDefinedType.scala --- @@ -53,7 +53,12 @@ abstract class UserDefinedType[UserType] extends DataType with Serializable { */ def serialize(obj: Any): Any - /** Convert a SQL datum to the user type */ + /** + * Convert a SQL datum to the user type + * + * This method may be called with an already deserialized datum, so it should be able to + * handle values of UserType too. + */ --- End diff -- Where do we call it on a deserialized datum? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT FIX] [BUILD] Fix maven build failures
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6511#issuecomment-107300035 I'm running into Maven build breaks due to missing test-jar dependencies in the `sql`, `mllib` and `flume-sink` projects. Why did you remove the test dependency? I can understand that it wouldn't transitively inherit the test deps. of a project that it depends on, but what's the harm in explicitly adding a test dep? Also, why do you think that the Jenkins build didn't catch the local Maven test compilation issues that I saw when running make-distribution.sh? Do you think that this is something that's Maven version specific? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOT FIX] [BUILD] Fix maven build failures
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/6511#issuecomment-107300943 False alarm: I was building off a slightly-out-of-date branch. Just saw the comment on your other commit; I see now that the issue is the fact that Spark Core's own dependencies aren't pulled in via the test JAR, so the test class itself is missing the deps. it needs. Makes sense. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7952][SPARK-7984][SQL] equality check b...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/6505#discussion_r31399878 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -482,30 +482,66 @@ trait HiveTypeCoercion { } /** - * Changes Boolean values to Bytes so that expressions like true false can be Evaluated. + * Changes numeric values to booleans so that expressions like true = 1 can be evaluated. */ - object BooleanComparisons extends Rule[LogicalPlan] { -val trueValues = Seq(1, 1L, 1.toByte, 1.toShort, new java.math.BigDecimal(1)).map(Literal(_)) -val falseValues = Seq(0, 0L, 0.toByte, 0.toShort, new java.math.BigDecimal(0)).map(Literal(_)) + object BooleanEqualization extends Rule[LogicalPlan] { +private val trueValues = Seq(1.toByte, 1.toShort, 1, 1L, new java.math.BigDecimal(1)) +private val falseValues = Seq(0.toByte, 0.toShort, 0, 0L, new java.math.BigDecimal(0)) + +private def buildCaseKeyWhen(booleanExpr: Expression, numericExpr: Expression) = { + CaseKeyWhen(numericExpr, Seq( +Literal(trueValues.head), booleanExpr, +Literal(falseValues.head), Not(booleanExpr), +Literal(false))) +} + +private def transform(booleanExpr: Expression, numericExpr: Expression) = { + If(Or(IsNull(booleanExpr), IsNull(numericExpr)), +Literal.create(null, BooleanType), +buildCaseKeyWhen(booleanExpr, numericExpr)) +} + +private def transformNullSafe(booleanExpr: Expression, numericExpr: Expression) = { + CaseWhen(Seq( +And(IsNull(booleanExpr), IsNull(numericExpr)), Literal(true), +Or(IsNull(booleanExpr), IsNull(numericExpr)), Literal(false), +buildCaseKeyWhen(booleanExpr, numericExpr) + )) +} def apply(plan: LogicalPlan): LogicalPlan = plan transformAllExpressions { // Skip nodes who's children have not been resolved yet. case e if !e.childrenResolved = e - // Hive treats (true = 1) as true and (false = 0) as true. - case EqualTo(l @ BooleanType(), r) if trueValues.contains(r) = l - case EqualTo(l, r @ BooleanType()) if trueValues.contains(l) = r - case EqualTo(l @ BooleanType(), r) if falseValues.contains(r) = Not(l) - case EqualTo(l, r @ BooleanType()) if falseValues.contains(l) = Not(r) - - // No need to change other EqualTo operators as that actually makes sense for boolean types. - case e: EqualTo = e - // No need to change the EqualNullSafe operators, too - case e: EqualNullSafe = e - // Otherwise turn them to Byte types so that there exists and ordering. - case p: BinaryComparison if p.left.dataType == BooleanType - p.right.dataType == BooleanType = -p.makeCopy(Array(Cast(p.left, ByteType), Cast(p.right, ByteType))) + // Hive treats (true = 1) as true and (false = 0) as true, + // all other cases are considered as false. + + // We may simplify the expression if one side is literal numeric values + case EqualTo(l @ BooleanType(), Literal(value, _: NumericType)) +if trueValues.contains(value) = l + case EqualTo(l @ BooleanType(), Literal(value, _: NumericType)) +if falseValues.contains(value) = Not(l) + case EqualTo(Literal(value, _: NumericType), r @ BooleanType()) +if trueValues.contains(value) = r + case EqualTo(Literal(value, _: NumericType), r @ BooleanType()) +if falseValues.contains(value) = Not(r) + case EqualNullSafe(l @ BooleanType(), Literal(value, _: NumericType)) +if trueValues.contains(value) = And(IsNotNull(l), l) + case EqualNullSafe(l @ BooleanType(), Literal(value, _: NumericType)) +if falseValues.contains(value) = And(IsNotNull(l), Not(l)) + case EqualNullSafe(Literal(value, _: NumericType), r @ BooleanType()) +if trueValues.contains(value) = And(IsNotNull(r), r) + case EqualNullSafe(Literal(value, _: NumericType), r @ BooleanType()) --- End diff -- I wonder if its not worthwhile to have a custom extractor for equality checking. It seems there might be more cases where either `==` or `=` should match. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail:
[GitHub] spark pull request: [SPARK-3850] Turn style checker on for trailin...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/6541#discussion_r31394780 --- Diff: scalastyle-config.xml --- @@ -50,6 +50,9 @@ */]]/parameter /parameters /check + + check level=error class=org.scalastyle.file.WhitespaceEndOfLineChecker enabled=true/check --- End diff -- Here you go https://github.com/apache/spark/pull/6543 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7986] Split scalastyle config into 3 se...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/6543 [SPARK-7986] Split scalastyle config into 3 sections. (1) rules that we enforce. (2) rules that we would like to enforce, but haven't cleaned up the codebase to turn on yet (or we need to make the scalastyle rule more configurable). (3) rules that we don't want to enforce. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark scalastyle Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6543.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6543 commit beefaabd14c2c6938d318179afe034403a6f71aa Author: Reynold Xin r...@databricks.com Date: 2015-05-31T21:58:23Z [SPARK-7986] Split scalastyle config into 3 sections. (1) rules that we enforce. (2) rules that we would like to enforce, but haven't cleaned up the codebase to turn on yet (or we need to make the scalastyle rule more configurable). (3) rules that we don't want to enforce. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7562][SPARK-6444][SQL] Improve error re...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/6405#discussion_r31394776 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -125,7 +133,13 @@ case class GroupExpression(children: Seq[Expression]) extends Expression { * so that the proper type conversions can be performed in the analyzer. */ trait ExpectsInputTypes { + self: Expression = def expectedChildTypes: Seq[DataType] + override def checkInputDataTypes: TypeCheckResult = { +// We will always do type casting for `ExpectsInputTypes` in `HiveTypeCoercion`, +// so type mismatch error won't be reported here, but for underling `Cast`s. --- End diff -- Seems this will result in a confusing error, since it will complain about casts that the user does not see in their query. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org