[GitHub] spark pull request: [SPARK-8968] [SQL] external sort by the partit...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/7336 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11171][SPARK-11237][SPARK-11241][ML] Tr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9207#issuecomment-173423355 **[Test build #49837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49837/consoleFull)** for PR 9207 at commit [`b514421`](https://github.com/apache/spark/commit/b514421683170d8c29ee3d39cb50abb59ff74816). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173424934 **[Test build #49838 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49838/consoleFull)** for PR 10854 at commit [`39f21de`](https://github.com/apache/spark/commit/39f21de507271314c1b08f9d6a9c0fc0a12396a4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-173428606 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-173428607 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49841/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/10480#discussion_r50353509 --- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala --- @@ -355,6 +355,13 @@ private[spark] object SerDe { writeInt(dos, v.length) v.foreach(elem => writeObject(dos, elem)) +// Handle Properties --- End diff -- I think 2 is acceptable to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6166] Limit number of concurrent outbou...
Github user wzhfy commented on the pull request: https://github.com/apache/spark/pull/10838#issuecomment-173437752 hi, @redsanket , in what situation will the number of requests become very large? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12204][SPARKR] Implement drop method fo...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/10201#issuecomment-173440987 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-12948. [SQL]. Consider reducing size of ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10861#issuecomment-173456430 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12914] [SQL] generate aggregation with ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10855#issuecomment-173468280 **[Test build #49862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49862/consoleFull)** for PR 10855 at commit [`7880786`](https://github.com/apache/spark/commit/788078668795458aa29a55d18e2b23686992df8d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12828][SQL]add natural join support
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10762#issuecomment-173468573 **[Test build #49859 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49859/consoleFull)** for PR 10762 at commit [`afb60a5`](https://github.com/apache/spark/commit/afb60a59fba17b07f5c744b4e961a2531eceef27). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12875] [ML] Add Weight of Evidence and ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10803#issuecomment-173468729 **[Test build #49858 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49858/consoleFull)** for PR 10803 at commit [`762e091`](https://github.com/apache/spark/commit/762e091014b9d5866d5e0345f4220dfbab119f5a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12469][CORE][RFC/WIP] Add Consistent Ac...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10841#issuecomment-173472830 **[Test build #49849 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49849/consoleFull)** for PR 10841 at commit [`5ff6b6a`](https://github.com/apache/spark/commit/5ff6b6a6fce974735fd0ad4450fdc382ce8590f9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12896] [WIP] Send only accumulator upda...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10857#issuecomment-173422710 **[Test build #49835 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49835/consoleFull)** for PR 10857 at commit [`abde0ed`](https://github.com/apache/spark/commit/abde0ed5be9945e8f089ae775daf4d9cc852e8a4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12469][CORE][RFC/WIP] Add Consistent Ac...
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/10841#issuecomment-173425544 That's a good point @codingcat , although the old behavior for regular accumulators is the same I'll add a note about the new API option. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12872][SQL] Support to specify the opti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10858#issuecomment-173425580 **[Test build #49839 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49839/consoleFull)** for PR 10858 at commit [`fbeee56`](https://github.com/apache/spark/commit/fbeee5673a37795a9b2387b069c0663485d0a252). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8968] [SQL] external sort by the partit...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/7336#issuecomment-173425609 Fixed by https://github.com/apache/spark/commit/d60f8d74ace5670b1b451a0ea0b93d3b9775bb52. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173433326 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49827/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-173433597 **[Test build #49844 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49844/consoleFull)** for PR 10731 at commit [`e61429f`](https://github.com/apache/spark/commit/e61429fec35c0f0983ff5e1bfeea11a1cef42690). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12926][SQL] SQLContext to disallow user...
Github user tejasapatil commented on the pull request: https://github.com/apache/spark/pull/10849#issuecomment-173436587 @marmbrus : Makes sense. I have updated the diff with your suggestion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173438176 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173438177 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49848/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Branch 1.6
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10859#issuecomment-173438549 @chetansomani can you close this pull request? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization for generate...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10828#issuecomment-173440035 **[Test build #2427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2427/consoleFull)** for PR 10828 at commit [`f5c9087`](https://github.com/apache/spark/commit/f5c90878514d6346cf9229daacac0963ea794713). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11622][MLLIB] Make LibSVMRelation exten...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9595#issuecomment-173449229 **[Test build #49851 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49851/consoleFull)** for PR 9595 at commit [`0d6d06d`](https://github.com/apache/spark/commit/0d6d06dc7aa98f2f2e6a1fb20d0af59f31ae4531). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11622][MLLIB] Make LibSVMRelation exten...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9595#issuecomment-173450415 **[Test build #49853 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49853/consoleFull)** for PR 9595 at commit [`8a2c96f`](https://github.com/apache/spark/commit/8a2c96fc28021e28c8009d4c58ce3d94e9227683). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11965] [ML] [Doc] Update user guide for...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10222#issuecomment-173450319 **[Test build #49850 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49850/consoleFull)** for PR 10222 at commit [`7def89a`](https://github.com/apache/spark/commit/7def89a93c5a533f03368923e150f1973d5ac19d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12689][SQL] Migrate DDL parsing to the ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10723#discussion_r50361063 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala --- @@ -140,6 +140,7 @@ private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) extends case Token("TOK_BOOLEAN", Nil) => BooleanType case Token("TOK_STRING", Nil) => StringType case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType +case Token("TOK_CHAR", Token(_, Nil) :: Nil) => StringType --- End diff -- We need to support Char type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12689][SQL] Migrate DDL parsing to the ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10723#discussion_r50361041 --- Diff: sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser/SparkSqlParser.g --- @@ -2115,7 +2185,7 @@ structType mapType @init { pushMsg("map type", state); } @after { popMsg(state); } -: KW_MAP LESSTHAN left=primitiveType COMMA right=type GREATERTHAN +: KW_MAP LESSTHAN left=type COMMA right=type GREATERTHAN --- End diff -- Key in Map can be any type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11965] [ML] [Doc] Update user guide for...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10222#issuecomment-173459397 **[Test build #49850 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49850/consoleFull)** for PR 10222 at commit [`7def89a`](https://github.com/apache/spark/commit/7def89a93c5a533f03368923e150f1973d5ac19d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11965] [ML] [Doc] Update user guide for...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10222#issuecomment-173459490 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49850/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173464944 **[Test build #49838 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49838/consoleFull)** for PR 10854 at commit [`39f21de`](https://github.com/apache/spark/commit/39f21de507271314c1b08f9d6a9c0fc0a12396a4). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173464972 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49838/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173464969 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173469249 **[Test build #49860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49860/consoleFull)** for PR 10854 at commit [`39f21de`](https://github.com/apache/spark/commit/39f21de507271314c1b08f9d6a9c0fc0a12396a4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12469][CORE][RFC/WIP] Add Consistent Ac...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10841#issuecomment-173423035 **[Test build #49821 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49821/consoleFull)** for PR 10841 at commit [`ce60621`](https://github.com/apache/spark/commit/ce60621dd5048114a1793b5adfad3d72cc423fed). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12469][CORE][RFC/WIP] Add Consistent Ac...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10841#issuecomment-173423180 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49821/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12872][SQL] Support to specify the opti...
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/10858 [SPARK-12872][SQL] Support to specify the option for compression codec for JSON datasource https://issues.apache.org/jira/browse/SPARK-12872 This PR makes the JSON datasource can compress output by option instead of manually setting Hadoop configurations. For reflecting codec by names, it is similar with https://github.com/apache/spark/pull/10805. As `CSVCompressionCodecs` can be shared with other datasources, it became a separate class to share as `CompressionCodecs`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark SPARK-12872 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10858.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10858 commit 5bf407f34ccc634c25d675fca2174cae38a6cd38 Author: hyukjinkwonDate: 2016-01-21T01:25:22Z Support to specify the option for compression codec for JSON datasource commit 5022ccd2d6ceb3ba18ad397df5debf1ae4865a0d Author: hyukjinkwon Date: 2016-01-21T01:26:51Z Remove newlines commit fbeee5673a37795a9b2387b069c0663485d0a252 Author: hyukjinkwon Date: 2016-01-21T01:33:27Z Update imports --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12469][CORE][RFC/WIP] Add Consistent Ac...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10841#issuecomment-173423177 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173428256 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49826/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173428158 **[Test build #49826 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49826/consoleFull)** for PR 10851 at commit [`2bf907a`](https://github.com/apache/spark/commit/2bf907a66aedd917b8ef3fcc9c6b1c2e20c1e028). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10861] Add range support
Github user JihongMA closed the pull request at: https://github.com/apache/spark/pull/9172 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10861] Add range support
Github user JihongMA commented on the pull request: https://github.com/apache/spark/pull/9172#issuecomment-173428238 @yhuai sure, I am closing it now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12747][SQL] Use correct type name for P...
Github user blbradley commented on the pull request: https://github.com/apache/spark/pull/10695#issuecomment-173436739 This is blocking me. Can we get it merged soon? I'm waiting to submit another PR to fix DecimalType also. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12204][SPARKR] Implement drop method fo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10201#issuecomment-173436793 **[Test build #49846 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49846/consoleFull)** for PR 10201 at commit [`5eb3004`](https://github.com/apache/spark/commit/5eb30044e3655d884de2aebaa39b7245d099fbdb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12872][SQL] Support to specify the opti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10858#issuecomment-173441168 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49839/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11622][MLLIB] Make LibSVMRelation exten...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9595#issuecomment-173449373 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11622][MLLIB] Make LibSVMRelation exten...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9595#issuecomment-173449374 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49851/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11622][MLLIB] Make LibSVMRelation exten...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9595#issuecomment-173449371 **[Test build #49851 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49851/consoleFull)** for PR 9595 at commit [`0d6d06d`](https://github.com/apache/spark/commit/0d6d06dc7aa98f2f2e6a1fb20d0af59f31ae4531). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50361275 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,7 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +categoryStats.prob(categoryStats.predict) --- End diff -- Finding the _proportion_ falling in outcome class 1 simply requires division of the counts by a constant. Since we're just using that number for an ordering, constant division won't matter. They are the same. My initial comment has a typo. It should say for a "binary **outcome**", not "binary category". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12914] [SQL] generate aggregation with ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10855#issuecomment-173458567 **[Test build #49855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49855/consoleFull)** for PR 10855 at commit [`7d1bd43`](https://github.com/apache/spark/commit/7d1bd43aafd7c38120b9508830e7a22db11371b4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173421921 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173421924 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49833/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173421891 **[Test build #49833 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49833/consoleFull)** for PR 10854 at commit [`9188ad3`](https://github.com/apache/spark/commit/9188ad33c809e7c075d8ef827d4e42688d3337f6). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7799][Streaming][Document]Add the linki...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10856#issuecomment-173421720 **[Test build #49836 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49836/consoleFull)** for PR 10856 at commit [`448007b`](https://github.com/apache/spark/commit/448007b2e928a6046a64a599a6831a0452fbd69b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11171][SPARK-11237][SPARK-11241][ML] Tr...
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/9207#issuecomment-173421710 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173425185 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12896] [WIP] Send only accumulator upda...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10857#issuecomment-173425234 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12896] [WIP] Send only accumulator upda...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10857#issuecomment-173425197 **[Test build #49835 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49835/consoleFull)** for PR 10857 at commit [`abde0ed`](https://github.com/apache/spark/commit/abde0ed5be9945e8f089ae775daf4d9cc852e8a4). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12896] [WIP] Send only accumulator upda...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10857#issuecomment-173425237 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49835/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173425186 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49824/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Branch 1.6
GitHub user chetansomani opened a pull request: https://github.com/apache/spark/pull/10859 Branch 1.6 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/spark branch-1.6 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10859.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10859 commit 40769b48cd001b7ff8c301628dc1442e3dd946cd Author: gatorsmileDate: 2015-12-01T18:38:59Z [SPARK-11905][SQL] Support Persist/Cache and Unpersist in Dataset APIs Persist and Unpersist exist in both RDD and Dataframe APIs. I think they are still very critical in Dataset APIs. Not sure if my understanding is correct? If so, could you help me check if the implementation is acceptable? Please provide your opinions. marmbrus rxin cloud-fan Thank you very much! Author: gatorsmile Author: xiaoli Author: Xiao Li Closes #9889 from gatorsmile/persistDS. (cherry picked from commit 0a7bca2da04aefff16f2513ec27a92e69ceb77f6) Signed-off-by: Michael Armbrust commit 843a31afbdeea66449750f0ba8f676ef31d00726 Author: Cheng Lian Date: 2015-12-01T18:21:31Z [SPARK-12046][DOC] Fixes various ScalaDoc/JavaDoc issues This PR backports PR #10039 to master Author: Cheng Lian Closes #10063 from liancheng/spark-12046.doc-fix.master. (cherry picked from commit 69dbe6b40df35d488d4ee343098ac70d00bbdafb) Signed-off-by: Yin Huai commit 99dc1335e2f635a067f9fa1e83a35bf9593bfc24 Author: woj-i Date: 2015-12-01T19:05:45Z [SPARK-11821] Propagate Kerberos keytab for all environments andrewor14 the same PR as in branch 1.5 harishreedharan Author: woj-i Closes #9859 from woj-i/master. (cherry picked from commit 6a8cf80cc8ef435ec46138fa57325bda5d68f3ce) Signed-off-by: Marcelo Vanzin commit ab2a124c8eca6823ee016c9ecfbdbf4918fbcdd6 Author: Josh Rosen Date: 2015-12-01T19:49:20Z [SPARK-12065] Upgrade Tachyon from 0.8.1 to 0.8.2 This commit upgrades the Tachyon dependency from 0.8.1 to 0.8.2. Author: Josh Rosen Closes #10054 from JoshRosen/upgrade-to-tachyon-0.8.2. (cherry picked from commit 34e7093c1131162b3aa05b65a19a633a0b5b633e) Signed-off-by: Josh Rosen commit 1cf9d3858c8a3a5796b64a9fbea22509f02d778a Author: Nong Li Date: 2015-12-01T20:59:53Z [SPARK-12030] Fix Platform.copyMemory to handle overlapping regions. This bug was exposed as memory corruption in Timsort which uses copyMemory to copy large regions that can overlap. The prior implementation did not handle this case half the time and always copied forward, resulting in the data being corrupt. Author: Nong Li Closes #10068 from nongli/spark-12030. (cherry picked from commit 2cef1cdfbb5393270ae83179b6a4e50c3cbf9e93) Signed-off-by: Yin Huai commit 81db8d086bbfe72caa0c45a395ebcaed80b5c237 Author: Tathagata Das Date: 2015-12-01T22:08:36Z [SPARK-12004] Preserve the RDD partitioner through RDD checkpointing The solution is the save the RDD partitioner in a separate file in the RDD checkpoint directory. That is, `/_partitioner`. In most cases, whether the RDD partitioner was recovered or not, does not affect the correctness, only reduces performance. So this solution makes a best-effort attempt to save and recover the partitioner. If either fails, the checkpointing is not affected. This makes this patch safe and backward compatible. Author: Tathagata Das Closes #9983 from tdas/SPARK-12004. (cherry picked from commit 60b541ee1b97c9e5e84aa2af2ce856f316ad22b3) Signed-off-by: Andrew Or commit 21909b8ac0068658cc833f324c0f1f418c200d61 Author: Shixiong Zhu Date: 2015-12-01T23:16:07Z Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize" This reverts commit 9b99b2b46c452ba396e922db5fc7eec02c45b158. commit 5647774b07593514f4ed4c29a038cfb1b69c9ba1 Author: Xusen Yin Date: 2015-12-01T23:21:53Z [SPARK-11961][DOC] Add docs of ChiSqSelector https://issues.apache.org/jira/browse/SPARK-11961 Author: Xusen Yin Closes #9965 from
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10851#discussion_r50351631 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util.sketch; + +import java.io.InputStream; +import java.io.OutputStream; + +/** + * An implementation of Count-Min sketch data structure for the following data types: --- End diff -- I'd just start with "A Count-Min sketch is a probabilistic data structure ..." i.e. your second paragraph. And then explain the type of data types supported. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/10851#discussion_r50351683 --- Diff: common/sketch/src/main/java/org/apache/spark/util/sketch/CountMinSketch.java --- @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util.sketch; + +import java.io.InputStream; +import java.io.OutputStream; + +/** + * An implementation of Count-Min sketch data structure for the following data types: + * + * {@link Byte} + * {@link Short} + * {@link Integer} + * {@link Long} + * {@link String} + * + * A Count-Min sketch is a probabilistic data structure used for summarizing streams of data in + * sub-linear space. Each {@link CountMinSketch} is initialized with a random seed, and a pair + * of parameters: + * + * relative error (or {@code eps}), and + * confidence (or {@code delta}) + * + * Suppose you want to estimate the number of times an element {@code x} has appeared in a data + * stream so far. With probability {@code delta}, the estimate of this frequency is within the + * range {@code true frequency <= estimate <= true frequency + eps * N}, where {@code N} is the + * total count of items have appeared the the data stream so far. + * + * Under the cover, a {@link CountMinSketch} is essentially a two-dimensional {@code long} array + * with depth {@code d} and width {@code w}, where + * + * {@code d = ceil(2 / eps)} + * {@code w = ceil(-log(1 - confidence) / log(2))} + * + * + * See http://www.eecs.harvard.edu/~michaelm/CS222/countmin.pdf for technical details, + * including proofs of the estimates and error bounds used in this implementation. + * + * This implementation is largely based on the {@code CountMinSketch} class from stream-lib. + */ +abstract public class CountMinSketch { + /** + * Returns the relative error (or {@code eps}) of this {@link CountMinSketch}. + */ + public abstract double relativeError(); + + /** + * Returns the confidence (or {@code delta}) of this {@link CountMinSketch}. + */ + public abstract double confidence(); + + /** + * Depth of this {@link CountMinSketch}. + */ + public abstract int depth(); + + /** + * Width of this {@link CountMinSketch}. + */ + public abstract int width(); + + /** + * Total count of items added to this {@link CountMinSketch} so far. + */ + public abstract long totalCount(); + + /** + * Adds 1 to {@code item}. + */ + public abstract void add(Object item); + + /** + * Adds {@code count} to {@code item}. + */ + public abstract void add(Object item, long count); + + /** + * Returns the estimated frequency of {@code item}. + */ + public abstract long estimateCount(Object item); + + /** + * Merges another {@link CountMinSketch} with this one in place. + * + * Note that only Count-Min sketches with the same {@code depth}, {@code width}, and random seed + * can be merged. + */ + public abstract CountMinSketch mergeInPlace(CountMinSketch other); --- End diff -- declare that this could throw some exception? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12939][SQL] migrate encoder resolution ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10852#issuecomment-173433127 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12939][SQL] migrate encoder resolution ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10852#issuecomment-173433129 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49830/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173433177 **[Test build #49827 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49827/consoleFull)** for PR 10851 at commit [`7ea22a9`](https://github.com/apache/spark/commit/7ea22a98b0d791d0d4fa28585ec447e529554d61). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173433323 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Branch 1.6
Github user chetansomani closed the pull request at: https://github.com/apache/spark/pull/10859 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173438675 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11622][MLLIB] Make LibSVMRelation exten...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9595#issuecomment-173451728 **[Test build #49853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49853/consoleFull)** for PR 9595 at commit [`8a2c96f`](https://github.com/apache/spark/commit/8a2c96fc28021e28c8009d4c58ce3d94e9227683). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11622][MLLIB] Make LibSVMRelation exten...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9595#issuecomment-173451792 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49853/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7997][Core]Remove Akka from Spark Core ...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/10854#issuecomment-173465691 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12789]Support order by index
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10731#issuecomment-173475388 **[Test build #49864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49864/consoleFull)** for PR 10731 at commit [`e61429f`](https://github.com/apache/spark/commit/e61429fec35c0f0983ff5e1bfeea11a1cef42690). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fixes SPARK-12910: R version for installing sp...
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/10836#issuecomment-173421200 looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12224][SPARKR] R support for JDBC sourc...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/10480#discussion_r50348037 --- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala --- @@ -355,6 +355,13 @@ private[spark] object SerDe { writeInt(dos, v.length) v.foreach(elem => writeObject(dos, elem)) +// Handle Properties --- End diff -- @shivaram as you see we are calling 3 different overloads of `read().jdbc()` in Scala, 4 if counting `write().jdbc()`. I think there would be 4 approaches to handle `read().jdbc()`: 1. Have 3 JVM helper functions 2. Have 1 helper function and on JVM side figure out which overload to route to 3. Have 1 helper function and include parameter processing (eg. check numPartitions/defaultParallelism etc), and overload checks all within JVM - and leave R to be a thin shim 4. serialize Properties as jobj and work on it on R side I feel #4 gives us the least overhead (less code) and more flexibility (since logic like default values for numPartition exists only on R/Python and not on Scala side). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12895] Implement TaskMetrics with accum...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/10835#issuecomment-173422198 @davies the bigger patch is ready: #10857. Let me know which one you prefer to review more. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11171][SPARK-11237][SPARK-11241][ML] Tr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9207#issuecomment-173429786 **[Test build #49837 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49837/consoleFull)** for PR 9207 at commit [`b514421`](https://github.com/apache/spark/commit/b514421683170d8c29ee3d39cb50abb59ff74816). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11171][SPARK-11237][SPARK-11241][ML] Tr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9207#issuecomment-173429886 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11171][SPARK-11237][SPARK-11241][ML] Tr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9207#issuecomment-173429888 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49837/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173437921 **[Test build #49848 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49848/consoleFull)** for PR 10851 at commit [`a6e7479`](https://github.com/apache/spark/commit/a6e74791e2087149e20afa7456029eb0a66a82d2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12933][SQL] Initial implementation of C...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10851#issuecomment-173438174 **[Test build #49848 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49848/consoleFull)** for PR 10851 at commit [`a6e7479`](https://github.com/apache/spark/commit/a6e74791e2087149e20afa7456029eb0a66a82d2). * This patch **fails Java style tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10524][ML] Use the soft prediction to o...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8734#discussion_r50355886 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -740,7 +740,7 @@ private[ml] object RandomForest extends Logging { val categoryStats = binAggregates.getImpurityCalculator(nodeFeatureOffset, featureValue) val centroid = if (categoryStats.count != 0) { -categoryStats.predict +categoryStats.prob(categoryStats.predict) --- End diff -- As I saw from the implementation, `categoryStats.stats(1)` is just the count of class 1. Are we going to order bins by that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12914] [SQL] generate aggregation with ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10855#issuecomment-173437954 **[Test build #49847 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49847/consoleFull)** for PR 10855 at commit [`a98bc05`](https://github.com/apache/spark/commit/a98bc05bf0542f5fdb8b54cb36651448832cfa8c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization for generate...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10828#issuecomment-173439428 **[Test build #2426 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2426/consoleFull)** for PR 10828 at commit [`f5c9087`](https://github.com/apache/spark/commit/f5c90878514d6346cf9229daacac0963ea794713). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12204][SPARKR] Implement drop method fo...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10201#issuecomment-173439417 **[Test build #49846 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49846/consoleFull)** for PR 10201 at commit [`5eb3004`](https://github.com/apache/spark/commit/5eb30044e3655d884de2aebaa39b7245d099fbdb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12204][SPARKR] Implement drop method fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10201#issuecomment-173439516 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49846/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12204][SPARKR] Implement drop method fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10201#issuecomment-173439514 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12790][CORE] Remove HistoryServer old m...
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/10860 [SPARK-12790][CORE] Remove HistoryServer old multiple files format Removed isLegacyLogDirectory code path and updated tests @andrewor14 You can merge this pull request into a Git repository by running: $ git pull https://github.com/felixcheung/spark historyserverformat Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10860.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10860 commit 4c302ec0a9212158c5b1cd444670ab4bad81d218 Author: felixcheungDate: 2016-01-19T04:06:39Z remove isLegacyLogDirectory code path and update tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10845#discussion_r50361137 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -696,6 +697,43 @@ object HiveTypeCoercion { } /** + * Strength reduction for comparisons between an integral column and a decimal literal: + * + * 1. int_col > decimal_literal => int_col > floor(decimal_literal) + * 2. int_col >= decimal_literal => int_col >= ceil(decimal_literal) + * 3. int_col < decimal_literal => int_col < ceil(decimal_literal) + * 4. int_col <= decimal_literal => int_col <= floor(decimal_literal) + * 5. decimal_literal > int_col => ceil(decimal_literal) > int_col + * 6. decimal_literal >= int_col => floor(decimal_literal) >= int_col + * 7. decimal_literal < int_col => floor(decimal_literal) < int_col + * 8. decimal_literal <= int_col => ceil(decimal_literal) <= int_col + * + */ + object SimplifyIntegerDecimalComparing extends Rule[LogicalPlan] { --- End diff -- Agreed. I also think that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12689][SQL] Migrate DDL parsing to the ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10723#discussion_r50361083 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala --- @@ -156,9 +157,10 @@ private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) extends protected def nodeToStructField(node: ASTNode): StructField = node match { case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) => - StructField(fieldName, nodeToDataType(dataType), nullable = true) -case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) => - StructField(fieldName, nodeToDataType(dataType), nullable = true) + StructField(cleanIdentifier(fieldName), nodeToDataType(dataType), nullable = true) +case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: comment :: Nil) => + val meta = new MetadataBuilder().putString("comment", unquoteString(comment.text)).build() + StructField(cleanIdentifier(fieldName), nodeToDataType(dataType), nullable = true, meta) --- End diff -- Add comment to `StructField`'s metadata. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12204][SPARKR] Implement drop method fo...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10201 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization for generate...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10828#issuecomment-173460276 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49856/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12902] [SQL] visualization for generate...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10828#issuecomment-173460272 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-173465206 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12828][SQL]add natural join support
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/10762#issuecomment-173465114 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12757][WIP] Use reference counting to p...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10705#issuecomment-173469381 **[Test build #49861 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49861/consoleFull)** for PR 10705 at commit [`12ed084`](https://github.com/apache/spark/commit/12ed0841b5d5cf171e9db9325bf9f61f3dd8046b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12904][SQL] Strength reduction for inte...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10845#issuecomment-173473549 **[Test build #49863 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49863/consoleFull)** for PR 10845 at commit [`7202c54`](https://github.com/apache/spark/commit/7202c546d025fc2c5cf71856c7e64fce8e85444f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12222] [Core] Deserialize RoaringBitmap...
Github user highmoutain commented on the pull request: https://github.com/apache/spark/pull/10213#issuecomment-173475873 does this problem have been solved in 1.6.0? But I encoutered this problem again in 1.6.0 when I use udf. my version is spark-1.6.0-bin-hadoop2.6.tgz. (p.mobile_id, 'model', 'MOBILE_MODEL') as mobile_model from analytics_device_profile_zh10 p limit 1; 16/01/21 14:47:29 WARN TaskSetManager: Lost task 0.0 in stage 44.0 (TID 44, 172.30.4.216): com.esotericsoftware.kryo.KryoException: Buffer underflow. Serialization trace: mobileAttributeCahceMap (enterprise.hive.udfs.ConvertDeviceModel) at com.esotericsoftware.kryo.io.Input.require(Input.java:156) at com.esotericsoftware.kryo.io.Input.readAscii_slow(Input.java:580) at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:558) at com.esotericsoftware.kryo.io.Input.readString(Input.java:436) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7799][Streaming][Document]Add the linki...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10856#issuecomment-173418991 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org