[GitHub] spark pull request: [SPARK-3371][SQL] Renaming a function expressi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2511 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3371][SQL] Renaming a function expressi...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2511#issuecomment-57590958 Thanks! Merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3437][BUILD] Support crossbuilding in m...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2357#issuecomment-57590659 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21183/consoleFull) for PR 2357 at commit [`609dd98`](https://github.com/apache/spark/commit/609dd98b7de77397dfc490c1c0a12bb9349830e5). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1813. Add a utility to SparkConf that ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/789#issuecomment-57590497 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21182/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-57590495 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21180/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-57590494 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21181/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3437][BUILD] Support crossbuilding in m...
GitHub user ScrapCodes reopened a pull request: https://github.com/apache/spark/pull/2357 [SPARK-3437][BUILD] Support crossbuilding in maven. With new scala-install-plugin. Since this plugin is not deployed anywhere, for anyone trying this patch has to publish it locally by cloning the following repo https://github.com/ScrapCodes/scala-install-plugin. And then running `mvn install`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ScrapCodes/spark-1 maven-improvements Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2357.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2357 commit 4d2d30cc9dcbe2b929ed8ed124d6719430ec6fae Author: Prashant Sharma Date: 2014-09-11T08:09:46Z Supported new scala install plugin. Which can let us cross build for scala. commit 609dd98b7de77397dfc490c1c0a12bb9349830e5 Author: Prashant Sharma Date: 2014-09-11T10:00:10Z Changed to newly updated with cross build support branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ SPARK-1812] Adjust build system and tests to...
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/2615#issuecomment-57590353 And this https://github.com/ScrapCodes/scala-install-plugin plugin takes care of publishing correct poms too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [ SPARK-1812] Adjust build system and tests to...
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/2615#issuecomment-57590286 Hey Patrick, thanks for looking at this. I did not say it is not possible. I just said the best(easiest ) way I could come up was to modify the maven install plugin. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-57589906 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21179/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-3223 runAsSparkUser cannot change HDFS w...
Github user jongyoul commented on the pull request: https://github.com/apache/spark/pull/2126#issuecomment-57589522 @tgravescs This code only apples in mesos mode, so another mode - yarn and standalone - is not affected. +1 @timothysc, val fwInfo = FrameworkInfo.newBuilder().setUser(sc.sparkUser).setName(sc.appName).build() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3212][SQL] Use logical plan matchi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2501#issuecomment-57589474 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21178/consoleFull) for PR 2501 at commit [`65ed04a`](https://github.com/apache/spark/commit/65ed04afdc49f96d5f66257cb003f1e8e345095c). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1813. Add a utility to SparkConf that ma...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/789#issuecomment-57589292 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21175/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1813. Add a utility to SparkConf that ma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/789#issuecomment-57589288 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21175/consoleFull) for PR 789 at commit [`1be3fa5`](https://github.com/apache/spark/commit/1be3fa53c4daf29d5b0153f2ac39e6d221f9bc56). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` throw new SparkException("Failed to load class to register with Kryo", e)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1297 Upgrade HBase dependency to 0.98
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1893#issuecomment-57588918 @pwendell can you take a look at this when you have a chance --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Merge pull request #1 from apache/master
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2502 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: JIRA issue: [SPARK-1405] Gibbs sampling based ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/476 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP] SPARK-2450: Add YARN executor log links ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1375 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2391 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1486#issuecomment-57588340 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21177/consoleFull) for PR 1486 at commit [`338d4f8`](https://github.com/apache/spark/commit/338d4f8fedd68b64a7fdfaf078afcc2623072501). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2693][SQL] Supported for UDAF Hive Aggr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2620#issuecomment-57588338 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21176/consoleFull) for PR 2620 at commit [`caf25c6`](https://github.com/apache/spark/commit/caf25c6633751f5418864a484304b17cf7a18b1a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1486#issuecomment-57588159 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-57588021 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21174/consoleFull) for PR 2388 at commit [`99945ce`](https://github.com/apache/spark/commit/99945ce52e7559728191226fbc21a2a592591ceb). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class TopicModelingKryoRegistrator extends KryoRegistrator ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2693][SQL] Supported for UDAF Hive Aggr...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2620#issuecomment-57588061 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-57588026 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21174/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1813. Add a utility to SparkConf that ma...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/789#issuecomment-57585682 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21175/consoleFull) for PR 789 at commit [`1be3fa5`](https://github.com/apache/spark/commit/1be3fa53c4daf29d5b0153f2ac39e6d221f9bc56). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1813. Add a utility to SparkConf that ma...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/789#issuecomment-57585666 Updated patch allows using both at the same time --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3371][SQL] Renaming a function expressi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2511#issuecomment-57584098 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21173/consoleFull) for PR 2511 at commit [`9fb973f`](https://github.com/apache/spark/commit/9fb973f39582e03ad06bf99c78f01099d493170a). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3371][SQL] Renaming a function expressi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2511#issuecomment-57584100 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21173/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-57584038 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21174/consoleFull) for PR 2388 at commit [`99945ce`](https://github.com/apache/spark/commit/99945ce52e7559728191226fbc21a2a592591ceb). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/2609#discussion_r18322230 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -202,9 +205,20 @@ private[spark] class Worker( // Spin up a separate thread (in a future) to do the dir cleanup; don't tie up worker actor val cleanupFuture = concurrent.future { logInfo("Cleaning up oldest application directories in " + workDir + " ...") -Utils.findOldFiles(workDir, APP_DATA_RETENTION_SECS) - .foreach(Utils.deleteRecursively) +val appDirs = workDir.listFiles() +if (appDirs == null) { + throw new IOException("ERROR: Failed to list files in " + appDirs) +} +appDirs.filter { dir => { --- End diff -- You do not need the extra bracket after the "dir =>". We use the enclosing bracket's scope. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3719][CORE]:"complete/failed stages" is...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2574#discussion_r18321787 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressPage.scala --- @@ -70,11 +72,11 @@ private[ui] class JobProgressPage(parent: JobProgressTab) extends WebUIPage("") Completed Stages: - {completedStages.size} + {totalCompletedStages} --- End diff -- I agree; this will help to avoid user confusion if there's a big difference between the count and number of displayed stages. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3654][SQL] Implement all extended HiveQ...
Github user ravipesala commented on the pull request: https://github.com/apache/spark/pull/2590#issuecomment-57581369 Fixed code as per comments, please review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...
Github user mubarak closed the pull request at: https://github.com/apache/spark/pull/1723 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...
Github user mubarak commented on the pull request: https://github.com/apache/spark/pull/1723#issuecomment-57581072 Fixed using https://github.com/apache/spark/pull/2464 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3371][SQL] Renaming a function expressi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2511#issuecomment-57580433 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21173/consoleFull) for PR 2511 at commit [`9fb973f`](https://github.com/apache/spark/commit/9fb973f39582e03ad06bf99c78f01099d493170a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/2241#issuecomment-57580447 @yhuai I removed all unnecessary implicits to make it consistent, but have to keep wrapperToFileSinkDesc because HiveFileFormatUtils.getHiveRecordWriter needs FileSinkDesc type, and also it help to track the internal state change of FileSinkDesc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3371][SQL] Renaming a function expressi...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/spark/pull/2511#discussion_r18321536 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -166,7 +186,7 @@ class SqlParser extends StandardTokenParsers with PackratParsers { val withFilter = f.map(f => Filter(f, base)).getOrElse(base) val withProjection = g.map {g => -Aggregate(assignAliases(g), assignAliases(p), withFilter) +Aggregate(assignAliasesForGroups(g,p), assignAliases(p), withFilter) --- End diff -- Yes @marmbrus , better we remove assignAliases to grouping expressions.Updated code as per that. Please review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2563#discussion_r18321520 --- Diff: python/pyspark/sql.py --- @@ -385,50 +429,32 @@ def _parse_datatype_string(datatype_string): >>> check_datatype(complex_maptype) True """ -index = datatype_string.find("(") -if index == -1: -# It is a primitive type. -index = len(datatype_string) -type_or_field = datatype_string[:index] -rest_part = datatype_string[index + 1:len(datatype_string) - 1].strip() - -if type_or_field in _all_primitive_types: -return _all_primitive_types[type_or_field]() - -elif type_or_field == "ArrayType": -last_comma_index = rest_part.rfind(",") -containsNull = True -if rest_part[last_comma_index + 1:].strip().lower() == "false": -containsNull = False -elementType = _parse_datatype_string( -rest_part[:last_comma_index].strip()) -return ArrayType(elementType, containsNull) - -elif type_or_field == "MapType": -last_comma_index = rest_part.rfind(",") -valueContainsNull = True -if rest_part[last_comma_index + 1:].strip().lower() == "false": -valueContainsNull = False -keyType, valueType = _parse_datatype_list( -rest_part[:last_comma_index].strip()) -return MapType(keyType, valueType, valueContainsNull) - -elif type_or_field == "StructField": -first_comma_index = rest_part.find(",") -name = rest_part[:first_comma_index].strip() -last_comma_index = rest_part.rfind(",") -nullable = True -if rest_part[last_comma_index + 1:].strip().lower() == "false": -nullable = False -dataType = _parse_datatype_string( -rest_part[first_comma_index + 1:last_comma_index].strip()) -return StructField(name, dataType, nullable) - -elif type_or_field == "StructType": -# rest_part should be in the format like -# List(StructField(field1,IntegerType,false)). -field_list_string = rest_part[rest_part.find("(") + 1:-1] -fields = _parse_datatype_list(field_list_string) +return _parse_datatype_json_value(json.loads(json_string)) + + +def _parse_datatype_json_value(json_value): +if json_value in _all_primitive_types.keys(): --- End diff -- if json_value is {}, it's not hashable, you can not use 'in' for it. I would like to use same type of json_value for all types, such as dict, with a key called type, such as: ``` {'type': 'int'} ``` for other types, it could have additional keys, based on the type, such as: ``` {'type':'array', 'element': {'type':'int'}, 'null': True} ``` In this ways, it will be easier to do the type switch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57579497 @buenrostro-oo @tdas We have seen several test failures from `NetworkReceiverSuite`. Do you have time to take a look? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2563#discussion_r18321352 --- Diff: python/pyspark/sql.py --- @@ -205,6 +234,16 @@ def __str__(self): return "ArrayType(%s,%s)" % (self.elementType, str(self.containsNull).lower()) +simpleString = 'array' + +def jsonValue(self): +return { +self.simpleString: { +'type': self.elementType.jsonValue(), +'containsNull': self.containsNull +} +} --- End diff -- This looks like js style, it could be fit in fewer lines. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3713][SQL] Uses JSON to serialize DataT...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2563#discussion_r18321283 --- Diff: python/pyspark/sql.py --- @@ -62,6 +63,12 @@ def __eq__(self, other): def __ne__(self, other): return not self.__eq__(other) +def jsonValue(self): +return self.simpleString --- End diff -- you can have default implementation as: self.__class__.__name__.[:-4].lower() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/2576#discussion_r18321063 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/orc/OrcTableOperations.scala --- @@ -0,0 +1,418 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.spark.sql.orc + +import org.apache.spark.sql.execution.{ExistingRdd, LeafNode, UnaryNode, SparkPlan} +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.{TaskContext, SerializableWritable} +import org.apache.spark.rdd.RDD + +import _root_.parquet.hadoop.util.ContextUtil +import org.apache.hadoop.fs.Path +import org.apache.hadoop.conf.Configuration +import org.apache.hadoop.mapreduce.lib.output.{FileOutputFormat, FileOutputCommitter} +import org.apache.hadoop.mapreduce.lib.input.FileInputFormat +import org.apache.hadoop.io.{Writable, NullWritable} +import org.apache.hadoop.mapreduce.{TaskID, TaskAttemptContext, Job} + +import org.apache.hadoop.hive.ql.io.orc.{OrcFile, OrcSerde, OrcInputFormat, OrcOutputFormat} +import org.apache.hadoop.hive.serde2.objectinspector._ +import org.apache.hadoop.hive.serde2.ColumnProjectionUtils +import org.apache.hadoop.hive.common.`type`.{HiveDecimal, HiveVarchar} + +import java.io.IOException +import java.text.SimpleDateFormat +import java.util.{Locale, Date} +import scala.collection.JavaConversions._ +import org.apache.hadoop.mapred.{SparkHadoopMapRedUtil, Reporter, JobConf} + +/** + * orc table scan operator. Imports the file that backs the given + * [[org.apache.spark.sql.orc.OrcRelation]] as a ``RDD[Row]``. + */ +case class OrcTableScan( + output: Seq[Attribute], + relation: OrcRelation, + columnPruningPred: Option[Expression]) + extends LeafNode { + + @transient + lazy val serde: OrcSerde = initSerde + + @transient + lazy val getFieldValue: Seq[Product => Any] = { +val inspector = serde.getObjectInspector.asInstanceOf[StructObjectInspector] +output.map(attr => { + val ref = inspector.getStructFieldRef(attr.name.toLowerCase(Locale.ENGLISH)) + row: Product => { +val fieldData = row.productElement(1) +val data = inspector.getStructFieldData(fieldData, ref) +unwrapData(data, ref.getFieldObjectInspector) + } +}) + } + + private def initSerde(): OrcSerde = { +val serde = new OrcSerde +serde.initialize(null, relation.prop) +serde + } + + def unwrapData(data: Any, oi: ObjectInspector): Any = oi match { +case pi: PrimitiveObjectInspector => pi.getPrimitiveJavaObject(data) +case li: ListObjectInspector => + Option(li.getList(data)) +.map(_.map(unwrapData(_, li.getListElementObjectInspector)).toSeq) +.orNull +case mi: MapObjectInspector => + Option(mi.getMap(data)).map( +_.map { + case (k, v) => +(unwrapData(k, mi.getMapKeyObjectInspector), + unwrapData(v, mi.getMapValueObjectInspector)) +}.toMap).orNull +case si: StructObjectInspector => + val allRefs = si.getAllStructFieldRefs + new GenericRow( +allRefs.map(r => + unwrapData(si.getStructFieldData(data, r), r.getFieldObjectInspector)).toArray) + } + + override def execute(): RDD[Row] = { +val sc = sqlContext.sparkContext +val job = new Job(sc.hadoopConfiguration) + +val conf: Configuration = ContextUtil.getConfiguration(job) +val fileList = FileSystemHelper.listFiles(relation.path, conf) + +// add all paths in the directory but skip "hidden" ones such +// as "_SUCCESS" +for (path <- fileList if !path.getName.startsWith("_")) { + FileInputFormat.addInputPath(job, path) +} +val serialConf = sc.broadcast(new SerializableWritable(conf)) + +setColumnIds(output, relat
[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/2576#discussion_r18321025 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/orc/OrcTableOperations.scala --- @@ -0,0 +1,418 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + +package org.apache.spark.sql.orc + +import org.apache.spark.sql.execution.{ExistingRdd, LeafNode, UnaryNode, SparkPlan} +import org.apache.spark.sql.catalyst.expressions._ +import org.apache.spark.{TaskContext, SerializableWritable} +import org.apache.spark.rdd.RDD + +import _root_.parquet.hadoop.util.ContextUtil +import org.apache.hadoop.fs.Path +import org.apache.hadoop.conf.Configuration +import org.apache.hadoop.mapreduce.lib.output.{FileOutputFormat, FileOutputCommitter} +import org.apache.hadoop.mapreduce.lib.input.FileInputFormat +import org.apache.hadoop.io.{Writable, NullWritable} +import org.apache.hadoop.mapreduce.{TaskID, TaskAttemptContext, Job} + +import org.apache.hadoop.hive.ql.io.orc.{OrcFile, OrcSerde, OrcInputFormat, OrcOutputFormat} +import org.apache.hadoop.hive.serde2.objectinspector._ +import org.apache.hadoop.hive.serde2.ColumnProjectionUtils +import org.apache.hadoop.hive.common.`type`.{HiveDecimal, HiveVarchar} + +import java.io.IOException +import java.text.SimpleDateFormat +import java.util.{Locale, Date} +import scala.collection.JavaConversions._ +import org.apache.hadoop.mapred.{SparkHadoopMapRedUtil, Reporter, JobConf} + +/** + * orc table scan operator. Imports the file that backs the given + * [[org.apache.spark.sql.orc.OrcRelation]] as a ``RDD[Row]``. + */ +case class OrcTableScan( + output: Seq[Attribute], + relation: OrcRelation, + columnPruningPred: Option[Expression]) + extends LeafNode { + + @transient + lazy val serde: OrcSerde = initSerde + + @transient + lazy val getFieldValue: Seq[Product => Any] = { +val inspector = serde.getObjectInspector.asInstanceOf[StructObjectInspector] +output.map(attr => { + val ref = inspector.getStructFieldRef(attr.name.toLowerCase(Locale.ENGLISH)) + row: Product => { +val fieldData = row.productElement(1) +val data = inspector.getStructFieldData(fieldData, ref) +unwrapData(data, ref.getFieldObjectInspector) + } +}) + } + + private def initSerde(): OrcSerde = { +val serde = new OrcSerde +serde.initialize(null, relation.prop) +serde + } + + def unwrapData(data: Any, oi: ObjectInspector): Any = oi match { +case pi: PrimitiveObjectInspector => pi.getPrimitiveJavaObject(data) +case li: ListObjectInspector => + Option(li.getList(data)) +.map(_.map(unwrapData(_, li.getListElementObjectInspector)).toSeq) +.orNull +case mi: MapObjectInspector => + Option(mi.getMap(data)).map( +_.map { + case (k, v) => +(unwrapData(k, mi.getMapKeyObjectInspector), + unwrapData(v, mi.getMapValueObjectInspector)) +}.toMap).orNull +case si: StructObjectInspector => + val allRefs = si.getAllStructFieldRefs + new GenericRow( +allRefs.map(r => + unwrapData(si.getStructFieldData(data, r), r.getFieldObjectInspector)).toArray) + } + + override def execute(): RDD[Row] = { +val sc = sqlContext.sparkContext +val job = new Job(sc.hadoopConfiguration) + +val conf: Configuration = ContextUtil.getConfiguration(job) +val fileList = FileSystemHelper.listFiles(relation.path, conf) + +// add all paths in the directory but skip "hidden" ones such +// as "_SUCCESS" +for (path <- fileList if !path.getName.startsWith("_")) { + FileInputFormat.addInputPath(job, path) +} +val serialConf = sc.broadcast(new SerializableWritable(conf)) + +setColumnIds(output, relat
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57577580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21171/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57577573 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21171/consoleFull) for PR 2595 at commit [`a0d9de3`](https://github.com/apache/spark/commit/a0d9de33d6b8ea7dec2e6421a5debd5310d3aa03). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3720][SQL]initial support ORC in spark ...
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/2576#discussion_r18320990 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/orc/ORCQuerySuite.scala --- @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.orc + +import org.apache.hadoop.fs.Path +import org.apache.hadoop.hive.ql.io.orc.CompressionKind +import org.apache.spark.sql.{SQLConf, SchemaRDD, TestData, QueryTest} +import org.apache.spark.sql.test.TestSQLContext +import org.scalatest.{BeforeAndAfterAll, FunSuiteLike} +import org.apache.spark.util.Utils +import org.apache.spark.sql.catalyst.util.getTempFilePath +import org.apache.spark.sql.test.TestSQLContext._ + +import java.io.File + +case class TestRDDEntry(key: Int, value: String) + +case class NullReflectData( +intField: java.lang.Integer, +longField: java.lang.Long, +floatField: java.lang.Float, +doubleField: java.lang.Double, +booleanField: java.lang.Boolean) + +case class OptionalReflectData( +intField: Option[Int], +longField: Option[Long], +floatField: Option[Float], +doubleField: Option[Double], +booleanField: Option[Boolean]) + +case class Nested(i: Int, s: String) + +case class Data(array: Seq[Int], nested: Nested) + +case class AllDataTypes( +stringField: String, +intField: Int, +longField: Long, +floatField: Float, +doubleField: Double, +shortField: Short, +byteField: Byte, +booleanField: Boolean) + +case class AllDataTypesWithNonPrimitiveType( +stringField: String, +intField: Int, +longField: Long, +floatField: Float, +doubleField: Double, +shortField: Short, +byteField: Byte, +booleanField: Boolean, +array: Seq[Int], +arrayContainsNull: Seq[Option[Int]], +map: Map[Int, Long], +mapValueContainsNull: Map[Int, Option[Long]], +data: Data) + +case class BinaryData(binaryData: Array[Byte]) + +class OrcQuerySuite extends QueryTest with FunSuiteLike with BeforeAndAfterAll { + TestData // Load test data tables. + + var testRDD: SchemaRDD = null + test("Read/Write All Types") { +val tempDir = getTempFilePath("orcTest").getCanonicalPath +val range = (0 to 255) +val data = sparkContext.parallelize(range) + .map(x => AllDataTypes(s"$x", x, x.toLong, x.toFloat, x.toDouble, x.toShort, x.toByte, x % 2 == 0)) + +data.saveAsOrcFile(tempDir) + +checkAnswer( + orcFile(tempDir), + data.toSchemaRDD.collect().toSeq) + +Utils.deleteRecursively(new File(tempDir)) + + } + + test("Compression options for writing to a Orcfile") { +val defaultOrcCompressionCodec = TestSQLContext.orcCompressionCodec +//TODO: support other compress codec +val file = getTempFilePath("orcTest") +val path = file.toString +val rdd = TestSQLContext.sparkContext.parallelize((1 to 100)) + .map(i => TestRDDEntry(i, s"val_$i")) + +// test default compression codec, now only support zlib +rdd.saveAsOrcFile(path) +var actualCodec = OrcFileOperator.readMetaData(new Path(path)).getCompression.name +assert(actualCodec == TestSQLContext.orcCompressionCodec.toUpperCase) + +/** --- End diff -- now only support zlib, i will remove this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h
[GitHub] spark pull request: [SPARK-2693][SQL] Supported for UDAF Hive Aggr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2620#issuecomment-57576395 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21172/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3696]Do not override the user-difined c...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2541#discussion_r18320642 --- Diff: sbin/spark-config.sh --- @@ -33,7 +33,7 @@ this="$config_bin/$script" export SPARK_PREFIX="`dirname "$this"`"/.. export SPARK_HOME="${SPARK_PREFIX}" -export SPARK_CONF_DIR="$SPARK_HOME/conf" +export SPARK_CONF_DIR="${SPARK_CONF_DIR:-"$SPARK_HOME/conf"}" --- End diff -- I'm not a bash expert, so I'm curious: does the nesting of double quotes work properly here? Here, we have double quotes inside of the `${}` and surrounding it, too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Prevents per row dynamic dispatching and...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2592#issuecomment-57575171 Yes, it should go after the DP PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1717#issuecomment-57575122 Do you mind adding "closes #2098" to the description of your PR so that this automatically closes the other PR when merged? Thanks!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1723#issuecomment-57575023 Hi @mubarak, This issue has been fixed by #2464, so do you mind closing this? Thanks! (Due to the way that this GitHub mirror is set up, we don't have permission to close your PR). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3755][Core] Do not bind port 1 - 1024 t...
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/2623#issuecomment-57574915 https://github.com/apache/spark/pull/2610 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2201 Improve FlumeInputDStream's stabili...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1310#issuecomment-57574730 Hi @joyyoj, Since this pull request doesn't show any code / changes, do you mind closing it? Feel free to update / re-open if you have code that you'd like us to review. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2706][SQL] Enable Spark to support Hive...
Github user zhzhan commented on the pull request: https://github.com/apache/spark/pull/2241#issuecomment-57574745 @pwendell I think the packaging has some problem, probably in protobuf. I ran some test suite, but cannot go through. With the original package, the test is OK. Following are some example failure case. sbt/sbt -Dhive.version=0.13.1 -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 "test-only org.apache.spark.sql.hive.CachedTableSuite" Caused by: sbt.ForkMain$ForkError: com.google.protobuf_spark.GeneratedMessage at java.net.URLClassLoader$1.run(URLClassLoader.java:366) sbt/sbt -Dhive.version=0.13.1 -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 "test-only org.apache.spark.sql.hive.execution.HiveQuerySuite" [info] ... [info] Cause: java.lang.ClassNotFoundException: com.google.protobuf_spark.GeneratedMessage [info] at java.net.URLClassLoader$1.run(URLClassLoader.java:366) sbt/sbt -Dhive.version=0.13.1 -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 "test-only org.apache.spark.sql.parquet.ParquetMetastoreSuite" [info] ... [info] Cause: java.lang.ClassNotFoundException: com.google.protobuf_spark.GeneratedMessage [info] at java.net.URLClassLoader$1.run(URLClassLoader.java:366) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57573144 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21169/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57573135 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21169/consoleFull) for PR 2609 at commit [`77a9de0`](https://github.com/apache/spark/commit/77a9de0adb733c406440e0f498f888939461f831). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2337 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2337#issuecomment-57572402 I've given it some thought and I don't think that we should merge the more general async. mechanism that I described in #2482. It had some confusing semantics surrounding cancellation (see the discussion of Thread.interrupt) and was probably more general than what most users need. Given that we should probably keep the current async APIs, this PR's change looks good. I'm going to merge this into `master`. Thanks for this commit and sorry for the long review delay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1720][SPARK-1719] Add the value of LD_L...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1031#issuecomment-57572266 Ok, I'll try to use LD_LIBRARY_PATH. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57572114 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21171/consoleFull) for PR 2595 at commit [`a0d9de3`](https://github.com/apache/spark/commit/a0d9de33d6b8ea7dec2e6421a5debd5310d3aa03). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1813. Add a utility to SparkConf that ma...
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/789#issuecomment-57572019 I think it's better to do both and explain that there might be problems. Otherwise users will see this new API and perhaps be surprised that their old registrator is no longer called. Not everyone reads the docs on the new API, so they might never notice, and just get poor performance. BTW looking at Kryo's docs, it does support multiple register calls on the same class, and it just uses the value from the last one. So it will probably do the right thing here if we call their custom registrator last. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...
Github user JoshRosen closed the pull request at: https://github.com/apache/spark/pull/2482 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3626] [WIP] Replace AsyncRDDActions wit...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2482#issuecomment-57572018 I'm going to close this for now. My approach has some confusing semantics and may be more general than what most users need. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user chouqin commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57571761 `NetworkReceiverSuite` in spark-streaming has failed, it is not related to this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user chouqin commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57571778 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750] support https in spark web ui
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-57570323 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21165/consoleFull) for PR 1980 at commit [`a29ec86`](https://github.com/apache/spark/commit/a29ec8632cce8cb29c38d8e9e3aee2334400130b). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` println(s"Failed to load main class $childMainClass.")` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750] support https in spark web ui
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-57570330 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21165/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3677] [BUILD] [YARN] pom.xml and SparkB...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2520#issuecomment-57570180 @ScrapCodes Do you have any better idea for this issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57570139 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21166/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57570133 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21166/consoleFull) for PR 2595 at commit [`a0d9de3`](https://github.com/apache/spark/commit/a0d9de33d6b8ea7dec2e6421a5debd5310d3aa03). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` println(s"Failed to load main class $childMainClass.")` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3212][SQL] Use logical plan matchi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2501#issuecomment-57569132 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21170/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3212][SQL] Use logical plan matchi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2501#issuecomment-57569130 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21170/consoleFull) for PR 2501 at commit [`bdf9a3f`](https://github.com/apache/spark/commit/bdf9a3f9dab4e4fe7cc89cc9a32a31e7511bf8de). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class LogicalRDD(output: Seq[Attribute], rdd: RDD[Row])(sqlContext: SQLContext)` * `case class PhysicalRDD(output: Seq[Attribute], rdd: RDD[Row]) extends LeafNode ` * `case class ExistingRdd(output: Seq[Attribute], rdd: RDD[Row]) extends LeafNode ` * `case class SparkLogicalPlan(alreadyPlanned: SparkPlan)(@transient sqlContext: SQLContext)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57568876 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21169/consoleFull) for PR 2609 at commit [`77a9de0`](https://github.com/apache/spark/commit/77a9de0adb733c406440e0f498f888939461f831). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-3212][SQL] Use logical plan matchi...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2501#issuecomment-57568818 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21170/consoleFull) for PR 2501 at commit [`bdf9a3f`](https://github.com/apache/spark/commit/bdf9a3f9dab4e4fe7cc89cc9a32a31e7511bf8de). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-3638 | Forced a compatible version of ht...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2535 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57568072 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21168/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57568070 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21168/consoleFull) for PR 2609 at commit [`7b7cae4`](https://github.com/apache/spark/commit/7b7cae481661b7b2a58ad96e08291573a0e043e7). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-3638 | Forced a compatible version of ht...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2535#issuecomment-57567934 This looks good to me. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57567979 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21168/consoleFull) for PR 2609 at commit [`7b7cae4`](https://github.com/apache/spark/commit/7b7cae481661b7b2a58ad96e08291573a0e043e7). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57567664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21167/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57567662 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21167/consoleFull) for PR 2609 at commit [`a045620`](https://github.com/apache/spark/commit/a04562069843603725cd4320afce9e5b19abe53b). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1860] More conservative app directory c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2609#issuecomment-57567579 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21167/consoleFull) for PR 2609 at commit [`a045620`](https://github.com/apache/spark/commit/a04562069843603725cd4320afce9e5b19abe53b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3755][Core] Do not bind port 1 - 1024 t...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2610#issuecomment-57567022 Oops you're right. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2096][SQL] support dot notation on arbi...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2405#issuecomment-57566803 Okay here are some thoughts and questions: - I don't think it really matters that we can't handle `f1.f11 > f2.f22` because we already don't know what do to if a user does `[1,2] > [0,3]` even without this new syntax. - Am I correct in saying that hive doesn't support this syntax at all and that we are inventing new functionality? I'm not strictly opposed to this, but we should be careful as once we support something we can't get rid of it later. - I'm not convinced that we need to handle arbitrary array nesting here. The case of getting all of one field from an array (which i guess makes this SQL short hand for `array.map(_.fieldName)`) seems reasonable, but is there a use case for the arbitrary nesting version? - This ends up complicating `GetField` quite a bit. What about creating a new expression type `ArrayGetField` and adding something to the analyzer that switches expression types when an array is detected. The idea here is to keep each expression simple so we can code-gen on a case by case basis. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3398] [EC2] Have spark-ec2 intelligentl...
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2339#issuecomment-57566795 > This patch adds the following public classes (experimental): println(s"Failed to load main class $childMainClass.") FYI: I believe I have these phantom new class notes finally sorted out in #2606. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2538#discussion_r18318446 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/python/PythonDStream.scala --- @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.streaming.api.python + +import java.io.{ObjectInputStream, ObjectOutputStream} +import java.lang.reflect.Proxy +import java.util.{ArrayList => JArrayList, List => JList} +import scala.collection.JavaConversions._ +import scala.collection.JavaConverters._ + +import org.apache.spark.api.java._ +import org.apache.spark.api.python._ +import org.apache.spark.rdd.RDD +import org.apache.spark.storage.StorageLevel +import org.apache.spark.streaming.{Interval, Duration, Time} +import org.apache.spark.streaming.dstream._ +import org.apache.spark.streaming.api.java._ + + +/** + * Interface for Python callback function with three arguments + */ +private[python] trait PythonTransformFunction { + def call(time: Long, rdds: JList[_]): JavaRDD[Array[Byte]] +} + +/** + * Wrapper for PythonTransformFunction + * TODO: support checkpoint + */ +private[python] class TransformFunction(@transient var pfunc: PythonTransformFunction) + extends function.Function2[JList[JavaRDD[_]], Time, JavaRDD[Array[Byte]]] with Serializable { + + def apply(rdd: Option[RDD[_]], time: Time): Option[RDD[Array[Byte]]] = { +Option(pfunc.call(time.milliseconds, List(rdd.map(JavaRDD.fromRDD(_)).orNull).asJava)) + .map(_.rdd) + } + + def apply(rdd: Option[RDD[_]], rdd2: Option[RDD[_]], time: Time): Option[RDD[Array[Byte]]] = { +val rdds = List(rdd.map(JavaRDD.fromRDD(_)).orNull, rdd2.map(JavaRDD.fromRDD(_)).orNull).asJava +Option(pfunc.call(time.milliseconds, rdds)).map(_.rdd) + } + + // for function.Function2 + def call(rdds: JList[JavaRDD[_]], time: Time): JavaRDD[Array[Byte]] = { +pfunc.call(time.milliseconds, rdds) + } + + private def writeObject(out: ObjectOutputStream): Unit = { +assert(PythonDStream.serializer != null, "Serializer has not been registered!") +val bytes = PythonDStream.serializer.serialize(pfunc) +out.writeInt(bytes.length) +out.write(bytes) + } + + private def readObject(in: ObjectInputStream): Unit = { +assert(PythonDStream.serializer != null, "Serializer has not been registered!") +val length = in.readInt() +val bytes = new Array[Byte](length) +in.readFully(bytes) +pfunc = PythonDStream.serializer.deserialize(bytes) + } +} + +/** + * Interface for Python Serializer to serialize PythonTransformFunction + */ +private[python] trait PythonTransformFunctionSerializer { + def dumps(id: String): Array[Byte] // --- End diff -- Extra `//` nit: move this trait to be near `PythonTransformFunction` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3398] [EC2] Have spark-ec2 intelligentl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2339#issuecomment-57566562 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21163/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2538#discussion_r18318422 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/api/python/PythonDStream.scala --- @@ -0,0 +1,304 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.streaming.api.python + +import java.io.{ObjectInputStream, ObjectOutputStream} +import java.lang.reflect.Proxy +import java.util.{ArrayList => JArrayList, List => JList} +import scala.collection.JavaConversions._ +import scala.collection.JavaConverters._ + +import org.apache.spark.api.java._ +import org.apache.spark.api.python._ +import org.apache.spark.rdd.RDD +import org.apache.spark.storage.StorageLevel +import org.apache.spark.streaming.{Interval, Duration, Time} +import org.apache.spark.streaming.dstream._ +import org.apache.spark.streaming.api.java._ + + +/** + * Interface for Python callback function with three arguments + */ +private[python] trait PythonTransformFunction { + def call(time: Long, rdds: JList[_]): JavaRDD[Array[Byte]] +} + +/** + * Wrapper for PythonTransformFunction + * TODO: support checkpoint + */ +private[python] class TransformFunction(@transient var pfunc: PythonTransformFunction) + extends function.Function2[JList[JavaRDD[_]], Time, JavaRDD[Array[Byte]]] with Serializable { --- End diff -- Function2 is already Serializable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3398] [EC2] Have spark-ec2 intelligentl...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2339#issuecomment-57566559 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21163/consoleFull) for PR 2339 at commit [`43a69f0`](https://github.com/apache/spark/commit/43a69f00a2fd004a57b860b3ee6bda8fc1e9f840). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` println(s"Failed to load main class $childMainClass.")` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/2538#discussion_r18318391 --- Diff: python/pyspark/streaming/tests.py --- @@ -0,0 +1,532 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import os +from itertools import chain +import time +import operator +import unittest +import tempfile + +from pyspark.context import SparkConf, SparkContext, RDD +from pyspark.streaming.context import StreamingContext + + +class PySparkStreamingTestCase(unittest.TestCase): + +timeout = 10 # seconds +duration = 1 + +def setUp(self): +class_name = self.__class__.__name__ +conf = SparkConf().set("spark.default.parallelism", 1) +self.sc = SparkContext(appName=class_name, conf=conf) +self.sc.setCheckpointDir("/tmp") +# TODO: decrease duration to speed up tests +self.ssc = StreamingContext(self.sc, self.duration) + +def tearDown(self): +self.ssc.stop() + +def _take(self, dstream, n): +""" +Return the first `n` elements in the stream (will start and stop). +""" +results = [] + +def take(_, rdd): +if rdd and len(results) < n: +results.extend(rdd.take(n - len(results))) + +dstream.foreachRDD(take) + +self.ssc.start() +while len(results) < n: +time.sleep(0.01) +self.ssc.stop(False, True) +return results + +def _collect(self, dstream): +""" +Collect each RDDs into the returned list. + +:return: list, which will have the collected items. +""" +result = [] + +def get_output(_, rdd): +r = rdd.collect() +if r: +result.append(r) +dstream.foreachRDD(get_output) +return result + +def _test_func(self, input, func, expected, sort=False, input2=None): +""" +@param input: dataset for the test. This should be list of lists. +@param func: wrapped function. This function should return PythonDStream object. +@param expected: expected output for this testcase. +""" +if not isinstance(input[0], RDD): +input = [self.sc.parallelize(d, 1) for d in input] +input_stream = self.ssc.queueStream(input) +if input2 and not isinstance(input2[0], RDD): +input2 = [self.sc.parallelize(d, 1) for d in input2] +input_stream2 = self.ssc.queueStream(input2) if input2 is not None else None + +# Apply test function to stream. +if input2: +stream = func(input_stream, input_stream2) +else: +stream = func(input_stream) + +result = self._collect(stream) +self.ssc.start() + +start_time = time.time() +# Loop until get the expected the number of the result from the stream. +while True: +current_time = time.time() +# Check time out. +if (current_time - start_time) > self.timeout: +print "timeout after", self.timeout +break +# StreamingContext.awaitTermination is not used to wait because +# if py4j server is called every 50 milliseconds, it gets an error. +time.sleep(0.05) +# Check if the output is the same length of expected output. +if len(expected) == len(result): +break +if sort: +self._sort_result_based_on_key(result) +self._sort_result_based_on_key(expected) +self.assertEqual(expected, result) + +def _sort_result_based_on_key(self, outputs): +"""Sort the list based on first value.""" +for output in outputs:
[GitHub] spark pull request: [SPARK-2489] [SQL] Parquet support for fixed_l...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1737#issuecomment-57566129 You are right that we would have to change the BinaryType to be a case class instead to hold this information and then change the rest of the code to deal with that. It is possible that we could play some tricks with the `unapply` method in the BinaryType companion object to minimize the changes to pattern matching code, I'd have to play around with it more to see if that is actually feasible though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57565848 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21166/consoleFull) for PR 2595 at commit [`a0d9de3`](https://github.com/apache/spark/commit/a0d9de33d6b8ea7dec2e6421a5debd5310d3aa03). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3479] [Build] Report failed test catego...
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/2606#discussion_r18318253 --- Diff: dev/run-tests-jenkins --- @@ -84,42 +98,46 @@ function post_message () { fi } + +# We diff master...$ghprbActualCommit because that gets us changes introduced in the PR +#+ and not anything else added to master since the PR was branched. + # check PR merge-ability and check for new public classes { if [ "$sha1" == "$ghprbActualCommit" ]; then -merge_note=" * This patch **does not** merge cleanly!" +merge_note=" * This patch **does not merge cleanly**." else merge_note=" * This patch merges cleanly." + fi + + source_files=$( + git diff master...$ghprbActualCommit --name-only `# diff patch against master from branch point` \ --- End diff -- After lots of trial and error, I'm pretty sure this is the correct way to do this diff, and I understand why. A brief explanation is included as a comment earlier in the file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57565744 @chouqin Sorry for the delay in my review. I will finish mine within the next 24 hours. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3479] [Build] Report failed test catego...
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/2606#discussion_r18318239 --- Diff: dev/run-tests-jenkins --- @@ -84,42 +98,46 @@ function post_message () { fi } + +# We diff master...$ghprbActualCommit because that gets us changes introduced in the PR +#+ and not anything else added to master since the PR was branched. + # check PR merge-ability and check for new public classes { if [ "$sha1" == "$ghprbActualCommit" ]; then -merge_note=" * This patch **does not** merge cleanly!" +merge_note=" * This patch **does not merge cleanly**." else merge_note=" * This patch merges cleanly." + fi + + source_files=$( --- End diff -- We can do a valid diff regardless of the merge-ability of the patch, so I moved this out of the if block. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3479] [Build] [WIP] Report failed test ...
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2606#issuecomment-57565606 cc @pwendell This PR is ready for review. Here are examples of the messages posted when: * [all tests pass](https://github.com/apache/spark/pull/2606#issuecomment-57548426) * [PySpark unit tests fail and there is a new class](https://github.com/apache/spark/pull/2606#issuecomment-57535539) * [Spark unit tests fail](https://github.com/apache/spark/pull/2606#issuecomment-57522419) * [RAT tests fail](https://github.com/apache/spark/pull/2606#issuecomment-57510289) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3366][MLLIB]Compute best splits distrib...
Github user chouqin commented on the pull request: https://github.com/apache/spark/pull/2595#issuecomment-57565298 @mengxr @jkbradley thanks for your comments, it can pass unit test now, do you have any more suggestions? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2750] support https in spark web ui
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1980#issuecomment-57564475 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21165/consoleFull) for PR 1980 at commit [`a29ec86`](https://github.com/apache/spark/commit/a29ec8632cce8cb29c38d8e9e3aee2334400130b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-3711: Optimize where in clause filter qu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2561#issuecomment-57564269 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/250/consoleFull) for PR 2561 at commit [`430f5d1`](https://github.com/apache/spark/commit/430f5d15a95ddda314d5750e5b42fdc5e2fac4ba). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org