[GitHub] spark pull request: [SPARK-12399] Display correct error message wh...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10352#issuecomment-165378339 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47910/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9578] [ML] Stemmer feature transformer
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10272#issuecomment-165378340 **[Test build #47903 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47903/consoleFull)** for PR 10272 at commit [`ff03152`](https://github.com/apache/spark/commit/ff03152daa3d710dbb54b244488f9eb4b4a80378). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `class Stemmer (override val uid: String)`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12399] Display correct error message wh...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10352#issuecomment-165378338 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12218] [SQL] [Backport-1.5] Fixed the P...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10344#issuecomment-165378986 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/9593#discussion_r47878665 --- Diff: docs/configuration.md --- @@ -1523,6 +1523,15 @@ Apart from these, the following properties are also available, and may be useful + spark.streaming.backpressure.initialRate + not set + +Initial rate for backpressure mechanism (since 1.5). This provides maximum receiving rate of +receivers in the first batch when enables the backpressure mechanism, then the maximum receiving +rate will compute dynamically based on the current batch scheduling delays and processing times. --- End diff -- "will **be** compute**d** dynamically" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12218] [SQL] [Backport-1.5] Fixed the P...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10344#issuecomment-165378989 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47899/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/9593#discussion_r47878960 --- Diff: docs/configuration.md --- @@ -1523,6 +1523,15 @@ Apart from these, the following properties are also available, and may be useful + spark.streaming.backpressure.initialRate + not set + +Initial rate for backpressure mechanism (since 1.5). This provides maximum receiving rate of --- End diff -- "for **the** backpressure mechanism" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12395] [SQL] fix resulting columns of o...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10353#issuecomment-165380331 cc @rxin @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12395] [SQL] fix resulting columns of o...
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/10353 [SPARK-12395] [SQL] fix resulting columns of outer join For API DataFrame.join(right, usingColumns, joinType), if the joinType is right_outer or full_outer, the resulting join columns could be wrong (will be null). The order of columns had been changed to match that with MySQL and PostgreSQL [1]. This PR also fix the nullability of output for outer join. [1] http://www.postgresql.org/docs/9.2/static/queries-table-expressions.html You can merge this pull request into a Git repository by running: $ git pull https://github.com/davies/spark fix_join Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10353.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10353 commit f5ab9cbd6af264a03c00ca93d4d58c94a46bf468 Author: Davies LiuDate: 2015-12-17T08:11:38Z fix resulting columns of outer join --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user BenFradet commented on a diff in the pull request: https://github.com/apache/spark/pull/9593#discussion_r47879107 --- Diff: docs/configuration.md --- @@ -1523,6 +1523,15 @@ Apart from these, the following properties are also available, and may be useful + spark.streaming.backpressure.initialRate + not set + +Initial rate for backpressure mechanism (since 1.5). This provides maximum receiving rate of +receivers in the first batch when enables the backpressure mechanism, then the maximum receiving --- End diff -- I'd say: "This is the initial maximum receiving rate at which each receiver will receive data for the first batch when the backpressure mechanism is enabled." --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-165380452 A few remarks regarding the doc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12388] change default compression to lz...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10342#issuecomment-165380956 I had patched LZ4BlockInputStream to support concated streams. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12399] Display correct error message wh...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10352#issuecomment-165380912 **[Test build #47915 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47915/consoleFull)** for PR 10352 at commit [`e926120`](https://github.com/apache/spark/commit/e926120a12fd5182b09a1095097ee9e5ccf2e935). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12388] change default compression to lz...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10342#issuecomment-165389625 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47898/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8519] [ML] [MLlib] Blockify distance co...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/10306#discussion_r47878122 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -250,114 +240,142 @@ class KMeans private ( } } } + val initTimeInSeconds = (System.nanoTime() - initStartTime) / 1e9 logInfo(s"Initialization with $initializationMode took " + "%.3f".format(initTimeInSeconds) + " seconds.") -val active = Array.fill(numRuns)(true) -val costs = Array.fill(numRuns)(0.0) - -var activeRuns = new ArrayBuffer[Int] ++ (0 until numRuns) +var costs = 0.0 var iteration = 0 - val iterationStartTime = System.nanoTime() +val isSparse = data.take(1)(0).vector.isInstanceOf[SparseVector] -// Execute iterations of Lloyd's algorithm until all runs have converged -while (iteration < maxIterations && !activeRuns.isEmpty) { +// Execute Lloyd's algorithm until converged or reached the max number of iterations +while (iteration < maxIterations) { type WeightedPoint = (Vector, Long) def mergeContribs(x: WeightedPoint, y: WeightedPoint): WeightedPoint = { axpy(1.0, x._1, y._1) (y._1, x._2 + y._2) } - val activeCenters = activeRuns.map(r => centers(r)).toArray - val costAccums = activeRuns.map(_ => sc.accumulator(0.0)) - - val bcActiveCenters = sc.broadcast(activeCenters) + val costAccums = sc.accumulator(0.0) + val bcCenters = sc.broadcast(centers) // Find the sum and count of points mapping to each center val totalContribs = data.mapPartitions { points => -val thisActiveCenters = bcActiveCenters.value -val runs = thisActiveCenters.length -val k = thisActiveCenters(0).length -val dims = thisActiveCenters(0)(0).vector.size +val thisCenters = bcCenters.value +val k = thisCenters.length +val dims = thisCenters(0).vector.size + +val sums = Array.fill(k)(Vectors.zeros(dims)) +val counts = Array.fill(k)(0L) -val sums = Array.fill(runs, k)(Vectors.zeros(dims)) -val counts = Array.fill(runs, k)(0L) +val vectorOfPoints = new ArrayBuffer[Vector]() +val normOfPoints = new ArrayBuffer[Double]() +var numRows = 0 +// Construct points matrix points.foreach { point => - (0 until runs).foreach { i => -val (bestCenter, cost) = KMeans.findClosest(thisActiveCenters(i), point) -costAccums(i) += cost -val sum = sums(i)(bestCenter) -axpy(1.0, point.vector, sum) -counts(i)(bestCenter) += 1 + vectorOfPoints.append(point.vector) + normOfPoints.append(point.norm) + numRows += 1 +} + +val pointMatrix = if (isSparse) { + val coo = new ArrayBuffer[(Int, Int, Double)]() + vectorOfPoints.zipWithIndex.foreach { v => +val sv = v._1.asInstanceOf[SparseVector] +sv.indices.indices.foreach { i => + coo.append((v._2, sv.indices(i), sv.values(i))) +} } + SparseMatrix.fromCOO(numRows, dims, coo.toSeq) +} else { + new DenseMatrix(numRows, dims, vectorOfPoints.flatMap(_.toArray).toArray, true) } -val contribs = for (i <- 0 until runs; j <- 0 until k) yield { - ((i, j), (sums(i)(j), counts(i)(j))) +// Construct centers matrix +val vectorOfCenters = new ArrayBuffer[Double]() +val normOfCenters = new ArrayBuffer[Double]() +thisCenters.foreach { center => + vectorOfCenters.appendAll(center.vector.toArray) + normOfCenters.append(center.norm) +} +val centerMatrix = new DenseMatrix(dims, k, vectorOfCenters.toArray) + +val a2b2 = new ArrayBuffer[Double]() +val normOfPointsArray = normOfPoints.toArray +val normOfCentersArray = normOfCenters.toArray +for (i <- 0 until k; j <- 0 until numRows) { + a2b2.append(normOfPointsArray(j) * normOfPointsArray(j) + +normOfCentersArray(i) * normOfCentersArray(i)) +} + +val distanceMatrix = new DenseMatrix(numRows, k, a2b2.toArray) +gemm(-2.0, pointMatrix, centerMatrix, 1.0, distanceMatrix) + +val vectorOfPointsArray = vectorOfPoints.toArray + distanceMatrix.transpose.toArray.grouped(k).toArray.map(_.zipWithIndex.min).zipWithIndex --- End diff -- Good point! --- If
[GitHub] spark pull request: [SPARK-12218] [SQL] [Backport-1.5] Fixed the P...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10344#issuecomment-165378831 [Test build #47899 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47899/console) for PR 10344 at commit [`7d298fe`](https://github.com/apache/spark/commit/7d298fe8501563f10394d04ef43d3364a3dc43f3). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9578] [ML] Stemmer feature transformer
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/10272#issuecomment-165378793 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-165378836 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-165378870 Jenkins add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12218] [SQL] Fixed the Parquet's filter...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/10278#issuecomment-165379694 @gatorsmile Sorry for the late reply and thanks for the nice catch! The `In` predicate push down issue had been tracked by SPARK-11164, and done as part of PR #8956. Unfortunately that we didn't merge that PR due to other problems in it. Could you please add SPARK-11164 to your PR title? For the `Not` push-down rule: 1. I'm for adding it to branch-1.5 since it's a pretty safe one. 2. I think we might also want to add more general [CNF][1] conversion rule to master, which should be done in a separate PR, of course. Since we don't have existential / universal quantifier in our predicates, I think CNF conversion in Spark SQL can be as simple as keeping pushing `Not` and `Or` inward (or downward) using De Morgan's laws and the distributive law: ```scala object CNFConversion extends Rule[LogicalPlan] { override def apply(plan: LogicalPlan): LogicalPlan = plan transform { case filter: Filter => import org.apache.spark.sql.catalyst.dsl.expressions._ filter.copy(condition = filter.condition.transform { case Not(x Or y) => !x && !y case Not(x And y) => !x || !y case (x And y) Or z => (x || z) && (y || z) case x Or (y And z) => (x || y) && (x || z) }) } } ``` (Notice that this version doesn't handle common expression elimination.) That said, the `Not` push-down rule is actually a subset of CNF conversion. There had once been a PR aimed to add CNF conversion for data source filter push-down only, but wasn't merged (see SPARK-6624 and PR #6713). As @marmbrus commented there, CNF conversion might be worth adding to the optimizer. @rxin @marmbrus Not super confident about the CNF conversion conclusion above, please correct me if I'm wrong. [1]: https://en.wikipedia.org/wiki/Conjunctive_normal_form --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-165381302 **[Test build #47907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47907/consoleFull)** for PR 9593 at commit [`5a1dd98`](https://github.com/apache/spark/commit/5a1dd982cc59b360a8bb14b73e778540b1d8f8ea). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Once driver register successfully, stop it to ...
GitHub user echoTomei opened a pull request: https://github.com/apache/spark/pull/10354 Once driver register successfully, stop it to connect to master. This commit is to resolve SPARK-12396. You can merge this pull request into a Git repository by running: $ git pull https://github.com/echoTomei/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10354.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10354 commit 5bee290e357d8f855bcf22393fd076a8301f1001 Author: echo2mei <534384...@qq.com> Date: 2015-12-17T08:28:31Z Once driver register successfully, stop it to connect master again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8519] [ML] [MLlib] Blockify distance co...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10306#issuecomment-165385198 **[Test build #47912 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47912/consoleFull)** for PR 10306 at commit [`8f76116`](https://github.com/apache/spark/commit/8f76116a471a54fd0cb6331ebf8eeccc5764a23e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8519] [ML] [MLlib] Blockify distance co...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10306#issuecomment-165385374 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47912/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12397][SQL] Improve error messages for ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10351#issuecomment-165388142 **[Test build #47909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47909/consoleFull)** for PR 10351 at commit [`0bb2565`](https://github.com/apache/spark/commit/0bb256595a7311568d2a14086f480606eb4fb3d5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12397][SQL] Improve error messages for ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10351#issuecomment-165388269 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12388] change default compression to lz...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10342#issuecomment-165389539 **[Test build #47898 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47898/consoleFull)** for PR 10342 at commit [`f49f1ef`](https://github.com/apache/spark/commit/f49f1ef92eaeb7e55a96d3a2a2930cd5d4567f3f). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12388] change default compression to lz...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10342#issuecomment-165389624 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
GitHub user BenFradet opened a pull request: https://github.com/apache/spark/pull/10355 [SPARK-7425] [ML] [WIP] spark.ml Predictor should support other numeric types for label Currently, the Predictor abstraction expects the input labelCol type to be DoubleType, but we should support other numeric types. This will involve updating the PredictorParams.validateAndTransformSchema method. You can merge this pull request into a Git repository by running: $ git pull https://github.com/BenFradet/spark SPARK-7425 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10355.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10355 commit 2b276a29cc6183e6b42f828631c23e74bcd4144f Author: BenFradetDate: 2015-12-13T15:41:57Z check label data type for numeric type instead of double commit 7ef4ad4fce33f48e744fd0c9dbf30549eee76097 Author: BenFradet Date: 2015-12-13T15:47:42Z added some cases to extractLabeledPoints, looking for a better way to handle this commit e791ff6390284d6b0baae869de52f37bc0e38862 Author: BenFradet Date: 2015-12-13T16:40:30Z Added a method to set the metadata on a dataframe commit 83ffecba566f6a085fffa7dcf3194fa9f64edfc3 Author: BenFradet Date: 2015-12-13T16:41:43Z unit tests for the decision tree classifier commit 97dde27306984e3d78ca05d1cc99e47b18e48a8f Author: BenFradet Date: 2015-12-17T08:33:56Z used the sqlcontext provided with MLlibTestSparkContext commit c68ace1d7afe815ed252bacd0cebc576ef6e06b0 Author: BenFradet Date: 2015-12-17T09:06:46Z simpler version of extractLabeledPoints --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165389989 This is still a wip but remarks are welcome. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12289][SQL] Support UnsafeRow in TakeOr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10330#issuecomment-165383938 **[Test build #47916 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47916/consoleFull)** for PR 10330 at commit [`c343447`](https://github.com/apache/spark/commit/c343447db3e9db2571d746d7ac2218e58ba2f992). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11215] [ML] Add multiple columns suppor...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/9183#issuecomment-165387557 +1, I've been meaning to request this transformer for a while. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9578] [ML] Stemmer feature transformer
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10272#issuecomment-165378488 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8519] [ML] [MLlib] Blockify distance co...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10306#issuecomment-165378537 **[Test build #47912 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47912/consoleFull)** for PR 10306 at commit [`8f76116`](https://github.com/apache/spark/commit/8f76116a471a54fd0cb6331ebf8eeccc5764a23e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9578] [ML] Stemmer feature transformer
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10272#issuecomment-165378491 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47903/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12368] [ML] [DOC] Better doc for the bi...
Github user BenFradet commented on the pull request: https://github.com/apache/spark/pull/10328#issuecomment-165380597 cc @jkbradley @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12054] [SQL] Consider nullability of ex...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/10333#discussion_r47879128 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameJoinSuite.scala --- @@ -48,11 +48,13 @@ class DataFrameJoinSuite extends QueryTest with SharedSQLContext { checkAnswer( df.join(df2, Seq("int", "str"), "left"), - Row(1, 2, "1", null) :: Row(2, 3, "2", null) :: Row(3, 4, "3", null) :: Nil) + Row(1, 2, "1", null, null, null) :: Row(2, 3, "2", null, null, null) :: +Row(3, 4, "3", null, null, null) :: Nil) checkAnswer( df.join(df2, Seq("int", "str"), "right"), - Row(null, null, null, 2) :: Row(null, null, null, 3) :: Row(null, null, null, 4) :: Nil) + Row(null, null, null, 1, 2, "2") :: Row(null, null, null, 2, 3, "3") :: +Row(null, null, null, 3, 4, "4") :: Nil) --- End diff -- This bug will be fixed by https://github.com/apache/spark/pull/10353 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-165381420 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-165381448 **[Test build #47913 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47913/consoleFull)** for PR 10154 at commit [`3208d85`](https://github.com/apache/spark/commit/3208d85653b0d2df0c68bb67856fac1f50740ecf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12395] [SQL] fix resulting columns of o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10353#issuecomment-165381524 **[Test build #47914 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47914/consoleFull)** for PR 10353 at commit [`f5ab9cb`](https://github.com/apache/spark/commit/f5ab9cbd6af264a03c00ca93d4d58c94a46bf468). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-165381421 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47907/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Once driver register successfully, stop it to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10354#issuecomment-165385604 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-8519] [ML] [MLlib] Blockify distance co...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10306#issuecomment-165385372 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12397][SQL] Improve error messages for ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10351#issuecomment-165388272 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47909/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165390564 **[Test build #47917 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47917/consoleFull)** for PR 10355 at commit [`c68ace1`](https://github.com/apache/spark/commit/c68ace1d7afe815ed252bacd0cebc576ef6e06b0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165390848 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165390845 **[Test build #47917 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47917/consoleFull)** for PR 10355 at commit [`c68ace1`](https://github.com/apache/spark/commit/c68ace1d7afe815ed252bacd0cebc576ef6e06b0). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12293][SQL] Support UnsafeRow in LocalT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10283#issuecomment-165393360 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12293][SQL] Support UnsafeRow in LocalT...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10283#issuecomment-165393172 **[Test build #47911 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47911/consoleFull)** for PR 10283 at commit [`97d390f`](https://github.com/apache/spark/commit/97d390f9d4a929f6e115529d8bf8ec096074543c). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `case class LocalRelation(output: Seq[Attribute], data: Seq[UnsafeRow] = Nil)`\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12293][SQL] Support UnsafeRow in LocalT...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/10283#issuecomment-165396392 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-165399619 **[Test build #47900 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47900/consoleFull)** for PR 9593 at commit [`4f3392f`](https://github.com/apache/spark/commit/4f3392ffc0fadfc31761e8b84a53ec47f82c1245). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-165399693 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9593#issuecomment-165399695 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47900/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12377][Python][Wrong implementation for...
GitHub user somideshmukh opened a pull request: https://github.com/apache/spark/pull/10356 [SPARK-12377][Python][Wrong implementation for Row.__call__ in pyspark] Made chnages in types.py file to change "return _create_row(self, args)" to return create_row(self.fields_, args) You can merge this pull request into a Git repository by running: $ git pull https://github.com/somideshmukh/spark Branch12172015 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10356.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10356 commit bbc38ba690bed3ba777817b0dda15b51cbd031f2 Author: somideshmukhDate: 2015-12-17T09:48:54Z [SPARK-12377][Python][Wrong implementation for Row.__call__ in pyspark] --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12377][Python][Wrong implementation for...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10356#issuecomment-165404400 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12289][SQL] Support UnsafeRow in TakeOr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10330#issuecomment-165405876 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12404] [SQL] Ensure objects passed to S...
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/10357 [SPARK-12404] [SQL] Ensure objects passed to StaticInvoke is Serializable Now `StaticInvoke` receives Any as a object and StaticInvoke can be serialized but sometimes the object passed is not serializable. For example, following code raises Exception because RowEncoder#extractorsFor is invoked indirectly makes `StaticInvoke`. ``` case class TimestampContainer(timestamp: java.sql.Timestamp) val rdd = sc.parallelize(1 to 2).map(_ => TimestampContainer(System.currentTimeMillis)) val df = rdd.toDF val ds = df.as[TimestampContainer] val rdd2 = ds.rdd <- invokes extractorsFor indirectory ``` I'll add test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sarutak/spark SPARK-12404 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10357.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10357 commit e92ae87e9a5e9258277ff8a0243a84992c12414c Author: Kousuke SarutaDate: 2015-12-17T10:01:48Z Fixed StaticInvoke to ensure receiving Class object which is serializable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12289][SQL] Support UnsafeRow in TakeOr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10330#issuecomment-165405882 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47916/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12404] [SQL] Ensure objects passed to S...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10357#issuecomment-165407251 **[Test build #47922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47922/consoleFull)** for PR 10357 at commit [`ae7fdc1`](https://github.com/apache/spark/commit/ae7fdc1b3ca7d988e45ffc741676ea847e397d78). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-165408654 **[Test build #47913 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47913/consoleFull)** for PR 10154 at commit [`3208d85`](https://github.com/apache/spark/commit/3208d85653b0d2df0c68bb67856fac1f50740ecf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12388] change default compression to lz...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10342#issuecomment-165391692 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12388] change default compression to lz...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10342#issuecomment-165391694 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47908/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12388] change default compression to lz...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10342#issuecomment-165391615 **[Test build #47908 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47908/consoleFull)** for PR 10342 at commit [`13390e8`](https://github.com/apache/spark/commit/13390e8ebabc8ab9ed5ad426aca361d25976cd31). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_:\n * `public final class LZ4BlockInputStream extends FilterInputStream `\n --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12293][SQL] Support UnsafeRow in LocalT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10283#issuecomment-165393366 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47911/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165395169 **[Test build #47918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47918/consoleFull)** for PR 10355 at commit [`1027e83`](https://github.com/apache/spark/commit/1027e83d9add7a91adda7bba892b3fe7926405f1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12289][SQL] Support UnsafeRow in TakeOr...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/10330#discussion_r47891294 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala --- @@ -207,7 +211,7 @@ case class TakeOrderedAndProject( private val ord: InterpretedOrdering = new InterpretedOrdering(sortOrder, child.output) // TODO: remove @transient after figure out how to clean closure at InsertIntoHiveTable. - @transient private val projection = projectList.map(new InterpretedProjection(_, child.output)) + @transient private val projection = projectList.map(UnsafeProjection.create(_, child.output)) --- End diff -- I think it is ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165390849 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47917/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165396126 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165396122 **[Test build #47918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47918/consoleFull)** for PR 10355 at commit [`1027e83`](https://github.com/apache/spark/commit/1027e83d9add7a91adda7bba892b3fe7926405f1). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165396129 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47918/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12293][SQL] Support UnsafeRow in LocalT...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10283#issuecomment-165401597 **[Test build #47920 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47920/consoleFull)** for PR 10283 at commit [`97d390f`](https://github.com/apache/spark/commit/97d390f9d4a929f6e115529d8bf8ec096074543c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165401216 **[Test build #47919 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47919/consoleFull)** for PR 10355 at commit [`b014357`](https://github.com/apache/spark/commit/b0143575b50567225a63dabfef560a5d35554077). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7425] [ML] [WIP] spark.ml Predictor sho...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10355#issuecomment-165405712 **[Test build #47921 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47921/consoleFull)** for PR 10355 at commit [`47cc81b`](https://github.com/apache/spark/commit/47cc81bdd84146587ba169c80c8fc1ef382dafe3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12289][SQL] Support UnsafeRow in TakeOr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10330#issuecomment-165405764 **[Test build #47916 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47916/consoleFull)** for PR 10330 at commit [`c343447`](https://github.com/apache/spark/commit/c343447db3e9db2571d746d7ac2218e58ba2f992). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-165408794 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12149] [Web UI] Executor UI improvement...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10154#issuecomment-165408796 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47913/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12404] [SQL] Ensure objects passed to S...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10357#issuecomment-165451414 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47923/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12345] [CORE] Do not send SPARK_HOME th...
Github user skyluc commented on the pull request: https://github.com/apache/spark/pull/10329#issuecomment-165459198 Fixed the 'style', added a comment, and switch to `filterKeys`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12366][SQL][REPL] ExecutorClassLoader s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10358#issuecomment-165454564 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47924/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12366][SQL][REPL] ExecutorClassLoader s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10358#issuecomment-165454498 **[Test build #47924 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47924/consoleFull)** for PR 10358 at commit [`9269b0b`](https://github.com/apache/spark/commit/9269b0ba9d4788e8aee3154bd838014551f0aaa0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...
Github user dragos commented on a diff in the pull request: https://github.com/apache/spark/pull/10332#discussion_r47902198 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala --- @@ -94,7 +94,12 @@ private[mesos] class MesosSubmitRequestServlet( val driverMemory = sparkProperties.get("spark.driver.memory") val driverCores = sparkProperties.get("spark.driver.cores") val appArgs = request.appArgs -val environmentVariables = request.environmentVariables +// We don't want to pass down SPARK_HOME when launching Spark apps +// with Mesos cluster mode since it's populated by default on the client and it will +// cause spark-submit script to look for files in SPARK_HOME instead. +// We only need the ability to specify where to find spark-submit script +// which user can user spark.executor.home or spark.home configurations. +val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME")) --- End diff -- Unfortunately there is a subtle error here, and this is a no-op. And nobody ran this code, it seems. Here's what happens: `environmentVariables` is a map, not a sequence. So `filter` works on Pairs, and a pair will never be equal to a string. The correct call would have been `filterKeys`. Unfortunately this went in RC3 without fixing the bug. It is harmless otherwise, but highlights the fact that there are no easy fixes or safe changes. :-/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9384] [core] Easier setting of executor...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/7739#issuecomment-165466946 I'm a bit on the fence about this. Its kind of bringing us full circle to what SPARK used to have when they had env variables, most of those applied across everything. I think we chose to do them separate so users had the option of setting different ones. I see the case for this makes it easier for users in some cases but I also see it as possibly confusing and just more maintenance and complexity in the spark code. My other concern here is now this has different semantics then extraJavaOptions and extraLibraryPath as they won't have a common option. cc @pwendell as I think he originally added the configs to see if he remember any discussion around common ones. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12404] [SQL] Ensure objects passed to S...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10357#issuecomment-165451271 **[Test build #47923 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47923/consoleFull)** for PR 10357 at commit [`ae7fdc1`](https://github.com/apache/spark/commit/ae7fdc1b3ca7d988e45ffc741676ea847e397d78). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12404] [SQL] Ensure objects passed to S...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10357#issuecomment-165451412 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12366][SQL][REPL] ExecutorClassLoader s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10358#issuecomment-165458017 **[Test build #47925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47925/consoleFull)** for PR 10358 at commit [`9269b0b`](https://github.com/apache/spark/commit/9269b0ba9d4788e8aee3154bd838014551f0aaa0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12345] [CORE] Do not send SPARK_HOME th...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10329#issuecomment-165462105 **[Test build #47926 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47926/consoleFull)** for PR 10329 at commit [`62f4d2f`](https://github.com/apache/spark/commit/62f4d2ffc1af8aa64a945195faa9b2ef74b5ee9b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12345][MESOS] Filter SPARK_HOME when su...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/10332#discussion_r47909261 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala --- @@ -94,7 +94,12 @@ private[mesos] class MesosSubmitRequestServlet( val driverMemory = sparkProperties.get("spark.driver.memory") val driverCores = sparkProperties.get("spark.driver.cores") val appArgs = request.appArgs -val environmentVariables = request.environmentVariables +// We don't want to pass down SPARK_HOME when launching Spark apps +// with Mesos cluster mode since it's populated by default on the client and it will +// cause spark-submit script to look for files in SPARK_HOME instead. +// We only need the ability to specify where to find spark-submit script +// which user can user spark.executor.home or spark.home configurations. +val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME")) --- End diff -- That's really the problem, I think we should fix this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12366][SQL][REPL] ExecutorClassLoader s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10358#issuecomment-165454563 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12404] [SQL] Ensure objects passed to S...
Github user yu-iskw commented on the pull request: https://github.com/apache/spark/pull/10357#issuecomment-165461189 @sarutak thank you for sending this PR. @marmbrus @rxin could you review it? I think this is a little big issue. We should fix it before releasing Spark 1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11749][Streaming] Duplicate creating th...
Github user jhu-chang commented on the pull request: https://github.com/apache/spark/pull/9765#issuecomment-165470678 @zsxwing Thanks for your comments, could you help to check this again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12366][SQL][REPL] ExecutorClassLoader s...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/10358#issuecomment-165455309 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12376][TESTS] Spark Streaming Java8APIS...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/10336#issuecomment-165480439 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12321][SQL] JSON format for TreeNode (u...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10311#issuecomment-165492038 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Once driver register successfully, stop it to ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/10354 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Once driver register successfully, stop it to ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/10354#issuecomment-165494701 This PR was merged by mistake, reverted, sorry. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12054] [SQL] Consider nullability of ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10333#issuecomment-165502829 **[Test build #47932 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47932/consoleFull)** for PR 10333 at commit [`765f735`](https://github.com/apache/spark/commit/765f73579b3a64a906ce246ffa3408313ce58b95). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12345][Mesos] Properly filter out SPARK...
Github user dragos commented on a diff in the pull request: https://github.com/apache/spark/pull/10359#discussion_r47929981 --- Diff: core/src/main/scala/org/apache/spark/deploy/rest/mesos/MesosRestServer.scala --- @@ -99,7 +99,7 @@ private[mesos] class MesosSubmitRequestServlet( // cause spark-submit script to look for files in SPARK_HOME instead. // We only need the ability to specify where to find spark-submit script // which user can user spark.executor.home or spark.home configurations. -val environmentVariables = request.environmentVariables.filter(!_.equals("SPARK_HOME")) +val environmentVariables = request.environmentVariables.filterKeys(!_.equals("SPARK_HOME")) --- End diff -- Yeah, that'd be ok too. I chose to leave the code as it was and do the minimal change (given that we are in the RC cycle) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12366][SQL][REPL] ExecutorClassLoader s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/10358#issuecomment-165484882 **[Test build #47925 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47925/consoleFull)** for PR 10358 at commit [`9269b0b`](https://github.com/apache/spark/commit/9269b0ba9d4788e8aee3154bd838014551f0aaa0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12366][SQL][REPL] ExecutorClassLoader s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/10358#issuecomment-165485048 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47925/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12311][CORE] Restore previous value of ...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/10289#discussion_r47920442 --- Diff: core/src/test/scala/org/apache/spark/util/collection/ExternalSorterSuite.scala --- @@ -235,7 +235,7 @@ class ExternalSorterSuite extends SparkFunSuite with LocalSparkContext { it.next() } } - +*/ --- End diff -- Sorry, this is my mistake. This file should not be modified. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org