[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9078#issuecomment-147878046 [Test build #43674 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43674/console) for PR 9078 at commit [`0846260`](https://github.com/apache/spark/commit/084626087faa5749091b99c56cad5f706d1f75b2). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait QueryExecutionListener ` * `class ExecutionListenerManager extends Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10959] [PYSPARK] StreamingLogisticRegre...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9087#issuecomment-147877817 Merged into branch-1.5. Could you close this JIRA manually? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9083 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9083#issuecomment-147877496 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9078#issuecomment-147877246 [Test build #43685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43685/consoleFull) for PR 9078 at commit [`9943fea`](https://github.com/apache/spark/commit/9943feaf4ce288614e40a8502626900fb3cf3a4b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147876792 [Test build #43684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43684/consoleFull) for PR 9099 at commit [`36e191d`](https://github.com/apache/spark/commit/36e191deb1ca994ede2dfc143bc6c2a4c572c9d0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147876185 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9078#issuecomment-147876157 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147876155 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9078#issuecomment-147876184 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10389][SQL][1.5] support order by non-a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9102#issuecomment-147876071 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43672/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10622] [core] [yarn] Differentiate dead...
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/8887#discussion_r41936884 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -781,6 +781,23 @@ private[spark] class TaskSetManager( sortedTaskSetQueue } + /** + * Called by TaskScheduler when an executor is lost, but the reason is not yet known. This method + * does not fail any tasks related to the executor. Instead, tasks are left as is, but the + * executor is removed from the list of live executors, so no new tasks are scheduled. Pending + * tasks for the executor are re-queued. + */ + override def disableExecutor(execId: String, host: String): Unit = { +for (index <- getPendingTasksForExecutor(execId)) { + addPendingTask(index, readding = true) +} +for (index <- getPendingTasksForHost(host)) { + addPendingTask(index, readding = true) +} +// recalculate valid locality levels and waits when executor is disabled. +recomputeLocality() --- End diff -- It seems to not be expensive; looks `O(1)` except in rare cases where most executors with pending tasks are dead, in which case it would be `O(number executors)`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10389][SQL][1.5] support order by non-a...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9102#issuecomment-147876065 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10389][SQL][1.5] support order by non-a...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9102#issuecomment-147875930 [Test build #43672 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43672/console) for PR 9102 at commit [`ab2341a`](https://github.com/apache/spark/commit/ab2341a39ca90116b1f86b2f33c119cac57d51dc). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11042] [SQL] [BRANCH-1.5 TEST-ONLY] Add...
Github user yhuai closed the pull request at: https://github.com/apache/spark/pull/9077 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11017] [SQL] Support ImperativeAggregat...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9038#issuecomment-147875246 [Test build #43683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43683/consoleFull) for PR 9038 at commit [`2547b29`](https://github.com/apache/spark/commit/2547b29e61cc27580f5dbb68a8ed8f65d8c04848). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147875233 [Test build #43682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43682/consoleFull) for PR 9103 at commit [`6e7190e`](https://github.com/apache/spark/commit/6e7190efdac078ed6ca0e355a320c728d1423ab2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10619] Can't sort columns on Executor P...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9101#issuecomment-147874653 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10619] Can't sort columns on Executor P...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9101#issuecomment-147874656 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43669/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11017] [SQL] Support ImperativeAggregat...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9038#issuecomment-147874402 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10619] Can't sort columns on Executor P...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9101#issuecomment-147874507 [Test build #43669 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43669/console) for PR 9101 at commit [`492d915`](https://github.com/apache/spark/commit/492d915f09bc27743d7343554c87839a9d495b51). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11017] [SQL] Support ImperativeAggregat...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9038#issuecomment-147874435 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147874502 LGTM, pending test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147873489 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147873466 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9104#issuecomment-147872722 Can we also remove an extra `ConvertToUnsafe` here? Specifically, is the parquet table scan still claiming to produce safe rows when its really producing unsafe ones now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/9104#discussion_r41934985 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -178,52 +179,26 @@ private[sql] object DataSourceStrategy extends Strategy with Logging { sparkPlan } - // TODO: refactor this thing. It is very complicated because it does projection internally. - // We should just put a project on top of this. private def mergeWithPartitionValues( - schema: StructType, - requiredColumns: Array[String], - partitionColumns: Array[String], + requiredColumns: Seq[Attribute], + dataColumns: Seq[Attribute], + partitionColumnSchema: StructType, partitionValues: InternalRow, dataRows: RDD[InternalRow]): RDD[InternalRow] = { -val nonPartitionColumns = requiredColumns.filterNot(partitionColumns.contains) - // If output columns contain any partition column(s), we need to merge scanned data // columns and requested partition columns to form the final result. -if (!requiredColumns.sameElements(nonPartitionColumns)) { - val mergers = requiredColumns.zipWithIndex.map { case (name, index) => -// To see whether the `index`-th column is a partition column... -val i = partitionColumns.indexOf(name) -if (i != -1) { - val dt = schema(partitionColumns(i)).dataType - // If yes, gets column value from partition values. - (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => { -mutableRow(ordinal) = partitionValues.get(i, dt) - } -} else { - // Otherwise, inherits the value from scanned data. - val i = nonPartitionColumns.indexOf(name) - val dt = schema(nonPartitionColumns(i)).dataType - (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => { -mutableRow(ordinal) = dataRow.get(i, dt) - } -} +if (requiredColumns != dataColumns) { + // Builds `AttributeReference`s for all partition columns so that we can use them to project + // required partition columns. Note that if a partition column appears in `requiredColumns`, + // we should use the `AttributeReference` in `requiredColumns`. + val requiredColumnMap = requiredColumns.map(a => a.name -> a).toMap + val partitionColumns = partitionColumnSchema.toAttributes.map { a => +requiredColumnMap.getOrElse(a.name, a) } - // Since we know for sure that this closure is serializable, we can avoid the overhead - // of cleaning a closure for each RDD by creating our own MapPartitionsRDD. Functionally - // this is equivalent to calling `dataRows.mapPartitions(mapPartitionsFunc)` (SPARK-7718). val mapPartitionsFunc = (_: TaskContext, _: Int, iterator: Iterator[InternalRow]) => { -val dataTypes = requiredColumns.map(schema(_).dataType) -val mutableRow = new SpecificMutableRow(dataTypes) -iterator.map { dataRow => - var i = 0 - while (i < mutableRow.numFields) { -mergers(i)(mutableRow, dataRow, i) -i += 1 - } - mutableRow.asInstanceOf[InternalRow] -} +val projection = UnsafeProjection.create(requiredColumns, dataColumns ++ partitionColumns) +iterator.map(dataRow => projection(new JoinedRow(dataRow, partitionValues))) --- End diff -- That's a good point, didn't realize `JoinedRow` is mutable. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8453#issuecomment-147872531 [Test build #43681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43681/consoleFull) for PR 8453 at commit [`b60ee53`](https://github.com/apache/spark/commit/b60ee53ef9e1172d9072e00d829d8216904dc791). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147872318 also need to change this comment https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala#L540 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10949] Update Snappy version to 1.1.2
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/8995#issuecomment-147872165 Hey @a-roberts, How about this: - Add a `private[spark]` method to the `private[spark]` `CompressionCodec` companion object and have that method maintain the hardcoded list of compression codecs which support concatenation of serialized streams. This method should accept a `CompressionCodec` instance and perform the `instanceof` check. I'd consider naming this something like "supportsConcatenationOfSerializedStreams" to be very explicit and clear. - Update `fastMergeIsSupported` to use this new static method. I like this approach since it makes it very clear why we're only supporting those two codecs. I wouldn't worry about third-party / external compression codecs being able to take advantage of this feature. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9104#issuecomment-147872104 Micro-benchmark result with TPC-DS (scale-factor 15) `store_sales` table shows a ~12% performance gain. Before: - Round 0: 8133 ms - Round 1: 7799 ms - Round 2: 8010 ms - Round 3: 8009 ms - Round 4: 8223 ms - Average: 8034.8 ms After: - Round 0: 7401 ms - Round 1: 6897 ms - Round 2: 6873 ms - Round 3: 6935 ms - Round 4: 7056 ms - Average: 7032.4 ms Benchmark code (where `ss_sold_date_sk` is an `INT` partitioning column and `ss_sold_time_sk` is an `INT` data column): ```scala import com.google.common.base.Stopwatch def benchmark(runs: Int, warmupRuns: Int = 0)(f: => Unit) { val stopwatch = new Stopwatch() (0 until warmupRuns).foreach { i => f } def run(i: Int) = { stopwatch.reset() stopwatch.start() f stopwatch.stop() val elapsed = stopwatch.elapsedMillis() println(s"Round $i: $elapsed ms") elapsed } val total = (0 until runs).map(i => run(i)).sum.toDouble println(s"Average: ${total / runs} ms") } val path = "file:///Users/lian/tpcds/sf15/store_sales" benchmark(5, 5) { val df = sqlContext.read.parquet(path).selectExpr("ss_sold_time_sk", "ss_sold_date_sk") df.queryExecution.toRdd.foreach(row => ()) } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10185] [SQL] Feat sql comma separated p...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/8416#issuecomment-147872102 Thanks for working on this, I spent some time debating the API with @rxin and here is what we came up with: - calling the function `load(paths: Array[String])` would be more consistent with the rest of the reader API. This precludes using varargs, but that is probably not the most common use of this function. - we need to add support in python too. - it would be good to also add a test to make sure that we aren't breaking comma handling for single / multiple paths. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147871946 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147871947 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43680/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Branch 1.5
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/9071#issuecomment-147871053 Hey @xif10416s, do you mind closing this issue? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8453#issuecomment-147870905 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8453#issuecomment-147870927 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10932] [PROJECT INFRA] Port two minor c...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/8986 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10932] [PROJECT INFRA] Port two minor c...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/8986#issuecomment-147870089 Going to merge this now so that it doesn't become stale or get forgotten. I'll address return code checks if the problem re-occurs once we're closer to the release. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9078#issuecomment-147869856 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43667/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9078#issuecomment-147869855 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9078#issuecomment-147869665 [Test build #43667 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43667/console) for PR 9078 at commit [`2d81674`](https://github.com/apache/spark/commit/2d816740084c447bf08e36035467a907f70df667). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait QueryExecutionListener ` * `class ExecutionListenerManager extends Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147868900 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147868918 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9105#issuecomment-147868773 [Test build #43679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43679/consoleFull) for PR 9105 at commit [`67f3f32`](https://github.com/apache/spark/commit/67f3f325f127fe2b5c5ca7619ec47cec01dc6389). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9093 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147868516 Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Comment Syntax checkup
Github user aertoria closed the pull request at: https://github.com/apache/spark/pull/9080 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9105#issuecomment-147867971 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9105#issuecomment-147867919 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9105#issuecomment-147867587 LGTM pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9105#issuecomment-147866979 cc @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/9105 [SPARK-11032][SQL] correctly handle having We should not stop resolving having when the having condtion is resolved, or something like `count(1)` will crash. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark having Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9105.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9105 commit 67f3f325f127fe2b5c5ca7619ec47cec01dc6389 Author: Wenchen Fan Date: 2015-10-13T22:02:42Z correctly handle having --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9104#discussion_r41932293 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -178,52 +179,26 @@ private[sql] object DataSourceStrategy extends Strategy with Logging { sparkPlan } - // TODO: refactor this thing. It is very complicated because it does projection internally. - // We should just put a project on top of this. private def mergeWithPartitionValues( - schema: StructType, - requiredColumns: Array[String], - partitionColumns: Array[String], + requiredColumns: Seq[Attribute], + dataColumns: Seq[Attribute], + partitionColumnSchema: StructType, partitionValues: InternalRow, dataRows: RDD[InternalRow]): RDD[InternalRow] = { -val nonPartitionColumns = requiredColumns.filterNot(partitionColumns.contains) - // If output columns contain any partition column(s), we need to merge scanned data // columns and requested partition columns to form the final result. -if (!requiredColumns.sameElements(nonPartitionColumns)) { - val mergers = requiredColumns.zipWithIndex.map { case (name, index) => -// To see whether the `index`-th column is a partition column... -val i = partitionColumns.indexOf(name) -if (i != -1) { - val dt = schema(partitionColumns(i)).dataType - // If yes, gets column value from partition values. - (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => { -mutableRow(ordinal) = partitionValues.get(i, dt) - } -} else { - // Otherwise, inherits the value from scanned data. - val i = nonPartitionColumns.indexOf(name) - val dt = schema(nonPartitionColumns(i)).dataType - (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => { -mutableRow(ordinal) = dataRow.get(i, dt) - } -} +if (requiredColumns != dataColumns) { + // Builds `AttributeReference`s for all partition columns so that we can use them to project + // required partition columns. Note that if a partition column appears in `requiredColumns`, + // we should use the `AttributeReference` in `requiredColumns`. + val requiredColumnMap = requiredColumns.map(a => a.name -> a).toMap + val partitionColumns = partitionColumnSchema.toAttributes.map { a => +requiredColumnMap.getOrElse(a.name, a) } - // Since we know for sure that this closure is serializable, we can avoid the overhead - // of cleaning a closure for each RDD by creating our own MapPartitionsRDD. Functionally - // this is equivalent to calling `dataRows.mapPartitions(mapPartitionsFunc)` (SPARK-7718). val mapPartitionsFunc = (_: TaskContext, _: Int, iterator: Iterator[InternalRow]) => { -val dataTypes = requiredColumns.map(schema(_).dataType) -val mutableRow = new SpecificMutableRow(dataTypes) -iterator.map { dataRow => - var i = 0 - while (i < mutableRow.numFields) { -mergers(i)(mutableRow, dataRow, i) -i += 1 - } - mutableRow.asInstanceOf[InternalRow] -} +val projection = UnsafeProjection.create(requiredColumns, dataColumns ++ partitionColumns) +iterator.map(dataRow => projection(new JoinedRow(dataRow, partitionValues))) --- End diff -- Do we have to allocate a new JoinedRow each time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9104#issuecomment-147865822 [Test build #43678 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43678/consoleFull) for PR 9104 at commit [`23a0fc2`](https://github.com/apache/spark/commit/23a0fc2ef86daa8faa785ef2ea3f1d7b5d1b692c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147864775 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43663/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147864773 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147864600 [Test build #43663 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43663/console) for PR 9100 at commit [`9fdcaae`](https://github.com/apache/spark/commit/9fdcaae7230ee1c1d9dbffbf9e931dadf4517a82). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `def getPath = path.getOrElse(sys.error("Constructors must start at a class type"))` * `case class WrapOption(optionType: DataType, child: Expression)` * `class GenericArrayData(val array: Array[Any]) extends ArrayData ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9104#issuecomment-147864429 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9104#issuecomment-147864406 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147864365 It already passed tests as of the latest commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/8453#discussion_r41930886 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala --- @@ -165,68 +133,51 @@ class SimpleCatalog(val conf: CatalystConf) extends Catalog { * lost when the JVM exits. */ trait OverrideCatalog extends Catalog { - // TODO: This doesn't work when the database changes... --- End diff -- This comment is outdated now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10412][SQL] report memory usage for tun...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8931#issuecomment-147864159 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43665/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10412][SQL] report memory usage for tun...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8931#issuecomment-147864158 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147864120 LGTM pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [PYTHON] [MINOR] List modules in PySpark tests...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9088#issuecomment-147864131 [Test build #1892 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1892/console) for PR 9088 at commit [`85cf1ca`](https://github.com/apache/spark/commit/85cf1ca08c8e079eea9150808e9d3d766671ccc0). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10412][SQL] report memory usage for tun...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8931#issuecomment-147863989 [Test build #43665 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43665/console) for PR 8931 at commit [`fdae182`](https://github.com/apache/spark/commit/fdae1827564a0535f22a19b432442d66e56f12a6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/9104 [SPARK-11088] [SQL] Merges partition values using UnsafeProjection `DataSourceStrategy.mergeWithPartitionValues` is essentially a projection implemented in a quite inefficient way. This PR optimizes this method with `UnsafeProjection` to avoid unnecessary boxing costs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark spark-11088.faster-partition-values-merging Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9104.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9104 commit 23a0fc2ef86daa8faa785ef2ea3f1d7b5d1b692c Author: Cheng Lian Date: 2015-10-13T18:04:36Z Merges partition values using UnsafeProjection --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9081#discussion_r41930462 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/MultiInstanceRelation.scala --- @@ -1,33 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - *http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.spark.sql.catalyst.analysis - -import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan - -/** - * A trait that should be mixed into query operators where an single instance might appear multiple - * times in a logical query plan. It is invalid to have multiple copies of the same attribute - * produced by distinct operators in a query tree as this breaks the guarantee that expression - * ids, which are used to differentiate attributes, are unique. - * - * During analysis, operators that include this trait may be asked to produce a new version - * of itself with globally unique expression ids. --- End diff -- In-fact much of this could probably be copied to NewOutput. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9012#issuecomment-147863440 [Test build #43677 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43677/console) for PR 9012 at commit [`ba4c91b`](https://github.com/apache/spark/commit/ba4c91bfbc732737db958e0e0905b8ce25b00647). * This patch **fails R style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9012#issuecomment-147863446 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43677/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9012#issuecomment-147863445 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9081#discussion_r41930249 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala --- @@ -28,6 +28,17 @@ import org.apache.spark.util.MutablePair import org.apache.spark.util.random.PoissonSampler import org.apache.spark.{HashPartitioner, SparkEnv} +@DeveloperApi +case class NewOutput(output: Seq[Attribute], child: SparkPlan) extends UnaryNode { --- End diff -- scaladoc please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9081#discussion_r41930272 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala --- @@ -23,6 +23,25 @@ import org.apache.spark.sql.catalyst.plans._ import org.apache.spark.sql.types._ import org.apache.spark.util.collection.OpenHashSet +case class NewOutput(output: Seq[Attribute], child: LogicalPlan) extends UnaryNode { --- End diff -- Can you add scaladoc to both of these that explains what they are for. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9012#issuecomment-147863143 [Test build #43677 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43677/consoleFull) for PR 9012 at commit [`ba4c91b`](https://github.com/apache/spark/commit/ba4c91bfbc732737db958e0e0905b8ce25b00647). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147863010 [Test build #43676 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43676/consoleFull) for PR 9103 at commit [`5a9e388`](https://github.com/apache/spark/commit/5a9e388c097285fa47367455e362ce774e610923). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147862557 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147862505 [Test build #43675 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43675/console) for PR 9100 at commit [`d1b6d01`](https://github.com/apache/spark/commit/d1b6d018527c2ef7163c3f599413fb1047d2cc0f). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `def getPath = path.getOrElse(sys.error("Constructors must start at a class type"))` * `case class WrapOption(optionType: DataType, child: Expression)` * `class GenericArrayData(val array: Array[Any]) extends ArrayData ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147862559 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43675/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9012#issuecomment-147861380 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9012#issuecomment-147861343 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147861364 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9103#issuecomment-147861336 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147860936 Updated; PTAL @marmbrus. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/9103 [SPARK-11091] [SQL] Change spark.sql.canonicalizeView to spark.sql.nativeView. https://issues.apache.org/jira/browse/SPARK-11091 You can merge this pull request into a Git repository by running: $ git pull https://github.com/yhuai/spark SPARK-11091 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9103.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9103 commit 5a9e388c097285fa47367455e362ce774e610923 Author: Yin Huai Date: 2015-10-13T21:32:54Z Change spark.sql.canonicalizeView to spark.sql.nativeView. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147860350 you could probably add to that. this is just an extra tests to be safe (and should check for values too, the current tests don't seem to do that, only col names, data types and counts) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147860184 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147860186 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43662/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Throw exception when Named...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147859959 Alright, updating now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11080] [SQL] Throw exception when Named...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9093#issuecomment-147859909 [Test build #43662 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43662/console) for PR 9093 at commit [`955a1a8`](https://github.com/apache/spark/commit/955a1a879cb964e0bca64e716371f7fec1fe32cf). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class ChildProcAppHandle implements SparkAppHandle ` * `abstract class LauncherConnection implements Closeable, Runnable ` * `final class LauncherProtocol ` * ` static class Message implements Serializable ` * ` static class Hello extends Message ` * ` static class SetAppId extends Message ` * ` static class SetState extends Message ` * ` static class Stop extends Message ` * `class LauncherServer implements Closeable ` * `class NamedThreadFactory implements ThreadFactory ` * `class OutputRedirector ` * `case class ExprId(id: Long, jvmId: UUID)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARKR] [SPARK-10981] SparkR Join improvement...
Github user felixcheung commented on the pull request: https://github.com/apache/spark/pull/9029#issuecomment-147859792 looks good --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9083#issuecomment-147859196 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9083#issuecomment-147859198 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43670/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9083#issuecomment-147858753 [Test build #43670 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43670/console) for PR 9083 at commit [`2e76b0c`](https://github.com/apache/spark/commit/2e76b0c95f78515a2ae93419e0455f87a86a017f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147858705 [Test build #43675 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43675/consoleFull) for PR 9100 at commit [`d1b6d01`](https://github.com/apache/spark/commit/d1b6d018527c2ef7163c3f599413fb1047d2cc0f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/8453#discussion_r41927635 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala --- @@ -20,14 +20,22 @@ package org.apache.spark.sql.catalyst /** * Identifies a `table` in `database`. If `database` is not defined, the current database is used. */ -private[sql] case class TableIdentifier(table: String, database: Option[String] = None) { - def withDatabase(database: String): TableIdentifier = this.copy(database = Some(database)) +private[sql] case class TableIdentifier(table: String, database: Option[String]) { + def this(table: String) = this(table, None) - def toSeq: Seq[String] = database.toSeq :+ table + override def toString: String = { +if (table.contains('.') || database.exists(_.contains('.'))) { --- End diff -- There are some other character that need quote. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...
Github user zero323 commented on the pull request: https://github.com/apache/spark/pull/9099#issuecomment-147858070 Sure. Should I make a separate test for that or simply add to [`create DataFrame from list or data.frame`](https://github.com/apache/spark/blob/master/R/pkg/inst/tests/test_sparkSQL.R#L227). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9081#issuecomment-147857302 cc @marmbrus @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147856525 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9100#issuecomment-147856494 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org