date:20151013

[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9078#issuecomment-147878046
  
  [Test build #43674 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43674/console)
 for   PR 9078 at commit 
[`0846260`](https://github.com/apache/spark/commit/084626087faa5749091b99c56cad5f706d1f75b2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait QueryExecutionListener `
  * `class ExecutionListenerManager extends Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10959] [PYSPARK] StreamingLogisticRegre...

2015-10-13 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/9087#issuecomment-147877817
  
Merged into branch-1.5. Could you close this JIRA manually?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...

2015-10-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9083


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...

2015-10-13 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/9083#issuecomment-147877496
  
Merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9078#issuecomment-147877246
  
  [Test build #43685 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43685/consoleFull)
 for   PR 9078 at commit 
[`9943fea`](https://github.com/apache/spark/commit/9943feaf4ce288614e40a8502626900fb3cf3a4b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147876792
  
  [Test build #43684 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43684/consoleFull)
 for   PR 9099 at commit 
[`36e191d`](https://github.com/apache/spark/commit/36e191deb1ca994ede2dfc143bc6c2a4c572c9d0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147876185
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9078#issuecomment-147876157
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147876155
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9078#issuecomment-147876184
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10389][SQL][1.5] support order by non-a...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9102#issuecomment-147876071
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43672/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10622] [core] [yarn] Differentiate dead...

2015-10-13 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8887#discussion_r41936884
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -781,6 +781,23 @@ private[spark] class TaskSetManager(
 sortedTaskSetQueue
   }
 
+  /**
+   * Called by TaskScheduler when an executor is lost, but the reason is 
not yet known. This method
+   * does not fail any tasks related to the executor. Instead, tasks are 
left as is, but the
+   * executor is removed from the list of live executors, so no new tasks 
are scheduled. Pending
+   * tasks for the executor are re-queued.
+   */
+  override def disableExecutor(execId: String, host: String): Unit = {
+for (index <- getPendingTasksForExecutor(execId)) {
+  addPendingTask(index, readding = true)
+}
+for (index <- getPendingTasksForHost(host)) {
+  addPendingTask(index, readding = true)
+}
+// recalculate valid locality levels and waits when executor is 
disabled.
+recomputeLocality()
--- End diff --

It seems to not be expensive; looks `O(1)` except in rare cases where most 
executors with pending tasks are dead, in which case it would be `O(number 
executors)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10389][SQL][1.5] support order by non-a...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9102#issuecomment-147876065
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10389][SQL][1.5] support order by non-a...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9102#issuecomment-147875930
  
  [Test build #43672 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43672/console)
 for   PR 9102 at commit 
[`ab2341a`](https://github.com/apache/spark/commit/ab2341a39ca90116b1f86b2f33c119cac57d51dc).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11042] [SQL] [BRANCH-1.5 TEST-ONLY] Add...

2015-10-13 Thread yhuai

Github user yhuai closed the pull request at:

https://github.com/apache/spark/pull/9077


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11017] [SQL] Support ImperativeAggregat...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9038#issuecomment-147875246
  
  [Test build #43683 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43683/consoleFull)
 for   PR 9038 at commit 
[`2547b29`](https://github.com/apache/spark/commit/2547b29e61cc27580f5dbb68a8ed8f65d8c04848).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147875233
  
  [Test build #43682 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43682/consoleFull)
 for   PR 9103 at commit 
[`6e7190e`](https://github.com/apache/spark/commit/6e7190efdac078ed6ca0e355a320c728d1423ab2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10619] Can't sort columns on Executor P...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9101#issuecomment-147874653
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10619] Can't sort columns on Executor P...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9101#issuecomment-147874656
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43669/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11017] [SQL] Support ImperativeAggregat...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9038#issuecomment-147874402
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10619] Can't sort columns on Executor P...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9101#issuecomment-147874507
  
  [Test build #43669 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43669/console)
 for   PR 9101 at commit 
[`492d915`](https://github.com/apache/spark/commit/492d915f09bc27743d7343554c87839a9d495b51).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11017] [SQL] Support ImperativeAggregat...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9038#issuecomment-147874435
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147874502
  
LGTM, pending test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147873489
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147873466
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9104#issuecomment-147872722
  
Can we also remove an extra `ConvertToUnsafe` here?  Specifically, is the 
parquet table scan still claiming to produce safe rows when its really 
producing unsafe ones now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/9104#discussion_r41934985
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -178,52 +179,26 @@ private[sql] object DataSourceStrategy extends 
Strategy with Logging {
 sparkPlan
   }
 
-  // TODO: refactor this thing. It is very complicated because it does 
projection internally.
-  // We should just put a project on top of this.
   private def mergeWithPartitionValues(
-  schema: StructType,
-  requiredColumns: Array[String],
-  partitionColumns: Array[String],
+  requiredColumns: Seq[Attribute],
+  dataColumns: Seq[Attribute],
+  partitionColumnSchema: StructType,
   partitionValues: InternalRow,
   dataRows: RDD[InternalRow]): RDD[InternalRow] = {
-val nonPartitionColumns = 
requiredColumns.filterNot(partitionColumns.contains)
-
 // If output columns contain any partition column(s), we need to merge 
scanned data
 // columns and requested partition columns to form the final result.
-if (!requiredColumns.sameElements(nonPartitionColumns)) {
-  val mergers = requiredColumns.zipWithIndex.map { case (name, index) 
=>
-// To see whether the `index`-th column is a partition column...
-val i = partitionColumns.indexOf(name)
-if (i != -1) {
-  val dt = schema(partitionColumns(i)).dataType
-  // If yes, gets column value from partition values.
-  (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => {
-mutableRow(ordinal) = partitionValues.get(i, dt)
-  }
-} else {
-  // Otherwise, inherits the value from scanned data.
-  val i = nonPartitionColumns.indexOf(name)
-  val dt = schema(nonPartitionColumns(i)).dataType
-  (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => {
-mutableRow(ordinal) = dataRow.get(i, dt)
-  }
-}
+if (requiredColumns != dataColumns) {
+  // Builds `AttributeReference`s for all partition columns so that we 
can use them to project
+  // required partition columns.  Note that if a partition column 
appears in `requiredColumns`,
+  // we should use the `AttributeReference` in `requiredColumns`.
+  val requiredColumnMap = requiredColumns.map(a => a.name -> a).toMap
+  val partitionColumns = partitionColumnSchema.toAttributes.map { a =>
+requiredColumnMap.getOrElse(a.name, a)
   }
 
-  // Since we know for sure that this closure is serializable, we can 
avoid the overhead
-  // of cleaning a closure for each RDD by creating our own 
MapPartitionsRDD. Functionally
-  // this is equivalent to calling 
`dataRows.mapPartitions(mapPartitionsFunc)` (SPARK-7718).
   val mapPartitionsFunc = (_: TaskContext, _: Int, iterator: 
Iterator[InternalRow]) => {
-val dataTypes = requiredColumns.map(schema(_).dataType)
-val mutableRow = new SpecificMutableRow(dataTypes)
-iterator.map { dataRow =>
-  var i = 0
-  while (i < mutableRow.numFields) {
-mergers(i)(mutableRow, dataRow, i)
-i += 1
-  }
-  mutableRow.asInstanceOf[InternalRow]
-}
+val projection = UnsafeProjection.create(requiredColumns, 
dataColumns ++ partitionColumns)
+iterator.map(dataRow => projection(new JoinedRow(dataRow, 
partitionValues)))
--- End diff --

That's a good point, didn't realize `JoinedRow` is mutable. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8453#issuecomment-147872531
  
  [Test build #43681 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43681/consoleFull)
 for   PR 8453 at commit 
[`b60ee53`](https://github.com/apache/spark/commit/b60ee53ef9e1172d9072e00d829d8216904dc791).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147872318
  
also need to change this comment 
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala#L540


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10949] Update Snappy version to 1.1.2

2015-10-13 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/8995#issuecomment-147872165
  
Hey @a-roberts,

How about this: 

- Add a `private[spark]` method to the `private[spark]` `CompressionCodec` 
companion object and have that method maintain the hardcoded list of 
compression codecs which support concatenation of serialized streams. This 
method should accept a `CompressionCodec` instance and perform the `instanceof` 
check. I'd consider naming this something like 
"supportsConcatenationOfSerializedStreams" to be very explicit and clear.
- Update `fastMergeIsSupported` to use this new static method.

I like this approach since it makes it very clear why we're only supporting 
those two codecs.

I wouldn't worry about third-party / external compression codecs being able 
to take advantage of this feature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/9104#issuecomment-147872104
  
Micro-benchmark result with TPC-DS (scale-factor 15) `store_sales` table 
shows a ~12% performance gain.

Before:

- Round 0: 8133 ms
- Round 1: 7799 ms
- Round 2: 8010 ms
- Round 3: 8009 ms
- Round 4: 8223 ms
- Average: 8034.8 ms

After:

- Round 0: 7401 ms
- Round 1: 6897 ms
- Round 2: 6873 ms
- Round 3: 6935 ms
- Round 4: 7056 ms
- Average: 7032.4 ms

Benchmark code (where `ss_sold_date_sk` is an `INT` partitioning column and 
`ss_sold_time_sk` is an `INT` data column):

```scala
import com.google.common.base.Stopwatch

def benchmark(runs: Int, warmupRuns: Int = 0)(f: => Unit) {
  val stopwatch = new Stopwatch()

  (0 until warmupRuns).foreach { i =>
f
  }

  def run(i: Int) = {
stopwatch.reset()
stopwatch.start()
f
stopwatch.stop()
val elapsed = stopwatch.elapsedMillis()
println(s"Round $i: $elapsed ms")
elapsed
  }

  val total = (0 until runs).map(i => run(i)).sum.toDouble
  println(s"Average: ${total / runs} ms")
}

val path = "file:///Users/lian/tpcds/sf15/store_sales"

benchmark(5, 5) {
  val df = sqlContext.read.parquet(path).selectExpr("ss_sold_time_sk", 
"ss_sold_date_sk")
  df.queryExecution.toRdd.foreach(row => ())
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10185] [SQL] Feat sql comma separated p...

2015-10-13 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8416#issuecomment-147872102
  
Thanks for working on this, I spent some time debating the API with @rxin 
and here is what we came up with:

 - calling the function `load(paths: Array[String])` would be more 
consistent with the rest of the reader API.  This precludes using varargs, but 
that is probably not the most common use of this function.
 - we need to add support in python too.
 - it would be good to also add a test to make sure that we aren't breaking 
comma handling for single / multiple paths.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147871946
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147871947
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43680/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Branch 1.5

2015-10-13 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/9071#issuecomment-147871053
  
Hey @xif10416s, do you mind closing this issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8453#issuecomment-147870905
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8453#issuecomment-147870927
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10932] [PROJECT INFRA] Port two minor c...

2015-10-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8986


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10932] [PROJECT INFRA] Port two minor c...

2015-10-13 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/8986#issuecomment-147870089
  
Going to merge this now so that it doesn't become stale or get forgotten. 
I'll address return code checks if the problem re-occurs once we're closer to 
the release.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9078#issuecomment-147869856
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43667/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9078#issuecomment-147869855
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11068][SQL] add callback to query execu...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9078#issuecomment-147869665
  
  [Test build #43667 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43667/console)
 for   PR 9078 at commit 
[`2d81674`](https://github.com/apache/spark/commit/2d816740084c447bf08e36035467a907f70df667).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `trait QueryExecutionListener `
  * `class ExecutionListenerManager extends Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147868900
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147868918
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9105#issuecomment-147868773
  
  [Test build #43679 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43679/consoleFull)
 for   PR 9105 at commit 
[`67f3f32`](https://github.com/apache/spark/commit/67f3f325f127fe2b5c5ca7619ec47cec01dc6389).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...

2015-10-13 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9093


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...

2015-10-13 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147868516
  
Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Comment Syntax checkup

2015-10-13 Thread aertoria

Github user aertoria closed the pull request at:

https://github.com/apache/spark/pull/9080


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9105#issuecomment-147867971
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9105#issuecomment-147867919
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having

2015-10-13 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9105#issuecomment-147867587
  
LGTM pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having

2015-10-13 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9105#issuecomment-147866979
  
cc @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11032][SQL] correctly handle having

2015-10-13 Thread cloud-fan

GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/9105

[SPARK-11032][SQL] correctly handle having

We should not stop resolving having when the having condtion is resolved, 
or something like `count(1)` will crash.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark having

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9105


commit 67f3f325f127fe2b5c5ca7619ec47cec01dc6389
Author: Wenchen Fan 
Date:   2015-10-13T22:02:42Z

correctly handle having




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9104#discussion_r41932293
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -178,52 +179,26 @@ private[sql] object DataSourceStrategy extends 
Strategy with Logging {
 sparkPlan
   }
 
-  // TODO: refactor this thing. It is very complicated because it does 
projection internally.
-  // We should just put a project on top of this.
   private def mergeWithPartitionValues(
-  schema: StructType,
-  requiredColumns: Array[String],
-  partitionColumns: Array[String],
+  requiredColumns: Seq[Attribute],
+  dataColumns: Seq[Attribute],
+  partitionColumnSchema: StructType,
   partitionValues: InternalRow,
   dataRows: RDD[InternalRow]): RDD[InternalRow] = {
-val nonPartitionColumns = 
requiredColumns.filterNot(partitionColumns.contains)
-
 // If output columns contain any partition column(s), we need to merge 
scanned data
 // columns and requested partition columns to form the final result.
-if (!requiredColumns.sameElements(nonPartitionColumns)) {
-  val mergers = requiredColumns.zipWithIndex.map { case (name, index) 
=>
-// To see whether the `index`-th column is a partition column...
-val i = partitionColumns.indexOf(name)
-if (i != -1) {
-  val dt = schema(partitionColumns(i)).dataType
-  // If yes, gets column value from partition values.
-  (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => {
-mutableRow(ordinal) = partitionValues.get(i, dt)
-  }
-} else {
-  // Otherwise, inherits the value from scanned data.
-  val i = nonPartitionColumns.indexOf(name)
-  val dt = schema(nonPartitionColumns(i)).dataType
-  (mutableRow: MutableRow, dataRow: InternalRow, ordinal: Int) => {
-mutableRow(ordinal) = dataRow.get(i, dt)
-  }
-}
+if (requiredColumns != dataColumns) {
+  // Builds `AttributeReference`s for all partition columns so that we 
can use them to project
+  // required partition columns.  Note that if a partition column 
appears in `requiredColumns`,
+  // we should use the `AttributeReference` in `requiredColumns`.
+  val requiredColumnMap = requiredColumns.map(a => a.name -> a).toMap
+  val partitionColumns = partitionColumnSchema.toAttributes.map { a =>
+requiredColumnMap.getOrElse(a.name, a)
   }
 
-  // Since we know for sure that this closure is serializable, we can 
avoid the overhead
-  // of cleaning a closure for each RDD by creating our own 
MapPartitionsRDD. Functionally
-  // this is equivalent to calling 
`dataRows.mapPartitions(mapPartitionsFunc)` (SPARK-7718).
   val mapPartitionsFunc = (_: TaskContext, _: Int, iterator: 
Iterator[InternalRow]) => {
-val dataTypes = requiredColumns.map(schema(_).dataType)
-val mutableRow = new SpecificMutableRow(dataTypes)
-iterator.map { dataRow =>
-  var i = 0
-  while (i < mutableRow.numFields) {
-mergers(i)(mutableRow, dataRow, i)
-i += 1
-  }
-  mutableRow.asInstanceOf[InternalRow]
-}
+val projection = UnsafeProjection.create(requiredColumns, 
dataColumns ++ partitionColumns)
+iterator.map(dataRow => projection(new JoinedRow(dataRow, 
partitionValues)))
--- End diff --

Do we have to allocate a new JoinedRow each time?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9104#issuecomment-147865822
  
  [Test build #43678 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43678/consoleFull)
 for   PR 9104 at commit 
[`23a0fc2`](https://github.com/apache/spark/commit/23a0fc2ef86daa8faa785ef2ea3f1d7b5d1b692c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147864775
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43663/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147864773
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147864600
  
  [Test build #43663 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43663/console)
 for   PR 9100 at commit 
[`9fdcaae`](https://github.com/apache/spark/commit/9fdcaae7230ee1c1d9dbffbf9e931dadf4517a82).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `def getPath = path.getOrElse(sys.error("Constructors must start at 
a class type"))`
  * `case class WrapOption(optionType: DataType, child: Expression)`
  * `class GenericArrayData(val array: Array[Any]) extends ArrayData `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9104#issuecomment-147864429
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9104#issuecomment-147864406
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...

2015-10-13 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147864365
  
It already passed tests as of the latest commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...

2015-10-13 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8453#discussion_r41930886
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Catalog.scala
 ---
@@ -165,68 +133,51 @@ class SimpleCatalog(val conf: CatalystConf) extends 
Catalog {
  * lost when the JVM exits.
  */
 trait OverrideCatalog extends Catalog {
-
   // TODO: This doesn't work when the database changes...
--- End diff --

This comment is outdated now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10412][SQL] report memory usage for tun...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8931#issuecomment-147864159
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43665/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10412][SQL] report memory usage for tun...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8931#issuecomment-147864158
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...

2015-10-13 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147864120
  
LGTM pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [PYTHON] [MINOR] List modules in PySpark tests...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9088#issuecomment-147864131
  
  [Test build #1892 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1892/console)
 for   PR 9088 at commit 
[`85cf1ca`](https://github.com/apache/spark/commit/85cf1ca08c8e079eea9150808e9d3d766671ccc0).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10412][SQL] report memory usage for tun...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8931#issuecomment-147863989
  
  [Test build #43665 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43665/console)
 for   PR 8931 at commit 
[`fdae182`](https://github.com/apache/spark/commit/fdae1827564a0535f22a19b432442d66e56f12a6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11088] [SQL] Merges partition values us...

2015-10-13 Thread liancheng

GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/9104

[SPARK-11088] [SQL] Merges partition values using UnsafeProjection

`DataSourceStrategy.mergeWithPartitionValues` is essentially a projection 
implemented in a quite inefficient way. This PR optimizes this method with 
`UnsafeProjection` to avoid unnecessary boxing costs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark 
spark-11088.faster-partition-values-merging

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9104.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9104


commit 23a0fc2ef86daa8faa785ef2ea3f1d7b5d1b692c
Author: Cheng Lian 
Date:   2015-10-13T18:04:36Z

Merges partition values using UnsafeProjection




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling

2015-10-13 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9081#discussion_r41930462
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/MultiInstanceRelation.scala
 ---
@@ -1,33 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.catalyst.analysis
-
-import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
-
-/**
- * A trait that should be mixed into query operators where an single 
instance might appear multiple
- * times in a logical query plan.  It is invalid to have multiple copies 
of the same attribute
- * produced by distinct operators in a query tree as this breaks the 
guarantee that expression
- * ids, which are used to differentiate attributes, are unique.
- *
- * During analysis, operators that include this trait may be asked to 
produce a new version
- * of itself with globally unique expression ids.
--- End diff --

In-fact much of this could probably be copied to NewOutput.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9012#issuecomment-147863440
  
  [Test build #43677 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43677/console)
 for   PR 9012 at commit 
[`ba4c91b`](https://github.com/apache/spark/commit/ba4c91bfbc732737db958e0e0905b8ce25b00647).
 * This patch **fails R style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9012#issuecomment-147863446
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43677/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9012#issuecomment-147863445
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling

2015-10-13 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9081#discussion_r41930249
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala ---
@@ -28,6 +28,17 @@ import org.apache.spark.util.MutablePair
 import org.apache.spark.util.random.PoissonSampler
 import org.apache.spark.{HashPartitioner, SparkEnv}
 
+@DeveloperApi
+case class NewOutput(output: Seq[Attribute], child: SparkPlan) extends 
UnaryNode {
--- End diff --

scaladoc please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling

2015-10-13 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/9081#discussion_r41930272
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
@@ -23,6 +23,25 @@ import org.apache.spark.sql.catalyst.plans._
 import org.apache.spark.sql.types._
 import org.apache.spark.util.collection.OpenHashSet
 
+case class NewOutput(output: Seq[Attribute], child: LogicalPlan) extends 
UnaryNode {
--- End diff --

Can you add scaladoc to both of these that explains what they are for.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9012#issuecomment-147863143
  
  [Test build #43677 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43677/consoleFull)
 for   PR 9012 at commit 
[`ba4c91b`](https://github.com/apache/spark/commit/ba4c91bfbc732737db958e0e0905b8ce25b00647).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147863010
  
  [Test build #43676 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43676/consoleFull)
 for   PR 9103 at commit 
[`5a9e388`](https://github.com/apache/spark/commit/5a9e388c097285fa47367455e362ce774e610923).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147862557
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147862505
  
  [Test build #43675 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43675/console)
 for   PR 9100 at commit 
[`d1b6d01`](https://github.com/apache/spark/commit/d1b6d018527c2ef7163c3f599413fb1047d2cc0f).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `def getPath = path.getOrElse(sys.error("Constructors must start at 
a class type"))`
  * `case class WrapOption(optionType: DataType, child: Expression)`
  * `class GenericArrayData(val array: Array[Any]) extends ArrayData `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147862559
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43675/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9012#issuecomment-147861380
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9012#issuecomment-147861343
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147861364
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9103#issuecomment-147861336
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...

2015-10-13 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147860936
  
Updated; PTAL @marmbrus.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11091] [SQL] Change spark.sql.canonical...

2015-10-13 Thread yhuai

GitHub user yhuai opened a pull request:

https://github.com/apache/spark/pull/9103

[SPARK-11091] [SQL] Change spark.sql.canonicalizeView to 
spark.sql.nativeView.

https://issues.apache.org/jira/browse/SPARK-11091

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yhuai/spark SPARK-11091

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9103


commit 5a9e388c097285fa47367455e362ce774e610923
Author: Yin Huai 
Date:   2015-10-13T21:32:54Z

Change spark.sql.canonicalizeView to spark.sql.nativeView.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread felixcheung

Github user felixcheung commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147860350
  
you could probably add to that. this is just an extra tests to be safe (and 
should check for values too, the current tests don't seem to do that, only col 
names, data types and counts)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147860184
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Incorporate per-JVM id int...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147860186
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43662/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Throw exception when Named...

2015-10-13 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147859959
  
Alright, updating now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11080] [SQL] Throw exception when Named...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9093#issuecomment-147859909
  
  [Test build #43662 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43662/console)
 for   PR 9093 at commit 
[`955a1a8`](https://github.com/apache/spark/commit/955a1a879cb964e0bca64e716371f7fec1fe32cf).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class ChildProcAppHandle implements SparkAppHandle `
  * `abstract class LauncherConnection implements Closeable, Runnable `
  * `final class LauncherProtocol `
  * `  static class Message implements Serializable `
  * `  static class Hello extends Message `
  * `  static class SetAppId extends Message `
  * `  static class SetState extends Message `
  * `  static class Stop extends Message `
  * `class LauncherServer implements Closeable `
  * `class NamedThreadFactory implements ThreadFactory `
  * `class OutputRedirector `
  * `case class ExprId(id: Long, jvmId: UUID)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARKR] [SPARK-10981] SparkR Join improvement...

2015-10-13 Thread felixcheung

Github user felixcheung commented on the pull request:

https://github.com/apache/spark/pull/9029#issuecomment-147859792
  
looks good


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9083#issuecomment-147859196
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9083#issuecomment-147859198
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43670/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11059] [ML] Change range of quantile pr...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9083#issuecomment-147858753
  
  [Test build #43670 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43670/console)
 for   PR 9083 at commit 
[`2e76b0c`](https://github.com/apache/spark/commit/2e76b0c95f78515a2ae93419e0455f87a86a017f).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147858705
  
  [Test build #43675 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43675/consoleFull)
 for   PR 9100 at commit 
[`d1b6d01`](https://github.com/apache/spark/commit/d1b6d018527c2ef7163c3f599413fb1047d2cc0f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10104][SQL] Consolidate different forms...

2015-10-13 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/8453#discussion_r41927635
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/TableIdentifier.scala 
---
@@ -20,14 +20,22 @@ package org.apache.spark.sql.catalyst
 /**
  * Identifies a `table` in `database`.  If `database` is not defined, the 
current database is used.
  */
-private[sql] case class TableIdentifier(table: String, database: 
Option[String] = None) {
-  def withDatabase(database: String): TableIdentifier = this.copy(database 
= Some(database))
+private[sql] case class TableIdentifier(table: String, database: 
Option[String]) {
+  def this(table: String) = this(table, None)
 
-  def toSeq: Seq[String] = database.toSeq :+ table
+  override def toString: String = {
+if (table.contains('.') || database.exists(_.contains('.'))) {
--- End diff --

There are some other character that need quote.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11086][SPARKR] Use dropFactors column-w...

2015-10-13 Thread zero323

Github user zero323 commented on the pull request:

https://github.com/apache/spark/pull/9099#issuecomment-147858070
  
Sure. Should I make a separate test for that or simply add to [`create 
DataFrame from list or 
data.frame`](https://github.com/apache/spark/blob/master/R/pkg/inst/tests/test_sparkSQL.R#L227).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11072][SQL] simplify self join handling

2015-10-13 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9081#issuecomment-147857302
  
cc @marmbrus @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147856525
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11090] [SQL] Constructor for Product ty...

2015-10-13 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9100#issuecomment-147856494
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 7 8 >

201 - 300 of 749 matches

Mail list logo