date:20170831

[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19099
  
**[Test build #81299 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81299/testReport)**
 for PR 19099 at commit 
[`7a364a1`](https://github.com/apache/spark/commit/7a364a192f15bc99e362a2615c775730cb11fc24).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19099: [SPARK-21652][SQL] Fix rule confliction between InferFil...

2017-08-31 Thread jiangxb1987

Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/19099
  
cc @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19060: [WIP][SQL] Add DataSourceSuite validating data sources l...

2017-08-31 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19060
  
Previous parquet link is broken. The official one is 
https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/test/java/org/apache/parquet/hadoop/example/TestInputOutputFormat.java


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19099: [SPARK-21652][SQL] Fix rule confliction between I...

2017-08-31 Thread jiangxb1987

GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/19099

[SPARK-21652][SQL] Fix rule confliction between InferFiltersFromConstraints 
and ConstantPropagation

## What changes were proposed in this pull request?

For the given example below, the predicate added by 
`InferFiltersFromConstraints` is folded by `ConstantPropagation` later, this 
leads to unconverged optimize iteration:
```
Seq((1, 1)).toDF("col1", "col2").createOrReplaceTempView("t1")
Seq(1, 2).toDF("col").createOrReplaceTempView("t2")
sql("SELECT * FROM t1, t2 WHERE t1.col1 = 1 AND 1 = t1.col2 AND t1.col1 = 
t2.col AND t1.col2 = t2.col")
```

We can fix this by adjusting the indent of the optimize rules.

## How was this patch tested?

Add test case that would have failed in `SQLQuerySuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark unconverge-optimization

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19099.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19099


commit 7a364a192f15bc99e362a2615c775730cb11fc24
Author: Xingbo Jiang 
Date:   2017-08-31T21:49:02Z

fix rule confliction between InferFiltersFromConstraints and 
ConstantPropagation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19060: [WIP][SQL] Add DataSourceSuite validating data sources l...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19060
  
**[Test build #81298 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81298/testReport)**
 for PR 19060 at commit 
[`104f24c`](https://github.com/apache/spark/commit/104f24c9ad0743dc7c6329b4c0dde902e8e87de6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/16774#discussion_r136457345
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -120,6 +120,33 @@ class CrossValidatorSuite
 }
   }
 
+  test("cross validation with parallel evaluation") {
+val lr = new LogisticRegression
+val lrParamMaps = new ParamGridBuilder()
+  .addGrid(lr.regParam, Array(0.001, 1000.0))
+  .addGrid(lr.maxIter, Array(0, 3))
+  .build()
+val eval = new BinaryClassificationEvaluator
+val cv = new CrossValidator()
+  .setEstimator(lr)
+  .setEstimatorParamMaps(lrParamMaps)
+  .setEvaluator(eval)
+  .setNumFolds(2)
+  .setParallelism(1)
+val cvSerialModel = cv.fit(dataset)
+cv.setParallelism(2)
--- End diff --

It's a little difficult to do this in a unit test without making it flaky.  
I have run tests manually and verified it is working by both the expected 
speedup in timing and that the expected number of tasks are run concurrently.  
I can post some results if that would help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/16774#discussion_r136456379
  
--- Diff: 
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -120,6 +120,33 @@ class CrossValidatorSuite
 }
   }
 
+  test("cross validation with parallel evaluation") {
+val lr = new LogisticRegression
+val lrParamMaps = new ParamGridBuilder()
+  .addGrid(lr.regParam, Array(0.001, 1000.0))
+  .addGrid(lr.maxIter, Array(0, 3))
+  .build()
+val eval = new BinaryClassificationEvaluator
+val cv = new CrossValidator()
+  .setEstimator(lr)
+  .setEstimatorParamMaps(lrParamMaps)
+  .setEvaluator(eval)
+  .setNumFolds(2)
+  .setParallelism(1)
--- End diff --

So the seed param here is fixed by default and doesn't need to be set to 
ensure consistent results.  I think that's why it's not set in the other tests 
in this suite.  I'm not a fan of this behavior and I think it's better to 
explicitly set in tests, but then we should probably be consistent and set 
elsewhere too.  What are your thoughts on this @MLnick ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19082: [SPARK-21870][SQL] Split aggregation code into sm...

2017-08-31 Thread kiszk

Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/19082#discussion_r136455832
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -244,6 +246,92 @@ case class HashAggregateExec(
 
   protected override val shouldStopRequired = false
 
+  // We assume a prefix has lower cases and a name has camel cases
+  private val variableName = "^[a-z]+_[a-zA-Z]+[0-9]*".r
+
+  // Returns true if a given name id belongs to this `CodegenContext`
+  private def isVariable(nameId: String): Boolean = nameId match {
+case variableName() => true
+case _ => false
+  }
+
+  // Extracts all the outer references for a given `aggExpr`. This result 
will be used to split
+  // aggregation into small functions.
+  private def getOuterReferences(
+  ctx: CodegenContext,
+  aggExpr: Expression,
+  subExprs: Map[Expression, SubExprEliminationState]): Set[(String, 
String)] = {
+val stack = mutable.Stack[Expression](aggExpr)
+val argSet = mutable.Set[(String, String)]()
+val addIfNotLiteral = (value: String, tpe: String) => {
+  if (isVariable(value)) {
+argSet += ((tpe, value))
+  }
+}
+while (stack.nonEmpty) {
+  stack.pop() match {
+case e if subExprs.contains(e) =>
+  val exprCode = subExprs(e)
+  addIfNotLiteral(exprCode.value, ctx.javaType(e.dataType))
+  addIfNotLiteral(exprCode.isNull, "boolean")
+  // Since the children possibly has common expressions, we push 
them here
+  stack.pushAll(e.children)
+case ref: BoundReference
+if ctx.currentVars != null && ctx.currentVars(ref.ordinal) != 
null =>
+  val argVal = ctx.currentVars(ref.ordinal).value
+  addIfNotLiteral(argVal, ctx.javaType(ref.dataType))
+  addIfNotLiteral(ctx.currentVars(ref.ordinal).isNull, "boolean")
+case _: BoundReference =>
+  argSet += (("InternalRow", ctx.INPUT_ROW))
+case e =>
+  stack.pushAll(e.children)
+  }
+}
+
+argSet.toSet
+  }
+
+  // Splits the aggregation into small functions because the HotSpot does 
not compile
+  // too long functions.
+  private def splitAggregateExpressions(
+  ctx: CodegenContext,
+  aggExprs: Seq[Expression],
+  evalAndUpdateCodes: Seq[String],
+  subExprs: Map[Expression, SubExprEliminationState],
+  otherArgs: Seq[(String, String)] = Seq.empty): Seq[String] = {
+aggExprs.zipWithIndex.map { case (aggExpr, i) =>
+  // The maximum number of parameters in Java methods is 255, so this 
method gives up splitting
+  // the code if the number goes over the limit.
+  // You can find more information about the limit in the JVM 
specification:
+  //   - The number of method parameters is limited to 255 by the 
definition of a method
+  // descriptor, where the limit includes one unit for this in the 
case of instance
+  // or interface method invocations.
+  val args = (getOuterReferences(ctx, aggExpr, subExprs) ++ 
otherArgs).toSeq
+
+  // This is for testing/benchmarking only
+  val maxParamNumInJavaMethod =
+  
sqlContext.getConf("spark.sql.codegen.aggregate.maxParamNumInJavaMethod", null) 
match {
--- End diff --

Can we add a check code if a user specify a value that is more than 255?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18883: [SPARK-21276][CORE] Update lz4-java to the latest...

2017-08-31 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18883#discussion_r136451595
  
--- Diff: project/MimaExcludes.scala ---
@@ -41,7 +41,10 @@ object MimaExcludes {
 
 // [SPARK-19937] Add remote bytes read to disk.
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetrics.this"),
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this")
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this"),
+
+// [SPARK-21276] Update lz4-java to the latest (v1.4.0)
+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.io.LZ4BlockInputStream")
--- End diff --

By the way, I'm not sure if we want to pursue strictly compatible. Just to 
point out the issue here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19097: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pu...

2017-08-31 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19097


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19097: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown ...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19097
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18883: [SPARK-21276][CORE] Update lz4-java to the latest...

2017-08-31 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18883#discussion_r136450775
  
--- Diff: project/MimaExcludes.scala ---
@@ -41,7 +41,10 @@ object MimaExcludes {
 
 // [SPARK-19937] Add remote bytes read to disk.
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetrics.this"),
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this")
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this"),
+
+// [SPARK-21276] Update lz4-java to the latest (v1.4.0)
+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.io.LZ4BlockInputStream")
--- End diff --

But the user may write some codes to run different logics according to the 
InputStream types. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18883: [SPARK-21276][CORE] Update lz4-java to the latest...

2017-08-31 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/18883#discussion_r136448647
  
--- Diff: project/MimaExcludes.scala ---
@@ -41,7 +41,10 @@ object MimaExcludes {
 
 // [SPARK-19937] Add remote bytes read to disk.
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetrics.this"),
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this")
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this"),
+
+// [SPARK-21276] Update lz4-java to the latest (v1.4.0)
+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.io.LZ4BlockInputStream")
--- End diff --

It's "public" only insofar as it has to be in Java to use it this way. 
There's no case where a user should or would use this class directly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18975
  
**[Test build #81297 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81297/testReport)**
 for PR 18975 at commit 
[`e2db5e1`](https://github.com/apache/spark/commit/e2db5e1e0cc491480828328e07b7bb619dc05bbd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19098
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81293/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19098
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19098
  
**[Test build #81293 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81293/testReport)**
 for PR 19098 at commit 
[`567487d`](https://github.com/apache/spark/commit/567487d6089400527a1b30aca054ea517174a08d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136443263
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/InsertIntoDataSourceDirCommand.scala
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.sql._
+import org.apache.spark.sql.catalyst.catalog._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+
+/**
+ * A command used to write the result of a query to a directory.
+ *
+ * The syntax of using this command in SQL is:
+ * {{{
+ *   INSERT OVERWRITE DIRECTORY (path=STRING)?
+ *   USING format OPTIONS ([option1_name "option1_value", option2_name 
"option2_value", ...])
+ *   SELECT ...
+ * }}}
+ */
+case class InsertIntoDataSourceDirCommand(
+storage: CatalogStorageFormat,
+provider: Option[String],
+query: LogicalPlan,
+overwrite: Boolean) extends RunnableCommand {
+
+  override def innerChildren: Seq[LogicalPlan] = Seq(query)
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+assert(innerChildren.length == 1)
+assert(storage.locationUri.nonEmpty, "Directory path is required")
+assert(provider.isDefined, "Data source is required")
+
+// Create the relation based on the input logical plan: `data`.
+val pathOption = storage.locationUri.map("path" -> 
CatalogUtils.URIToString(_))
+val dataSource = DataSource(
--- End diff --

@gatorsmile I am not familiar with data source. Is it possible that you can 
give me some hints how to limit this to only "FileFormat"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18883: [SPARK-21276][CORE] Update lz4-java to the latest...

2017-08-31 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/18883#discussion_r136443138
  
--- Diff: project/MimaExcludes.scala ---
@@ -41,7 +41,10 @@ object MimaExcludes {
 
 // [SPARK-19937] Add remote bytes read to disk.
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetrics.this"),
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this")
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.ShuffleReadMetricDistributions.this"),
+
+// [SPARK-21276] Update lz4-java to the latest (v1.4.0)
+
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.io.LZ4BlockInputStream")
--- End diff --

@srowen This is a breaking change. We should not remove a public class that 
is in the api docs: 
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.io.LZ4BlockInputStream


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136439466
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ---
@@ -534,4 +534,83 @@ class InsertIntoHiveTableSuite extends QueryTest with 
TestHiveSingleton with Bef
   }
 }
   }
+
+  test("insert overwrite to dir from hive metastore table") {
+withTempDir { dir =>
+  val path = dir.toURI.getPath
+
+  checkAnswer(
+sql(s"INSERT OVERWRITE LOCAL DIRECTORY '${path}' SELECT * FROM src 
where key < 10"),
+Seq.empty[Row])
+
+  checkAnswer(
+sql(
+  s"""
+ |INSERT OVERWRITE LOCAL DIRECTORY '${path}'
+ |STORED AS orc
+ |SELECT * FROM src where key < 10
+  """.stripMargin),
+Seq.empty[Row])
+
+  // use orc data source to check the data of path is right.
+  withTempView("orc_source") {
+sql(
+  s"""
+ |CREATE TEMPORARY VIEW orc_source
+ |USING org.apache.spark.sql.hive.orc
+ |OPTIONS (
+ |  PATH '${dir.getCanonicalPath}'
+ |)
+   """.stripMargin)
+
+checkAnswer(
+  sql("select * from orc_source"),
+  sql("select * from src where key < 10").collect())
+  }
+}
+  }
+
+  test("insert overwrite to dir from temp table") {
+withTempView("test_insert_table") {
+  spark.range(10).selectExpr("id", "id AS 
str").createOrReplaceTempView("test_insert_table")
+
+  withTempDir { dir =>
+val path = dir.toURI.getPath
+
+checkAnswer(
+  sql(
+s"""
+   |INSERT OVERWRITE LOCAL DIRECTORY '${path}'
--- End diff --

added


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19097: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown ...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19097
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19097: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown ...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19097
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81292/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19097: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19097
  
**[Test build #81292 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81292/testReport)**
 for PR 19097 at commit 
[`c568282`](https://github.com/apache/spark/commit/c5682826710e784e283762e76d2ce0760af142d5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18306: [SPARK-21029][SS] All StreamingQuery should be st...

2017-08-31 Thread aray

Github user aray commented on a diff in the pull request:

https://github.com/apache/spark/pull/18306#discussion_r136436631
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -562,6 +563,8 @@ class SparkContext(config: SparkConf) extends Logging {
   }
 _cleaner.foreach(_.start())
 
+_stopHooks = new SparkShutdownHookManager()
--- End diff --

The queries also need to be gracefully stopped if someone calls `sc.stop()` 
without shutting down the JVM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136436585
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ---
@@ -534,4 +534,83 @@ class InsertIntoHiveTableSuite extends QueryTest with 
TestHiveSingleton with Bef
   }
 }
   }
+
+  test("insert overwrite to dir from hive metastore table") {
+withTempDir { dir =>
+  val path = dir.toURI.getPath
+
+  checkAnswer(
+sql(s"INSERT OVERWRITE LOCAL DIRECTORY '${path}' SELECT * FROM src 
where key < 10"),
+Seq.empty[Row])
--- End diff --

ok. updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19089: [SPARK-21728][core] Follow up: fix user config, auth in ...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19089
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19089: [SPARK-21728][core] Follow up: fix user config, auth in ...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19089
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81290/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/19098
  
The asserts are wrong if this can be called from user code. It should be 
`require`. The reason is, basically, exactly this. If you don't have assertions 
on this argument is accepted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19089: [SPARK-21728][core] Follow up: fix user config, auth in ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19089
  
**[Test build #81290 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81290/testReport)**
 for PR 19089 at commit 
[`31d6c77`](https://github.com/apache/spark/commit/31d6c776cfad48c1835effc417ec2116fada757f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19093: [SPARK-21880][web UI]In the SQL table page, modify jobs ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19093
  
**[Test build #81296 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81296/testReport)**
 for PR 19093 at commit 
[`6ae5f2b`](https://github.com/apache/spark/commit/6ae5f2b27cc08ec5bf0d6f9986516887e9a4b36a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18975
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18975
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81291/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19093: [SPARK-21880][web UI]In the SQL table page, modify jobs ...

2017-08-31 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/19093
  
LGTM pending tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18975
  
**[Test build #81291 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81291/testReport)**
 for PR 18975 at commit 
[`b2068ce`](https://github.com/apache/spark/commit/b2068ce27eec36e5970206d48282e36e09ebbec0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19093: [SPARK-21880][web UI]In the SQL table page, modify jobs ...

2017-08-31 Thread zsxwing

Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/19093
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/19098
  
@srowen , the assertion is from the Spark `ColumnarBatch` 
[here](https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnarBatch.java#L491).
  

```java
public ColumnarBatch.Row getRow(int rowId) {
assert(rowId >= 0);
assert(rowId < numRows);
row.rowId = rowId;
return row;
  }
```
I'm also not quite sure why this wasn't working, I checked and `-ea` is an 
added argument in the pom.  Still, I wonder if we should change the asserts in 
this class to something better. Maybe these are used instead of exceptions for 
performance?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18704
  
**[Test build #81295 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81295/testReport)**
 for PR 18704 at commit 
[`097fc05`](https://github.com/apache/spark/commit/097fc0502b059222f4cbc77c4aa0019bf013b6a3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19098: [SPARK-21583][HOTFIX] Removed intercept in test c...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/19098#discussion_r136427951
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1308,10 +1308,6 @@ class ColumnarBatchSuite extends SparkFunSuite {
   }
 }
 
-intercept[java.lang.AssertionError] {
-  batch.getRow(100)
--- End diff --

Thanks @gatorsmile , I'll put this in another test once I figure out why it 
wasn't being hit


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19093: [SPARK-21880][web UI]In the SQL table page, modify jobs ...

2017-08-31 Thread ajbozarth

Github user ajbozarth commented on the issue:

https://github.com/apache/spark/pull/19093
  
I'm ok with this change


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18818: [SPARK-21110][SQL] Structs, arrays, and other orderable ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18818
  
**[Test build #81294 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81294/testReport)**
 for PR 18818 at commit 
[`6e01186`](https://github.com/apache/spark/commit/6e011860ed800c9f869b66674cb241d3bb2d94fc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18692: [SPARK-21417][SQL] Infer join conditions using propagate...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18692
  
Sorry for the delay. @jiangxb1987 will submit a simple fix for the issue 
you mentioned. That will not be a perfect fix but it partially resolve the 
issue. In the future, we need to move the filter removal to a separate batch 
for cost-based optimization instead of doing it with filter inference in the 
same RBO batch. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18692: [SPARK-21417][SQL] Infer join conditions using propagate...

2017-08-31 Thread aokolnychyi

Github user aokolnychyi commented on the issue:

https://github.com/apache/spark/pull/18692
  
@gatorsmile what is our decision here? Shall we wait until SPARK-21652 is 
resolved? In the meantime, I can add some tests and see how the proposed rule 
works together with all others. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18818: [SPARK-21110][SQL] Structs, arrays, and other orderable ...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18818
  
LGTM except one comment. Thanks for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18818: [SPARK-21110][SQL] Structs, arrays, and other ord...

2017-08-31 Thread aray

Github user aray commented on a diff in the pull request:

https://github.com/apache/spark/pull/18818#discussion_r136421644
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -582,6 +582,7 @@ class CodegenContext {
 case array: ArrayType => genComp(array, c1, c2) + " == 0"
 case struct: StructType => genComp(struct, c1, c2) + " == 0"
 case udt: UserDefinedType[_] => genEqual(udt.sqlType, c1, c2)
+case NullType => "true"
--- End diff --

Yea, codegen fails without this. I had originally made the value `false` 
but when i noticed the codegen for comparison 
(https://github.com/aray/spark/blob/cc2f3eca28ee6b9faa87853568205307567827cc/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala#L606)
 returned `0`, I changed it to be consistent. Happy to change it back though. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136421034
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/InsertIntoDataSourceDirCommand.scala
 ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import org.apache.spark.sql._
+import org.apache.spark.sql.catalyst.catalog._
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources._
+
+/**
+ * A command used to write the result of a query to a directory.
+ *
+ * The syntax of using this command in SQL is:
+ * {{{
+ *   INSERT OVERWRITE DIRECTORY (path=STRING)?
+ *   USING format OPTIONS ([option1_name "option1_value", option2_name 
"option2_value", ...])
+ *   SELECT ...
+ * }}}
+ */
+case class InsertIntoDataSourceDirCommand(
+storage: CatalogStorageFormat,
+provider: Option[String],
--- End diff --

updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19080: [SPARK-21865][SQL] simplify the distribution sema...

2017-08-31 Thread aray

Github user aray commented on a diff in the pull request:

https://github.com/apache/spark/pull/19080#discussion_r136419947
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
 ---
@@ -284,24 +241,17 @@ case class RangePartitioning(ordering: 
Seq[SortOrder], numPartitions: Int)
   override def nullable: Boolean = false
   override def dataType: DataType = IntegerType
 
-  override def satisfies(required: Distribution): Boolean = required match 
{
-case UnspecifiedDistribution => true
-case OrderedDistribution(requiredOrdering) =>
-  val minSize = Seq(requiredOrdering.size, ordering.size).min
-  requiredOrdering.take(minSize) == ordering.take(minSize)
-case ClusteredDistribution(requiredClustering) =>
-  ordering.map(_.child).forall(x => 
requiredClustering.exists(_.semanticEquals(x)))
-case _ => false
-  }
-
-  override def compatibleWith(other: Partitioning): Boolean = other match {
-case o: RangePartitioning => this.semanticEquals(o)
-case _ => false
-  }
-
-  override def guarantees(other: Partitioning): Boolean = other match {
-case o: RangePartitioning => this.semanticEquals(o)
-case _ => false
+  override def satisfies(required: Distribution): Boolean = {
+super.satisfies(required) || {
+  required match {
+case OrderedDistribution(requiredOrdering) =>
+  val minSize = Seq(requiredOrdering.size, ordering.size).min
+  requiredOrdering.take(minSize) == ordering.take(minSize)
--- End diff --

While we are cleaning things up, this needs fixed. 
`RangePartitioning(a+,b+)` does not satisfy `OrderedDistribution(a+)`. It 
violates the requirement that all values of `a` need to be in the same 
partition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18818: [SPARK-21110][SQL] Structs, arrays, and other ord...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18818#discussion_r136419981
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -582,6 +582,7 @@ class CodegenContext {
 case array: ArrayType => genComp(array, c1, c2) + " == 0"
 case struct: StructType => genComp(struct, c1, c2) + " == 0"
 case udt: UserDefinedType[_] => genEqual(udt.sqlType, c1, c2)
+case NullType => "true"
--- End diff --

I found the test case, but the test case is not affected by the value we 
generate here since it is under `nullSafeCodeGen`. 

However, we should still return `false` when doing `null = null`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136419843
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveDirCommand.scala
 ---
@@ -0,0 +1,118 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive.execution
+
+import java.util.Properties
+
+import scala.language.existentials
+
+import org.apache.hadoop.fs.{FileSystem, Path}
+import org.apache.hadoop.hive.common.FileUtils
+import org.apache.hadoop.hive.ql.plan.TableDesc
+import org.apache.hadoop.hive.serde.serdeConstants
+import org.apache.hadoop.hive.serde2.`lazy`.LazySimpleSerDe
+import org.apache.hadoop.mapred._
+
+import org.apache.spark.sql.{Row, SparkSession}
+import org.apache.spark.sql.catalyst.catalog.CatalogStorageFormat
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.SparkPlan
+import org.apache.spark.util.Utils
+
+
+case class InsertIntoHiveDirCommand(
--- End diff --

updated


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136418390
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -155,6 +156,9 @@ object HiveAnalysis extends Rule[LogicalPlan] {
 
 case CreateTable(tableDesc, mode, Some(query)) if 
DDLUtils.isHiveTable(tableDesc) =>
   CreateHiveTableAsSelectCommand(tableDesc, query, mode)
+
+case InsertIntoDir(isLocal, storage, _, child, overwrite) =>
--- End diff --

updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136418037
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ---
@@ -140,6 +141,9 @@ case class DataSourceAnalysis(conf: SQLConf) extends 
Rule[LogicalPlan] with Cast
 parts, query, overwrite, false) if parts.isEmpty =>
   InsertIntoDataSourceCommand(l, query, overwrite)
 
+case InsertIntoDir(_, storage, provider, query, overwrite) if 
provider.nonEmpty =>
--- End diff --

updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18975: [SPARK-4131] Support "Writing data into the files...

2017-08-31 Thread janewangfb

Github user janewangfb commented on a diff in the pull request:

https://github.com/apache/spark/pull/18975#discussion_r136417593
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -1509,4 +1509,84 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder(conf) {
   query: LogicalPlan): LogicalPlan = {
 RepartitionByExpression(expressions, query, conf.numShufflePartitions)
   }
+
+  /**
+   * Return the parameters for [[InsertIntoDir]] logical plan.
+   *
+   * Expected format:
+   * {{{
+   *   INSERT OVERWRITE DIRECTORY
+   *   [path]
+   *   [OPTIONS table_property_list]
+   *   select_statement;
+   * }}}
+   */
+  override def visitInsertOverwriteDir(
+  ctx: InsertOverwriteDirContext): InsertDirParams = withOrigin(ctx) {
+val options = 
Option(ctx.options).map(visitPropertyKeyValues).getOrElse(Map.empty)
+var storage = DataSource.buildStorageFormatFromOptions(options)
+
+val path = Option(ctx.path) match {
+  case Some(s) => string(s)
+  case None => ""
+}
+
+if (!path.isEmpty && storage.locationUri.isDefined) {
+  throw new ParseException(
+"Directory path and 'path' in OPTIONS are both used to indicate 
the directory path, " +
+  "you can only specify one of them.", ctx)
+}
+if (path.isEmpty && !storage.locationUri.isDefined) {
+  throw new ParseException(
+"You need to specify directory path or 'path' in OPTIONS, but not 
both", ctx)
+}
+
+if (!path.isEmpty) {
+  val customLocation = Some(CatalogUtils.stringToURI(path))
+  storage = storage.copy(locationUri = customLocation)
+}
+
+val provider = ctx.tableProvider.qualifiedName.getText
+
+(false, storage, Some(provider))
+  }
+
+  /**
+   * Return the parameters for [[InsertIntoDir]] logical plan.
+   *
+   * Expected format:
+   * {{{
+   *   INSERT OVERWRITE [LOCAL] DIRECTORY
+   *   path
+   *   [ROW FORMAT row_format]
+   *   [STORED AS file_format]
+   *   select_statement;
+   * }}}
+   */
+  override def visitInsertOverwriteHiveDir(
+  ctx: InsertOverwriteHiveDirContext): InsertDirParams = 
withOrigin(ctx) {
+validateRowFormatFileFormat(ctx.rowFormat, ctx.createFileFormat, ctx)
+val rowStorage = Option(ctx.rowFormat).map(visitRowFormat)
+  .getOrElse(CatalogStorageFormat.empty)
+val fileStorage = 
Option(ctx.createFileFormat).map(visitCreateFileFormat)
+  .getOrElse(CatalogStorageFormat.empty)
+
+val path = string(ctx.path)
+// The path field is required
+if (path.isEmpty) {
+  operationNotAllowed("INSERT OVERWRITE DIRECTORY must be accompanied 
by path", ctx)
+}
+
+val defaultStorage = HiveSerDe.getDefaultStorage(conf)
+
+val storage = CatalogStorageFormat(
+  locationUri = Some(CatalogUtils.stringToURI(path)),
+  inputFormat = 
fileStorage.inputFormat.orElse(defaultStorage.inputFormat),
+  outputFormat = 
fileStorage.outputFormat.orElse(defaultStorage.outputFormat),
+  serde = 
rowStorage.serde.orElse(fileStorage.serde).orElse(defaultStorage.serde),
+  compressed = false,
+  properties = rowStorage.properties ++ fileStorage.properties)
+
+(ctx.LOCAL != null, storage, None)
--- End diff --

got it. updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18818: [SPARK-21110][SQL] Structs, arrays, and other ord...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18818#discussion_r136416655
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -582,6 +582,7 @@ class CodegenContext {
 case array: ArrayType => genComp(array, c1, c2) + " == 0"
 case struct: StructType => genComp(struct, c1, c2) + " == 0"
 case udt: UserDefinedType[_] => genEqual(udt.sqlType, c1, c2)
+case NullType => "true"
--- End diff --

Is this required? Will it be covered by any test?

BTW, the value should be `false`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/19098
  
Hm, I wonder how that results in an assertion? that's a normal error case 
and shouldn't cause an assert. Is it from a third-party library in this case, 
like Arrow? really we should fix that somehow so that the user-visible contract 
for this behavior never involves AssertionError.

Still, assertions ought to be _enabled_ during tests anyway, so I don't see 
how this doesn't actually fire. If it only affects the Maven build I'd suspect 
that maybe the scalatest-maven-plugin somehow doesn't turn on assertions? but 
it has `-ea` in its command line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19098: [SPARK-21583][HOTFIX] Removed intercept in test c...

2017-08-31 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19098


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19098
  
LGTM. Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19098: [SPARK-21583][HOTFIX] Removed intercept in test c...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19098#discussion_r136413800
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1308,10 +1308,6 @@ class ColumnarBatchSuite extends SparkFunSuite {
   }
 }
 
-intercept[java.lang.AssertionError] {
-  batch.getRow(100)
--- End diff --

Please add another test in a follow-up PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136413085
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

I just made #19098 to remove this check - it's not really testing the 
functionality added here anyway but maybe another test should be added for 
checkout index out of bounds errors.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19098: [SPARK-21583][HOTFIX] Removed intercept in test causing ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19098
  
**[Test build #81293 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81293/testReport)**
 for PR 19098 at commit 
[`567487d`](https://github.com/apache/spark/commit/567487d6089400527a1b30aca054ea517174a08d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19098: [SPARK-21583][HOTFIX] Removed intercept in test c...

2017-08-31 Thread BryanCutler

GitHub user BryanCutler opened a pull request:

https://github.com/apache/spark/pull/19098

[SPARK-21583][HOTFIX] Removed intercept in test causing failures

Removing a check in the ColumnarBatchSuite that depended on a Java 
assertion.  This assertion is being compiled out in the Maven builds causing 
the test to fail.  This part of the test is not specifically from to the 
functionality that is being tested here.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/BryanCutler/spark 
hotfix-ColumnarBatchSuite-assertion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19098.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19098


commit 567487d6089400527a1b30aca054ea517174a08d
Author: Bryan Cutler 
Date:   2017-08-31T18:21:06Z

this itercept relies on a Java assertion that could be compiled out, 
failing the test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18306: [SPARK-21029][SS] All StreamingQuery should be st...

2017-08-31 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18306#discussion_r136410312
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -562,6 +563,8 @@ class SparkContext(config: SparkConf) extends Logging {
   }
 _cleaner.foreach(_.start())
 
+_stopHooks = new SparkShutdownHookManager()
--- End diff --

there's already a shutdown hook to call sc.stop() - perhaps just add the 
clean up in stop()

https://github.com/aray/spark/blob/005472ed10fad3d1bc8feff12fc55c5682724a0e/core/src/main/scala/org/apache/spark/SparkContext.scala#L584



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136409601
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

I think the problem is that if the Java assertion is compiled out, then no 
error is produce and the test fails.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18697: [SPARK-16683][SQL] Repeated joins to same table can leak...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18697
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81286/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18697: [SPARK-16683][SQL] Repeated joins to same table can leak...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18697
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18697: [SPARK-16683][SQL] Repeated joins to same table can leak...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18697
  
**[Test build #81286 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81286/testReport)**
 for PR 18697 at commit 
[`0f21237`](https://github.com/apache/spark/commit/0f21237b61a59bfcbf384866e06323a667154924).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136408878
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

Maybe?
```scala
val m = intercept[java.lang.AssertionError] {
...
}.getMessage
assert(m.contains(...))
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136408531
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

Then, please check the error message here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136408063
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

It's probably because the assert is being compiled out.. This should 
probably not be in the test then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136407451
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

Thanks! It seems to happen Maven only. sbt-hadoop-2.6 passed.
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.6/3480/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136406559
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

Hmm, that is strange.  I'll take a look, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18837: [Spark-20812][Mesos] Add secrets support to the d...

2017-08-31 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18837


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18270
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81289/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18270
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18270
  
**[Test build #81289 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81289/testReport)**
 for PR 18270 at commit 
[`2c6ed67`](https://github.com/apache/spark/commit/2c6ed672aeb075243e453cadccaf24c9611735c6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18837: [Spark-20812][Mesos] Add secrets support to the dispatch...

2017-08-31 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/18837
  
There are still some small issues (minor style nits, duplicating conf keys 
instead of using `CONSTANT.key`), but well. I can't really comment on the 
functionality itself so I'll trust your guys' judgement since you're way more 
familiar with Mesos.

Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17014: [SPARK-18608][ML] Fix double-caching in ML algorithms

2017-08-31 Thread smurching

Github user smurching commented on the issue:

https://github.com/apache/spark/pull/17014
  
@WeichenXu123 That approach sounds reasonable to me. 

My main thought (& this might be obvious) is on the implementation level -- 
as long as we implement this by adding an `org.apache.spark.ml.Param` named 
`handlePersistence`, I think we can maintain binary compatibility. I'd be 
concerned about making `handlePersistence` an argument to `fit()`, which seems 
like it might [break binary 
compatibility](https://wiki.eclipse.org/Evolving_Java-based_APIs_2#Evolving_API_classes_-_API_methods_and_constructors).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18787: [SPARK-21583][SQL] Create a ColumnarBatch from Ar...

2017-08-31 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/18787#discussion_r136404583
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
 ---
@@ -1261,4 +1264,55 @@ class ColumnarBatchSuite extends SparkFunSuite {
 s"vectorized reader"))
 }
   }
+
+  test("create columnar batch from Arrow column vectors") {
+val allocator = ArrowUtils.rootAllocator.newChildAllocator("int", 0, 
Long.MaxValue)
+val vector1 = ArrowUtils.toArrowField("int1", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector1.allocateNew()
+val mutator1 = vector1.getMutator()
+val vector2 = ArrowUtils.toArrowField("int2", IntegerType, nullable = 
true)
+  .createVector(allocator).asInstanceOf[NullableIntVector]
+vector2.allocateNew()
+val mutator2 = vector2.getMutator()
+
+(0 until 10).foreach { i =>
+  mutator1.setSafe(i, i)
+  mutator2.setSafe(i + 1, i)
+}
+mutator1.setNull(10)
+mutator1.setValueCount(11)
+mutator2.setNull(0)
+mutator2.setValueCount(11)
+
+val columnVectors = Seq(new ArrowColumnVector(vector1), new 
ArrowColumnVector(vector2))
+
+val schema = StructType(Seq(StructField("int1", IntegerType), 
StructField("int2", IntegerType)))
+val batch = new ColumnarBatch(schema, 
columnVectors.toArray[ColumnVector], 11)
+batch.setNumRows(11)
+
+assert(batch.numCols() == 2)
+assert(batch.numRows() == 11)
+
+val rowIter = batch.rowIterator().asScala
+rowIter.zipWithIndex.foreach { case (row, i) =>
+  if (i == 10) {
+assert(row.isNullAt(0))
+  } else {
+assert(row.getInt(0) == i)
+  }
+  if (i == 0) {
+assert(row.isNullAt(1))
+  } else {
+assert(row.getInt(1) == i - 1)
+  }
+}
+
+intercept[java.lang.AssertionError] {
+  batch.getRow(100)
--- End diff --

Hi, @BryanCutler and @ueshin .
This seems to make master branch fail. Could you take a look once more? 
Thank you in advance!
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/3696/testReport/
- 
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.6/3730/testReport/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19097: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19097
  
**[Test build #81292 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81292/testReport)**
 for PR 19097 at commit 
[`c568282`](https://github.com/apache/spark/commit/c5682826710e784e283762e76d2ce0760af142d5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19097: [SPARK-17107][SQL][FOLLOW-UP] Remove redundant pu...

2017-08-31 Thread gatorsmile

GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/19097

[SPARK-17107][SQL][FOLLOW-UP] Remove redundant pushdown rule for Union

## What changes were proposed in this pull request?
Also remove useless function `partitionByDeterministic` after the changes 
of https://github.com/apache/spark/pull/14687

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark followupSPARK-17107

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19097.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19097


commit c5682826710e784e283762e76d2ce0760af142d5
Author: gatorsmile 
Date:   2017-08-31T17:39:06Z

fix.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19072: [SPARK-17139][ML][FOLLOW-UP] Add convenient method `asBi...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19072
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19072: [SPARK-17139][ML][FOLLOW-UP] Add convenient method `asBi...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19072
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81283/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19072: [SPARK-17139][ML][FOLLOW-UP] Add convenient method `asBi...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19072
  
**[Test build #81283 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81283/testReport)**
 for PR 19072 at commit 
[`e185d37`](https://github.com/apache/spark/commit/e185d37b9814c67d4e6d7f6404dc0900740bfded).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-31 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18953
  
Hi, @marmbrus , @liancheng , @yhuai .
Could you give me some advice about this ORC upgrade PR?
I tried to minimize the diff of PR, so I didn't remove the unused old one.
Thank you in advance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19095: [SPARK-21886][SQL] Use SparkSession.internalCreateDataFr...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19095
  
@jaceklaskowski Maybe you can fix the PR title next time. Thanks for your 
work!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19065: [SPARK-21729][ML][TEST] Generic test for ProbabilisticCl...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19065
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81288/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19065: [SPARK-21729][ML][TEST] Generic test for ProbabilisticCl...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19065
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19065: [SPARK-21729][ML][TEST] Generic test for ProbabilisticCl...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19065
  
**[Test build #81288 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81288/testReport)**
 for PR 19065 at commit 
[`f13cd73`](https://github.com/apache/spark/commit/f13cd73926e80173228637da2015c7d6e7a0e848).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19095: [SPARK-21886][SQL] Use SparkSession.internalCreateDataFr...

2017-08-31 Thread jaceklaskowski

Github user jaceklaskowski commented on the issue:

https://github.com/apache/spark/pull/19095
  
That was really quick! Thanks a lot @gatorsmile 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18975
  
**[Test build #81291 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81291/testReport)**
 for PR 18975 at commit 
[`b2068ce`](https://github.com/apache/spark/commit/b2068ce27eec36e5970206d48282e36e09ebbec0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with the im...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18538
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with the im...

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18538
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81287/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19089: [SPARK-21728][core] Follow up: fix user config, auth in ...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19089
  
**[Test build #81290 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81290/testReport)**
 for PR 19089 at commit 
[`31d6c77`](https://github.com/apache/spark/commit/31d6c776cfad48c1835effc417ec2116fada757f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18538: [SPARK-14516][ML] Adding ClusteringEvaluator with the im...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18538
  
**[Test build #81287 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81287/testReport)**
 for PR 18538 at commit 
[`45d1380`](https://github.com/apache/spark/commit/45d1380574ece58ff63c34ff31af6243aff16c3c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19095: [SPARK-21886][SQL] Use SparkSession.internalCreat...

2017-08-31 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19095


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19095: [SPARK-21886][SQL] Use SparkSession.internalCreateDataFr...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19095
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19095: [SPARK-21886][SQL] Use SparkSession.internalCreateDataFr...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19095
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19078: [SPARK-21862][ML] Add overflow check in PCA

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19078
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81285/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19078: [SPARK-21862][ML] Add overflow check in PCA

2017-08-31 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19078
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19078: [SPARK-21862][ML] Add overflow check in PCA

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19078
  
**[Test build #81285 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81285/testReport)**
 for PR 19078 at commit 
[`3304092`](https://github.com/apache/spark/commit/33040929a0332853f5999b750714ce4be2c2b19d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-31 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18270
  
**[Test build #81289 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81289/testReport)**
 for PR 18270 at commit 
[`2c6ed67`](https://github.com/apache/spark/commit/2c6ed672aeb075243e453cadccaf24c9611735c6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-31 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18270
  
That commit is the code changes I suggested. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 >

101 - 200 of 299 matches

Mail list logo