date:20181204

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23169
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23169
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5738/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22612
  
**[Test build #99684 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99684/testReport)**
 for PR 22612 at commit 
[`0a7402e`](https://github.com/apache/spark/commit/0a7402e92e87aef13d2f91043083b92abfa80233).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22612
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99684/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22612
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23203: [SPARK-26252][PYTHON] Add support to run specific...

2018-12-04 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23203#discussion_r238887812
  
--- Diff: python/run-tests.py ---
@@ -93,17 +93,18 @@ def run_individual_python_test(target_dir, test_name, 
pyspark_python):
 "pyspark-shell"
 ]
 env["PYSPARK_SUBMIT_ARGS"] = " ".join(spark_args)
-
-LOGGER.info("Starting test(%s): %s", pyspark_python, test_name)
+str_test_name = " ".join(test_name)
+LOGGER.info("Starting test(%s): %s", pyspark_python, str_test_name)
 start_time = time.time()
 try:
 per_test_output = tempfile.TemporaryFile()
 retcode = subprocess.Popen(
-[os.path.join(SPARK_HOME, "bin/pyspark"), test_name],
--- End diff --

Oh, yea. Looks that's going to reduce the diff. Let me try.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23220: [SPARK-25877][k8s] Move all feature logic to feat...

2018-12-04 Thread vanzin

GitHub user vanzin opened a pull request:

https://github.com/apache/spark/pull/23220

[SPARK-25877][k8s] Move all feature logic to feature classes.

This change makes the driver and executor builders a lot simpler
by encapsulating almost all feature logic into the respective
feature classes. The only logic that remains is the creation of
the initial pod, which needs to happen before anything else so
is better to be left in the builder class.

Most feature classes already behave fine when the config has nothing
they should handle, but a few minor tweaks had to be added. Unit
tests were also updated or added to account for these.

The builder suites were simplified a lot and just test the remaining
pod-related code in the builders themselves.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vanzin/spark SPARK-25877

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23220


commit a13bafd8e48d8a03fa35c6ff6817f03908f17e2d
Author: Marcelo Vanzin 
Date:   2018-12-04T19:42:31Z

[SPARK-25877][k8s] Move all feature logic to feature classes.

This change makes the driver and executor builders a lot simpler
by encapsulating almost all feature logic into the respective
feature classes. The only logic that remains is the creation of
the initial pod, which needs to happen before anything else so
is better to be left in the builder class.

Most feature classes already behave fine when the config has nothing
they should handle, but a few minor tweaks had to be added. Unit
tests were also updated or added to account for these.

The builder suites were simplified a lot and just test the remaining
pod-related code in the builders themselves.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23220
  
**[Test build #99687 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99687/testReport)**
 for PR 23220 at commit 
[`a13bafd`](https://github.com/apache/spark/commit/a13bafd8e48d8a03fa35c6ff6817f03908f17e2d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table...

2018-12-04 Thread wangyum

Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22721#discussion_r238891454
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala
 ---
@@ -45,6 +45,8 @@ object CommandUtils extends Logging {
   } else {
 catalog.alterTableStats(table.identifier, None)
   }
+} else {
+  catalog.refreshTable(table.identifier)
--- End diff --

Sure. move to DDLs is better. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23220
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5739/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5741/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23220
  
**[Test build #99687 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99687/testReport)**
 for PR 23220 at commit 
[`a13bafd`](https://github.com/apache/spark/commit/a13bafd8e48d8a03fa35c6ff6817f03908f17e2d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23220
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22721
  
**[Test build #99689 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99689/testReport)**
 for PR 22721 at commit 
[`c601b67`](https://github.com/apache/spark/commit/c601b674ec1c0e288c0b3852dcdb511c64bfa6a5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23196
  
**[Test build #99685 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99685/testReport)**
 for PR 23196 at commit 
[`57600e2`](https://github.com/apache/spark/commit/57600e2e41d8caa99afab161b16afe02ef640375).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `logError(s\"Failed to load class $childMainClass.\")`
  * `class CSVInferSchema(val options: CSVOptions) extends Serializable `
  * `class InterpretedSafeProjection(expressions: Seq[Expression]) extends 
Projection `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23220
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99687/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for CREATE EXTERN...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23108
  
**[Test build #99688 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99688/testReport)**
 for PR 23108 at commit 
[`fe472c8`](https://github.com/apache/spark/commit/fe472c81a21700ff52c84808437b85d02d6871ed).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for CREATE EXTERN...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23108
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5740/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23216: [SPARK-26264][CORE]It is better to add @transient...

2018-12-04 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23216#discussion_r238892679
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala 
---
@@ -56,7 +56,7 @@ private[spark] class ResultTask[T, U](
 stageAttemptId: Int,
 taskBinary: Broadcast[Array[Byte]],
 partition: Partition,
-locs: Seq[TaskLocation],
+@transient private var locs: Seq[TaskLocation],
--- End diff --

why is it `var` BTW?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for CREATE EXTERN...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23108
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99685/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22721
  
**[Test build #99690 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99690/testReport)**
 for PR 22721 at commit 
[`c91c154`](https://github.com/apache/spark/commit/c91c15493b30e49e81fbf9097b37bf0b4bdafc79).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5742/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23220
  
Kubernetes integration test status success
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5739/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23220
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5739/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23220: [SPARK-25877][k8s] Move all feature logic to feature cla...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23220
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23221: [SPARK-24243][CORE] Expose exceptions from InProc...

2018-12-04 Thread vanzin

GitHub user vanzin opened a pull request:

https://github.com/apache/spark/pull/23221

[SPARK-24243][CORE] Expose exceptions from InProcessAppHandle

## What changes were proposed in this pull request?

Adds a new method to SparkAppHandle called getError which returns
the exception (if present) that caused the underlying Spark app to
fail.

## How was this patch tested?

New tests added to SparkLauncherSuite for the new method.

Closes #21849

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vanzin/spark SPARK-24243

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23221.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23221


commit 9240b77078936dceaaa4a68f6a54c5c0c16aab73
Author: Sahil Takiar 
Date:   2018-07-23T17:31:24Z

[SPARK-24243][CORE] Expose exceptions from InProcessAppHandle

Adds a new method to `SparkAppHandle` called `getError` which returns
the exception (if present) that caused the underlying Spark app to
fail.

New tests added to `SparkLauncherSuite` for the new method.

commit 29f1436e14b453b41b055be6b4e124c5eae7d8ff
Author: Marcelo Vanzin 
Date:   2018-12-05T00:37:58Z

Merge branch 'master' into SPARK-24243

commit e58fc919355c48d2d3b1cacb4d0ee18036cacbc6
Author: Marcelo Vanzin 
Date:   2018-12-05T00:41:44Z

Feedback.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23221: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-12-04 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/23221
  
I applied my own feedback to the original PR and will merge pending tests 
(since it was already reviewed), unless someone comments first.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22468: [SPARK-25374][SQL] SafeProjection supports fallba...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/22468#discussion_r238894837
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
 ---
@@ -535,4 +535,98 @@ class UnsafeRowConverterSuite extends SparkFunSuite 
with Matchers with PlanTestB
 assert(unsafeRow.getSizeInBytes ==
   8 + 8 * 2 + roundedSize(field1.getSizeInBytes) + 
roundedSize(field2.getSizeInBytes))
   }
+
+  testBothCodegenAndInterpreted("SPARK-25374 converts back into safe 
representation") {
+def convertBackToInternalRow(inputRow: InternalRow, fields: 
Array[DataType]): InternalRow = {
+  val unsafeProj = UnsafeProjection.create(fields)
+  val unsafeRow = unsafeProj(inputRow)
+  val safeProj = SafeProjection.create(fields)
+  safeProj(unsafeRow)
+}
+
+// Simple tests
+val inputRow = InternalRow.fromSeq(Seq(
+  false, 3.toByte, 15.toShort, -83, 129L, 1.0f, 8.0, 
UTF8String.fromString("test"),
+  Decimal(255), CalendarInterval.fromString("interval 1 day"), 
Array[Byte](1, 2)
+))
+val fields1 = Array(
+  BooleanType, ByteType, ShortType, IntegerType, LongType, FloatType,
+  DoubleType, StringType, DecimalType.defaultConcreteType, 
CalendarIntervalType,
+  BinaryType)
+
+assert(convertBackToInternalRow(inputRow, fields1) === inputRow)
+
+// Array tests
+val arrayRow = InternalRow.fromSeq(Seq(
+  createArray(1, 2, 3),
+  createArray(
+createArray(Seq("a", "b", "c").map(UTF8String.fromString): _*),
+createArray(Seq("d").map(UTF8String.fromString): _*))
+))
+val fields2 = Array[DataType](
+  ArrayType(IntegerType),
+  ArrayType(ArrayType(StringType)))
+
+assert(convertBackToInternalRow(arrayRow, fields2) === arrayRow)
+
+// Struct tests
+val structRow = InternalRow.fromSeq(Seq(
+  InternalRow.fromSeq(Seq[Any](1, 4.0)),
+  InternalRow.fromSeq(Seq(
+UTF8String.fromString("test"),
+InternalRow.fromSeq(Seq(
+  1,
+  createArray(Seq("2", "3").map(UTF8String.fromString): _*)
+))
+  ))
+))
+val fields3 = Array[DataType](
+  StructType(
+StructField("c0", IntegerType) ::
+StructField("c1", DoubleType) ::
+Nil),
+  StructType(
+StructField("c2", StringType) ::
+StructField("c3", StructType(
+  StructField("c4", IntegerType) ::
+  StructField("c5", ArrayType(StringType)) ::
+  Nil)) ::
+Nil))
+
+assert(convertBackToInternalRow(structRow, fields3) === structRow)
+
+// Map tests
+val mapRow = InternalRow.fromSeq(Seq(
+  createMap(Seq("k1", "k2").map(UTF8String.fromString): _*)(1, 2),
+  createMap(
+createMap(3, 5)(Seq("v1", "v2").map(UTF8String.fromString): _*),
+createMap(7, 9)(Seq("v3", "v4").map(UTF8String.fromString): _*)
+  )(
+createMap(Seq("k3", "k4").map(UTF8String.fromString): 
_*)(3.toShort, 4.toShort),
+createMap(Seq("k5", "k6").map(UTF8String.fromString): 
_*)(5.toShort, 6.toShort)
+  )))
+val fields4 = Array[DataType](
+  MapType(StringType, IntegerType),
+  MapType(MapType(IntegerType, StringType), MapType(StringType, 
ShortType)))
+
+val mapResultRow = convertBackToInternalRow(mapRow, 
fields4).toSeq(fields4)
+val mapExpectedRow = mapRow.toSeq(fields4)
+// Since `ArrayBasedMapData` does not override `equals` and `hashCode`,
--- End diff --

Aha, thanks. I remember that its related to SPARK-18134.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238895286
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -144,9 +144,10 @@ class SQLQueryTestSuite extends QueryTest with 
SharedSQLContext {
 val (comments, code) = input.split("\n").partition(_.startsWith("--"))
 
 // Runs all the tests on both codegen-only and interpreter modes
-val codegenConfigSets = Array(CODEGEN_ONLY, NO_CODEGEN).map {
-  case codegenFactoryMode =>
-Array(SQLConf.CODEGEN_FACTORY_MODE.key -> 
codegenFactoryMode.toString)
+val codegenConfigSets = Array(("false", "NO_CODEGEN"), ("true", 
"CODEGEN_ONLY")).map {
--- End diff --

I will check the time later, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23221: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23221
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23221: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23221
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5743/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23221: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23221
  
**[Test build #99691 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99691/testReport)**
 for PR 23221 at commit 
[`e58fc91`](https://github.com/apache/spark/commit/e58fc919355c48d2d3b1cacb4d0ee18036cacbc6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23218: [SPARK-26266][BUILD] Update to Scala 2.12.8

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23218
  
**[Test build #4452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4452/testReport)**
 for PR 23218 at commit 
[`b667d37`](https://github.com/apache/spark/commit/b667d37e9ee2d8cdce459806925cdc0fe725b7bf).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23216: [SPARK-26264][CORE]It is better to add @transient to fie...

2018-12-04 Thread 10110346

Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/23216
  
> 
> 
> Are you sure it's even a field in the class? it looks like it's only used 
to define this:
> 
> ```
>   @transient private[this] val preferredLocs: Seq[TaskLocation] = {
> if (locs == null) Nil else locs.toSet.toSeq
>   }
> ```
> 
> I'd expect Scala would not generate a field. Indeed the thing it is used 
to make is transient.

Yeah, it would not generate a field, thanks @srowen 
By the way, is it better to remove `transient` for `ShuffleMapTask`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixe...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238899777
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2899,6 +2899,144 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   }
 }
   }
+
+  private def checkKeywordsExistsInExplain(df: DataFrame, keywords: 
String*): Unit = {
+val output = new java.io.ByteArrayOutputStream()
+Console.withOut(output) {
+  df.explain(extended = true)
+}
+val normalizedOutput = output.toString.replaceAll("#\\d+", "#x")
+for (key <- keywords) {
+  assert(normalizedOutput.contains(key))
+}
+  }
+
+  test("optimized plan should show the rewritten aggregate expression") {
--- End diff --

updated! Thanks, guys!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22514#discussion_r238899698
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -181,62 +180,39 @@ case class RelationConversions(
 conf: SQLConf,
 sessionCatalog: HiveSessionCatalog) extends Rule[LogicalPlan] {
   private def isConvertible(relation: HiveTableRelation): Boolean = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-serde.contains("parquet") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
-  serde.contains("orc") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_ORC)
+isConvertible(relation.tableMeta)
   }
 
-  // Return true for Apache ORC and Hive ORC-related configuration names.
-  // Note that Spark doesn't support configurations like 
`hive.merge.orcfile.stripe.level`.
-  private def isOrcProperty(key: String) =
-key.startsWith("orc.") || key.contains(".orc.")
-
-  private def isParquetProperty(key: String) =
-key.startsWith("parquet.") || key.contains(".parquet.")
-
-  private def convert(relation: HiveTableRelation): LogicalRelation = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-
-// Consider table and storage properties. For properties existing in 
both sides, storage
-// properties will supersede table properties.
-if (serde.contains("parquet")) {
-  val options = 
relation.tableMeta.properties.filterKeys(isParquetProperty) ++
-relation.tableMeta.storage.properties + 
(ParquetOptions.MERGE_SCHEMA ->
-
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
-  sessionCatalog.metastoreCatalog
-.convertToLogicalRelation(relation, options, 
classOf[ParquetFileFormat], "parquet")
-} else {
-  val options = 
relation.tableMeta.properties.filterKeys(isOrcProperty) ++
-relation.tableMeta.storage.properties
-  if (conf.getConf(SQLConf.ORC_IMPLEMENTATION) == "native") {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  
classOf[org.apache.spark.sql.execution.datasources.orc.OrcFileFormat],
-  "orc")
-  } else {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  classOf[org.apache.spark.sql.hive.orc.OrcFileFormat],
-  "orc")
-  }
-}
+  private def isConvertible(tableMeta: CatalogTable): Boolean = {
+val serde = 
tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
+serde.contains("parquet") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
+  serde.contains("orc") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_ORC)
   }
 
+  private val metastoreCatalog = sessionCatalog.metastoreCatalog
+
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan resolveOperators {
   // Write path
   case InsertIntoTable(r: HiveTableRelation, partition, query, 
overwrite, ifPartitionNotExists)
 // Inserting into partitioned table is not supported in 
Parquet/Orc data source (yet).
   if query.resolved && DDLUtils.isHiveTable(r.tableMeta) &&
 !r.isPartitioned && isConvertible(r) =>
-InsertIntoTable(convert(r), partition, query, overwrite, 
ifPartitionNotExists)
+InsertIntoTable(metastoreCatalog.convert(r), partition,
+  query, overwrite, ifPartitionNotExists)
 
   // Read path
   case relation: HiveTableRelation
   if DDLUtils.isHiveTable(relation.tableMeta) && 
isConvertible(relation) =>
-convert(relation)
+metastoreCatalog.convert(relation)
+
+  // CTAS
+  case CreateTable(tableDesc, mode, Some(query))
+  if DDLUtils.isHiveTable(tableDesc) && 
tableDesc.partitionColumnNames.isEmpty &&
+isConvertible(tableDesc) =>
--- End diff --

It's not a new optimization... It's an optimization we dropped in 2.3 by 
mistake.

I'm fine to add a config with default value true.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23217: [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat in order...

2018-12-04 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23217
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23217: [SPARK-25829][SQL][FOLLOWUP] Refactor MapConcat i...

2018-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23217


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22514#discussion_r238902415
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -181,62 +180,39 @@ case class RelationConversions(
 conf: SQLConf,
 sessionCatalog: HiveSessionCatalog) extends Rule[LogicalPlan] {
   private def isConvertible(relation: HiveTableRelation): Boolean = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-serde.contains("parquet") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
-  serde.contains("orc") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_ORC)
+isConvertible(relation.tableMeta)
   }
 
-  // Return true for Apache ORC and Hive ORC-related configuration names.
-  // Note that Spark doesn't support configurations like 
`hive.merge.orcfile.stripe.level`.
-  private def isOrcProperty(key: String) =
-key.startsWith("orc.") || key.contains(".orc.")
-
-  private def isParquetProperty(key: String) =
-key.startsWith("parquet.") || key.contains(".parquet.")
-
-  private def convert(relation: HiveTableRelation): LogicalRelation = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-
-// Consider table and storage properties. For properties existing in 
both sides, storage
-// properties will supersede table properties.
-if (serde.contains("parquet")) {
-  val options = 
relation.tableMeta.properties.filterKeys(isParquetProperty) ++
-relation.tableMeta.storage.properties + 
(ParquetOptions.MERGE_SCHEMA ->
-
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
-  sessionCatalog.metastoreCatalog
-.convertToLogicalRelation(relation, options, 
classOf[ParquetFileFormat], "parquet")
-} else {
-  val options = 
relation.tableMeta.properties.filterKeys(isOrcProperty) ++
-relation.tableMeta.storage.properties
-  if (conf.getConf(SQLConf.ORC_IMPLEMENTATION) == "native") {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  
classOf[org.apache.spark.sql.execution.datasources.orc.OrcFileFormat],
-  "orc")
-  } else {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  classOf[org.apache.spark.sql.hive.orc.OrcFileFormat],
-  "orc")
-  }
-}
+  private def isConvertible(tableMeta: CatalogTable): Boolean = {
+val serde = 
tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
+serde.contains("parquet") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
+  serde.contains("orc") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_ORC)
   }
 
+  private val metastoreCatalog = sessionCatalog.metastoreCatalog
+
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan resolveOperators {
   // Write path
   case InsertIntoTable(r: HiveTableRelation, partition, query, 
overwrite, ifPartitionNotExists)
 // Inserting into partitioned table is not supported in 
Parquet/Orc data source (yet).
   if query.resolved && DDLUtils.isHiveTable(r.tableMeta) &&
 !r.isPartitioned && isConvertible(r) =>
-InsertIntoTable(convert(r), partition, query, overwrite, 
ifPartitionNotExists)
+InsertIntoTable(metastoreCatalog.convert(r), partition,
+  query, overwrite, ifPartitionNotExists)
 
   // Read path
   case relation: HiveTableRelation
   if DDLUtils.isHiveTable(relation.tableMeta) && 
isConvertible(relation) =>
-convert(relation)
+metastoreCatalog.convert(relation)
+
+  // CTAS
+  case CreateTable(tableDesc, mode, Some(query))
+  if DDLUtils.isHiveTable(tableDesc) && 
tableDesc.partitionColumnNames.isEmpty &&
+isConvertible(tableDesc) =>
--- End diff --

hmm, the optimization is already controlled by configs like 
`HiveUtils.CONVERT_METASTORE_ORC` and `HiveUtils.CONVERT_METASTORE_PARQUET`. Do 
we need another config for it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23216: [SPARK-26264][CORE]It is better to add @transient to fie...

2018-12-04 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/23216
  
I think just leave it. The `@transient` in `ShuffleMapTasks`'s `locs` is 
just superfluous here, not sure it's worth changing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23216: [SPARK-26264][CORE]It is better to add @transient...

2018-12-04 Thread 10110346

Github user 10110346 closed the pull request at:

https://github.com/apache/spark/pull/23216


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23216: [SPARK-26264][CORE]It is better to add @transient to fie...

2018-12-04 Thread 10110346

Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/23216
  
Ok, I will close this PR, thank you very much


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixe...

2018-12-04 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/23213#discussion_r238905795
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/ExplainSuite.scala 
---
@@ -53,6 +55,133 @@ class ExplainSuite extends QueryTest with 
SharedSQLContext {
 checkKeywordsExistsInExplain(df,
   keywords = "InMemoryRelation", "StorageLevel(disk, memory, 
deserialized, 1 replicas)")
   }
+
+  test("optimized plan should show the rewritten aggregate expression") {
+withTempView("test_agg") {
+  sql(
+"""
+  |CREATE TEMPORARY VIEW test_agg AS SELECT * FROM VALUES
+  |  (1, true), (1, false),
+  |  (2, true),
+  |  (3, false), (3, null),
+  |  (4, null), (4, null),
+  |  (5, null), (5, true), (5, false) AS test_agg(k, v)
+""".stripMargin)
+
+  // simple explain of queries having every/some/any aggregates. 
Optimized
+  // plan should show the rewritten aggregate expression.
+  val df = sql("SELECT k, every(v), some(v), any(v) FROM test_agg 
GROUP BY k")
+  checkKeywordsExistsInExplain(df,
+"Aggregate [k#x], [k#x, min(v#x) AS every(v)#x, max(v#x) AS 
some(v)#x, " +
--- End diff --

I forgot to set true at extended in explain...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21486: [SPARK-24387][Core] Heartbeat-timeout executor is added ...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21486
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5744/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23213
  
**[Test build #99692 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99692/testReport)**
 for PR 23213 at commit 
[`808af50`](https://github.com/apache/spark/commit/808af50d756583bd69b7dd7ca1e1ae09d2457b41).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23218: [SPARK-26266][BUILD] Update to Scala 2.12.8

2018-12-04 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/23218
  
Hm, one failure was due to a JVM crash, but it fails twice consistent, with 
sbt just exiting with status 134. No other failures are logged. Not sure what 
to make of that!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23210: [SPARK-26233][SQL] CheckOverflow when encoding a decimal...

2018-12-04 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23210
  
a late LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22514#discussion_r238908877
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -181,62 +180,39 @@ case class RelationConversions(
 conf: SQLConf,
 sessionCatalog: HiveSessionCatalog) extends Rule[LogicalPlan] {
   private def isConvertible(relation: HiveTableRelation): Boolean = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-serde.contains("parquet") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
-  serde.contains("orc") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_ORC)
+isConvertible(relation.tableMeta)
   }
 
-  // Return true for Apache ORC and Hive ORC-related configuration names.
-  // Note that Spark doesn't support configurations like 
`hive.merge.orcfile.stripe.level`.
-  private def isOrcProperty(key: String) =
-key.startsWith("orc.") || key.contains(".orc.")
-
-  private def isParquetProperty(key: String) =
-key.startsWith("parquet.") || key.contains(".parquet.")
-
-  private def convert(relation: HiveTableRelation): LogicalRelation = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-
-// Consider table and storage properties. For properties existing in 
both sides, storage
-// properties will supersede table properties.
-if (serde.contains("parquet")) {
-  val options = 
relation.tableMeta.properties.filterKeys(isParquetProperty) ++
-relation.tableMeta.storage.properties + 
(ParquetOptions.MERGE_SCHEMA ->
-
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
-  sessionCatalog.metastoreCatalog
-.convertToLogicalRelation(relation, options, 
classOf[ParquetFileFormat], "parquet")
-} else {
-  val options = 
relation.tableMeta.properties.filterKeys(isOrcProperty) ++
-relation.tableMeta.storage.properties
-  if (conf.getConf(SQLConf.ORC_IMPLEMENTATION) == "native") {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  
classOf[org.apache.spark.sql.execution.datasources.orc.OrcFileFormat],
-  "orc")
-  } else {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  classOf[org.apache.spark.sql.hive.orc.OrcFileFormat],
-  "orc")
-  }
-}
+  private def isConvertible(tableMeta: CatalogTable): Boolean = {
+val serde = 
tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
+serde.contains("parquet") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
+  serde.contains("orc") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_ORC)
   }
 
+  private val metastoreCatalog = sessionCatalog.metastoreCatalog
+
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan resolveOperators {
   // Write path
   case InsertIntoTable(r: HiveTableRelation, partition, query, 
overwrite, ifPartitionNotExists)
 // Inserting into partitioned table is not supported in 
Parquet/Orc data source (yet).
   if query.resolved && DDLUtils.isHiveTable(r.tableMeta) &&
 !r.isPartitioned && isConvertible(r) =>
-InsertIntoTable(convert(r), partition, query, overwrite, 
ifPartitionNotExists)
+InsertIntoTable(metastoreCatalog.convert(r), partition,
+  query, overwrite, ifPartitionNotExists)
 
   // Read path
   case relation: HiveTableRelation
   if DDLUtils.isHiveTable(relation.tableMeta) && 
isConvertible(relation) =>
-convert(relation)
+metastoreCatalog.convert(relation)
+
+  // CTAS
+  case CreateTable(tableDesc, mode, Some(query))
+  if DDLUtils.isHiveTable(tableDesc) && 
tableDesc.partitionColumnNames.isEmpty &&
+isConvertible(tableDesc) =>
--- End diff --

I don't mind to add `HiveUtils.CONVERT_METASTORE_ORC_CTAS`, maybe we can do 
it in a followup?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22514#discussion_r238909363
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -181,62 +180,39 @@ case class RelationConversions(
 conf: SQLConf,
 sessionCatalog: HiveSessionCatalog) extends Rule[LogicalPlan] {
   private def isConvertible(relation: HiveTableRelation): Boolean = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-serde.contains("parquet") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
-  serde.contains("orc") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_ORC)
+isConvertible(relation.tableMeta)
   }
 
-  // Return true for Apache ORC and Hive ORC-related configuration names.
-  // Note that Spark doesn't support configurations like 
`hive.merge.orcfile.stripe.level`.
-  private def isOrcProperty(key: String) =
-key.startsWith("orc.") || key.contains(".orc.")
-
-  private def isParquetProperty(key: String) =
-key.startsWith("parquet.") || key.contains(".parquet.")
-
-  private def convert(relation: HiveTableRelation): LogicalRelation = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-
-// Consider table and storage properties. For properties existing in 
both sides, storage
-// properties will supersede table properties.
-if (serde.contains("parquet")) {
-  val options = 
relation.tableMeta.properties.filterKeys(isParquetProperty) ++
-relation.tableMeta.storage.properties + 
(ParquetOptions.MERGE_SCHEMA ->
-
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
-  sessionCatalog.metastoreCatalog
-.convertToLogicalRelation(relation, options, 
classOf[ParquetFileFormat], "parquet")
-} else {
-  val options = 
relation.tableMeta.properties.filterKeys(isOrcProperty) ++
-relation.tableMeta.storage.properties
-  if (conf.getConf(SQLConf.ORC_IMPLEMENTATION) == "native") {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  
classOf[org.apache.spark.sql.execution.datasources.orc.OrcFileFormat],
-  "orc")
-  } else {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  classOf[org.apache.spark.sql.hive.orc.OrcFileFormat],
-  "orc")
-  }
-}
+  private def isConvertible(tableMeta: CatalogTable): Boolean = {
+val serde = 
tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
+serde.contains("parquet") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
+  serde.contains("orc") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_ORC)
   }
 
+  private val metastoreCatalog = sessionCatalog.metastoreCatalog
+
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan resolveOperators {
   // Write path
   case InsertIntoTable(r: HiveTableRelation, partition, query, 
overwrite, ifPartitionNotExists)
 // Inserting into partitioned table is not supported in 
Parquet/Orc data source (yet).
   if query.resolved && DDLUtils.isHiveTable(r.tableMeta) &&
 !r.isPartitioned && isConvertible(r) =>
-InsertIntoTable(convert(r), partition, query, overwrite, 
ifPartitionNotExists)
+InsertIntoTable(metastoreCatalog.convert(r), partition,
+  query, overwrite, ifPartitionNotExists)
 
   // Read path
   case relation: HiveTableRelation
   if DDLUtils.isHiveTable(relation.tableMeta) && 
isConvertible(relation) =>
-convert(relation)
+metastoreCatalog.convert(relation)
+
+  // CTAS
+  case CreateTable(tableDesc, mode, Some(query))
+  if DDLUtils.isHiveTable(tableDesc) && 
tableDesc.partitionColumnNames.isEmpty &&
+isConvertible(tableDesc) =>
--- End diff --

ok.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-04 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23207#discussion_r238909822
  
--- Diff: core/src/main/scala/org/apache/spark/shuffle/metrics.scala ---
@@ -50,3 +50,57 @@ private[spark] trait ShuffleWriteMetricsReporter {
   private[spark] def decBytesWritten(v: Long): Unit
   private[spark] def decRecordsWritten(v: Long): Unit
 }
+
+
+/**
+ * A proxy class of ShuffleWriteMetricsReporter which proxy all metrics 
updating to the input
+ * reporters.
+ */
+private[spark] class GroupedShuffleWriteMetricsReporter(
--- End diff --

For the write metrics, it's different. It's the default one calls the SQL 
one, which needs to hack the default one to register external reporters.

Maybe we should not change the read side, just create a special 
`PairShuffleWriteMetricsReporter` to update both the SQL reporter and default 
reporter.

Another idea is, `ShuffleDependency` carries a `reporter => reporter` 
function, instead of a reporter. Then we can create a SQL reporter which takes 
another reporter(similar to read side), and put the SQL reporter's constructor 
in `ShuffleDependency`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99693 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99693/testReport)**
 for PR 22683 at commit 
[`8f11891`](https://github.com/apache/spark/commit/8f11891396d47ee9f404283e30922f9f16bc612a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99695 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99695/testReport)**
 for PR 22683 at commit 
[`235b2fb`](https://github.com/apache/spark/commit/235b2fbf20dae9c7a2177992b24765085fb2f221).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22514
  
**[Test build #99694 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99694/testReport)**
 for PR 22514 at commit 
[`57fc943`](https://github.com/apache/spark/commit/57fc94383ad3c66e5b93f40378d8c94aaa726e7a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5745/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22721
  
**[Test build #99689 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99689/testReport)**
 for PR 22721 at commit 
[`c601b67`](https://github.com/apache/spark/commit/c601b674ec1c0e288c0b3852dcdb511c64bfa6a5).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99689/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99696 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99696/testReport)**
 for PR 22683 at commit 
[`4c4674e`](https://github.com/apache/spark/commit/4c4674e1abfa28a01d733f4ae60039410e769fc8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-04 Thread shahidki31

Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/23088
  
Thanks @vanzin @srowen 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23203: [SPARK-26252][PYTHON] Add support to run specific unitte...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23203
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5746/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23203: [SPARK-26252][PYTHON] Add support to run specific unitte...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23203
  
**[Test build #99697 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99697/testReport)**
 for PR 23203 at commit 
[`bd23e01`](https://github.com/apache/spark/commit/bd23e01078deb90bcdba654ff82047603a462b2e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23203: [SPARK-26252][PYTHON] Add support to run specific unitte...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23203
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23169
  
**[Test build #99686 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99686/testReport)**
 for PR 23169 at commit 
[`22fe117`](https://github.com/apache/spark/commit/22fe117656ea004757efaffd847f81dc01df8433).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23169
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23169
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99686/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23169: [SPARK-26103][SQL] Limit the length of debug strings for...

2018-12-04 Thread DaveDeCaprio

Github user DaveDeCaprio commented on the issue:

https://github.com/apache/spark/pull/23169
  
@HeartSaVioR  I added tests for the default case and for a truncated plan.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23203: [SPARK-26252][PYTHON] Add support to run specific unitte...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23203
  
**[Test build #99697 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99697/testReport)**
 for PR 23203 at commit 
[`bd23e01`](https://github.com/apache/spark/commit/bd23e01078deb90bcdba654ff82047603a462b2e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23203: [SPARK-26252][PYTHON] Add support to run specific unitte...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23203
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99697/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23203: [SPARK-26252][PYTHON] Add support to run specific unitte...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23203
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23221: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23221
  
**[Test build #99691 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99691/testReport)**
 for PR 23221 at commit 
[`e58fc91`](https://github.com/apache/spark/commit/e58fc919355c48d2d3b1cacb4d0ee18036cacbc6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23221: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23221
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99691/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23221: [SPARK-24243][CORE] Expose exceptions from InProcessAppH...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23221
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for CREATE EXTERN...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23108
  
**[Test build #99688 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99688/testReport)**
 for PR 23108 at commit 
[`fe472c8`](https://github.com/apache/spark/commit/fe472c81a21700ff52c84808437b85d02d6871ed).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for CREATE EXTERN...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23108
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99688/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23108: [Spark-25993][SQL][TEST]Add test cases for CREATE EXTERN...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23108
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22721
  
**[Test build #99690 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99690/testReport)**
 for PR 22721 at commit 
[`c91c154`](https://github.com/apache/spark/commit/c91c15493b30e49e81fbf9097b37bf0b4bdafc79).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22721: [SPARK-19784][SPARK-25403][SQL] Refresh the table even t...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22721
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99690/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23120: [SPARK-26151][SQL] Return partial results for bad CSV re...

2018-12-04 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23120
  
Hi @MaxGekk , since this changes the result(although makes it better), do 
you mind adding a migration guide? thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23222: [SPARK-20636] Add the rule TransposeWindow to the...

2018-12-04 Thread gatorsmile

GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/23222

[SPARK-20636] Add the rule TransposeWindow to the optimization batch

## What changes were proposed in this pull request?

This PR is a follow-up of the PR 
https://github.com/apache/spark/pull/17899. It is to add the rule the optimizer 
batch. 

## How was this patch tested?
The existing tests. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark followupSPARK-20636

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23222.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23222


commit 1270e89026d80c862137c03edbeee53e56f3ed6d
Author: gatorsmile 
Date:   2018-12-05T05:07:00Z

add the rule TransposeWindow to the batch




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-04 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/23222
  
cc @ptkool @jiangxb1987 @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23222
  
**[Test build #99698 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99698/testReport)**
 for PR 23222 at commit 
[`1270e89`](https://github.com/apache/spark/commit/1270e89026d80c862137c03edbeee53e56f3ed6d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23222
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22514#discussion_r238933039
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
@@ -181,62 +180,39 @@ case class RelationConversions(
 conf: SQLConf,
 sessionCatalog: HiveSessionCatalog) extends Rule[LogicalPlan] {
   private def isConvertible(relation: HiveTableRelation): Boolean = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-serde.contains("parquet") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
-  serde.contains("orc") && 
conf.getConf(HiveUtils.CONVERT_METASTORE_ORC)
+isConvertible(relation.tableMeta)
   }
 
-  // Return true for Apache ORC and Hive ORC-related configuration names.
-  // Note that Spark doesn't support configurations like 
`hive.merge.orcfile.stripe.level`.
-  private def isOrcProperty(key: String) =
-key.startsWith("orc.") || key.contains(".orc.")
-
-  private def isParquetProperty(key: String) =
-key.startsWith("parquet.") || key.contains(".parquet.")
-
-  private def convert(relation: HiveTableRelation): LogicalRelation = {
-val serde = 
relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
-
-// Consider table and storage properties. For properties existing in 
both sides, storage
-// properties will supersede table properties.
-if (serde.contains("parquet")) {
-  val options = 
relation.tableMeta.properties.filterKeys(isParquetProperty) ++
-relation.tableMeta.storage.properties + 
(ParquetOptions.MERGE_SCHEMA ->
-
conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
-  sessionCatalog.metastoreCatalog
-.convertToLogicalRelation(relation, options, 
classOf[ParquetFileFormat], "parquet")
-} else {
-  val options = 
relation.tableMeta.properties.filterKeys(isOrcProperty) ++
-relation.tableMeta.storage.properties
-  if (conf.getConf(SQLConf.ORC_IMPLEMENTATION) == "native") {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  
classOf[org.apache.spark.sql.execution.datasources.orc.OrcFileFormat],
-  "orc")
-  } else {
-sessionCatalog.metastoreCatalog.convertToLogicalRelation(
-  relation,
-  options,
-  classOf[org.apache.spark.sql.hive.orc.OrcFileFormat],
-  "orc")
-  }
-}
+  private def isConvertible(tableMeta: CatalogTable): Boolean = {
+val serde = 
tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
+serde.contains("parquet") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_PARQUET) ||
+  serde.contains("orc") && 
SQLConf.get.getConf(HiveUtils.CONVERT_METASTORE_ORC)
   }
 
+  private val metastoreCatalog = sessionCatalog.metastoreCatalog
+
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan resolveOperators {
   // Write path
   case InsertIntoTable(r: HiveTableRelation, partition, query, 
overwrite, ifPartitionNotExists)
 // Inserting into partitioned table is not supported in 
Parquet/Orc data source (yet).
   if query.resolved && DDLUtils.isHiveTable(r.tableMeta) &&
 !r.isPartitioned && isConvertible(r) =>
-InsertIntoTable(convert(r), partition, query, overwrite, 
ifPartitionNotExists)
+InsertIntoTable(metastoreCatalog.convert(r), partition,
+  query, overwrite, ifPartitionNotExists)
 
   // Read path
   case relation: HiveTableRelation
   if DDLUtils.isHiveTable(relation.tableMeta) && 
isConvertible(relation) =>
-convert(relation)
+metastoreCatalog.convert(relation)
+
+  // CTAS
+  case CreateTable(tableDesc, mode, Some(query))
+  if DDLUtils.isHiveTable(tableDesc) && 
tableDesc.partitionColumnNames.isEmpty &&
+isConvertible(tableDesc) =>
--- End diff --

Since the regression was already introduced, we need to add a conf and 
migration guide. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23222: [SPARK-20636] Add the rule TransposeWindow to the optimi...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23222
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5747/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23213
  
**[Test build #99692 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99692/testReport)**
 for PR 23213 at commit 
[`808af50`](https://github.com/apache/spark/commit/808af50d756583bd69b7dd7ca1e1ae09d2457b41).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99692/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23213
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22514
  
**[Test build #99694 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99694/testReport)**
 for PR 22514 at commit 
[`57fc943`](https://github.com/apache/spark/commit/57fc94383ad3c66e5b93f40378d8c94aaa726e7a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99694/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-04 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99693 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99693/testReport)**
 for PR 22683 at commit 
[`8f11891`](https://github.com/apache/spark/commit/8f11891396d47ee9f404283e30922f9f16bc612a).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22683
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 468 matches

Mail list logo