[GitHub] spark pull request #22455: [SPARK-24572][SPARKR] "eager execution" for R she...

2018-09-21 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22455#discussion_r219404319
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -244,11 +245,15 @@ setMethod("showDF",
 #' @note show(SparkDataFrame) since 1.4.0
 setMethod("show", "SparkDataFrame",
   function(object) {
-cols <- lapply(dtypes(object), function(l) {
-  paste(l, collapse = ":")
-})
-s <- paste(cols, collapse = ", ")
-cat(paste(class(object), "[", s, "]\n", sep = ""))
+if (identical(sparkR.conf("spark.sql.repl.eagerEval.enabled", 
"false")[[1]], "true")) {
--- End diff --

I see. The document generated looks like 
https://spark.apache.org/docs/latest/api/R/index.html.

Then rewriting the description is good and clear.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use...

2018-09-21 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22500#discussion_r219404392
  
--- Diff: sql/core/benchmarks/MiscBenchmark-results.txt ---
@@ -0,0 +1,132 @@

+
+filter & aggregate without group

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+range/filter/sum:Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+range/filter/sum wholestage off 36618 / 41080 57.3 
 17.5   1.0X
+range/filter/sum wholestage on2495 / 2609840.4 
  1.2  14.7X
+
+

+
+range/limit/sum

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+range/limit/sum: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+range/limit/sum wholestage off 117 /  121   4477.9 
  0.2   1.0X
+range/limit/sum wholestage on  178 /  187   2938.1 
  0.3   0.7X
+
+

+
+sample

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+sample with replacement: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+sample with replacement wholestage off9142 / 9182 14.3 
 69.8   1.0X
+sample with replacement wholestage on 5926 / 6107 22.1 
 45.2   1.5X
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+sample without replacement:  Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+sample without replacement wholestage off  1834 / 1837 71.5
  14.0   1.0X
+sample without replacement wholestage on   784 /  803167.2 
  6.0   2.3X
+
+

+
+collect

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+collect: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+collect 1 million  186 /  215  5.6 
177.5   1.0X
+collect 2 millions 361 /  393  2.9 
344.2   0.5X
+collect 4 millions 884 / 1053  1.2 
843.4   0.2X
+
+

+
+collect limit

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+collect limit:   Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+collect limit 1 million206 /  225  5.1 
196.6   1.0X
+collect limit 2 millions   407 /  419  2.6 
387.8   0.5X
+
+

+
+generate explode

+
   

[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22515
  
**[Test build #96412 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96412/testReport)**
 for PR 22515 at commit 
[`0f32b01`](https://github.com/apache/spark/commit/0f32b0170fe6295bfef604b5a679f9391b5ec78f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22515
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22515
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3337/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22515
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-09-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22275#discussion_r219404072
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -4434,6 +4434,12 @@ def test_timestamp_dst(self):
 self.assertPandasEqual(pdf, df_from_python.toPandas())
 self.assertPandasEqual(pdf, df_from_pandas.toPandas())
 
+def test_toPandas_batch_order(self):
+df = self.spark.range(64, numPartitions=8).toDF("a")
+with 
self.sql_conf({"spark.sql.execution.arrow.maxRecordsPerBatch": 4}):
+pdf, pdf_arrow = self._toPandas_arrow_toggle(df)
+self.assertPandasEqual(pdf, pdf_arrow)
--- End diff --

hm, is this test case "enough" to trigger any possible problem just by 
random? would increasing the number of batch or num record per batch increase 
the chance of streaming order or concurrency issue perhaps?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...

2018-09-21 Thread seancxmao
Github user seancxmao commented on a diff in the pull request:

https://github.com/apache/spark/pull/22461#discussion_r219403696
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -462,6 +464,9 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
   .option("lowerBound", "2018-07-04 03:30:00.0")
   .option("upperBound", "2018-07-27 14:11:05.0")
   .option("numPartitions", 2)
+  .option("oracle.jdbc.mapDateToTimestamp", "false")
--- End diff --

Yes, we need this. Otherwise, Spark will read column `d` values as Catalyst 
type timestamp, which will fail the test.

https://user-images.githubusercontent.com/12194089/45865915-9e730800-bdb1-11e8-9a42-a1394c601166.png;>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22442: [SPARK-25447][SQL] Support JSON options by schema...

2018-09-21 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22442#discussion_r219403620
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -3611,6 +3611,20 @@ object functions {
*/
   def schema_of_json(e: Column): Column = withExpr(new 
SchemaOfJson(e.expr))
 
+  /**
+   * Parses a column containing a JSON string and infers its schema using 
options.
+   *
+   * @param e a string column containing JSON data.
+   * @param options JSON datasource options that control JSON parsing and 
type inference.
--- End diff --

As I see we don't fail . Simple example is if `multiLine` is enabled, 
`lineSep` is ignored. There are another examples.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send o...

2018-09-21 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/22275
  
got it. so the size of the each batch could grow. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22514
  
**[Test build #96410 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96410/testReport)**
 for PR 22514 at commit 
[`5debc60`](https://github.com/apache/spark/commit/5debc6096ae6e505d3386fd7eb569d154f158d55).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use...

2018-09-21 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22500#discussion_r219403342
  
--- Diff: sql/core/benchmarks/MiscBenchmark-results.txt ---
@@ -0,0 +1,132 @@

+
+filter & aggregate without group

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+range/filter/sum:Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+range/filter/sum wholestage off 36618 / 41080 57.3 
 17.5   1.0X
+range/filter/sum wholestage on2495 / 2609840.4 
  1.2  14.7X
+
+

+
+range/limit/sum

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+range/limit/sum: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+range/limit/sum wholestage off 117 /  121   4477.9 
  0.2   1.0X
+range/limit/sum wholestage on  178 /  187   2938.1 
  0.3   0.7X
+
+

+
+sample

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+sample with replacement: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+sample with replacement wholestage off9142 / 9182 14.3 
 69.8   1.0X
+sample with replacement wholestage on 5926 / 6107 22.1 
 45.2   1.5X
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+sample without replacement:  Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+sample without replacement wholestage off  1834 / 1837 71.5
  14.0   1.0X
+sample without replacement wholestage on   784 /  803167.2 
  6.0   2.3X
+
+

+
+collect

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+collect: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+collect 1 million  186 /  215  5.6 
177.5   1.0X
+collect 2 millions 361 /  393  2.9 
344.2   0.5X
+collect 4 millions 884 / 1053  1.2 
843.4   0.2X
+
+

+
+collect limit

+
+
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_151-b12 on Mac OS X 10.12.6
+Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
+
+collect limit:   Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative

+
+collect limit 1 million206 /  225  5.1 
196.6   1.0X
+collect limit 2 millions   407 /  419  2.6 
387.8   0.5X
+
+

+
+generate explode

+
   

[GitHub] spark issue #22458: [SPARK-25459] Add viewOriginalText back to CatalogTable

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22458
  
**[Test build #96411 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96411/testReport)**
 for PR 22458 at commit 
[`f3d3100`](https://github.com/apache/spark/commit/f3d3100399be442da9fd5e417aeefb9662903c49).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3336/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22455: [SPARK-24572][SPARKR] "eager execution" for R she...

2018-09-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22455#discussion_r219402870
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -244,11 +245,15 @@ setMethod("showDF",
 #' @note show(SparkDataFrame) since 1.4.0
 setMethod("show", "SparkDataFrame",
   function(object) {
-cols <- lapply(dtypes(object), function(l) {
-  paste(l, collapse = ":")
-})
-s <- paste(cols, collapse = ", ")
-cat(paste(class(object), "[", s, "]\n", sep = ""))
+if (identical(sparkR.conf("spark.sql.repl.eagerEval.enabled", 
"false")[[1]], "true")) {
--- End diff --

respecting `spark.sql.repl.eagerEval.maxNumRows` somewhat makes sense. but 
instead of changing `showDF` which has other cases beyond eagerEval, we could 
change where `showDF` is called by `show`, and pass the max rows.

here 
https://github.com/apache/spark/pull/22455/files#diff-508641a8bd6c6b59f3e77c80cdcfa6a9R249


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/22514
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22458: [SPARK-25459] Add viewOriginalText back to CatalogTable

2018-09-21 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22458
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22511
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22316
  
**[Test build #96409 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96409/testReport)**
 for PR 22316 at commit 
[`382640b`](https://github.com/apache/spark/commit/382640be9bb9739929daea0bceb3093836d7f78d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22511
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3335/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22511
  
**[Test build #96408 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96408/testReport)**
 for PR 22511 at commit 
[`aee82ab`](https://github.com/apache/spark/commit/aee82abe4cd9fbefa14fb280644276fe491bcf9a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22513
  
**[Test build #96407 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96407/testReport)**
 for PR 22513 at commit 
[`1c3c0f6`](https://github.com/apache/spark/commit/1c3c0f692d38b361f35017df3e999f7838e28e48).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22455: [SPARK-24572][SPARKR] "eager execution" for R she...

2018-09-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/22455#discussion_r219402281
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -244,11 +245,15 @@ setMethod("showDF",
 #' @note show(SparkDataFrame) since 1.4.0
 setMethod("show", "SparkDataFrame",
   function(object) {
-cols <- lapply(dtypes(object), function(l) {
-  paste(l, collapse = ":")
-})
-s <- paste(cols, collapse = ", ")
-cat(paste(class(object), "[", s, "]\n", sep = ""))
+if (identical(sparkR.conf("spark.sql.repl.eagerEval.enabled", 
"false")[[1]], "true")) {
--- End diff --

^ that - it's two different "overloads" for two different "class"


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...

2018-09-21 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22316
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3334/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22511
  
cc @jiangxb1987 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22511
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22513
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22512: [SPARK-25498][SQL][WIP] Fix SQLQueryTestSuite failures w...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22512
  
**[Test build #96405 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96405/testReport)**
 for PR 22512 at commit 
[`39c5e92`](https://github.com/apache/spark/commit/39c5e92713b86f342e756591235f9cbe25126f90).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] DECIMAL_OPERATIONS_ALLOW_PR...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22494
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified//
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] DECIMAL_OPERATIONS_ALLOW_PR...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22494
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22512: [SPARK-25498][SQL][WIP] Fix SQLQueryTestSuite failures w...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22512
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3332/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] DECIMAL_OPERATIONS_ALLOW_PR...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22494
  
**[Test build #96406 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96406/testReport)**
 for PR 22494 at commit 
[`1ee9f02`](https://github.com/apache/spark/commit/1ee9f0208a3cb6de373e05366c19bf69967eecd8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22512: [SPARK-25498][SQL][WIP] Fix SQLQueryTestSuite failures w...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22512
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22512: [SPARK-25498][SQL][WIP] Fix SQLQueryTestSuite failures w...

2018-09-21 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22512
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96401/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22458: [SPARK-25459] Add viewOriginalText back to CatalogTable

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22458
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96399/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22511
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96394/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22467: [SPARK-25465][TEST] Refactor Parquet test suites in proj...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22467
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96400/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22515
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96402/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96397/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22467: [SPARK-25465][TEST] Refactor Parquet test suites in proj...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22467
  
**[Test build #96400 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96400/testReport)**
 for PR 22467 at commit 
[`813d19c`](https://github.com/apache/spark/commit/813d19c63477b82a76bdd0d1da73cf3cb1d38564).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96403/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22316
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96404/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22458: [SPARK-25459] Add viewOriginalText back to CatalogTable

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22458
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22467: [SPARK-25465][TEST] Refactor Parquet test suites in proj...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22467
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22511
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22511: [SPARK-25422][CORE] Don't memory map blocks streamed to ...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22511
  
**[Test build #96394 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96394/testReport)**
 for PR 22511 at commit 
[`aee82ab`](https://github.com/apache/spark/commit/aee82abe4cd9fbefa14fb280644276fe491bcf9a).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22513
  
**[Test build #96403 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96403/testReport)**
 for PR 22513 at commit 
[`1c3c0f6`](https://github.com/apache/spark/commit/1c3c0f692d38b361f35017df3e999f7838e28e48).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22513
  
**[Test build #96397 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96397/testReport)**
 for PR 22513 at commit 
[`9288933`](https://github.com/apache/spark/commit/9288933b4a71e646e67f551dcfd80f9ff9a470da).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22316
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22515
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22514
  
**[Test build #96401 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96401/testReport)**
 for PR 22514 at commit 
[`5debc60`](https://github.com/apache/spark/commit/5debc6096ae6e505d3386fd7eb569d154f158d55).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22515
  
**[Test build #96402 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96402/testReport)**
 for PR 22515 at commit 
[`0f32b01`](https://github.com/apache/spark/commit/0f32b0170fe6295bfef604b5a679f9391b5ec78f).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22458: [SPARK-25459] Add viewOriginalText back to CatalogTable

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22458
  
**[Test build #96399 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96399/testReport)**
 for PR 22458 at commit 
[`f3d3100`](https://github.com/apache/spark/commit/f3d3100399be442da9fd5e417aeefb9662903c49).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22316
  
**[Test build #96404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96404/testReport)**
 for PR 22316 at commit 
[`382640b`](https://github.com/apache/spark/commit/382640be9bb9739929daea0bceb3093836d7f78d).
 * This patch **fails due to an unknown error code, -9**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18544: [SPARK-21318][SQL]Improve exception message thrown by `l...

2018-09-21 Thread stanzhai
Github user stanzhai commented on the issue:

https://github.com/apache/spark/pull/18544
  
@cloud-fan 

User's hive UDFs are registered in externalCatalog which not exists in 
functionRegistry.

It will throws a NoSuchFunctionException when an exception is encountered 
while loading a hive UDF.

But we should throw the original exception.

So, I just fix the issue by:

```
if (functionRegistry.functionExists(funcName)) {
  throw error
} else {
  ...
}
```

changed to:

```
if (super.functionExists(name)) {
  throw error
} else {
  ...
}
```

The following is implementation of `super.functionExists`

```
def functionExists(name: FunctionIdentifier): Boolean = {
  val db = formatDatabaseName(name.database.getOrElse(getCurrentDatabase))
  requireDbExists(db)
  functionRegistry.functionExists(name) ||
externalCatalog.functionExists(db, name.funcName)
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] DECIMAL_OPERATIONS_ALLOW_PR...

2018-09-21 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22494
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22471: [SPARK-25469][SQL][Performance] Eval methods of Concat, ...

2018-09-21 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/22471
  
away from keyboard now, so will do when I’m back. Thank!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmar...

2018-09-21 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22497
  
I see. I will wait in other PRs.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22331: [SPARK-25331][SS] Make FileStreamSink ignore part...

2018-09-21 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/spark/pull/22331#discussion_r219399313
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StagingFileCommitProtocol.scala
 ---
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming
+
+import org.apache.hadoop.fs.{FileAlreadyExistsException, FileContext, Path}
+import org.apache.hadoop.mapreduce.{JobContext, TaskAttemptContext}
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.io.FileCommitProtocol
+import org.apache.spark.internal.io.FileCommitProtocol.TaskCommitMessage
+
+class StagingFileCommitProtocol(jobId: String, path: String)
+  extends FileCommitProtocol with Serializable with Logging
+  with ManifestCommitProtocol {
+  private var stagingDir: Option[Path] = None
--- End diff --

Looks like you're using Option but always call `.get` without any checking. 
In `setupTask` it is fine since assignment is placed in there, but in 
`newTaskTempFile` we may be better to guard with `require` which achieves 
fail-fast and let `.get` always succeed later.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22316: [SPARK-25048][SQL] Pivoting by multiple columns in Scala...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22316
  
**[Test build #96404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96404/testReport)**
 for PR 22316 at commit 
[`382640b`](https://github.com/apache/spark/commit/382640be9bb9739929daea0bceb3093836d7f78d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmar...

2018-09-21 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22497
  
Congratulation, @kiszk

I am working on https://github.com/apache/spark/pull/22513 . If it is 
possible, could you wait until we reach an agreement in that PR before merging 
other benchmark?
In this PR,  there is also such problem:
```
import org.apache.spark.util.{Benchmark, BenchmarkBase => FileBenchmarkBase}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22513
  
**[Test build #96403 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96403/testReport)**
 for PR 22513 at commit 
[`1c3c0f6`](https://github.com/apache/spark/commit/1c3c0f692d38b361f35017df3e999f7838e28e48).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3331/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22513
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Ben...

2018-09-21 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/22513#discussion_r219397646
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala
 ---
@@ -27,7 +27,7 @@ import 
org.apache.spark.sql.functions.monotonically_increasing_id
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.internal.SQLConf.ParquetOutputTimestampType
 import org.apache.spark.sql.types.{ByteType, Decimal, DecimalType, 
TimestampType}
-import org.apache.spark.util.{Benchmark, BenchmarkBase => 
FileBenchmarkBase, Utils}
+import org.apache.spark.util.Utils
 
 /**
  * Benchmark to measure read performance with Filter pushdown.
--- End diff --

Thanks, I have updated the doc 👍 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219397495
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

The your first is correct since this patch addresses only codegen-on case. 
We can add another code to address codegen-off case.
Regarding the your second point, have we ever distingished a wrong output 
from a bad written UT when we defect the difference between `expression` and 
`expected`. I think that the distinguishment is nice to have, but not mandatory 
to have.


I have one question about your approach:
```
assert(containsNull(expected) && isNullable(expression.dataType))
```
Since the above two conditions evaluates `expected` and `expression` 
independently, how this works for the following case? I think that the 
assertion would be passed
```
expression:
dataType = StructType(ArrayType(IntegerType, false), ArrayType(IntegerType, 
true))
Struct(Array(0, null), Array(1, 0))

expected:
Struct(Array(0, 0), Array(1, null))
```




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22515
  
**[Test build #96402 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96402/testReport)**
 for PR 22515 at commit 
[`0f32b01`](https://github.com/apache/spark/commit/0f32b0170fe6295bfef604b5a679f9391b5ec78f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22515
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3330/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsingNonempt...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22515
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22508: [SPARK-23549][SQL] Rename config spark.sql.legacy...

2018-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22508


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22515: [SPARK-19724][SQL] allowCreatingManagedTableUsing...

2018-09-21 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/22515

[SPARK-19724][SQL] allowCreatingManagedTableUsingNonemptyLocation should 
have legacy prefix

One more legacy config to go ...

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark 
allowCreatingManagedTableUsingNonemptyLocation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22515.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22515


commit f7c372e6f803c86e189e984fa6c1dd81f84454e9
Author: Reynold Xin 
Date:   2018-09-21T02:10:10Z

[SPARK-19724][SQL] allowCreatingManagedTableUsingNonemptyLocation should 
have legacy prefix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22508: [SPARK-23549][SQL] Rename config spark.sql.legacy.compar...

2018-09-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22508
  
LGTM, merging to master/2.4!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22514
  
**[Test build #96401 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96401/testReport)**
 for PR 22514 at commit 
[`5debc60`](https://github.com/apache/spark/commit/5debc6096ae6e505d3386fd7eb569d154f158d55).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22514
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3329/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22513: [SPARK-25499][TEST]Refactor BenchmarkBase and Benchmark

2018-09-21 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22513
  
> KryoBenchmark is in core, and UnsafeProjectionBenchmark, 
HashByteArrayBenchmark and HashBenchmark are in catalyst. If we move the 
benchmark base class to sql, benchmarks mentioned above would not be able to 
inherit from the benchmark base class. What do you think?

@wangyum The cases you mentioned are currently using the 
`org.apache.spark.sql.execution.benchmark.BenchmarkBase` (the old one), so it 
seems fine.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22227: [SPARK-25202] [SQL] Implements split with limit sql func...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/7
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96389/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22163
  
**[Test build #96389 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96389/testReport)**
 for PR 22163 at commit 
[`2dc94a2`](https://github.com/apache/spark/commit/2dc94a24ab06141768413dc2bf6f9c5e29ce7249).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmar...

2018-09-21 Thread seancxmao
Github user seancxmao commented on the issue:

https://github.com/apache/spark/pull/22497
  
@kiszk @wangyum Thank you!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18544: [SPARK-21318][SQL]Improve exception message thrown by `l...

2018-09-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18544
  
Can you explain how do we fix the problem?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-09-21 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/22514

[SPARK-25271][SQL] Hive ctas commands should use data source if it is 
convertible

## What changes were proposed in this pull request?

We have a 
[regression](https://github.com/apache/spark/pull/20521/files#r217254430) since 
2.3.1 that Hive ctas command only uses Hive Serde to write data. Hive ctas 
command previously will use Parquet/Orc data source to write data if it is 
convertible.

Because of it, the related regression reported by this JIRA is when writing 
a empty map in to Hive using ctas, it hits Hive's known issue and is thrown 
exception.

## How was this patch tested?

Added test.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-25271-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22514.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22514


commit 5debc6096ae6e505d3386fd7eb569d154f158d55
Author: Liang-Chi Hsieh 
Date:   2018-09-12T10:33:53Z

Hive ctas commands should use data source format if it is convertible.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22509: [SPARK-25384][SQL] Clarify fromJsonForceNullableS...

2018-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22509


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmar...

2018-09-21 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/22497
  
Congratulation, @kiszk 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22509: [SPARK-25384][SQL] Clarify fromJsonForceNullableSchema w...

2018-09-21 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22509
  
lgtm, merging to master/2.4!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite I...

2018-09-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22461#discussion_r219392779
  
--- Diff: 
external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
 ---
@@ -462,6 +464,9 @@ class OracleIntegrationSuite extends 
DockerJDBCIntegrationSuite with SharedSQLCo
   .option("lowerBound", "2018-07-04 03:30:00.0")
   .option("upperBound", "2018-07-27 14:11:05.0")
   .option("numPartitions", 2)
+  .option("oracle.jdbc.mapDateToTimestamp", "false")
--- End diff --

Do we need this line?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] DECIMAL_OPERATIONS_ALLOW_PR...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22494
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] DECIMAL_OPERATIONS_ALLOW_PR...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22494
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96392/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmar...

2018-09-21 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/22497
  
Thanks! merging to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22494: [SPARK-22036][SQL][followup] DECIMAL_OPERATIONS_ALLOW_PR...

2018-09-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22494
  
**[Test build #96392 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96392/testReport)**
 for PR 22494 at commit 
[`1ee9f02`](https://github.com/apache/spark/commit/1ee9f0208a3cb6de373e05366c19bf69967eecd8).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22497: [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayB...

2018-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22497


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22509: [SPARK-25384][SQL] Clarify fromJsonForceNullableSchema w...

2018-09-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22509
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   >