[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97477/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
**[Test build #97477 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97477/testReport)**
 for PR 22608 at commit 
[`5d270f1`](https://github.com/apache/spark/commit/5d270f17dccbb2eac6d3c2ab8c12987e3d992086).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional in INT...

2018-10-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20433
  
@maropu Thanks! This is great to make our Spark SQL parser fully compatible 
with ANSI SQL. Please continue the efforts! 

cc @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional...

2018-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20433#discussion_r225784123
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -335,6 +335,12 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val ANSI_SQL_PARSER =
+buildConf("spark.sql.parser.ansi.enabled")
+  .doc("When true, tries to conform to ANSI SQL syntax.")
+  .booleanConf
+  .createWithDefault(false)
--- End diff --

Since the next is the 3.0 release, we will turn this on by default. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20433: [SPARK-23264][SQL] Make INTERVAL keyword optional...

2018-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20433#discussion_r225783980
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -335,6 +335,12 @@ object SQLConf {
 .booleanConf
 .createWithDefault(true)
 
+  val ANSI_SQL_PARSER =
--- End diff --

The legacy flag will be removed in 3.0 release. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22746: [SPARK-24499][SQL][DOC] Split the page of sql-pro...

2018-10-16 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/22746#discussion_r225783658
  
--- Diff: docs/sql-reference.md ---
@@ -0,0 +1,641 @@
+---
+layout: global
+title: Reference
+displayTitle: Reference
+---
+
+* Table of contents
+{:toc}
+
+## Data Types
+
+Spark SQL and DataFrames support the following data types:
+
+* Numeric types
+- `ByteType`: Represents 1-byte signed integer numbers.
--- End diff --

nit: use 2 space indent.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4052/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22749
  
**[Test build #97480 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97480/testReport)**
 for PR 22749 at commit 
[`25a6162`](https://github.com/apache/spark/commit/25a616286075ca4f0a7d528095b387172b05c6c3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22219: [SPARK-25224][SQL] Improvement of Spark SQL ThriftServer...

2018-10-16 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22219
  
cc @srinathshankar @yuchenhuo 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22746: [SPARK-24499][SQL][DOC] Split the page of sql-pro...

2018-10-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22746#discussion_r225780740
  
--- Diff: docs/sql-getting-started.md ---
@@ -0,0 +1,369 @@
+---
+layout: global
+title: Getting Started
+displayTitle: Getting Started
+---
+
+* Table of contents
+{:toc}
+
+## Starting Point: SparkSession
+
+
+
+
+The entry point into all functionality in Spark is the 
[`SparkSession`](api/scala/index.html#org.apache.spark.sql.SparkSession) class. 
To create a basic `SparkSession`, just use `SparkSession.builder()`:
+
+{% include_example init_session 
scala/org/apache/spark/examples/sql/SparkSQLExample.scala %}
+
+
+
+
+The entry point into all functionality in Spark is the 
[`SparkSession`](api/java/index.html#org.apache.spark.sql.SparkSession) class. 
To create a basic `SparkSession`, just use `SparkSession.builder()`:
+
+{% include_example init_session 
java/org/apache/spark/examples/sql/JavaSparkSQLExample.java %}
+
+
+
+
+The entry point into all functionality in Spark is the 
[`SparkSession`](api/python/pyspark.sql.html#pyspark.sql.SparkSession) class. 
To create a basic `SparkSession`, just use `SparkSession.builder`:
+
+{% include_example init_session python/sql/basic.py %}
+
+
+
+
+The entry point into all functionality in Spark is the 
[`SparkSession`](api/R/sparkR.session.html) class. To initialize a basic 
`SparkSession`, just call `sparkR.session()`:
+
+{% include_example init_session r/RSparkSQLExample.R %}
+
+Note that when invoked for the first time, `sparkR.session()` initializes 
a global `SparkSession` singleton instance, and always returns a reference to 
this instance for successive invocations. In this way, users only need to 
initialize the `SparkSession` once, then SparkR functions like `read.df` will 
be able to access this global instance implicitly, and users don't need to pass 
the `SparkSession` instance around.
+
+
+
+`SparkSession` in Spark 2.0 provides builtin support for Hive features 
including the ability to
+write queries using HiveQL, access to Hive UDFs, and the ability to read 
data from Hive tables.
+To use these features, you do not need to have an existing Hive setup.
+
+## Creating DataFrames
+
+
+
+With a `SparkSession`, applications can create DataFrames from an 
[existing `RDD`](#interoperating-with-rdds),
+from a Hive table, or from [Spark data sources](#data-sources).
--- End diff --

The link `[Spark data sources](#data-sources)` does not work after this 
change. Could you fix all the similar cases? Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22694: [SQL][CATALYST][MINOR] update some error comments

2018-10-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22694


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22694: [SQL][CATALYST][MINOR] update some error comments

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22694
  
Merged to master and branch-2.4.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22503: [SPARK-25493][SQL] Use auto-detection for CRLF in CSV da...

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22503
  
@justinuang, okay. Mind rebasing this please?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22263
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22263
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97476/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22263
  
**[Test build #97476 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97476/testReport)**
 for PR 22263 at commit 
[`5e088b8`](https://github.com/apache/spark/commit/5e088b86822dd6b1bf4c3bb085fde3c96af03658).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22295: [SPARK-25255][PYTHON]Add getActiveSession to SparkSessio...

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22295
  
@huaxingao, thanks for addressing comments. Would you mind rebasing it and 
resolving the conflicts?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22752
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97474/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22752
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22752
  
**[Test build #97474 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97474/testReport)**
 for PR 22752 at commit 
[`a3f53c4`](https://github.com/apache/spark/commit/a3f53c41879e28d71d4dbd79d80a51e50d82ecee).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22482
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22482
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97475/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22482
  
**[Test build #97475 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97475/testReport)**
 for PR 22482 at commit 
[`5c74609`](https://github.com/apache/spark/commit/5c746090a8d5560f043754383656d54653a315dc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22729: [SPARK-25737][CORE] Remove JavaSparkContextVarargsWorkar...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22729
  
**[Test build #4380 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4380/testReport)**
 for PR 22729 at commit 
[`0860d27`](https://github.com/apache/spark/commit/0860d27a205d3dd3d94e6bbe2c9db49b7e432ef4).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22749
  
**[Test build #97479 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97479/testReport)**
 for PR 22749 at commit 
[`6a6fa45`](https://github.com/apache/spark/commit/6a6fa454e22728cc2ad8e5515cd587fe0be84b26).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97479/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225769471
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
+.load("src/test/resources/test-data/with-array-fields")
+.as(encoder);
+
+List records = dataset.collectAsList();
+
+Assert.assertTrue(Util.equals(records, RECORDS));
+}
+
+private static StructType createSchema() {
+StructField[] intervalFields = {
+new StructField("startTime", DataTypes.LongType, true, 
Metadata.empty()),
+new StructField("endTime", DataTypes.LongType, true, 
Metadata.empty())
+};
+DataType intervalType = new StructType(intervalFields);
+
+DataType intervalsType = new ArrayType(intervalType, true);
+
+DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
+
+StructField[] fields = {
+new StructField("id", DataTypes.IntegerType, true, 
Metadata.empty()),
+new StructField("intervals", intervalsType, true, 
Metadata.empty()),
+new StructField("values", valuesType, true, 
Metadata.empty())
+};
+return new StructType(fields);
+}
+
+public static class Record {
+
+private int id;
+private List intervals;
+private List values;
+
+public Record() { }
+
+Record(int id, List intervals, List values) {
+this.id = id;
+this.intervals = intervals;
+this.values = values;
+}
+
+public int getId() {
+return id;
+}
+
+public void setId(int id) {
  

[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225768857
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
+.load("src/test/resources/test-data/with-array-fields")
+.as(encoder);
+
+List records = dataset.collectAsList();
+
+Assert.assertTrue(Util.equals(records, RECORDS));
+}
+
+private static StructType createSchema() {
+StructField[] intervalFields = {
+new StructField("startTime", DataTypes.LongType, true, 
Metadata.empty()),
+new StructField("endTime", DataTypes.LongType, true, 
Metadata.empty())
+};
+DataType intervalType = new StructType(intervalFields);
+
+DataType intervalsType = new ArrayType(intervalType, true);
+
+DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
+
+StructField[] fields = {
+new StructField("id", DataTypes.IntegerType, true, 
Metadata.empty()),
+new StructField("intervals", intervalsType, true, 
Metadata.empty()),
+new StructField("values", valuesType, true, 
Metadata.empty())
+};
+return new StructType(fields);
+}
+
+public static class Record {
+
+private int id;
+private List intervals;
+private List values;
--- End diff --

Will this list of int affect the test? If no, maybe we can get rid of it to 
simplify the test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22745: [SPARK-21402][SQL][FOLLOW-UP] Fix java map of str...

2018-10-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22745#discussion_r225768707
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithMapSuite.java ---
@@ -0,0 +1,257 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.MapType;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithMapSuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+toMap(
+Arrays.asList("a", "b"),
+Arrays.asList(new Interval(111, 211), new 
Interval(121, 221))
+),
+toMap(Arrays.asList("a", "b", "c"), Arrays.asList(11, 21, 
31))
+));
+RECORDS.add(new Record(2,
+toMap(
+Arrays.asList("a", "b"),
+Arrays.asList(new Interval(112, 212), new 
Interval(122, 222))
+),
+toMap(Arrays.asList("a", "b", "c"), Arrays.asList(12, 22, 
32))
+));
+RECORDS.add(new Record(3,
+toMap(
+Arrays.asList("a", "b"),
+Arrays.asList(new Interval(113, 213), new 
Interval(123, 223))
+),
+toMap(Arrays.asList("a", "b", "c"), Arrays.asList(13, 23, 
33))
+));
+}
+
+private static  Map toMap(Collection keys, 
Collection values) {
+Map map = new HashMap<>();
+Iterator keyI = keys.iterator();
+Iterator valueI = values.iterator();
+while (keyI.hasNext() && valueI.hasNext()) {
+map.put(keyI.next(), valueI.next());
+}
+return map;
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithMapFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
+.load("src/test/resources/test-data/with-map-fields")
+.as(encoder);
+
+List records = dataset.collectAsList();
+
+Assert.assertTrue(Util.equals(records, RECORDS));
+}
+
+private static StructType createSchema() {
+StructField[] intervalFields = {
+new StructField("startTime", DataTypes.LongType, true, 
Metadata.empty()),
+new StructField("endTime", DataTypes.LongType, true, 
Metadata.empty())
+};
+DataType intervalType = new StructType(intervalFields);
+
+DataType intervalsType = new MapType(DataTypes.StringType, 
intervalType, true);
+
+DataType valuesType = new MapType(DataTypes.StringType, 

[GitHub] spark pull request #22724: [SPARK-25734][SQL] Literal should have a value co...

2018-10-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22724


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22724: [SPARK-25734][SQL] Literal should have a value correspon...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22724
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225767103
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
+.load("src/test/resources/test-data/with-array-fields")
+.as(encoder);
+
+List records = dataset.collectAsList();
+
+Assert.assertTrue(Util.equals(records, RECORDS));
+}
+
+private static StructType createSchema() {
+StructField[] intervalFields = {
+new StructField("startTime", DataTypes.LongType, true, 
Metadata.empty()),
+new StructField("endTime", DataTypes.LongType, true, 
Metadata.empty())
+};
+DataType intervalType = new StructType(intervalFields);
+
+DataType intervalsType = new ArrayType(intervalType, true);
+
+DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
+
+StructField[] fields = {
+new StructField("id", DataTypes.IntegerType, true, 
Metadata.empty()),
+new StructField("intervals", intervalsType, true, 
Metadata.empty()),
+new StructField("values", valuesType, true, 
Metadata.empty())
+};
+return new StructType(fields);
+}
+
+public static class Record {
+
+private int id;
+private List intervals;
+private List values;
+
+public Record() { }
+
+Record(int id, List intervals, List values) {
+this.id = id;
+this.intervals = intervals;
+this.values = values;
+}
+
+public int getId() {
+return id;
+}
+
+public void setId(int id) 

[GitHub] spark issue #22745: [SPARK-21402][SQL][FOLLOW-UP] Fix java map of structs de...

2018-10-16 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22745
  
It's a different issue, I think it worth a new ticket


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/22732#discussion_r225764876
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ---
@@ -81,11 +81,11 @@ case class UserDefinedFunction protected[sql] (
   f,
   dataType,
   exprs.map(_.expr),
+  nullableTypes.map(_.map(!_)).getOrElse(exprs.map(_ => false)),
--- End diff --

Hm, but we can't use getParameterTypes anymore. It won't work in Scala 
2.12. Where the nullability info is definitely not available, be conservative 
and assume it all needs null handling?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22746: [SPARK-24499][SQL][DOC] Split the page of sql-programmin...

2018-10-16 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/22746
  
@gatorsmile Sorry for the late on this, please have a look when you have 
time.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22749
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4051/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test status success
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4050/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4050/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread maryannxue
Github user maryannxue commented on a diff in the pull request:

https://github.com/apache/spark/pull/22732#discussion_r225762708
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ---
@@ -81,11 +81,11 @@ case class UserDefinedFunction protected[sql] (
   f,
   dataType,
   exprs.map(_.expr),
+  nullableTypes.map(_.map(!_)).getOrElse(exprs.map(_ => false)),
--- End diff --

In addition to what I just pointed out, which is when we did try to get 
`inputSchemas` through `ScalaReflection.schemaFor` and got an exception for 
unrecognized types, there's another case where we could get an unspecified 
`nullableTypes`, and that is when `UserDefinedFunction` is instantiated calling 
the constructor but not the `create` method.
Then I assume it's created by an earlier version, and we should use the old 
logic, i.e., `ScalaReflection.getParameterTypes` 
(https://github.com/apache/spark/pull/22259/files#diff-57b3d87be744b7d79a9beacf8e5e5eb2L2153)
 to get the correct information for `nullableTypes`. Is that right, @cloud-fan 
@srowen ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22749: [WIP][SPARK-25746][SQL] Refactoring ExpressionEncoder to...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22749
  
**[Test build #97479 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97479/testReport)**
 for PR 22749 at commit 
[`6a6fa45`](https://github.com/apache/spark/commit/6a6fa454e22728cc2ad8e5515cd587fe0be84b26).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4050/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in P...

2018-10-16 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/21990#discussion_r225762148
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala 
---
@@ -1136,4 +1121,27 @@ object SparkSession extends Logging {
   SparkSession.clearDefaultSession()
 }
   }
+
+  /**
+   * Initialize extensions if the user has defined a configurator class in 
their SparkConf.
+   * This class will be applied to the extensions passed into this 
function.
+   */
+  private[sql] def applyExtensionsFromConf(conf: SparkConf, extensions: 
SparkSessionExtensions) {
--- End diff --

Oh, I see, moving to the default constructor was not a good idea.
How about the first suggestion?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22263: [SPARK-25269][SQL] SQL interface support specify ...

2018-10-16 Thread wangyum
Github user wangyum commented on a diff in the pull request:

https://github.com/apache/spark/pull/22263#discussion_r225762035
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -288,6 +297,65 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils with SharedSQLContext
 }
   }
 
+  test("SQL interface support storageLevel(DISK_ONLY)") {
--- End diff --

How about this:
```scala
Seq("LAZY", "").foreach { isLazy =>
  Seq(true, false).foreach { withInvalidOptions =>
Seq(true, false).foreach { withCacheTempView =>
  Map("DISK_ONLY" -> Disk, "MEMORY_ONLY" -> Memory).foreach {
case (storageLevel, dataReadMethod) =>
  val testName = s"SQL interface support option: storageLevel: 
$storageLevel, " +
s"isLazy: ${isLazy.equals("LAZY")}, " +
s"withInvalidOptions: $withInvalidOptions, withCacheTempView: 
$withCacheTempView"
  val cacheOption = if (withInvalidOptions) {
s"OPTIONS('storageLevel' '$storageLevel', 'a' '1', 'b' '2')"
  } else {
s"OPTIONS('storageLevel' '$storageLevel')"
  }
  test(testName) {
if (withCacheTempView) {
  withTempView("testSelect") {
sql(s"CACHE $isLazy TABLE testSelect $cacheOption SELECT * 
FROM testData")
assertCached(spark.table("testSelect"))
val rddId = rddIdOf("testSelect")
if (isLazy.equals("LAZY")) {
  sql("SELECT COUNT(*) FROM testSelect").collect()
}
assert(isExpectStorageLevel(rddId, dataReadMethod))
  }
} else {
  sql(s"CACHE $isLazy TABLE testData $cacheOption")
  assertCached(spark.table("testData"))
  val rddId = rddIdOf("testData")
  if (isLazy.equals("LAZY")) {
sql("SELECT COUNT(*) FROM testData").collect()
  }
  assert(isExpectStorageLevel(rddId, dataReadMethod))
}
  }
  }
}
  }
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21588
  
@rxin and @gatorsmile, WDYT?

I already had to argue about Hadoop 3 support here and there (for instance 
see [SPARK-18112|https://issues.apache.org/jira/browse/SPARK-18112] and 
[SPARK-18673|https://issues.apache.org/jira/browse/SPARK-18673]), and explain 
what's going on.

Looks ideally we should go ahead 2. 
(https://github.com/apache/spark/pull/21588#issuecomment-429272279) if I am not 
mistaken. If there are some more concerns we should address before going ahead, 
definitely I am willing to help investigating as well.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22612: [SPARK-24958] Add executors' process tree total memory i...

2018-10-16 Thread rezasafi
Github user rezasafi commented on the issue:

https://github.com/apache/spark/pull/22612
  
Jenkins retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
**[Test build #97478 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97478/testReport)**
 for PR 22608 at commit 
[`4c9b886`](https://github.com/apache/spark/commit/4c9b886c1f23bbdd3d8e1ec7df25f03e45892d88).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test status failure
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4049/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22608
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4049/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4049/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-10-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22707#discussion_r225759293
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 ---
@@ -227,18 +227,22 @@ case class InsertIntoHiveTable(
   // Newer Hive largely improves insert overwrite performance. As 
Spark uses older Hive
   // version and we may not want to catch up new Hive version 
every time. We delete the
   // Hive partition first and then load data file into the Hive 
partition.
-  if (oldPart.nonEmpty && overwrite) {
-oldPart.get.storage.locationUri.foreach { uri =>
-  val partitionPath = new Path(uri)
-  val fs = partitionPath.getFileSystem(hadoopConf)
-  if (fs.exists(partitionPath)) {
-if (!fs.delete(partitionPath, true)) {
-  throw new RuntimeException(
-"Cannot remove partition directory '" + 
partitionPath.toString)
-}
-// Don't let Hive do overwrite operation since it is 
slower.
-doHiveOverwrite = false
+  if (overwrite) {
+val oldPartitionPath = 
oldPart.flatMap(_.storage.locationUri.map(new Path(_)))
+  .getOrElse {
+ExternalCatalogUtils.generatePartitionPath(
+  partitionSpec,
+  partitionColumnNames,
+  HiveClientImpl.toHiveTable(table).getDataLocation)
--- End diff --

Looks correct as I saw we assign `CatalogTable.storage.locationUr` to 
HiveTable's data location.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22608: [SPARK-25750][K8S][TESTS] Kerberos Support Integration T...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22608
  
**[Test build #97477 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97477/testReport)**
 for PR 22608 at commit 
[`5d270f1`](https://github.com/apache/spark/commit/5d270f17dccbb2eac6d3c2ab8c12987e3d992086).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22379


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21588: [SPARK-24590][BUILD] Make Jenkins tests passed with hado...

2018-10-16 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/21588
  
Thanks @HyukjinKwon 
Upgrade Hive to 2.3.2 can fix 
[SPARK-12014](https://issues.apache.org/jira/browse/SPARK-12014), 
[SPARK-18673](https://issues.apache.org/jira/browse/SPARK-18673), 
[SPARK-24766](https://issues.apache.org/jira/browse/SPARK-24766) and 
[SPARK-25193](https://issues.apache.org/jira/browse/SPARK-25193).
Also, can improve the performance of the 
[SPARK-18107](https://issues.apache.org/jira/browse/SPARK-18107).
Seems it doesn’t break backward compatibility. I have verified it in our 
production environment (Hive 1.2.1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22666
  
Woah .. let me resolve the conflicts tonight.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22379
  
Thanks all!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()

2018-10-16 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22379
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22707: [SPARK-25717][SQL] Insert overwrite a recreated e...

2018-10-16 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22707#discussion_r225756219
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 ---
@@ -227,18 +227,22 @@ case class InsertIntoHiveTable(
   // Newer Hive largely improves insert overwrite performance. As 
Spark uses older Hive
   // version and we may not want to catch up new Hive version 
every time. We delete the
   // Hive partition first and then load data file into the Hive 
partition.
-  if (oldPart.nonEmpty && overwrite) {
-oldPart.get.storage.locationUri.foreach { uri =>
-  val partitionPath = new Path(uri)
-  val fs = partitionPath.getFileSystem(hadoopConf)
-  if (fs.exists(partitionPath)) {
-if (!fs.delete(partitionPath, true)) {
-  throw new RuntimeException(
-"Cannot remove partition directory '" + 
partitionPath.toString)
-}
-// Don't let Hive do overwrite operation since it is 
slower.
-doHiveOverwrite = false
+  if (overwrite) {
+val oldPartitionPath = 
oldPart.flatMap(_.storage.locationUri.map(new Path(_)))
+  .getOrElse {
+ExternalCatalogUtils.generatePartitionPath(
+  partitionSpec,
+  partitionColumnNames,
+  HiveClientImpl.toHiveTable(table).getDataLocation)
--- End diff --

> 
> 
> `HiveClientImpl.toHiveTable(table).getDataLocation` -> `new 
Path(table.location)`?

Yes, they get the same value. I'll change it , thank you very much.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22748: [SPARK-25745][K8S] Improve docker-image-tool.sh script

2018-10-16 Thread ifilonenko
Github user ifilonenko commented on the issue:

https://github.com/apache/spark/pull/22748
  
There seems to be overlapping logic between this PR and 
https://github.com/apache/spark/pull/22681


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21990
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21990
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97472/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21990: [SPARK-25003][PYSPARK] Use SessionExtensions in Pyspark

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21990
  
**[Test build #97472 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97472/testReport)**
 for PR 21990 at commit 
[`d9b2a55`](https://github.com/apache/spark/commit/d9b2a55275b74c406d9f9c435bf1b53a6ef4b35a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22745: [SPARK-21402][SQL][FOLLOW-UP] Fix java map of structs de...

2018-10-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22745
  
Is this a separate PR because this part is pretty separable, and you think 
could be considered separately? if it's all part of one logical change that 
should go in together or not at all, they can be in the original PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22598: [SPARK-25501][SS] Add kafka delegation token supp...

2018-10-16 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request:

https://github.com/apache/spark/pull/22598#discussion_r225752604
  
--- Diff: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/TokenUtil.scala
 ---
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.kafka010
+
+import java.text.SimpleDateFormat
+import java.util.Properties
+
+import org.apache.hadoop.io.Text
+import org.apache.hadoop.security.token.{Token, TokenIdentifier}
+import 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenIdentifier
+import org.apache.kafka.clients.CommonClientConfigs
+import org.apache.kafka.clients.admin.{AdminClient, 
CreateDelegationTokenOptions}
+import org.apache.kafka.common.config.SaslConfigs
+import org.apache.kafka.common.security.token.delegation.DelegationToken
+
+import org.apache.spark.SparkConf
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config._
+
+private[kafka010] object TokenUtil extends Logging {
+  private[kafka010] val TOKEN_KIND = new Text("KAFKA_DELEGATION_TOKEN")
+  private[kafka010] val TOKEN_SERVICE = new 
Text("kafka.server.delegation.token")
+
+  private[kafka010] class KafkaDelegationTokenIdentifier extends 
AbstractDelegationTokenIdentifier {
+override def getKind: Text = TOKEN_KIND;
+  }
+
+  private def printToken(token: DelegationToken): Unit = {
+if (log.isDebugEnabled) {
+  val dateFormat = new SimpleDateFormat("-MM-dd'T'HH:mm")
+  logDebug("%-15s %-30s %-15s %-25s %-15s %-15s %-15s".format(
+"TOKENID", "HMAC", "OWNER", "RENEWERS", "ISSUEDATE", "EXPIRYDATE", 
"MAXDATE"))
+  val tokenInfo = token.tokenInfo
+  logDebug("%-15s [hidden] %-15s %-25s %-15s %-15s %-15s".format(
+tokenInfo.tokenId,
+tokenInfo.owner,
+tokenInfo.renewersAsString,
+dateFormat.format(tokenInfo.issueTimestamp),
+dateFormat.format(tokenInfo.expiryTimestamp),
+dateFormat.format(tokenInfo.maxTimestamp)))
+}
+  }
+
+  private[kafka010] def createAdminClientProperties(sparkConf: SparkConf): 
Properties = {
+val adminClientProperties = new Properties
+
+val bootstrapServers = sparkConf.get(KAFKA_BOOTSTRAP_SERVERS)
+require(bootstrapServers.nonEmpty, s"Tried to obtain kafka delegation 
token but bootstrap " +
+  "servers not configured.")
+
adminClientProperties.put(CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, 
bootstrapServers.get)
+
+val protocol = sparkConf.get(KAFKA_SECURITY_PROTOCOL)
+
adminClientProperties.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, 
protocol)
+if (protocol.endsWith("SSL")) {
+  logInfo("SSL protocol detected.")
+  sparkConf.get(KAFKA_TRUSTSTORE_LOCATION).foreach { 
truststoreLocation =>
+adminClientProperties.put("ssl.truststore.location", 
truststoreLocation)
+  }
+  sparkConf.get(KAFKA_TRUSTSTORE_PASSWORD).foreach { 
truststorePassword =>
+adminClientProperties.put("ssl.truststore.password", 
truststorePassword)
+  }
+} else {
+  logWarning("Obtaining kafka delegation token through plain 
communication channel. Please " +
+"consider the security impact.")
+}
+
+// There are multiple possibilities to log in:
+// - Keytab is provided -> try to log in with kerberos module using 
kafka's dynamic JAAS
+//   configuration.
+// - Keytab not provided -> try to log in with JVM global security 
configuration
+//   which can be configured for example with 
'java.security.auth.login.config'.
+//   For this no additional parameter needed.
+KafkaSecurityHelper.getKeytabJaasParams(sparkConf).foreach { 
jaasParams =>
+  logInfo("Keytab detected, using it for login.")
+  adminClientProperties.put(SaslConfigs.SASL_MECHANISM, 
SaslConfigs.GSSAPI_MECHANISM)

[GitHub] spark issue #22725: [SPARK-24610][[CORE][FOLLOW-UP]fix reading small files v...

2018-10-16 Thread 10110346
Github user 10110346 commented on the issue:

https://github.com/apache/spark/pull/22725
  
@tgravescs ok, I will do it ,thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225752208
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
--- End diff --

If we remove `createSchema`, we can remove Line 35 ~ 40 , too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225751969
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
+.load("src/test/resources/test-data/with-array-fields")
+.as(encoder);
+
+List records = dataset.collectAsList();
+
+Assert.assertTrue(Util.equals(records, RECORDS));
+}
+
+private static StructType createSchema() {
+StructField[] intervalFields = {
+new StructField("startTime", DataTypes.LongType, true, 
Metadata.empty()),
+new StructField("endTime", DataTypes.LongType, true, 
Metadata.empty())
+};
+DataType intervalType = new StructType(intervalFields);
+
+DataType intervalsType = new ArrayType(intervalType, true);
+
+DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
+
+StructField[] fields = {
+new StructField("id", DataTypes.IntegerType, true, 
Metadata.empty()),
+new StructField("intervals", intervalsType, true, 
Metadata.empty()),
+new StructField("values", valuesType, true, 
Metadata.empty())
+};
+return new StructType(fields);
+}
+
+public static class Record {
+
+private int id;
+private List intervals;
+private List values;
+
+public Record() { }
+
+Record(int id, List intervals, List values) {
+this.id = id;
+this.intervals = intervals;
+this.values = values;
+}
+
+public int getId() {
+return id;
+}
+
+public void setId(int 

[GitHub] spark issue #22655: [SPARK-25666][PYTHON] Internally document type conversio...

2018-10-16 Thread BryanCutler
Github user BryanCutler commented on the issue:

https://github.com/apache/spark/pull/22655
  
Thanks @viirya !


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225751513
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
+.load("src/test/resources/test-data/with-array-fields")
+.as(encoder);
+
+List records = dataset.collectAsList();
+
+Assert.assertTrue(Util.equals(records, RECORDS));
+}
+
+private static StructType createSchema() {
+StructField[] intervalFields = {
+new StructField("startTime", DataTypes.LongType, true, 
Metadata.empty()),
+new StructField("endTime", DataTypes.LongType, true, 
Metadata.empty())
+};
+DataType intervalType = new StructType(intervalFields);
+
+DataType intervalsType = new ArrayType(intervalType, true);
+
+DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
+
+StructField[] fields = {
+new StructField("id", DataTypes.IntegerType, true, 
Metadata.empty()),
+new StructField("intervals", intervalsType, true, 
Metadata.empty()),
+new StructField("values", valuesType, true, 
Metadata.empty())
+};
+return new StructType(fields);
+}
+
+public static class Record {
+
+private int id;
+private List intervals;
+private List values;
+
+public Record() { }
+
+Record(int id, List intervals, List values) {
+this.id = id;
+this.intervals = intervals;
+this.values = values;
+}
+
+public int getId() {
+return id;
+}
+
+public void setId(int 

[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225751459
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
+.load("src/test/resources/test-data/with-array-fields")
+.as(encoder);
+
+List records = dataset.collectAsList();
+
+Assert.assertTrue(Util.equals(records, RECORDS));
+}
+
+private static StructType createSchema() {
+StructField[] intervalFields = {
+new StructField("startTime", DataTypes.LongType, true, 
Metadata.empty()),
+new StructField("endTime", DataTypes.LongType, true, 
Metadata.empty())
+};
+DataType intervalType = new StructType(intervalFields);
+
+DataType intervalsType = new ArrayType(intervalType, true);
+
+DataType valuesType = new ArrayType(DataTypes.IntegerType, true);
+
+StructField[] fields = {
+new StructField("id", DataTypes.IntegerType, true, 
Metadata.empty()),
+new StructField("intervals", intervalsType, true, 
Metadata.empty()),
+new StructField("values", valuesType, true, 
Metadata.empty())
+};
+return new StructType(fields);
+}
+
+public static class Record {
+
+private int id;
+private List intervals;
+private List values;
+
+public Record() { }
+
+Record(int id, List intervals, List values) {
+this.id = id;
+this.intervals = intervals;
+this.values = values;
+}
+
+public int getId() {
+return id;
+}
+
+public void setId(int 

[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22263
  
**[Test build #97476 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97476/testReport)**
 for PR 22263 at commit 
[`5e088b8`](https://github.com/apache/spark/commit/5e088b86822dd6b1bf4c3bb085fde3c96af03658).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22482: WIP - [SPARK-10816][SS] Support session window natively

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22482
  
**[Test build #97475 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97475/testReport)**
 for PR 22482 at commit 
[`5c74609`](https://github.com/apache/spark/commit/5c746090a8d5560f043754383656d54653a315dc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22263
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4048/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22263
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22729: [SPARK-25737][CORE] Remove JavaSparkContextVarargsWorkar...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22729
  
**[Test build #4380 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4380/testReport)**
 for PR 22729 at commit 
[`0860d27`](https://github.com/apache/spark/commit/0860d27a205d3dd3d94e6bbe2c9db49b7e432ef4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225749733
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
--- End diff --

@vofque Please note the `startTime` and `endTime`. It should be 
case-sensitive.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225749174
  
--- Diff: 
sql/core/src/test/java/test/org/apache/spark/sql/JavaBeanWithArraySuite.java ---
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package test.org.apache.spark.sql;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Iterator;
+import java.util.List;
+
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Encoder;
+import org.apache.spark.sql.Encoders;
+import org.apache.spark.sql.test.TestSparkSession;
+import org.apache.spark.sql.types.ArrayType;
+import org.apache.spark.sql.types.DataType;
+import org.apache.spark.sql.types.DataTypes;
+import org.apache.spark.sql.types.Metadata;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+public class JavaBeanWithArraySuite {
+
+private static final List RECORDS = new ArrayList<>();
+
+static {
+RECORDS.add(new Record(1,
+Arrays.asList(new Interval(111, 211), new Interval(121, 
221)),
+Arrays.asList(11, 21, 31, 41)
+));
+RECORDS.add(new Record(2,
+Arrays.asList(new Interval(112, 212), new Interval(122, 
222)),
+Arrays.asList(12, 22, 32, 42)
+));
+RECORDS.add(new Record(3,
+Arrays.asList(new Interval(113, 213), new Interval(123, 
223)),
+Arrays.asList(13, 23, 33, 43)
+));
+}
+
+private TestSparkSession spark;
+
+@Before
+public void setUp() {
+spark = new TestSparkSession();
+}
+
+@After
+public void tearDown() {
+spark.stop();
+spark = null;
+}
+
+@Test
+public void testBeanWithArrayFieldsDeserialization() {
+
+StructType schema = createSchema();
+Encoder encoder = Encoders.bean(Record.class);
+
+Dataset dataset = spark
+.read()
+.format("json")
+.schema(schema)
--- End diff --

I'm wondering if we can use the latest and neat approach in this PR. Then, 
we can remove `createSchema()` here.
```scala
- .schema(schema)
+ .schema("id int, intervals array>, values array")
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22288
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22288
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97469/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22288
  
**[Test build #97469 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97469/testReport)**
 for PR 22288 at commit 
[`2c5a753`](https://github.com/apache/spark/commit/2c5a75354d36d08199b9805a7513a4ec4a546a27).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22752
  
**[Test build #97474 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97474/testReport)**
 for PR 22752 at commit 
[`a3f53c4`](https://github.com/apache/spark/commit/a3f53c41879e28d71d4dbd79d80a51e50d82ecee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22752
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22752
  
add to whitelist


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22752
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22752
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingListener...

2018-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22752
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22752: [SPARK-24787][CORE] Revert hsync in EventLoggingL...

2018-10-16 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/22752

[SPARK-24787][CORE] Revert hsync in EventLoggingListener and make 
FsHistoryProvider to read lastBlockBeingWritten data for logs

## What changes were proposed in this pull request?

`hsync` has been added as part of SPARK-19531 to get the latest data in the 
history sever ui, but that is causing the performance overhead and also leading 
to drop many history log events. `hsync` uses the force `FileChannel.force` to 
sync the data to the disk and happens for the data pipeline, it is costly 
operation and making the application to face overhead and drop the events.

I think getting the latest data in history server can be done in different 
way (no impact to application while writing events), there is an api 
`DFSInputStream.getFileLength()` which gives the file length including the 
`lastBlockBeingWrittenLength`(different from `FileStatus.getLen()`), this api 
can be used when the file status length and previously cached length are equal 
to verify whether any new data has been written or not, if there is any update 
in data length then the history server can update the in progress history log. 
And also I made this change as configurable with the default value false, and 
can be enabled for history server if users want to see the updated data in ui.

## How was this patch tested?

Added new test and verified manually, with the added conf 
`spark.history.fs.inProgressAbsoluteLengthCheck.enabled=true`, history server 
is reading the logs including the last block data which is being written and 
updating the Web UI with the latest data.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-24787

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22752.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22752


commit a3f53c41879e28d71d4dbd79d80a51e50d82ecee
Author: Devaraj K 
Date:   2018-10-16T23:50:20Z

[SPARK-24787][CORE] Revert hsync in EventLoggingListener and make
FsHistoryProvider to read lastBlockBeingWritten data for logs




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22708: [SPARK-21402][SQL] Fix java array of structs dese...

2018-10-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22708#discussion_r225740504
  
--- Diff: sql/core/src/test/resources/test-data/with-array-fields ---
@@ -0,0 +1,3 @@
+{ "id": 1, "intervals": [{ "startTime": 111, "endTime": 211 }, { 
"startTime": 121, "endTime": 221 }], "values": [11, 21, 31, 41]}
--- End diff --

Could you rename this to `with-array-fields.json`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22624: [SPARK-23781][CORE] Add base class for token renewal fun...

2018-10-16 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22624
  
There's stuff that I need to fix for the recent changes in the kubernetes 
code; also I'm going to do the work I meant to do for SPARK-25693 here, since 
it requires as much testing and isn't that much more code. So hang on a bit.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22732: [SPARK-25044][FOLLOW-UP] Change ScalaUDF construc...

2018-10-16 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/22732#discussion_r225735714
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ---
@@ -81,11 +81,11 @@ case class UserDefinedFunction protected[sql] (
   f,
   dataType,
   exprs.map(_.expr),
+  nullableTypes.map(_.map(!_)).getOrElse(exprs.map(_ => false)),
--- End diff --

Yes that's right. There are a number of UDFs in MLlib, etc that have inputs 
of type "Any", which isn't great, but I wanted to work around rather than 
change them for now.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22670: [SPARK-25631][SPARK-25632][SQL][TEST] Improve the test r...

2018-10-16 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22670
  
@srowen Thank you very much.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22670: [SPARK-25631][SPARK-25632][SQL][TEST] Improve the...

2018-10-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22670


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22670: [SPARK-25631][SPARK-25632][SQL][TEST] Improve the test r...

2018-10-16 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22670
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22381: [SPARK-25394][CORE] Add an application status metrics so...

2018-10-16 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/spark/pull/22381
  
Thanx @vanzin!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22381: [SPARK-25394][CORE] Add an application status met...

2018-10-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22381


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >