[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20685
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88033/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20685
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20464
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88039/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20685
  
**[Test build #88033 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88033/testReport)**
 for PR 20685 at commit 
[`4e4f075`](https://github.com/apache/spark/commit/4e4f07544d17ea0493b4c5887d8215550eedc424).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20464
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20464
  
**[Test build #88039 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88039/testReport)**
 for PR 20464 at commit 
[`8c1a8ec`](https://github.com/apache/spark/commit/8c1a8ec46ea28ce17fcaae42aa7b9955cb34bfc8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20753: [SPARK-23582][SQL] StaticInvoke should support in...

2018-03-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20753#discussion_r172755611
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -133,8 +134,21 @@ case class StaticInvoke(
   override def nullable: Boolean = needNullCheck || returnNullable
   override def children: Seq[Expression] = arguments
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
+  override def eval(input: InternalRow): Any = {
+if (staticObject == null) {
+  throw new RuntimeException("The static class cannot be null.")
+}
+
+val parmTypes = arguments.map(e =>
+  CallMethodViaReflection.typeMapping.getOrElse(e.dataType,
+Seq(e.dataType.asInstanceOf[ObjectType].cls))(0))
+val parms = arguments.map(e => e.eval(input).asInstanceOf[Object])
--- End diff --

We need null checks here for inputs? Also, can we add a common function in 
`InvokeLike` to handle input arguments for other `InvokeLike` eprs? (I mean the 
interpreted version of `InvokeLike.prepareArguments`).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20753: [SPARK-23582][SQL] StaticInvoke should support in...

2018-03-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20753#discussion_r172754548
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -133,8 +134,21 @@ case class StaticInvoke(
   override def nullable: Boolean = needNullCheck || returnNullable
   override def children: Seq[Expression] = arguments
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
+  override def eval(input: InternalRow): Any = {
+if (staticObject == null) {
--- End diff --

We need this check?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20682: [SPARK-23522][Python] always use sys.exit over builtin e...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20682
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20682: [SPARK-23522][Python] always use sys.exit over builtin e...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20682
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88036/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20682: [SPARK-23522][Python] always use sys.exit over builtin e...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20682
  
**[Test build #88036 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88036/testReport)**
 for PR 20682 at commit 
[`c1b7413`](https://github.com/apache/spark/commit/c1b7413d356dafdc607683292bfff7b1a57cdf27).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-check-en...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20735
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-check-en...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20735
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88038/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-check-en...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20735
  
**[Test build #88038 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88038/testReport)**
 for PR 20735 at commit 
[`a9d3fa5`](https://github.com/apache/spark/commit/a9d3fa5ead2ebec5f44615dc272056fe59f6130a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88032/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88034/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20754: [SPARK-23287][MESOS] Spark scheduler does not remove ini...

2018-03-06 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20754
  
@devaraj-kavali can you add test for this?
cc @susanxhuynh 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20464
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1347/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20464
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20687: [SPARK-23500][SQL] Fix complex type simplification rules...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20687
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20687: [SPARK-23500][SQL] Fix complex type simplification rules...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20687
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88035/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88034 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88034/testReport)**
 for PR 20756 at commit 
[`b8f171e`](https://github.com/apache/spark/commit/b8f171e5492f3156767589ad4c6ed458cb24615c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88032 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88032/testReport)**
 for PR 20756 at commit 
[`0c48a9b`](https://github.com/apache/spark/commit/0c48a9ba2551435e3794b4e98002423b9a8d527b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20687: [SPARK-23500][SQL] Fix complex type simplification rules...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20687
  
**[Test build #88035 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88035/testReport)**
 for PR 20687 at commit 
[`63c7098`](https://github.com/apache/spark/commit/63c7098fc4b14af7859580682f17c73abcd7ff08).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20678: [SPARK-23380][PYTHON] Adds a conf for Arrow fallb...

2018-03-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20678#discussion_r172751054
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1689,6 +1689,10 @@ using the call `toPandas()` and when creating a 
Spark DataFrame from a Pandas Da
 `createDataFrame(pandas_df)`. To use Arrow when executing these calls, 
users need to first set
 the Spark configuration 'spark.sql.execution.arrow.enabled' to 'true'. 
This is disabled by default.
 
+In addition, optimizations enabled by 'spark.sql.execution.arrow.enabled' 
could fallback automatically
+to non-optimized implementations if an error occurs before the actual 
computation within Spark.
--- End diff --

very minor nit: `non-optimized implementations` --> `non-Arrow optimization 
implementation`

this matches the description in the paragraph below


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20678: [SPARK-23380][PYTHON] Adds a conf for Arrow fallb...

2018-03-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20678#discussion_r172751164
  
--- Diff: docs/sql-programming-guide.md ---
@@ -1800,6 +1800,7 @@ working with timestamps in `pandas_udf`s to get the 
best performance, see
 ## Upgrading From Spark SQL 2.3 to 2.4
 
   - Since Spark 2.4, Spark maximizes the usage of a vectorized ORC reader 
for ORC files by default. To do that, `spark.sql.orc.impl` and 
`spark.sql.orc.filterPushdown` change their default values to `native` and 
`true` respectively.
+  - In PySpark, when Arrow optimization is enabled, previously `toPandas` 
just failed when Arrow optimization is unabled to be used whereas 
`createDataFrame` from Pandas DataFrame allowed the fallback to 
non-optimization. Now, both `toPandas` and `createDataFrame` from Pandas 
DataFrame allow the fallback by default, which can be switched by 
`spark.sql.execution.arrow.fallback.enabled`.
--- End diff --

`which can be switched by` -> `which can be switched on by` or `which can 
be switched on with`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20464: [SPARK-23291][SQL][R] R's substr should not reduce start...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20464
  
**[Test build #88039 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88039/testReport)**
 for PR 20464 at commit 
[`8c1a8ec`](https://github.com/apache/spark/commit/8c1a8ec46ea28ce17fcaae42aa7b9955cb34bfc8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20464: [SPARK-23291][SQL][R] R's substr should not reduc...

2018-03-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20464#discussion_r172751184
  
--- Diff: docs/sparkr.md ---
@@ -663,3 +663,7 @@ You can inspect the search path in R with 
[`search()`](https://stat.ethz.ch/R-ma
  - The `stringsAsFactors` parameter was previously ignored with `collect`, 
for example, in `collect(createDataFrame(iris), stringsAsFactors = TRUE))`. It 
has been corrected.
  - For `summary`, option for statistics to compute has been added. Its 
output is changed from that from `describe`.
  - A warning can be raised if versions of SparkR package and the Spark JVM 
do not match.
+
+## Upgrading to Spark 2.4.0
+
+ - The `start` parameter of `substr` method was wrongly subtracted by one, 
previously. In other words, the index specified by `start` parameter was 
considered as 0-base. This can lead to inconsistent substring results and also 
does not match with the behaviour with `substr` in R. It has been fixed so the 
`start` parameter of `substr` method is now 1-base, e.g., `substr(df$a, 2, 5)` 
should be changed to `substr(df$a, 1, 4)`.
--- End diff --

Yes. Added.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20464: [SPARK-23291][SQL][R] R's substr should not reduc...

2018-03-06 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/20464#discussion_r172750404
  
--- Diff: docs/sparkr.md ---
@@ -663,3 +663,7 @@ You can inspect the search path in R with 
[`search()`](https://stat.ethz.ch/R-ma
  - The `stringsAsFactors` parameter was previously ignored with `collect`, 
for example, in `collect(createDataFrame(iris), stringsAsFactors = TRUE))`. It 
has been corrected.
  - For `summary`, option for statistics to compute has been added. Its 
output is changed from that from `describe`.
  - A warning can be raised if versions of SparkR package and the Spark JVM 
do not match.
+
+## Upgrading to Spark 2.4.0
+
+ - The `start` parameter of `substr` method was wrongly subtracted by one, 
previously. In other words, the index specified by `start` parameter was 
considered as 0-base. This can lead to inconsistent substring results and also 
does not match with the behaviour with `substr` in R. It has been fixed so the 
`start` parameter of `substr` method is now 1-base, e.g., `substr(df$a, 2, 5)` 
should be changed to `substr(df$a, 1, 4)`.
--- End diff --

could you add
`method is now 1-base, e.g., therefore to get the same result as 
substr(df$a, 2, 5), it should be changed to substr(df$a, 1, 4)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-check-en...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20735
  
**[Test build #88038 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88038/testReport)**
 for PR 20735 at commit 
[`a9d3fa5`](https://github.com/apache/spark/commit/a9d3fa5ead2ebec5f44615dc272056fe59f6130a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-check-en...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20735
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-check-en...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20735
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1346/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20696: [SPARK-23525] [SQL] Support ALTER TABLE CHANGE COLUMN CO...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20696
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88031/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20696: [SPARK-23525] [SQL] Support ALTER TABLE CHANGE COLUMN CO...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20696
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20696: [SPARK-23525] [SQL] Support ALTER TABLE CHANGE COLUMN CO...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20696
  
**[Test build #88031 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88031/testReport)**
 for PR 20696 at commit 
[`48fc338`](https://github.com/apache/spark/commit/48fc338dc30720aa05e1871d69bad66ae2dfaa59).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20757
  
**[Test build #88037 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88037/testReport)**
 for PR 20757 at commit 
[`d53cfea`](https://github.com/apache/spark/commit/d53cfea1be24c1e0ae6fce6653a0f686719cd1c4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20757
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1345/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20757: [SPARK-23595][SQL] ValidateExternalType should support i...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20757
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20757: [SPARK-23595][SQL] ValidateExternalType should su...

2018-03-06 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/20757

[SPARK-23595][SQL] ValidateExternalType should support interpreted execution

## What changes were proposed in this pull request?
This pr supported interpreted mode for `ValidateExternalType`.

## How was this patch tested?
Added tests in `ObjectExpressionsSuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-23595

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20757.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20757


commit d53cfea1be24c1e0ae6fce6653a0f686719cd1c4
Author: Takeshi Yamamuro 
Date:   2018-03-06T17:06:28Z

ValidateExternalType should support interpreted execution




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20685
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread jinxing64
Github user jinxing64 commented on the issue:

https://github.com/apache/spark/pull/20685
  
@cloud-fan @squito 
Thanks a lot !


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2018-03-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19222#discussion_r172743278
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
 ---
@@ -57,20 +59,20 @@
 
   // The data stored in these two allocations need to maintain binary 
compatible. We can
   // directly pass this buffer to external components.
-  private long nulls;
--- End diff --

yea, I think `UTF8String` is good enough as the first show case.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20755: [SPARK-23406][SS] Enable stream-stream self-joins for br...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20755
  
**[Test build #88030 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88030/testReport)**
 for PR 20755 at commit 
[`484babb`](https://github.com/apache/spark/commit/484babb58d9cf61d5dcc6521865cd2a5db64dd82).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20755: [SPARK-23406][SS] Enable stream-stream self-joins for br...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20755
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20696: [SPARK-23525] [SQL] Support ALTER TABLE CHANGE COLUMN CO...

2018-03-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20696
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20755: [SPARK-23406][SS] Enable stream-stream self-joins for br...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20755
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88030/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20685
  
sounds reasonable. The purpose of this corruption check is to fail fast to 
retry the stage(re-shuffle), so disk corruption should also be counted.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20753
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20753
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88029/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20753: [SPARK-23582][SQL] StaticInvoke should support interpret...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20753
  
**[Test build #88029 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88029/testReport)**
 for PR 20753 at commit 
[`f570692`](https://github.com/apache/spark/commit/f570692616cfd7921470029051705c44e4b9c5db).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20702: [SPARK-23547][SQL]Cleanup the .pipeout file when the Hiv...

2018-03-06 Thread zuotingbing
Github user zuotingbing commented on the issue:

https://github.com/apache/spark/pull/20702
  
@gatorsmile  @liufengdb please take a look at this, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20682: [SPARK-23522][Python] always use sys.exit over builtin e...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20682
  
**[Test build #88036 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88036/testReport)**
 for PR 20682 at commit 
[`c1b7413`](https://github.com/apache/spark/commit/c1b7413d356dafdc607683292bfff7b1a57cdf27).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20687: [SPARK-23500][SQL] Fix complex type simplification rules...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20687
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1344/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1343/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20687: [SPARK-23500][SQL] Fix complex type simplification rules...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20687
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20682: [SPARK-23522][Python] always use sys.exit over builtin e...

2018-03-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20682
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20748: [SPARK-23611][SQL] Add a helper function to check except...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20748
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88027/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20748: [SPARK-23611][SQL] Add a helper function to check except...

2018-03-06 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20748
  
@hvanhovell ok, check again? Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20748: [SPARK-23611][SQL] Add a helper function to check except...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20748
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88034 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88034/testReport)**
 for PR 20756 at commit 
[`b8f171e`](https://github.com/apache/spark/commit/b8f171e5492f3156767589ad4c6ed458cb24615c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20748: [SPARK-23611][SQL] Add a helper function to check except...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20748
  
**[Test build #88027 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88027/testReport)**
 for PR 20748 at commit 
[`aeca542`](https://github.com/apache/spark/commit/aeca5428a179e932fa5fdbfbe8de2f64b64a4b43).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-c...

2018-03-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20735#discussion_r172732614
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -736,7 +736,8 @@ private object YarnAllocator {
   def memLimitExceededLogMessage(diagnostics: String, pattern: Pattern): 
String = {
 val matcher = pattern.matcher(diagnostics)
 val diag = if (matcher.find()) " " + matcher.group() + "." else ""
-("Container killed by YARN for exceeding memory limits." + diag
-  + " Consider boosting spark.yarn.executor.memoryOverhead.")
+s"Container killed by YARN for exceeding memory limits. $diag " +
+  "Consider boosting spark.yarn.executor.memoryOverhead or " +
+  "disable yarn.nodemanager.vmem-check-enabled because of YARN-4714."
--- End diff --

Thank you for confirmation!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20687: [SPARK-23500][SQL] Fix complex type simplification rules...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20687
  
**[Test build #88035 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88035/testReport)**
 for PR 20687 at commit 
[`63c7098`](https://github.com/apache/spark/commit/63c7098fc4b14af7859580682f17c73abcd7ff08).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20687: [SPARK-23500][SQL] Fix complex type simplification rules...

2018-03-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/20687
  
Retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20685
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1342/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20685
  
**[Test build #88033 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88033/testReport)**
 for PR 20685 at commit 
[`4e4f075`](https://github.com/apache/spark/commit/4e4f07544d17ea0493b4c5887d8215550eedc424).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20685
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1341/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-c...

2018-03-06 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/20735#discussion_r172732010
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -736,7 +736,8 @@ private object YarnAllocator {
   def memLimitExceededLogMessage(diagnostics: String, pattern: Pattern): 
String = {
 val matcher = pattern.matcher(diagnostics)
 val diag = if (matcher.find()) " " + matcher.group() + "." else ""
-("Container killed by YARN for exceeding memory limits." + diag
-  + " Consider boosting spark.yarn.executor.memoryOverhead.")
+s"Container killed by YARN for exceeding memory limits. $diag " +
+  "Consider boosting spark.yarn.executor.memoryOverhead or " +
+  "disable yarn.nodemanager.vmem-check-enabled because of YARN-4714."
--- End diff --

nit: "disable" -> "disabling"?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1340/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20756
  
**[Test build #88032 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88032/testReport)**
 for PR 20756 at commit 
[`0c48a9b`](https://github.com/apache/spark/commit/0c48a9ba2551435e3794b4e98002423b9a8d527b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20756: [SPARK-23593][SQL] Add interpreted execution for ...

2018-03-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20756#discussion_r172731452
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -1254,8 +1254,24 @@ case class InitializeJavaBean(beanInstance: 
Expression, setters: Map[String, Exp
   override def children: Seq[Expression] = beanInstance +: 
setters.values.toSeq
   override def dataType: DataType = beanInstance.dataType
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
+  override def eval(input: InternalRow): Any = {
+val instance = beanInstance.eval(input).asInstanceOf[Object]
+if (instance != null) {
+  setters.foreach { case (setterMethod, fieldExpr) =>
+val fieldValue = fieldExpr.eval(input).asInstanceOf[Object]
+
+val foundMethods = instance.getClass.getMethods.filter { method =>
+  method.getName == setterMethod && 
Modifier.isPublic(method.getModifiers) &&
+method.getParameterTypes.length == 1
+}
+assert(foundMethods.length == 1,
+  throw new RuntimeException("The Java Bean instance should have 
only one " +
--- End diff --

codegen evaluation does not check method existence. But for non-codegen 
evaluation here, it is a bit weird to directly invoke first found method (we 
may not find it). cc @hvanhovell 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20756: [SPARK-23593][SQL] Add interpreted execution for Initial...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20756
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20685: [SPARK-23524] Big local shuffle blocks should not...

2018-03-06 Thread Ngone51
Github user Ngone51 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20685#discussion_r172731294
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -583,8 +587,8 @@ object ShuffleBlockFetcherIterator {
* Result of a fetch from a remote block successfully.
* @param blockId block id
* @param address BlockManager that the block was fetched from.
-   * @param size estimated size of the block, used to calculate 
bytesInFlight.
-   * Note that this is NOT the exact bytes.
+   * @param size estimated size of the block. Note that this is NOT the 
exact bytes.
+*Size of remote block is used to calculate bytesInFlight.
--- End diff --

nit: documentation style


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20756: [SPARK-23593][SQL] Add interpreted execution for ...

2018-03-06 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/20756

[SPARK-23593][SQL] Add interpreted execution for InitializeJavaBean 
expression

## What changes were proposed in this pull request?

Add interpreted execution for `InitializeJavaBean` expression.

## How was this patch tested?

Added unit test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-23593

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20756.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20756


commit 978080bd8f5a1b095fd0d58ff529e16dd9cbadba
Author: Liang-Chi Hsieh 
Date:   2018-03-07T02:56:53Z

Add interpreted execution for InitializeJavaBean expression.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20688: [SPARK-23096][SS] Migrate rate source to V2

2018-03-06 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/20688#discussion_r172730994
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
 ---
@@ -24,8 +24,8 @@ import org.json4s.jackson.Serialization
 
 import org.apache.spark.sql.Row
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
-import org.apache.spark.sql.execution.streaming.{RateSourceProvider, 
RateStreamOffset, ValueRunTimeMsPair}
-import org.apache.spark.sql.execution.streaming.sources.RateStreamSourceV2
+import org.apache.spark.sql.execution.streaming.{RateStreamOffset, 
ValueRunTimeMsPair}
+import org.apache.spark.sql.execution.streaming.sources.RateSourceProvider
 import org.apache.spark.sql.sources.v2.DataSourceOptions
 import org.apache.spark.sql.sources.v2.reader._
 import 
org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousDataReader, 
ContinuousReader, Offset, PartitionOffset}
--- End diff --

Could you make the names of the different readers consistent with each 
other? Similar to Kafka?

RateStreamProvider
RateStreamMicroBatchReader, RateStreamMicroBatchDataReaderFactory 
RateStreamContinuousReader, 





---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20688: [SPARK-23096][SS] Migrate rate source to V2

2018-03-06 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/20688#discussion_r172730858
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/RateSourceSuite.scala
 ---
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.sources
+
+import java.nio.file.Files
+import java.util.Optional
+import java.util.concurrent.TimeUnit
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
+import org.apache.spark.sql.catalyst.errors.TreeNodeException
+import org.apache.spark.sql.execution.datasources.DataSource
+import org.apache.spark.sql.execution.streaming._
+import org.apache.spark.sql.execution.streaming.continuous._
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.sources.v2.{ContinuousReadSupport, 
DataSourceOptions, MicroBatchReadSupport}
+import org.apache.spark.sql.sources.v2.reader.streaming.Offset
+import org.apache.spark.sql.streaming.StreamTest
+import org.apache.spark.util.ManualClock
+
+class RateSourceSuite extends StreamTest {
--- End diff --

Hi @tdas , I think I used "git mv", the thing is that when the diff is 
larger then x%, it will treat as "git rm" and "git add" 
(https://makandracards.com/makandra/30957-git-how-to-get-a-useful-diff-when-renaming-files).
 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20688: [SPARK-23096][SS] Migrate rate source to V2

2018-03-06 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/20688#discussion_r172730333
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/RateSourceSuite.scala
 ---
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.sources
+
+import java.nio.file.Files
+import java.util.Optional
+import java.util.concurrent.TimeUnit
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
+import org.apache.spark.sql.catalyst.errors.TreeNodeException
+import org.apache.spark.sql.execution.datasources.DataSource
+import org.apache.spark.sql.execution.streaming._
+import org.apache.spark.sql.execution.streaming.continuous._
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.sources.v2.{ContinuousReadSupport, 
DataSourceOptions, MicroBatchReadSupport}
+import org.apache.spark.sql.sources.v2.reader.streaming.Offset
+import org.apache.spark.sql.streaming.StreamTest
+import org.apache.spark.util.ManualClock
+
+class RateSourceSuite extends StreamTest {
--- End diff --

Why did you not move this file using "git mv" and then change? Then we 
would have been able to diff it properly. 
This was a pain in the text socket v2 PR as well :(


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20688: [SPARK-23096][SS] Migrate rate source to V2

2018-03-06 Thread tdas
Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/20688#discussion_r172729894
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateSourceProvider.scala
 ---
@@ -0,0 +1,291 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.sources
+
+import java.io._
+import java.nio.charset.StandardCharsets
+import java.util.Optional
+import java.util.concurrent.TimeUnit
+
+import scala.collection.JavaConverters._
+
+import org.apache.commons.io.IOUtils
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.network.util.JavaUtils
+import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
+import org.apache.spark.sql.catalyst.util.DateTimeUtils
+import org.apache.spark.sql.execution.streaming._
+import 
org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader
+import org.apache.spark.sql.sources.DataSourceRegister
+import org.apache.spark.sql.sources.v2.{ContinuousReadSupport, 
DataSourceOptions, DataSourceV2, MicroBatchReadSupport}
+import org.apache.spark.sql.sources.v2.reader._
+import org.apache.spark.sql.sources.v2.reader.streaming.{ContinuousReader, 
MicroBatchReader, Offset}
+import org.apache.spark.sql.types.{LongType, StructField, StructType, 
TimestampType}
+import org.apache.spark.util.{ManualClock, SystemClock}
+
+object RateSourceProvider {
+  val SCHEMA =
+StructType(StructField("timestamp", TimestampType) :: 
StructField("value", LongType) :: Nil)
+
+  val VERSION = 1
+
+  val NUM_PARTITIONS = "numPartitions"
+  val ROWS_PER_SECOND = "rowsPerSecond"
+  val RAMP_UP_TIME = "rampUpTime"
+
+  /** Calculate the end value we will emit at the time `seconds`. */
+  def valueAtSecond(seconds: Long, rowsPerSecond: Long, rampUpTimeSeconds: 
Long): Long = {
+// E.g., rampUpTimeSeconds = 4, rowsPerSecond = 10
+// Then speedDeltaPerSecond = 2
+//
+// seconds   = 0 1 2  3  4  5  6
+// speed = 0 2 4  6  8 10 10 (speedDeltaPerSecond * seconds)
+// end value = 0 2 6 12 20 30 40 (0 + speedDeltaPerSecond * seconds) * 
(seconds + 1) / 2
+val speedDeltaPerSecond = rowsPerSecond / (rampUpTimeSeconds + 1)
+if (seconds <= rampUpTimeSeconds) {
+  // Calculate "(0 + speedDeltaPerSecond * seconds) * (seconds + 1) / 
2" in a special way to
+  // avoid overflow
+  if (seconds % 2 == 1) {
+(seconds + 1) / 2 * speedDeltaPerSecond * seconds
+  } else {
+seconds / 2 * speedDeltaPerSecond * (seconds + 1)
+  }
+} else {
+  // rampUpPart is just a special case of the above formula: 
rampUpTimeSeconds == seconds
+  val rampUpPart = valueAtSecond(rampUpTimeSeconds, rowsPerSecond, 
rampUpTimeSeconds)
+  rampUpPart + (seconds - rampUpTimeSeconds) * rowsPerSecond
+}
+  }
+}
+
+class RateSourceProvider extends DataSourceV2
+  with MicroBatchReadSupport with ContinuousReadSupport with 
DataSourceRegister {
+  import RateSourceProvider._
+
+  private def checkParameters(options: DataSourceOptions): Unit = {
+if (options.get(ROWS_PER_SECOND).isPresent) {
+  val rowsPerSecond = options.get(ROWS_PER_SECOND).get().toLong
+  if (rowsPerSecond <= 0) {
+throw new IllegalArgumentException(
+  s"Invalid value '$rowsPerSecond'. The option 'rowsPerSecond' 
must be positive")
+  }
+}
+
+if (options.get(RAMP_UP_TIME).isPresent) {
+  val rampUpTimeSeconds =
+JavaUtils.timeStringAsSec(options.get(RAMP_UP_TIME).get())
+  if (rampUpTimeSeconds < 0) {
+throw new IllegalArgumentException(
+  s"Invalid value '$rampUpTimeSeconds'. The option 'rampUpTime' 
must not be negative")
+  }
+}
+
+if (options.get(NUM_PARTITIONS).isPresent) {
+  val numPartitions = o

[GitHub] spark pull request #20735: [MINOR][YARN] Add disable yarn.nodemanager.vmem-c...

2018-03-06 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/20735#discussion_r172729670
  
--- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
 ---
@@ -736,7 +736,8 @@ private object YarnAllocator {
   def memLimitExceededLogMessage(diagnostics: String, pattern: Pattern): 
String = {
 val matcher = pattern.matcher(diagnostics)
 val diag = if (matcher.find()) " " + matcher.group() + "." else ""
-("Container killed by YARN for exceeding memory limits." + diag
-  + " Consider boosting spark.yarn.executor.memoryOverhead.")
+s"Container killed by YARN for exceeding memory limits. $diag " +
+  "Consider boosting spark.yarn.executor.memoryOverhead or " +
+  "disable yarn.nodemanager.vmem-check-enabled because of YARN-4714."
--- End diff --

The changes looks fine to me.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/20685
  
it'll also help with disk corruption ... from the stack traces in 
SPARK-4105 you can't really tell what the source of the problem is.  it'll be 
pretty hard to determine what the source of corruption is if we start seeing it 
again.  anyway, I don't feel that strongly about it either way.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20696: [SPARK-23525] [SQL] Support ALTER TABLE CHANGE COLUMN CO...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20696
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20696: [SPARK-23525] [SQL] Support ALTER TABLE CHANGE COLUMN CO...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20696
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1339/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20696: [SPARK-23525] [SQL] Support ALTER TABLE CHANGE COLUMN CO...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20696
  
**[Test build #88031 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88031/testReport)**
 for PR 20696 at commit 
[`48fc338`](https://github.com/apache/spark/commit/48fc338dc30720aa05e1871d69bad66ae2dfaa59).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20755: [SPARK-23406][SS] Enable stream-stream self-joins...

2018-03-06 Thread tdas
Github user tdas closed the pull request at:

https://github.com/apache/spark/pull/20755


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20755: [SPARK-23406][SS] Enable stream-stream self-joins for br...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20755
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1338/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20755: [SPARK-23406][SS] Enable stream-stream self-joins for br...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20755
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20755: [SPARK-23406][SS] Enable stream-stream self-joins for br...

2018-03-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20755
  
**[Test build #88030 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88030/testReport)**
 for PR 20755 at commit 
[`484babb`](https://github.com/apache/spark/commit/484babb58d9cf61d5dcc6521865cd2a5db64dd82).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20755: [SPARK-23406][SS] Enable stream-stream self-joins...

2018-03-06 Thread tdas
GitHub user tdas opened a pull request:

https://github.com/apache/spark/pull/20755

[SPARK-23406][SS] Enable stream-stream self-joins for branch-2.3

## What changes were proposed in this pull request?
This is limited but safe-to-backport version of self-join-fix made in 
#20598 
That PR solved two bugs
1. Add MultiInstanceRelation trait to leaf logical nodes to allow 
resolution - This is the major fix required to allow streaming self-joins, and 
is safe to backport.
2. Fix attribute rewriting in MicroBatchExecution when micro-batch plans 
are spliced into the streaming logical plan - This is a minor fix that is not 
safe to backport. Without this fix only a very small fraction self-join cases 
will have issues, but those issues may lead to incorrect results.

## How was this patch tested?
New unit test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tdas/spark SPARK-23406-2.3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20755.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20755


commit 484babb58d9cf61d5dcc6521865cd2a5db64dd82
Author: Tathagata Das 
Date:   2018-03-07T00:53:34Z

Fixed




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20753: [SPARK-23582][SQL] StaticInvoke should support in...

2018-03-06 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/20753#discussion_r172722081
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -133,8 +134,21 @@ case class StaticInvoke(
   override def nullable: Boolean = needNullCheck || returnNullable
   override def children: Seq[Expression] = arguments
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
+  override def eval(input: InternalRow): Any = {
+if (staticObject == null) {
+  throw new RuntimeException("The static class cannot be null.")
+}
+
+val parmTypes = arguments.map(e =>
+  CallMethodViaReflection.typeMapping.getOrElse(e.dataType,
+Seq(e.dataType.asInstanceOf[ObjectType].cls))(0))
--- End diff --

You are right. I have to support other types before merging this change.
This is a prototype for discussing whether we use reflection or not.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20754: [SPARK-23287][MESOS] Spark scheduler does not remove ini...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20754
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20754: [SPARK-23287][MESOS] Spark scheduler does not remove ini...

2018-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20754
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20727: [SPARK-23577][SQL] Supports custom line separator...

2018-03-06 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20727#discussion_r172721381
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/HadoopFileLinesReader.scala
 ---
@@ -30,9 +31,19 @@ import 
org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
 /**
  * An adaptor from a [[PartitionedFile]] to an [[Iterator]] of [[Text]], 
which are all of the lines
  * in that file.
+ *
+ * @param file A part (i.e. "block") of a single file that should be read 
line by line.
+ * @param lineSeparator A line separator that should be used for each 
line. If the value is `None`,
+ *  it covers `\r`, `\r\n` and `\n`.
+ * @param conf Hadoop configuration
  */
 class HadoopFileLinesReader(
-file: PartitionedFile, conf: Configuration) extends Iterator[Text] 
with Closeable {
+file: PartitionedFile,
+lineSeparator: Option[String],
+conf: Configuration) extends Iterator[Text] with Closeable {
--- End diff --

Note that it's an internal API for datasources and Hadoop's Text already 
has an assumption for utf8. I don't think we should call getBytes with utf8 at 
each caller side.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20754: [SPARK-23287][MESOS] Spark scheduler does not rem...

2018-03-06 Thread devaraj-kavali
GitHub user devaraj-kavali opened a pull request:

https://github.com/apache/spark/pull/20754

[SPARK-23287][MESOS] Spark scheduler does not remove initial executor if 
not one job submitted

## What changes were proposed in this pull request?

In `ExecutorAllocationManager.schedule()`, `numExecutorsTarget` is getting 
updated as part of `updateAndSyncNumExecutorsTarget(now)` but it skips updating 
till initializing becomes false, `removeExecutors()` is not removing the 
expired executors since the condition `else if (newExecutorTotal - 1 < 
numExecutorsTarget) { ` is satisfying to skip them, and they are missing to 
remove and continues running till the application completes.

I moved the `updateAndSyncNumExecutorsTarget(now)` to after the expiry 
check and initializing var assignment if eligible so that the updated 
`numExecutorsTarget `can be used while removing executors. 

## How was this patch tested?

I verified it manually by enabling the dynamic allocation with Mesos mode, 
now it removes the executors when they are not getting assigned any task for 
the specified executorIdleTimeout. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/devaraj-kavali/spark SPARK-23287

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20754.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20754


commit 5bef384acfe3d76949dceab669743e15373bad57
Author: Devaraj K 
Date:   2018-03-07T01:54:58Z

SPARK-23287 Spark scheduler does not remove initial executor if not one
job submitted




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20753: [SPARK-23582][SQL] StaticInvoke should support in...

2018-03-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20753#discussion_r172718446
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -133,8 +134,21 @@ case class StaticInvoke(
   override def nullable: Boolean = needNullCheck || returnNullable
   override def children: Seq[Expression] = arguments
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
+  override def eval(input: InternalRow): Any = {
+if (staticObject == null) {
+  throw new RuntimeException("The static class cannot be null.")
+}
+
+val parmTypes = arguments.map(e =>
+  CallMethodViaReflection.typeMapping.getOrElse(e.dataType,
+Seq(e.dataType.asInstanceOf[ObjectType].cls))(0))
--- End diff --

The external types of native types `CalendarIntervalType` and `BinaryType` 
are not `ObjectType`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20753: [SPARK-23582][SQL] StaticInvoke should support in...

2018-03-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20753#discussion_r172717961
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala
 ---
@@ -133,8 +134,21 @@ case class StaticInvoke(
   override def nullable: Boolean = needNullCheck || returnNullable
   override def children: Seq[Expression] = arguments
 
-  override def eval(input: InternalRow): Any =
-throw new UnsupportedOperationException("Only code-generated 
evaluation is supported.")
+  override def eval(input: InternalRow): Any = {
+if (staticObject == null) {
+  throw new RuntimeException("The static class cannot be null.")
+}
+
+val parmTypes = arguments.map(e =>
+  CallMethodViaReflection.typeMapping.getOrElse(e.dataType,
+Seq(e.dataType.asInstanceOf[ObjectType].cls))(0))
+val parms = arguments.map(e => e.eval(input).asInstanceOf[Object])
+val method = staticObject.getDeclaredMethod(functionName, parmTypes : 
_*)
+val ret = method.invoke(null, parms : _*)
+val retClass = CallMethodViaReflection.typeMapping.getOrElse(dataType,
+  Seq(dataType.asInstanceOf[ObjectType].cls))(0)
--- End diff --

Will `dataType` always be an `ObjectType` here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >