[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96885/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22339
  
**[Test build #96885 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96885/testReport)**
 for PR 22339 at commit 
[`542872c`](https://github.com/apache/spark/commit/542872cb5459fae1a66ee45aa193986e9a37fb96).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22527: [SPARK-17952][SQL] Nested Java beans support in c...

2018-10-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22527#discussion_r222183494
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -1098,16 +1098,24 @@ object SQLContext {
   data: Iterator[_],
   beanClass: Class[_],
   attrs: Seq[AttributeReference]): Iterator[InternalRow] = {
-val extractors =
-  
JavaTypeInference.getJavaBeanReadableProperties(beanClass).map(_.getReadMethod)
-val methodsToConverts = extractors.zip(attrs).map { case (e, attr) =>
-  (e, CatalystTypeConverters.createToCatalystConverter(attr.dataType))
+def createStructConverter(cls: Class[_], fieldTypes: 
Iterator[DataType]): Any => InternalRow = {
--- End diff --

nit: `Seq[DataType]` instead of `Iterator[DataType]`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22527: [SPARK-17952][SQL] Nested Java beans support in c...

2018-10-02 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/22527#discussion_r222185482
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala ---
@@ -1098,16 +1098,24 @@ object SQLContext {
   data: Iterator[_],
   beanClass: Class[_],
   attrs: Seq[AttributeReference]): Iterator[InternalRow] = {
-val extractors =
-  
JavaTypeInference.getJavaBeanReadableProperties(beanClass).map(_.getReadMethod)
-val methodsToConverts = extractors.zip(attrs).map { case (e, attr) =>
-  (e, CatalystTypeConverters.createToCatalystConverter(attr.dataType))
+def createStructConverter(cls: Class[_], fieldTypes: 
Iterator[DataType]): Any => InternalRow = {
+  val methodConverters =
+
JavaTypeInference.getJavaBeanReadableProperties(cls).iterator.zip(fieldTypes)
+  .map { case (property, fieldType) =>
+val method = property.getReadMethod
+method -> createConverter(method.getReturnType, fieldType)
+  }.toArray
+  value => new GenericInternalRow(
--- End diff --

We should check whether the `value` is `null` or not? Also could you add a 
test for the case?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22491
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96883/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22491
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22491
  
**[Test build #96883 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96883/testReport)**
 for PR 22491 at commit 
[`4b69423`](https://github.com/apache/spark/commit/4b6942337bab89fad4d40f120d53197544a799b8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3650/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22500
  
**[Test build #96888 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96888/testReport)**
 for PR 22500 at commit 
[`c791249`](https://github.com/apache/spark/commit/c791249fdd3f97736d977087434575ca3480ae9d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/22500
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22619: [SQL][MINOR] Make use of TypeCoercion.findTightestCommon...

2018-10-02 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22619
  
cc @HyukjinKwon @MaxGekk 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22619: [SQL][MINOR] Make use of TypeCoercion.findTightestCommon...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22619
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22619: [SQL][MINOR] Make use of TypeCoercion.findTightestCommon...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22619
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96881/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22619: [SQL][MINOR] Make use of TypeCoercion.findTightestCommon...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22619
  
**[Test build #96881 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96881/testReport)**
 for PR 22619 at commit 
[`d4e0bdb`](https://github.com/apache/spark/commit/d4e0bdb5a06f59ff1df3933cb43804e71cde8259).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22339
  
**[Test build #96887 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96887/testReport)**
 for PR 22339 at commit 
[`d91c815`](https://github.com/apache/spark/commit/d91c815774bff070bdb3cb149678ff080bc06b45).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3649/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96886/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22339
  
**[Test build #96886 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96886/testReport)**
 for PR 22339 at commit 
[`dab9bf3`](https://github.com/apache/spark/commit/dab9bf3771994989e5de2857f91d117dc8b74623).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22339
  
**[Test build #96886 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96886/testReport)**
 for PR 22339 at commit 
[`dab9bf3`](https://github.com/apache/spark/commit/dab9bf3771994989e5de2857f91d117dc8b74623).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3648/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22339
  
**[Test build #96885 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96885/testReport)**
 for PR 22339 at commit 
[`542872c`](https://github.com/apache/spark/commit/542872cb5459fae1a66ee45aa193986e9a37fb96).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22339: [SPARK-17159][STREAM] Significant speed up for running s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22339
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3647/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96882/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22500
  
**[Test build #96882 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96882/testReport)**
 for PR 22500 at commit 
[`c791249`](https://github.com/apache/spark/commit/c791249fdd3f97736d977087434575ca3480ae9d).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22488
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22488
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96880/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22488
  
**[Test build #96880 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96880/testReport)**
 for PR 22488 at commit 
[`27c6493`](https://github.com/apache/spark/commit/27c649337e326aa8081afcd9de1a7097ea2788e2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning wh...

2018-10-02 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22610#discussion_r222173904
  
--- Diff: python/pyspark/worker.py ---
@@ -84,13 +84,36 @@ def wrap_scalar_pandas_udf(f, return_type):
 arrow_return_type = to_arrow_type(return_type)
 
 def verify_result_length(*a):
+import pyarrow as pa
 result = f(*a)
 if not hasattr(result, "__len__"):
 raise TypeError("Return type of the user-defined function 
should be "
 "Pandas.Series, but is 
{}".format(type(result)))
 if len(result) != len(a[0]):
 raise RuntimeError("Result vector from pandas_udf was not the 
required length: "
"expected %d, got %d" % (len(a[0]), 
len(result)))
+
+# Ensure return type of Pandas.Series matches the arrow return 
type of the user-defined
+# function. Otherwise, we may produce incorrect serialized data.
+# Note: for timestamp type, we only need to ensure both types are 
timestamp because the
+# serializer will do conversion.
+try:
+arrow_type_of_result = pa.from_numpy_dtype(result.dtype)
+both_are_timestamp = 
pa.types.is_timestamp(arrow_type_of_result) and \
+pa.types.is_timestamp(arrow_return_type)
+if not both_are_timestamp and arrow_return_type != 
arrow_type_of_result:
+print("WARN: Arrow type %s of return Pandas.Series of the 
user-defined function's "
+  "dtype %s doesn't match the arrow type %s "
+  "of defined return type %s" % (arrow_type_of_result, 
result.dtype,
+ arrow_return_type, 
return_type),
+  file=sys.stderr)
+except:
+print("WARN: Can't infer arrow type of Pandas.Series's dtype: 
%s, which might not "
+  "match the arrow type %s of defined return type %s" % 
(result.dtype,
+ 
arrow_return_type,
+ 
return_type),
--- End diff --

ok. thanks. :-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning wh...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22610#discussion_r222173421
  
--- Diff: python/pyspark/worker.py ---
@@ -84,13 +84,36 @@ def wrap_scalar_pandas_udf(f, return_type):
 arrow_return_type = to_arrow_type(return_type)
 
 def verify_result_length(*a):
+import pyarrow as pa
 result = f(*a)
 if not hasattr(result, "__len__"):
 raise TypeError("Return type of the user-defined function 
should be "
 "Pandas.Series, but is 
{}".format(type(result)))
 if len(result) != len(a[0]):
 raise RuntimeError("Result vector from pandas_udf was not the 
required length: "
"expected %d, got %d" % (len(a[0]), 
len(result)))
+
+# Ensure return type of Pandas.Series matches the arrow return 
type of the user-defined
+# function. Otherwise, we may produce incorrect serialized data.
+# Note: for timestamp type, we only need to ensure both types are 
timestamp because the
+# serializer will do conversion.
+try:
+arrow_type_of_result = pa.from_numpy_dtype(result.dtype)
+both_are_timestamp = 
pa.types.is_timestamp(arrow_type_of_result) and \
+pa.types.is_timestamp(arrow_return_type)
+if not both_are_timestamp and arrow_return_type != 
arrow_type_of_result:
+print("WARN: Arrow type %s of return Pandas.Series of the 
user-defined function's "
+  "dtype %s doesn't match the arrow type %s "
+  "of defined return type %s" % (arrow_type_of_result, 
result.dtype,
+ arrow_return_type, 
return_type),
+  file=sys.stderr)
+except:
+print("WARN: Can't infer arrow type of Pandas.Series's dtype: 
%s, which might not "
+  "match the arrow type %s of defined return type %s" % 
(result.dtype,
+ 
arrow_return_type,
+ 
return_type),
--- End diff --

I would fix the indentation here tho :-)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22610
  
> he only other idea I have is to provide an option to raise an error if 
the type needs to be cast.

Actually sounds good to me.

I think the problem is we are not quite clear about when the type is 
mismatched in UDFs (see also https://github.com/apache/spark/pull/20163 for a 
reminder). IIRC, we rather roughly agreed upon documenting it (and allowing 
exact type match (?)).

@viirya and @BryanCutler, how about we document that return types should be 
matched (we can leave a chart or map referring (`types.to_arrow_type`)?

One additional improvement might be .. we describe that type casting 
behaviour is .. say .. not guaranteed but I am not sure how we can nicely 
document this. Probably only mentioning the type mapping is fine ..? 



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22611: [SPARK-25595] Ignore corrupt Avro files if flag I...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22611#discussion_r222169616
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ---
@@ -342,6 +342,53 @@ class AvroSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils {
 }
   }
 
+  private def createDummyCorruptFile(dir: File): Unit = {
+FileUtils.forceMkdir(dir)
+val corruptFile = new File(dir, "corrupt.avro")
+val writer = new BufferedWriter(new FileWriter(corruptFile))
+writer.write("corrupt")
+writer.close()
+  }
+
+  test("Ignore corrupt Avro file if flag IGNORE_CORRUPT_FILES enabled") {
+withSQLConf(SQLConf.IGNORE_CORRUPT_FILES.key -> "true") {
+  withTempPath { dir =>
+createDummyCorruptFile(dir)
+val message = intercept[FileNotFoundException] {
+  spark.read.format("avro").load(dir.getAbsolutePath).schema
+}.getMessage
+assert(message.contains("No Avro files found."))
+
+val srcFile = new File("src/test/resources/episodes.avro")
+val destFile = new File(dir, "episodes.avro")
+FileUtils.copyFile(srcFile, destFile)
+
+val df = spark.read.format("avro").load(srcFile.getAbsolutePath)
+val schema = df.schema
+val result = df.collect()
+// Schema inference picks random readable sample file.
+// Here we use a loop to eliminate randomness.
+(1 to 5).foreach { _ =>
+  
assert(spark.read.format("avro").load(dir.getAbsolutePath).schema == schema)
+  checkAnswer(spark.read.format("avro").load(dir.getAbsolutePath), 
result)
+}
+  }
+}
+  }
+
+  test("Throws IOException on reading corrupt Avro file if flag 
IGNORE_CORRUPT_FILES disabled") {
+withSQLConf(SQLConf.IGNORE_CORRUPT_FILES.key -> "false") {
+  withTempPath { dir =>
+createDummyCorruptFile(dir)
+val message = intercept[org.apache.spark.SparkException] {
+  spark.read.format("avro").load(dir.getAbsolutePath).schema
--- End diff --

`.schema` wouldn't probably be needed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22611: [SPARK-25595] Ignore corrupt Avro files if flag I...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22611#discussion_r222169458
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ---
@@ -342,6 +342,53 @@ class AvroSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils {
 }
   }
 
+  private def createDummyCorruptFile(dir: File): Unit = {
+FileUtils.forceMkdir(dir)
+val corruptFile = new File(dir, "corrupt.avro")
+val writer = new BufferedWriter(new FileWriter(corruptFile))
+writer.write("corrupt")
+writer.close()
+  }
+
+  test("Ignore corrupt Avro file if flag IGNORE_CORRUPT_FILES enabled") {
+withSQLConf(SQLConf.IGNORE_CORRUPT_FILES.key -> "true") {
+  withTempPath { dir =>
+createDummyCorruptFile(dir)
+val message = intercept[FileNotFoundException] {
+  spark.read.format("avro").load(dir.getAbsolutePath).schema
+}.getMessage
+assert(message.contains("No Avro files found."))
+
+val srcFile = new File("src/test/resources/episodes.avro")
+val destFile = new File(dir, "episodes.avro")
+FileUtils.copyFile(srcFile, destFile)
+
+val df = spark.read.format("avro").load(srcFile.getAbsolutePath)
+val schema = df.schema
+val result = df.collect()
+// Schema inference picks random readable sample file.
+// Here we use a loop to eliminate randomness.
--- End diff --

Actually I don't think it's randomness in this test. In this test, HDFS 
lists files in an alphabetical order under to the hood although it's not 
guaranteed. I think the picking order here at least is deterministic.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22616: [SPARK-25586][Core] Remove outer objects from logdebug s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22616
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96876/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22616: [SPARK-25586][Core] Remove outer objects from logdebug s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22616
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22616: [SPARK-25586][Core] Remove outer objects from logdebug s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22616
  
**[Test build #96876 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96876/testReport)**
 for PR 22616 at commit 
[`2723b47`](https://github.com/apache/spark/commit/2723b47ded05c2eb2ca6cdb3f893965120c04920).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22491
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96878/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22491
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22488
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22488
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96879/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22491
  
**[Test build #96878 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96878/testReport)**
 for PR 22491 at commit 
[`5c819fb`](https://github.com/apache/spark/commit/5c819fb972c6c5a87375d9f8d6418d1de186449c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22488
  
**[Test build #96879 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96879/testReport)**
 for PR 22488 at commit 
[`d2d0a3e`](https://github.com/apache/spark/commit/d2d0a3e9fbb27b6a531241bbe6641d5950a7a703).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96877/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22500
  
**[Test build #96877 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96877/testReport)**
 for PR 22500 at commit 
[`f7a14e3`](https://github.com/apache/spark/commit/f7a14e3e63bb3f78b750bb17fc66938308647b63).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22613: [SPARK-25583][DOC][BRANCH-2.3]Add history-server related...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22613
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96884/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22613: [SPARK-25583][DOC][BRANCH-2.3]Add history-server related...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22613
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22613: [SPARK-25583][DOC][BRANCH-2.3]Add history-server related...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22613
  
**[Test build #96884 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96884/testReport)**
 for PR 22613 at commit 
[`f0947a3`](https://github.com/apache/spark/commit/f0947a31f13035e490385a73b027cd5dbe789072).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22611: [SPARK-25595] Ignore corrupt Avro files if flag I...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22611#discussion_r222167078
  
--- Diff: 
external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala ---
@@ -342,6 +342,53 @@ class AvroSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils {
 }
   }
 
+  private def createDummyCorruptFile(dir: File): Unit = {
+FileUtils.forceMkdir(dir)
+val corruptFile = new File(dir, "corrupt.avro")
+val writer = new BufferedWriter(new FileWriter(corruptFile))
+writer.write("corrupt")
+writer.close()
--- End diff --

ditto for `tryWithResource`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22611: [SPARK-25595] Ignore corrupt Avro files if flag I...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22611#discussion_r222167036
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala ---
@@ -100,6 +77,50 @@ private[avro] class AvroFileFormat extends FileFormat
 }
   }
 
+  private def inferAvroSchemaFromFiles(
+  files: Seq[FileStatus],
+  conf: Configuration,
+  ignoreExtension: Boolean): Schema = {
+val ignoreCorruptFiles = SQLConf.get.ignoreCorruptFiles
+// Schema evolution is not supported yet. Here we only pick first 
random readable sample file to
+// figure out the schema of the whole dataset.
+val avroReader = files.iterator.map { f =>
+  val path = f.getPath
+  if (!ignoreExtension && !path.getName.endsWith(".avro")) {
+None
+  } else {
+val in = new FsInput(path, conf)
+try {
+  Some(DataFileReader.openReader(in, new 
GenericDatumReader[GenericRecord]()))
+} catch {
+  case e: IOException =>
+if (ignoreCorruptFiles) {
+  logWarning(s"Skipped the footer in the corrupted file: 
$path", e)
+  None
+} else {
+  throw new SparkException(s"Could not read file: $path", e)
+}
+} finally {
+  in.close()
+}
+  }
+}.collectFirst {
+  case Some(reader) => reader
+}
+
+avroReader match {
+  case Some(reader) =>
+try {
--- End diff --

ditto for `tryWithResource`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22611: [SPARK-25595] Ignore corrupt Avro files if flag I...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22611#discussion_r222166950
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala ---
@@ -100,6 +77,50 @@ private[avro] class AvroFileFormat extends FileFormat
 }
   }
 
+  private def inferAvroSchemaFromFiles(
+  files: Seq[FileStatus],
+  conf: Configuration,
+  ignoreExtension: Boolean): Schema = {
+val ignoreCorruptFiles = SQLConf.get.ignoreCorruptFiles
+// Schema evolution is not supported yet. Here we only pick first 
random readable sample file to
+// figure out the schema of the whole dataset.
+val avroReader = files.iterator.map { f =>
+  val path = f.getPath
+  if (!ignoreExtension && !path.getName.endsWith(".avro")) {
+None
+  } else {
+val in = new FsInput(path, conf)
--- End diff --

Not a big deal but we can use `Utils.tryiWithResource`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22611: [SPARK-25595] Ignore corrupt Avro files if flag I...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22611#discussion_r222166498
  
--- Diff: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala ---
@@ -100,6 +77,50 @@ private[avro] class AvroFileFormat extends FileFormat
 }
   }
 
+  private def inferAvroSchemaFromFiles(
+  files: Seq[FileStatus],
+  conf: Configuration,
+  ignoreExtension: Boolean): Schema = {
+val ignoreCorruptFiles = SQLConf.get.ignoreCorruptFiles
--- End diff --

Bout about matching it to 
`sparkSession.sessionState.conf.ignoreCorruptFiles` like other occurrences?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22613: [SPARK-25583][DOC][BRANCH-2.3]Add history-server related...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22613
  
**[Test build #96884 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96884/testReport)**
 for PR 22613 at commit 
[`f0947a3`](https://github.com/apache/spark/commit/f0947a31f13035e490385a73b027cd5dbe789072).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22613: [SPARK-25583][DOC][BRANCH-2.3]Add history-server related...

2018-10-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22613
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22616: [SPARK-25586][Core] Remove outer objects from logdebug s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22616
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96875/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22616: [SPARK-25586][Core] Remove outer objects from logdebug s...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22616
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22616: [SPARK-25586][Core] Remove outer objects from logdebug s...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22616
  
**[Test build #96875 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96875/testReport)**
 for PR 22616 at commit 
[`cc8926d`](https://github.com/apache/spark/commit/cc8926d6a69786412e7630b03d738da8fda7effe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `logDebug(s\" + cloning the object of class $`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22615: [SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

2018-10-02 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/22615
  
We can simplery this:

https://github.com/apache/spark/blob/7ad18ee9f26e75dbe038c6034700f9cd4c0e2baa/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L273-L295
to
```scala
sparkConf.get(ROLLED_LOG_INCLUDE_PATTERN).foreach { includePattern =>
  val logAggregationContext = 
Records.newRecord(classOf[LogAggregationContext])
  logAggregationContext.setRolledLogsIncludePattern(includePattern)
  sparkConf.get(ROLLED_LOG_EXCLUDE_PATTERN).foreach { excludePattern =>
logAggregationContext.setRolledLogsExcludePattern(excludePattern)
  }
  appContext.setLogAggregationContext(logAggregationContext)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22618: [SPARK-25321][ML] Revert SPARK-14681 to avoid API breaki...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22618
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22618: [SPARK-25321][ML] Revert SPARK-14681 to avoid API breaki...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22618
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96874/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22618: [SPARK-25321][ML] Revert SPARK-14681 to avoid API breaki...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22618
  
**[Test build #96874 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96874/testReport)**
 for PR 22618 at commit 
[`90eb1d7`](https://github.com/apache/spark/commit/90eb1d7f5895e442a86506e3e7dae382e138b3b0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `sealed abstract class Node extends Serializable `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22491
  
**[Test build #96883 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96883/testReport)**
 for PR 22491 at commit 
[`4b69423`](https://github.com/apache/spark/commit/4b6942337bab89fad4d40f120d53197544a799b8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/22488
  
Congratulation, @jiangxb1987


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22491
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22491
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3646/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22491: [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22491
  
Could you review and merge https://github.com/wangyum/spark/pull/14 ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is mo...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22347#discussion_r222158101
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
@@ -348,30 +349,30 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] 
with Logging with Serializ
 // Otherwise, interpolate the number of partitions we need to try, 
but overestimate
 // it by 50%. We also cap the estimation in the end.
 val limitScaleUpFactor = 
Math.max(sqlContext.conf.limitScaleUpFactor, 2)
-if (buf.isEmpty) {
+if (scannedRowCount == 0) {
   numPartsToTry = partsScanned * limitScaleUpFactor
 } else {
-  val left = n - buf.size
+  val left = n - scannedRowCount
   // As left > 0, numPartsToTry is always >= 1
-  numPartsToTry = Math.ceil(1.5 * left * partsScanned / 
buf.size).toInt
+  numPartsToTry = Math.ceil(1.5 * left * partsScanned / 
scannedRowCount).toInt
   numPartsToTry = Math.min(numPartsToTry, partsScanned * 
limitScaleUpFactor)
 }
   }
 
   val p = partsScanned.until(math.min(partsScanned + numPartsToTry, 
totalParts).toInt)
   val sc = sqlContext.sparkContext
-  val res = sc.runJob(childRDD,
-(it: Iterator[Array[Byte]]) => if (it.hasNext) it.next() else 
Array.empty[Byte], p)
-
-  buf ++= res.flatMap(decodeUnsafeRows)
+  val res = sc.runJob(childRDD, (it: Iterator[(Long, Array[Byte])]) =>
+if (it.hasNext) it.next() else (0L, Array.empty[Byte]), p)
 
+  buf ++= res.map(_._2)
+  scannedRowCount += res.map(_._1).sum
   partsScanned += p.size
 }
 
-if (buf.size > n) {
-  buf.take(n).toArray
+if (scannedRowCount > n) {
+  buf.toArray.view.flatMap(decodeUnsafeRows).take(n).force
 } else {
-  buf.toArray
+  buf.toArray.view.flatMap(decodeUnsafeRows).force
--- End diff --

`buf.flatMap(decodeUnsafeRows).toArray` ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is mo...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22347#discussion_r222158041
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ---
@@ -348,30 +349,30 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] 
with Logging with Serializ
 // Otherwise, interpolate the number of partitions we need to try, 
but overestimate
 // it by 50%. We also cap the estimation in the end.
 val limitScaleUpFactor = 
Math.max(sqlContext.conf.limitScaleUpFactor, 2)
-if (buf.isEmpty) {
+if (scannedRowCount == 0) {
   numPartsToTry = partsScanned * limitScaleUpFactor
 } else {
-  val left = n - buf.size
+  val left = n - scannedRowCount
   // As left > 0, numPartsToTry is always >= 1
-  numPartsToTry = Math.ceil(1.5 * left * partsScanned / 
buf.size).toInt
+  numPartsToTry = Math.ceil(1.5 * left * partsScanned / 
scannedRowCount).toInt
   numPartsToTry = Math.min(numPartsToTry, partsScanned * 
limitScaleUpFactor)
 }
   }
 
   val p = partsScanned.until(math.min(partsScanned + numPartsToTry, 
totalParts).toInt)
   val sc = sqlContext.sparkContext
-  val res = sc.runJob(childRDD,
-(it: Iterator[Array[Byte]]) => if (it.hasNext) it.next() else 
Array.empty[Byte], p)
-
-  buf ++= res.flatMap(decodeUnsafeRows)
+  val res = sc.runJob(childRDD, (it: Iterator[(Long, Array[Byte])]) =>
+if (it.hasNext) it.next() else (0L, Array.empty[Byte]), p)
 
+  buf ++= res.map(_._2)
+  scannedRowCount += res.map(_._1).sum
   partsScanned += p.size
 }
 
-if (buf.size > n) {
-  buf.take(n).toArray
+if (scannedRowCount > n) {
+  buf.toArray.view.flatMap(decodeUnsafeRows).take(n).force
--- End diff --

Can we simplify like the following?
```scala
  buf.flatMap(decodeUnsafeRows).take(n).toArray
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22500
  
**[Test build #96882 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96882/testReport)**
 for PR 22500 at commit 
[`c791249`](https://github.com/apache/spark/commit/c791249fdd3f97736d977087434575ca3480ae9d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22619: [SQL][MINOR] Make use of TypeCoercion.findTightestCommon...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22619
  
**[Test build #96881 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96881/testReport)**
 for PR 22619 at commit 
[`d4e0bdb`](https://github.com/apache/spark/commit/d4e0bdb5a06f59ff1df3933cb43804e71cde8259).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22500: [SPARK-25488][TEST] Refactor MiscBenchmark to use main m...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22500
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3645/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22619: [SQL][MINOR] Make use of TypeCoercion.findTightestCommon...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22619
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3644/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22619: [SQL][MINOR] Make use of TypeCoercion.findTightestCommon...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22619
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22619: [SQL][MINOR] Make use of TypeCoercion.findTightes...

2018-10-02 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/22619

[SQL][MINOR] Make use of TypeCoercion.findTightestCommonType while 
inferring CSV schema.

## What changes were proposed in this pull request?
Current the CSV's infer schema code inlines 
`TypeCoercion.findTightestCommonType`. This is a minor refactor to make use of 
the common type coercion code when applicable.  This way we can take advantage 
of any improvement to the base method.

Thanks to @MaxGekk for finding this while reviewing another PR.

## How was this patch tested?
This is a minor refactor.  Existing tests are used to verify the change.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark csv_minor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22619.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22619


commit d4e0bdb5a06f59ff1df3933cb43804e71cde8259
Author: Dilip Biswal 
Date:   2018-10-03T00:47:42Z

Make use of TypeCoercion.findTightestCommonType while inferring CSV schema




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...

2018-10-02 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/22047
  
@gatorsmile Thanks.. I will check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22488
  
Hi, @jiangxb1987 .
Could you review (and merge) this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19601: [SPARK-22383][SQL] Generate code to directly get value o...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19601
  
Hi, @kiszk . Is this still valid?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-02 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/22610
  
Thanks @BryanCutler! Yes, this should not be a bug but is used as a warning 
to users that there might be some type conversion they are not noticed at first 
glance on the Pandas UDFs. For now the conversion is silently done behind the 
scene and as the case in the JIRA shows it might not be easily noticed that 
Pandas.Series from UDFs isn't matched with defined UDFs' return types.





---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22598: [SPARK-25501][SS] Add kafka delegation token support.

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22598
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96871/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22598: [SPARK-25501][SS] Add kafka delegation token support.

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22598
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22598: [SPARK-25501][SS] Add kafka delegation token support.

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22598
  
**[Test build #96871 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96871/testReport)**
 for PR 22598 at commit 
[`2a79d59`](https://github.com/apache/spark/commit/2a79d598a3789def989454560d9efce5664318f0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22237
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22237
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96873/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22237
  
**[Test build #96873 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96873/testReport)**
 for PR 22237 at commit 
[`7e7af88`](https://github.com/apache/spark/commit/7e7af88cba38d8f5c1dbb9c1fc8204cada9d0f69).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class YarnShuffleServiceMetrics implements MetricsSource `
  * `case class SchemaOfJson(`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22488
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3643/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22488
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22047: [SPARK-19851] Add support for EVERY and ANY (SOME) aggre...

2018-10-02 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22047
  
Let me post something I wrote recently. Could you add test cases to ensure 
that we do not break the  "Ignore NULLs" policy

> All the set/aggregate functions ignore NULLs. The typical built-in 
Set/Aggregate functions are AVG, COUNT, MAX, MIN, SUM, GROUPING. 

> Note, COUNT(*) is actually equivalent to COUNT(1). Thus, it still 
includes rows containing null.

> Tip, because of the "Ignore NULLs" policy, Sum(a) + Sum(b) is not the 
same as Sum(a+b). 

> Note, although the set functions follow the "Ignore NULLs" policy, MIN, 
MAX, SUM AVG, EVERY, ANY and SOME returns NULL if 1) every value is NULL or 2) 
SELECT returns no row at all. COUNT never returns NULL. 

> TODO: When a set function eliminates NULLs, Spark SQL does not follow 
others to issue a warning message SQLSTATE 01003 "null value eliminated in set 
function".

> TODO: Check whether all the expressions that extend AggregateFunction 
follow the "Ignore NULLs" policy. If not, we need more investigation to see 
whether we should correct them.

> TODO: When Spark SQL supports EVERY, ANY, and SOME, they follow the same 
"Ignore NULLs" policy.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22613: [SPARK-25583][DOC][BRANCH-2.3]Add history-server related...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22613
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22488
  
**[Test build #96880 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96880/testReport)**
 for PR 22488 at commit 
[`27c6493`](https://github.com/apache/spark/commit/27c649337e326aa8081afcd9de1a7097ea2788e2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22615: [SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22615
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22615: [SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

2018-10-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22615
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96867/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22615: [SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

2018-10-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22615
  
**[Test build #96867 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96867/testReport)**
 for PR 22615 at commit 
[`3b313bb`](https://github.com/apache/spark/commit/3b313bb83c84429b4d5840055523b1ca48489d19).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...

2018-10-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22488
  
@wangyum . Could you review and merge 
https://github.com/wangyum/spark/pull/13 ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >