[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22675#discussion_r223725196 --- Diff: docs/ml-datasource.md --- @@ -0,0 +1,51 @@ +--- +layout: global +title: Data sources +displayTitle: Data sources

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22237#discussion_r223696096 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala --- @@ -547,4 +548,27 @@ class JsonFunctionsSuite extends QueryTest

[GitHub] spark issue #22656: [SPARK-25669][SQL] Check CSV header only when it exists

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22656 Merged to master and branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22676#discussion_r223594430 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVHeaderChecker.scala --- @@ -0,0 +1,131 @@ +/* + * Licensed

[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22676#discussion_r223594011 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVHeaderChecker.scala --- @@ -0,0 +1,131 @@ +/* + * Licensed

[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22676#discussion_r223593838 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -273,44 +274,47 @@ private[csv] object

[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22676#discussion_r223593894 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVUtils.scala --- @@ -90,6 +89,49 @@ object CSVUtils

[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22676#discussion_r223593701 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -273,44 +274,47 @@ private[csv] object

[GitHub] spark issue #22676: [SPARK-25684][SQL] Organize header related codes in CSV ...

2018-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22676 cc @cloud-fan and @MaxGekk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22676: [SPARK-25684][SQL] Organize header related codes ...

2018-10-09 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/22676 [SPARK-25684][SQL] Organize header related codes in CSV datasource ## What changes were proposed in this pull request? 1. Move `CSVDataSource.makeSafeHeader

[GitHub] spark pull request #22656: [SPARK-25669][SQL] Check CSV header only when it ...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22656#discussion_r223566522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -505,7 +505,8 @@ class DataFrameReader private[sql](sparkSession

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22675#discussion_r223552437 --- Diff: docs/ml-datasource.md --- @@ -0,0 +1,51 @@ +--- +layout: global +title: Data sources +displayTitle: Data sources

[GitHub] spark pull request #22675: [SPARK-25347][ML][DOC] Spark datasource for image...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22675#discussion_r223552386 --- Diff: docs/ml-datasource.md --- @@ -0,0 +1,51 @@ +--- +layout: global +title: Data sources +displayTitle: Data sources

[GitHub] spark pull request #22653: [SPARK-25659][PYTHON][TEST] Test type inference s...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22653#discussion_r223552056 --- Diff: python/pyspark/sql/tests.py --- @@ -1149,6 +1149,75 @@ def test_infer_schema(self): result = self.spark.sql("SELECT l[0].a

[GitHub] spark issue #22615: [SPARK-25016][BUILD][CORE] Remove support for Hadoop 2.6

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22615 I want to see the configurations .. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22237#discussion_r223550790 --- Diff: docs/sql-programming-guide.md --- @@ -1890,6 +1890,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22653: [SPARK-25659][PYTHON][TEST] Test type inference specific...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22653 Merged to master. Thank you @BryanCutler. Let me try to make sure I take further actions for related items

[GitHub] spark issue #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting deserializ...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22635 Yea, same issue exists in Pandas UDFs too (quickly double checked). This PR fixes it. That code path is rather one same place FYI

[GitHub] spark issue #22653: [SPARK-25659][PYTHON][TEST] Test type inference specific...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22653 Let me open a JIRA after taking some more further looks. I checked some code places and looks it's not easy to support `None` for instance

[GitHub] spark pull request #22653: [SPARK-25659][PYTHON][TEST] Test type inference s...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22653#discussion_r223520528 --- Diff: python/pyspark/sql/tests.py --- @@ -1149,6 +1149,75 @@ def test_infer_schema(self): result = self.spark.sql("SELECT l[0].a

[GitHub] spark pull request #22545: [SPARK-25525][SQL][PYSPARK] Do not update conf fo...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22545#discussion_r223408456 --- Diff: python/pyspark/sql/session.py --- @@ -156,7 +156,7 @@ def getOrCreate(self): default. >&g

[GitHub] spark issue #22657: [SPARK-25670][TEST] Reduce number of tested timezones in...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22657 I agree. We don't test `commons-lang3` library here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22631: [SPARK-25605][TESTS] Run cast string to timestamp...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22631#discussion_r223286771 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala --- @@ -110,7 +112,7 @@ class CastSuite extends

[GitHub] spark pull request #22631: [SPARK-25605][TESTS] Run cast string to timestamp...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22631#discussion_r223285813 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala --- @@ -110,7 +112,7 @@ class CastSuite extends

[GitHub] spark issue #22655: [SPARK-25666][PYTHON] Internally document type conversio...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22655 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22655: [SPARK-25666][PYTHON] Internally document type conversio...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22655 I am getting this in. This is an ongoing effort and it just documents them internally for now. --- - To unsubscribe, e-mail

[GitHub] spark issue #22669: [SPARK-25677] [Doc] spark.io.compression.codec = org.apa...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22669 Merged to master and branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting deserializ...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22635 Merged to master and branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22667: [SPARK-25673][BUILD] Remove Travis CI which enables Java...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22667 Merged to master and branch-2.4. Thanks @srowen and @felixcheung. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22669: [SPARK-25677] [Doc] spark.io.compression.codec = org.apa...

2018-10-08 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22669 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22655: [WIP][SPARK-25666][PYTHON] Internally document type conv...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22655 Let me make this table for Pandas UDF too and then open another JIRA (or mailing list) to discuss about this further. I need more investigations to propose the desired behaviour targeting 3.0

[GitHub] spark issue #22665: add [openjdk11] to Travis build matrix

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22665 @sullis, this file will be obsolete per https://github.com/apache/spark/pull/22667. This shall be closed

[GitHub] spark pull request #22667: [SPARK-25673][BUILD] Remove Travis CI which enabl...

2018-10-07 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/22667 [SPARK-25673][BUILD] Remove Travis CI which enables Java lint check ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/12980 added Travis CI file

[GitHub] spark issue #22667: [SPARK-25673][BUILD] Remove Travis CI which enables Java...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22667 cc @srowen and @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22659 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22653: [SPARK-25659][PYTHON][TEST] Test type inference specific...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22653 @BryanCutler and @viirya would you mind if I ask to take a look please? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22655: [SPARK-25666][PYTHON] Internally document type co...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22655#discussion_r223218608 --- Diff: python/pyspark/sql/functions.py --- @@ -2733,6 +2733,33 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE

[GitHub] spark pull request #22655: [SPARK-25666][PYTHON] Internally document type co...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22655#discussion_r223218561 --- Diff: python/pyspark/sql/functions.py --- @@ -2733,6 +2733,33 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE

[GitHub] spark issue #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting deserializ...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22635 Nice catch @viirya LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting des...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22635#discussion_r223218449 --- Diff: python/pyspark/accumulators.py --- @@ -109,10 +109,14 @@ def _deserialize_accumulator(aid, zero_value, accum_param): from

[GitHub] spark pull request #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting des...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22635#discussion_r223218361 --- Diff: python/pyspark/sql/tests.py --- @@ -3603,6 +3603,31 @@ def test_repr_behaviors(self): self.assertEquals(None, df

[GitHub] spark issue #22610: [SPARK-25461][PySpark][SQL] Add document for mismatch be...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22610 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22610: [SPARK-25461][PySpark][SQL] Add document for mism...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22610#discussion_r223217249 --- Diff: python/pyspark/sql/functions.py --- @@ -2909,6 +2909,12 @@ def pandas_udf(f=None, returnType=None, functionType=None): can fail

[GitHub] spark pull request #22647: [SPARK-25655] [BUILD] Add -Pspark-ganglia-lgpl to...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22647#discussion_r223186506 --- Diff: external/spark-ganglia-lgpl/src/main/scala/org/apache/spark/metrics/sink/GangliaSink.scala --- @@ -64,11 +64,12 @@ class GangliaSink(val

[GitHub] spark issue #22655: [SPARK-25666][PYTHON] Internally document type conversio...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22655 cc @cloud-fan, @viirya and @BryanCutler, WDYT? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22655: [SPARK-25666][PYTHON] Internally document type co...

2018-10-06 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/22655 [SPARK-25666][PYTHON] Internally document type conversion between Python data and SQL types in normal UDFs ### What changes were proposed in this pull request? We are facing some

[GitHub] spark pull request #22653: [SPARK-25659][PYTHON] Test type inference specifi...

2018-10-06 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/22653 [SPARK-25659][PYTHON] Test type inference specification for createDataFrame in PySpark ## What changes were proposed in this pull request? This PR proposes to specify type inference

[GitHub] spark issue #22640: [SPARK-25621][SPARK-25622][TEST] Reduce test time of Buc...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22640 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r223173806 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +255,20 @@ def newSession(self): """ return self.__c

[GitHub] spark issue #22619: [SPARK-25600][SQL][MINOR] Make use of TypeCoercion.findT...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22619 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22448: [SPARK-25417][SQL] Improve findTightestCommonType to coe...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22448 @dilipbiswal, can you file another JIRA instead of SPARK-25417 specifically for type coercion? --- - To unsubscribe, e-mail

[GitHub] spark issue #22646: [SPARK-25654][SQL] Support for nested JavaBean arrays, l...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22646 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22649: [SPARK-25644][SS][FOLLOWUP][BUILD] Fix Scala 2.12 build ...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22649 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning wh...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22610#discussion_r223173637 --- Diff: python/pyspark/sql/functions.py --- @@ -2909,6 +2909,11 @@ def pandas_udf(f=None, returnType=None, functionType=None): can fail

[GitHub] spark issue #22227: [SPARK-25202] [SQL] Implements split with limit sql func...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/7 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22647: [SPARK-25655] [BUILD] Add -Pspark-ganglia-lgpl to the sc...

2018-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22647 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22636: [SPARK-25629][TEST] Reduce ParquetFilterSuite: filter pu...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22636 Yea, it is not obvious and only few seconds - might not be so worth. But looks improvement because it fixes the test cases to test what the previous PR targeted. Wouldn't it be better just

[GitHub] spark issue #22647: [SPARK-25655] [BUILD] Add -Pspark-ganglia-lgpl to the sc...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22647 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22640: [SPARK-25621][SPARK-25622][TEST] Reduce test time of Buc...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22640 cc @cloud-fan too for sure. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22640: [SPARK-25621][SPARK-25622][TEST] Reduce test time...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22640#discussion_r222975187 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala --- @@ -66,7 +66,7 @@ abstract class BucketedReadSuite extends

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22237#discussion_r222971397 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/FailureSafeParser.scala --- @@ -15,50 +15,51 @@ * limitations under

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r222959106 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +255,20 @@ def newSession(self): """ return self.__c

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r222958780 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +255,20 @@ def newSession(self): """ return self.__c

[GitHub] spark issue #21596: [SPARK-24601] Update Jackson to 2.9.6

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21596 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22633: [SPARK-25644][SS]Fix java foreachBatch in DataStreamWrit...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22633 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22227: [SPARK-25202] [SQL] Implements split with limit sql func...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/7 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

2018-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22237#discussion_r222926650 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ParseMode.scala --- @@ -51,6 +56,8 @@ object ParseMode extends Logging

[GitHub] spark issue #22635: [SPARK-25591][PySpark][SQL] Avoid overwriting deserializ...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22635 Thanks for cc'ing me. Will take a look this week. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22633: [SPARK-25644][SS]Fix java foreachBatch in DataStreamWrit...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22633 Looks fine to me --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222894056 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222894110 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222893790 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222894028 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222893841 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222893720 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22627: [SPARK-25639] [DOCS] Added docs for foreachBatch,...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22627#discussion_r222893572 --- Diff: docs/structured-streaming-programming-guide.md --- @@ -1989,22 +2026,211 @@ head(sql("select * from aggre

[GitHub] spark pull request #22633: [SPARK-25644][SS]Fix java foreachBatch in DataStr...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22633#discussion_r222893318 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/streaming/JavaDataStreamReaderWriterSuite.java --- @@ -0,0 +1,89 @@ +/* +* Licensed

[GitHub] spark pull request #22633: [SPARK-25644][SS]Fix java foreachBatch in DataStr...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22633#discussion_r222893114 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/streaming/JavaDataStreamReaderWriterSuite.java --- @@ -0,0 +1,89 @@ +/* +* Licensed

[GitHub] spark issue #22619: [SPARK-25600][SQL][MINOR] Make use of TypeCoercion.findT...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22619 Yup. Let me leave this open few more days in case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22637: [SPARK-25408] Move to mode ideomatic Java8

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22637 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22637: [SPARK-25408] Move to mode ideomatic Java8

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22637 mind filling PR description please? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22399: [SPARK-25408] Move to mode ideomatic Java8

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22399 Let me trigger it in the next PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22399: [SPARK-25408] Move to mode ideomatic Java8

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22399 @Fokko, Let's follow the PR format next time BTW, for instance, "How was this patch tested?" --- - To unsubscri

[GitHub] spark pull request #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning wh...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22610#discussion_r222885910 --- Diff: python/pyspark/sql/functions.py --- @@ -2909,6 +2909,11 @@ def pandas_udf(f=None, returnType=None, functionType=None): can fail

[GitHub] spark pull request #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning wh...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22610#discussion_r222885267 --- Diff: python/pyspark/worker.py --- @@ -84,13 +84,36 @@ def wrap_scalar_pandas_udf(f, return_type): arrow_return_type = to_arrow_type

[GitHub] spark issue #22227: [SPARK-25202] [SQL] Implements split with limit sql func...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/7 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22622: [SPARK-25635][SQL][BUILD] Support selective direct encod...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22622 Which looks now fixed in https://github.com/apache/spark/commit/5ae20cf1a96a33f5de4435fcfb55914d64466525

[GitHub] spark issue #22622: [SPARK-25635][SQL][BUILD] Support selective direct encod...

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22622 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22626 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22626 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21596: [SPARK-24601] Update Jackson to 2.9.6

2018-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21596 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22622: [SPARK-25635][SQL][BUILD] Support selective direc...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22622#discussion_r222535553 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala --- @@ -115,6 +116,71 @@ abstract class OrcSuite

[GitHub] spark pull request #22622: [SPARK-25635][SQL][BUILD] Support selective direc...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22622#discussion_r222535182 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala --- @@ -115,6 +116,71 @@ abstract class OrcSuite

[GitHub] spark pull request #22622: [SPARK-25635][SQL][BUILD] Support selective direc...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22622#discussion_r222535254 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcSourceSuite.scala --- @@ -284,4 +350,8 @@ class OrcSourceSuite

[GitHub] spark issue #22593: [Streaming][DOC] Fix typo & format in DataStreamWriter.s...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22593 ping @niofire --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22610 One clear thing looks adding some documentation .. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22620: [SPARK-25601][PYTHON] Register Grouped aggregate UDF Vec...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22620 Thank you @icexelloss, @gatorsmile, @BryanCutler and @viirya. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22620: [SPARK-25601][PYTHON] Register Grouped aggregate UDF Vec...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22620 Merged to master and branch-2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22610: [WIP][SPARK-25461][PySpark][SQL] Print warning when retu...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22610 Thanks, @BryanCutler. WDYT about documenting the type map thing? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22620: [SPARK-25601][PYTHON] Register Grouped aggregate ...

2018-10-03 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22620#discussion_r222417513 --- Diff: python/pyspark/sql/udf.py --- @@ -298,6 +298,15 @@ def register(self, name, f, returnType=None): >>> spark.sq

<    7   8   9   10   11   12   13   14   15   16   >