[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22429 I took a super quick pass - the change actually quite looks okay in general to me. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #22429: [SPARK-25440][SQL] Dumping query execution info t...

2018-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22429#discussion_r232604420 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -176,9 +176,9 @@ case class TakeOrderedAndProjectExec

[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22429 @MaxGekk, couple of questions for its implementation from a cursory look. It's the implementation is complicated here: 1. it tries to use writer and avoid to construct

[GitHub] spark pull request #22779: [SPARK-25786][CORE]If the ByteBuffer.hasArray is ...

2018-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22779#discussion_r232587038 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -338,7 +338,7 @@ private[spark] class KryoSerializerInstance(ks

[GitHub] spark issue #23011: [SPARK-26013][R][BUILD] Upgrade R tools version from 3.4...

2018-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23011 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22809: [SPARK-19851][SQL] Add support for EVERY and ANY ...

2018-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22809#discussion_r232564281 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Max.scala --- @@ -57,3 +57,34 @@ case class Max(child

[GitHub] spark pull request #22693: [SPARK-25701][SQL] Supports calculation of table ...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22693#discussion_r232556859 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -115,26 +116,45 @@ class ResolveHiveSerdeTable(session

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23012#discussion_r232549234 --- Diff: docs/index.md --- @@ -31,7 +31,8 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS). It's easy locally on one

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23012#discussion_r232549053 --- Diff: docs/index.md --- @@ -31,7 +31,8 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS). It's easy locally on one

[GitHub] spark pull request #22939: [SPARK-25446][R] Add schema_of_json() and schema_...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22939#discussion_r232540931 --- Diff: R/pkg/R/functions.R --- @@ -2230,6 +2237,32 @@ setMethod("from_json", signature(x = "Column", schema = &q

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23012 adding @srowen too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23012: [SPARK-26014][R] Deprecate R prior to version 3.4 in Spa...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23012 Tests probably will fail since it produces warnings. cc @felixcheung. @shaneknapp, @viirya, @shivaram, @falaki, @mengxr, @yanboliang FYI. This PR is made per http://apache

[GitHub] spark pull request #23012: [SPARK-26014][R] Deprecate R prior to version 3.4...

2018-11-11 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/23012 [SPARK-26014][R] Deprecate R prior to version 3.4 in SparkR ## What changes were proposed in this pull request? This PR proposes to bump up the minimum versions of R from 3.1 to 3.4

[GitHub] spark issue #23011: [SPARK-26013][R][BUILD] Upgrade R tools version from 3.4...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23011 cc @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23011: [SPARK-26013][R][BUILD] Upgrade R tools version f...

2018-11-11 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/23011 [SPARK-26013][R][BUILD] Upgrade R tools version from 3.4.0 to 3.5.1 in AppVeyor build ## What changes were proposed in this pull request? R tools 3.5.1 is released few months ago

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix barrier task run without Barr...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22962 Looks making sense to me in general. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix barrier task run witho...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22962#discussion_r232528655 --- Diff: python/pyspark/tests.py --- @@ -618,10 +618,13 @@ def test_barrier_with_python_worker_reuse(self): """

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22954 Yea .. I will make the followup works right away after this one get merged. Thanks @felixcheung. Let me address the rest of comments, and wait for Arrow release. @BryanCutler BTW, do

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232525184 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -307,6 +307,64 @@ test_that("create DataFrame from RDD", { unsetH

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232525068 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -307,6 +307,64 @@ test_that("create DataFrame from RDD", { unsetH

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232520110 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -149,8 +156,8 @@ class UnivocityParser

[GitHub] spark issue #23008: [SPARK-22674][PYTHON] Removed the namedtuple pickling pa...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23008 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23006: [SPARK-26007][SQL] DataFrameReader.csv() respects...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23006#discussion_r232494906 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -491,7 +491,8 @@ class DataFrameReader private[sql](sparkSession

[GitHub] spark pull request #22880: [SPARK-25407][SQL] Ensure we pass a compatible pr...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22880#discussion_r232489729 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -202,8 +204,12 @@ private

[GitHub] spark pull request #22880: [SPARK-25407][SQL] Ensure we pass a compatible pr...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22880#discussion_r232489624 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -130,8 +130,8 @@ private[parquet

[GitHub] spark pull request #22880: [SPARK-25407][SQL] Ensure we pass a compatible pr...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22880#discussion_r232489418 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -182,18 +182,20 @@ private

[GitHub] spark pull request #22880: [SPARK-25407][SQL] Ensure we pass a compatible pr...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22880#discussion_r232489371 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala --- @@ -93,13 +141,14 @@ private[parquet

[GitHub] spark pull request #22880: [SPARK-25407][SQL] Ensure we pass a compatible pr...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22880#discussion_r232489340 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala --- @@ -93,13 +141,14 @@ private[parquet

[GitHub] spark issue #22880: [SPARK-25407][SQL] Ensure we pass a compatible pruned sc...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22880 Looks good. I or someone else should take a closer look before getting this in. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232487851 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -149,8 +156,8 @@ class UnivocityParser

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232487778 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -149,8 +156,8 @@ class UnivocityParser

[GitHub] spark issue #23006: [SPARK-26007][SQL] DataFrameReader.csv() respects to spa...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23006 Looks good otherwise. I or someone else should take a closer look. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22954 Hm .. the CRAN passed in my local. Let me workaround for now. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22973: [SPARK-25972][PYTHON] Missed JSON options in streaming.p...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22973 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23006: [SPARK-26007][SQL] DataFrameReader.csv() respects...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23006#discussion_r232487030 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -491,7 +491,8 @@ class DataFrameReader private[sql](sparkSession

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22305 @icexelloss, while we are here, mind fixing the example in the PR description as self-contained workable example

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232486778 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide ## Upgrading From Spark SQL 2.4

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232486670 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -149,8 +156,8 @@ class UnivocityParser

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232486599 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -104,6 +105,12 @@ class UnivocityParser

[GitHub] spark pull request #23006: [SPARK-26007][SQL] DataFrameReader.csv() respects...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23006#discussion_r232486474 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -491,7 +491,8 @@ class DataFrameReader private[sql](sparkSession

[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r232486218 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -306,7 +306,15 @@ case class FileSourceScanExec

[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22905#discussion_r232486141 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -306,7 +306,15 @@ case class FileSourceScanExec

[GitHub] spark issue #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send u...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22275 Thanks for asking me. Will take a look within few days. Don't block because of me for clarification. --- - To unsubscribe

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232485612 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -27,17 +27,62 @@ import

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232485777 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/WindowInPandasExec.scala --- @@ -73,68 +118,151 @@ case class

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22305 adding @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232485476 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -63,7 +65,7 @@ private[spark] object PythonEvalType

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232485435 --- Diff: python/pyspark/worker.py --- @@ -154,6 +154,47 @@ def wrapped(*series): return lambda *a: (wrapped(*a), arrow_return_type

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232485374 --- Diff: python/pyspark/sql/tests.py --- @@ -89,6 +89,7 @@ from pyspark.sql.types import _merge_type from pyspark.tests import QuietTest

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232485348 --- Diff: python/pyspark/sql/tests.py --- @@ -7064,12 +7098,104 @@ def test_invalid_args(self): foo_udf = pandas_udf(lambda x: x

[GitHub] spark pull request #22305: [SPARK-24561][SQL][Python] User-defined window ag...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22305#discussion_r232485316 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala --- @@ -63,7 +65,7 @@ private[spark] object PythonEvalType

[GitHub] spark issue #22305: [SPARK-24561][SQL][Python] User-defined window aggregati...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22305 @icexelloss, let's take out NumPy discussion in this PR. It's super bigger scope then this. --- - To unsubscribe, e-mail

[GitHub] spark pull request #22938: [SPARK-25935][SQL] Prevent null rows from JSON pa...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22938#discussion_r232484880 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -550,15 +550,33 @@ case class

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232484798 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CsvExpressionsSuite.scala --- @@ -226,4 +227,17 @@ class

[GitHub] spark pull request #22979: [SPARK-25977][SQL] Parsing decimals from CSV usin...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22979#discussion_r232484751 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -9,6 +9,8 @@ displayTitle: Spark SQL Upgrading Guide ## Upgrading From Spark SQL 2.4

[GitHub] spark pull request #22973: [SPARK-25972][PYTHON] Missed JSON options in stre...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22973#discussion_r232484720 --- Diff: python/pyspark/sql/streaming.py --- @@ -467,11 +468,18 @@ def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None

[GitHub] spark issue #20788: [SPARK-23647][PYTHON][SQL] Adds more types for hint in p...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20788 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20788: [SPARK-23647][PYTHON][SQL] Adds more types for hint in p...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20788 Simply calling it should be enough. See https://github.com/apache/spark/pull/21649/files --- - To unsubscribe, e-mail

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22954 Let me hide some comments that are addressed (it looks messy). Please make unhide if I mistakenly hide some comments that are not addressed yet

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232478997 --- Diff: R/pkg/R/SQLContext.R --- @@ -147,6 +147,55 @@ getDefaultSqlSource <- function() { l[["spark.sql.sources

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232476906 --- Diff: R/pkg/R/SQLContext.R --- @@ -147,6 +147,55 @@ getDefaultSqlSource <- function() { l[["spark.sql.sources

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232475881 --- Diff: R/pkg/R/SQLContext.R --- @@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0,

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232475777 --- Diff: R/pkg/R/SQLContext.R --- @@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0,

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232475752 --- Diff: R/pkg/R/SQLContext.R --- @@ -147,6 +147,55 @@ getDefaultSqlSource <- function() { l[["spark.sql.sources

[GitHub] spark issue #23001: [INFRA] Close stale PRs

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23001 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473761 --- Diff: R/pkg/R/SQLContext.R --- @@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0,

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473723 --- Diff: R/pkg/R/SQLContext.R --- @@ -189,19 +238,67 @@ createDataFrame <- function(data, schema = NULL, samplingRatio = 1.0,

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473716 --- Diff: R/pkg/R/SQLContext.R --- @@ -172,10 +221,10 @@ getDefaultSqlSource <- function() { createDataFrame <- function(data, schema

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473705 --- Diff: R/pkg/R/SQLContext.R --- @@ -147,6 +147,55 @@ getDefaultSqlSource <- function() { l[["spark.sql.sources

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473697 --- Diff: R/pkg/R/SQLContext.R --- @@ -147,6 +147,55 @@ getDefaultSqlSource <- function() { l[["spark.sql.sources

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473669 --- Diff: R/pkg/R/SQLContext.R --- @@ -147,6 +147,55 @@ getDefaultSqlSource <- function() { l[["spark.sql.sources

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473643 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -225,4 +226,25 @@ private[sql] object SQLUtils extends Logging

[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22954#discussion_r232473611 --- Diff: R/pkg/R/SQLContext.R --- @@ -147,6 +147,55 @@ getDefaultSqlSource <- function() { l[["spark.sql.sources

[GitHub] spark issue #23001: [INFRA] Close stale PRs

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23001 I took a quick pass. Mind adding those please?: #22539 #22539 #21868 #21514 #21402 #21322 #21257 #20163 #19691 #18697 #18636 #17176

[GitHub] spark issue #18697: [SPARK-16683][SQL] Repeated joins to same table can leak...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18697 Let't close this then. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20163: [SPARK-22966][PYTHON][SQL] Python UDFs with returnType=S...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20163 Let's leave this closed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20168: [SPARK-22730][ML] Add ImageSchema support for all OpenCv...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20168 @tomasatdatabricks, mind updating this? Lately I happened to take a look for this few times. I will try to review

[GitHub] spark issue #20503: [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for R...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20503 ping @ashashwat to update --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20788: [SPARK-23647][PYTHON][SQL] Adds more types for hint in p...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20788 @DylanGuedes let's add tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #21257: [SPARK-24194] [SQL]HadoopFsRelation cannot overwrite a p...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21257 ping @zheh12 to address comments. I am going to suggest to close this for now while I am identifying PRs to close now

[GitHub] spark issue #21322: [SPARK-24225][CORE] Support closing AutoClosable objects...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21322 Shall we leave this PR closed and start it from a design doc? Let me suggest to close this for now while I am looking through old PRs. @JeetKunDoug, please feel free to create a clone

[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21402 @redsanket you should close it by yourself. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21514 I'm going to suggest to close this. The review comments were not addressed more then few months and there's not quite a great point to keep inactive PRs. Feel free to take over

[GitHub] spark issue #21514: [SPARK-22860] [SPARK-24621] [Core] [WebUI] - hide key pa...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21514 ping @tooptoop4 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #21868: [SPARK-24906][SQL] Adaptively enlarge split / partition ...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21868 I think we should fix this. Basically the dynamic estimation logic is too flaky, and I think we need this for the current status. Let's don't add it for now. While I am revisiting old

[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22060 hey @maryannxue, where are we here? Let's close this if it's going to be inactive a couple of weeks. --- - To unsubscribe

[GitHub] spark issue #22144: [SPARK-24935][SQL] : Problem with Executing Hive UDF's f...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22144 So, 2.4 is out. Where are we? Rereading the comments above, looks we should: 1. Find the root cause 2. Officially drop it if the workaround is not easy 3. Fix

[GitHub] spark issue #22184: [SPARK-25132][SQL][DOC] Add migration doc for case-insen...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22184 @seancxmao, so this behaviour changes description is only valid when we upgrade spark 2.3 to 2.4? Then we can add it in `Upgrading From Spark SQL 2.3 to 2.4

[GitHub] spark issue #21363: [SPARK-19228][SQL] Migrate on Java 8 time from FastDateF...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21363 I am going to suggest to close this since it's being active more then few weeks. It should be good to fix. Let me leave some cc's who might be interested in this just FYI. Feel free to take

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix BarrierTaskContext while pyth...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22962 Please fix the PR title to describe what it fixes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix BarrierTaskContext whi...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22962#discussion_r232448083 --- Diff: python/pyspark/taskcontext.py --- @@ -144,10 +144,19 @@ def __init__(self): """Construct a BarrierTaskContext,

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix BarrierTaskContext whi...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22962#discussion_r232447967 --- Diff: python/pyspark/taskcontext.py --- @@ -144,10 +144,19 @@ def __init__(self): """Construct a BarrierTaskContext,

[GitHub] spark issue #22962: [SPARK-25921][PySpark] Fix BarrierTaskContext while pyth...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22962 @xuanyuanking, mind explaining how and why it happens rather then what happens in PR description? --- - To unsubscribe, e

[GitHub] spark pull request #22962: [SPARK-25921][PySpark] Fix BarrierTaskContext whi...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22962#discussion_r232447862 --- Diff: python/pyspark/tests.py --- @@ -614,6 +614,18 @@ def context_barrier(x): times = rdd.barrier().mapPartitions(f).map

[GitHub] spark issue #22932: [SPARK-25102][SQL] Write Spark version to ORC/Parquet fi...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22932 double checked. A late LGTM too --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22994: [BUILD] refactor dev/lint-python in to something readabl...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22994 I agree with this change. The current script is a total mess - I will try to help take a look when the tests pass. BTW, it would be awesome if PR description contains what this PR tries to fix

[GitHub] spark issue #22963: [SPARK-25962][BUILD][PYTHON] Specify minimum versions fo...

2018-11-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22963 I also agree with @srowen's (https://github.com/apache/spark/pull/22963#issuecomment-437133365) --- - To unsubscribe, e

[GitHub] spark issue #22954: [SPARK-25981][R] Enables Arrow optimization from R DataF...

2018-11-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22954 Hey guys thanks for reviewing! Will address them soon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22963: [SPARK-25962][BUILD][PYTHON] Specify minimum versions fo...

2018-11-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22963 OMG, I don't know why I missed these comments. I will read it tomorrow (now it's 6 am and I could get sleep

[GitHub] spark pull request #22989: [SPARK-25986][Build] Banning throw new OutOfMemor...

2018-11-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22989#discussion_r232289280 --- Diff: scalastyle-config.xml --- @@ -240,6 +240,18 @@ This file is divided into 3 sections: ]]> + --- End d

<    1   2   3   4   5   6   7   8   9   10   >