[GitHub] spark pull request #22246: [WIP] [SPARK-25235] [SHELL] Merge the REPL code i...

2018-08-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22246#discussion_r213164929 --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala --- @@ -124,6 +141,26 @@ class SparkILoop(in0: Option[BufferedReader], out

[GitHub] spark issue #22162: [spark-24442][SQL] Added parameters to control the defau...

2018-08-27 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22162 We should wait @AndrewKL for few days? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

2018-08-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22237#discussion_r212928747 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -554,18 +554,22 @@ case class JsonToStructs

[GitHub] spark issue #22239: [SPARK-19355][SQL][Followup] Remove the child.outputOrde...

2018-08-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22239 @hvanhovell ah, updated. also updated the PR description too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22239: [SPARK-19355][SQL][Followup] Remove the child.outputPart...

2018-08-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22239 @maropu Thanks. I just added it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212856674 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3237,20 @@ class Dataset[T] private[sql]( files.toSet.toArray

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212854255 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3237,6 +3237,20 @@ class Dataset[T] private[sql]( files.toSet.toArray

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212856186 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -289,6 +289,14 @@ private

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212854351 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -329,49 +329,52 @@ abstract class SparkPlan extends QueryPlan

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r212854113 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -329,49 +329,52 @@ abstract class SparkPlan extends QueryPlan

[GitHub] spark issue #22239: [SPARK-19355][SQL][Followup] Remove the child.outputPart...

2018-08-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22239 Sure, thank you @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22237#discussion_r212847031 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -554,18 +554,22 @@ case class JsonToStructs

[GitHub] spark issue #22239: [SPARK-19355][SQL][Followup] Remove the child.outputPart...

2018-08-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22239 @hvanhovell I can set `spark.sql.limit.flatGlobalLimit` to false to match `TakeOrderedAndProjectExec` semantics at the beginning of `TakeOrderedAndProjectSuite`. Or you prefer to add an explicit

[GitHub] spark issue #22239: [SPARK-19355][SQL][Followup] Remove the child.outputPart...

2018-08-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22239 > @viirya did you try to run TakeOrderedAndProjectSuite? I am pretty sure that will fail now ;)... Not yet. Let me

[GitHub] spark issue #22239: [SPARK-19355][SQL][Followup] Remove the child.outputPart...

2018-08-26 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22239 cc @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22239: [SPARK-19355][SQL][Followup] Remove the child.out...

2018-08-26 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22239 [SPARK-19355][SQL][Followup] Remove the child.outputPartitioning check ## What changes were proposed in this pull request? This is based on the discussion https://github.com/apache/spark

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212844439 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode

[GitHub] spark pull request #22234: [SPARK-25241][SQL] Configurable empty values when...

2018-08-26 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22234#discussion_r212825429 --- Diff: python/pyspark/sql/readwriter.py --- @@ -345,11 +345,11 @@ def text(self, paths, wholetext=False, lineSep=None): @since(2.0) def

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r212812744 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -1021,6 +1022,116 @@ class

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r212812771 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -44,7 +45,12 @@ private[parquet] class

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r212812718 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -1021,6 +1022,116 @@ class

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r212812673 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -350,25 +356,38 @@ private[parquet] class

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r212812755 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -1021,6 +1022,116 @@ class

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212811618 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -93,25 +96,93 @@ trait BaseLimitExec extends UnaryExecNode

[GitHub] spark pull request #16677: [SPARK-19355][SQL] Use map output statistics to i...

2018-08-25 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16677#discussion_r212792753 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-limit.sql --- @@ -1,6 +1,9 @@ -- A test suite for IN LIMIT in parent

[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/7#discussion_r212783356 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -232,30 +232,41 @@ case class RLike(left

[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/7#discussion_r212783481 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -232,30 +232,41 @@ case class RLike(left

[GitHub] spark pull request #22227: [SPARK-25202] [Core] Implements split with limit ...

2018-08-24 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/7#discussion_r212783375 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala --- @@ -232,30 +232,41 @@ case class RLike(left

[GitHub] spark issue #22229: [MINOR] Fix Scala 2.12 build

2018-08-24 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/9 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21732 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-24 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21732 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-08-24 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22171 > How can this happen? When Spark writes decimal out, the external systems will get decimal values, not string values. I have the same quest

[GitHub] spark pull request #21732: [SPARK-24762][SQL] Enable Option of Product encod...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21732#discussion_r212520685 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/TypedAggregateExpression.scala --- @@ -19,25 +19,85 @@ package

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-23 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21732 @cloud-fan I made attempt to remove `topLevel` parameter. The approach is to flatten serializers and deserialzer at `TypedAggregateExpression`. So users are not aware of difference when using

[GitHub] spark pull request #22206: SPARK-25213: Add project to v2 scans before pytho...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22206#discussion_r212497182 --- Diff: python/pyspark/sql/tests.py --- @@ -6394,6 +6394,17 @@ def test_invalid_args(self): df.withColumn('mean_v', mean_udf(df['v

[GitHub] spark pull request #22206: SPARK-25213: Add project to v2 scans before pytho...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22206#discussion_r212496291 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala --- @@ -130,10 +133,22 @@ object

[GitHub] spark pull request #22206: SPARK-25213: Add project to v2 scans before pytho...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22206#discussion_r212488758 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala --- @@ -130,10 +133,22 @@ object

[GitHub] spark pull request #22206: SPARK-25213: Add project to v2 scans before pytho...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22206#discussion_r212488439 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala --- @@ -130,10 +133,22 @@ object

[GitHub] spark pull request #22206: SPARK-25213: Add project to v2 scans before pytho...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22206#discussion_r212484229 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala --- @@ -130,10 +133,22 @@ object

[GitHub] spark pull request #22157: [SPARK-25126][SQL] Avoid creating Reader for all ...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22157#discussion_r212475271 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala --- @@ -79,9 +79,10 @@ object OrcUtils extends Logging

[GitHub] spark pull request #22157: [SPARK-25126][SQL] Avoid creating Reader for all ...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22157#discussion_r212304433 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala --- @@ -562,20 +562,57 @@ abstract class OrcQueryTest

[GitHub] spark pull request #22171: [SPARK-25177][SQL] When dataframe decimal type co...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22171#discussion_r212289525 --- Diff: sql/core/src/test/resources/sql-tests/results/literals.sql.out --- @@ -197,7 +197,7 @@ select .e3 -- !query 20 select 1E309, -1E309

[GitHub] spark pull request #22171: [SPARK-25177][SQL] When dataframe decimal type co...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22171#discussion_r212244913 --- Diff: sql/core/src/test/resources/sql-tests/results/literals.sql.out --- @@ -197,7 +197,7 @@ select .e3 -- !query 20 select 1E309, -1E309

[GitHub] spark pull request #22157: [SPARK-25126][SQL] Avoid creating Reader for all ...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22157#discussion_r212229343 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala --- @@ -79,9 +79,10 @@ object OrcUtils extends Logging

[GitHub] spark pull request #22157: [SPARK-25126][SQL] Avoid creating Reader for all ...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22157#discussion_r212225224 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala --- @@ -562,20 +562,57 @@ abstract class OrcQueryTest

[GitHub] spark pull request #22157: [SPARK-25126][SQL] Avoid creating Reader for all ...

2018-08-23 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22157#discussion_r212223394 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala --- @@ -79,9 +79,10 @@ object OrcUtils extends Logging

[GitHub] spark issue #22187: [SPARK-25178][SQL] Directly ship the StructType objects ...

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22187 +1 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22171: [SPARK-25177][SQL] When dataframe decimal type co...

2018-08-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22171#discussion_r212165658 --- Diff: sql/core/src/test/resources/sql-tests/results/literals.sql.out --- @@ -197,7 +197,7 @@ select .e3 -- !query 20 select 1E309, -1E309

[GitHub] spark pull request #22171: [SPARK-25177][SQL] When dataframe decimal type co...

2018-08-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22171#discussion_r212165521 --- Diff: sql/core/src/test/resources/sql-tests/results/higher-order-functions.sql.out --- @@ -201,6 +201,7 @@ struct<> -- !query 20

[GitHub] spark issue #22187: [SPARK-25178][SQL] change the generated code of the keyS...

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22187 Yeah, I agreed with @rednaxelafx to directly ship the StructType objects looks like a better solution. +1

[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22181 cc @zsxwing @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22181 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22171 So this is an issue only related to `Dataset.show`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22182 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16478 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...

2018-08-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22181 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16478 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.col...

2018-08-21 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22181 [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection.ExternalAppendOnlyMapSuiteCheck ## What changes were proposed in this pull request? `ExternalAppendOnlyMapSuiteCheck` test is flaky

[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21859 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22152 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16478 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20146 This is not urgent, but if you have time, can you help review this? @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19756: [SPARK-22527][SQL] Reuse coordinated exchanges if...

2018-08-21 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/19756 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16478 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Joi...

2018-08-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22168#discussion_r211619140 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -1058,31 +1064,37 @@ private class

[GitHub] spark pull request #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Joi...

2018-08-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22168#discussion_r211607297 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -975,8 +979,10 @@ private class

[GitHub] spark pull request #22168: [SPARK-24985][SQL][WIP] Fix OOM in Full Outer Joi...

2018-08-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22168#discussion_r211609440 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -1028,23 +1034,23 @@ private class

[GitHub] spark pull request #22152: [SPARK-25159][SQL] json schema inference should o...

2018-08-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22152#discussion_r211599957 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala --- @@ -69,10 +70,17 @@ private[sql] object JsonInferSchema

[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22024 @vincent-grosbois I don't look into this change yet. Do you have reply for @dbtsai's comment https://github.com/apache/spark/pull/22024#issuecomment-412659532

[GitHub] spark issue #22024: [SPARK-25034][CORE] Remove allocations in onBlockFetchSu...

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22024 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211506399 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallbackSuite.scala --- @@ -40,4 +55,13

[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22154 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22154 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21859 If this optimization is done more generally, will the implicitly cached data cause memory pressure on driver, as seems we don't have way to release them

[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22163 What you mean `only one record is written to a buffer each time`? Isn't it controlled by `diskWriteBufferSize` to write such size of data each time

[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22154 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211446570 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala --- @@ -180,7 +180,10 @@ object UnsafeProjection

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211443168 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala --- @@ -180,7 +180,10 @@ object UnsafeProjection

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211443069 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala --- @@ -63,7 +49,10 @@ abstract

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211432763 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala --- @@ -63,7 +49,10 @@ abstract

[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22151 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22151#discussion_r211246362 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -201,13 +201,11 @@ class AvroSerializer(rootCatalystType

[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22151#discussion_r211246064 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroLogicalTypeSuite.scala --- @@ -179,7 +192,7 @@ class AvroLogicalTypeSuite extends

[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22151#discussion_r211243540 --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala --- @@ -201,13 +201,11 @@ class AvroSerializer(rootCatalystType

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21732 > if we can totally hide the topLevel parameter from users, it should be fine. Generally I think the behavior is consistent, now Option[Product] is always a struct type column. But we n

[GitHub] spark pull request #22152: [SPARK-25159][SQL] json schema inference should o...

2018-08-20 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22152#discussion_r211189624 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala --- @@ -69,10 +70,17 @@ private[sql] object JsonInferSchema

[GitHub] spark issue #22150: [SPARK-25144][SQL][TEST][BRANCH-2.3] Free aggregate map ...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22150 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22150: [SPARK-25144][SQL][TEST][BRANCH-2.3] Free aggregate map ...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22150 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...

2018-08-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21859 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21859: [SPARK-24900][SQL]Speed up sort when the dataset ...

2018-08-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21859#discussion_r211122674 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1207,6 +1207,13 @@ object SQLConf { .intConf

[GitHub] spark pull request #21859: [SPARK-24900][SQL]Speed up sort when the dataset ...

2018-08-19 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21859#discussion_r211122076 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -155,6 +156,8 @@ class RangePartitioner[K : Ordering : ClassTag, V

[GitHub] spark pull request #21931: [SPARK-24978][SQL]Add spark.sql.fast.hash.aggrega...

2018-08-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21931#discussion_r211089742 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala --- @@ -366,6 +366,43 @@ class AggregateBenchmark

[GitHub] spark issue #21931: [SPARK-24978][SQL]Add spark.sql.fast.hash.aggregate.row....

2018-08-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21931 Minor comments. LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #21931: [SPARK-24978][SQL]Add spark.sql.fast.hash.aggrega...

2018-08-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21931#discussion_r211089705 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala --- @@ -366,6 +366,43 @@ class AggregateBenchmark

[GitHub] spark pull request #21931: [SPARK-24978][SQL]Add spark.sql.fast.hash.aggrega...

2018-08-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21931#discussion_r211089695 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1437,6 +1437,16 @@ object SQLConf { .intConf

[GitHub] spark pull request #21931: [SPARK-24978][SQL]Add spark.sql.fast.hash.aggrega...

2018-08-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21931#discussion_r211089716 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/AggregateBenchmark.scala --- @@ -366,6 +366,43 @@ class AggregateBenchmark

[GitHub] spark issue #20232: [SPARK-23042][ML] Use OneHotEncoderModel to encode label...

2018-08-17 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20232 Thanks @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20232: [SPARK-23042][ML] Use OneHotEncoderModel to encode label...

2018-08-17 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20232 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20232: [SPARK-23042][ML] Use OneHotEncoderModel to encode label...

2018-08-17 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20232 ping @dbtsai do you have time to review this too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-17 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21732 From the above, I think the aggregator encoder for `Option[Product]` might be a bit tricky to use for users, since they might need to know the difference between `topLevel = true` and `topLevel

<    3   4   5   6   7   8   9   10   11   12   >