[GitHub] spark pull request #20521: [SPARK-22977][SQL] fix web UI SQL tab for CTAS

2018-09-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20521#discussion_r217254812 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -56,34 +57,36 @@ case class

[GitHub] spark pull request #20521: [SPARK-22977][SQL] fix web UI SQL tab for CTAS

2018-09-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20521#discussion_r217254430 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -56,34 +57,36 @@ case class

[GitHub] spark pull request #22344: [SPARK-25352][SQL] Perform ordered global limit w...

2018-09-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22344#discussion_r217088628 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -68,22 +68,42 @@ abstract class SparkStrategies extends

[GitHub] spark pull request #22395: [SPARK-16323][SQL] Add IntegralDivide expression

2018-09-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22395#discussion_r217088230 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala --- @@ -314,6 +314,27 @@ case class Divide(left

[GitHub] spark pull request #22395: [SPARK-16323][SQL] Add IntegralDivide expression

2018-09-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22395#discussion_r217087471 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala --- @@ -314,6 +314,27 @@ case class Divide(left

[GitHub] spark issue #22344: [SPARK-25352][SQL] Perform ordered global limit when lim...

2018-09-12 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22344 ping @cloud-fan seems to me we should consider to include this in 2.4. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22344: [SPARK-25352][SQL] Perform ordered global limit when lim...

2018-09-12 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22344 retest this please... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nu...

2018-09-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22375#discussion_r216917624 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -223,8 +223,8 @@ trait

[GitHub] spark issue #22344: [SPARK-25352][SQL] Perform ordered global limit when lim...

2018-09-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22344 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22344: [SPARK-25352][SQL] Perform ordered global limit when lim...

2018-09-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22344 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22394: [SPARK-25406][SQL] For ParquetSchemaPruningSuite....

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22394#discussion_r216875341 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -245,28 +249,32 @@ class

[GitHub] spark pull request #22394: [SPARK-25406][SQL] For ParquetSchemaPruningSuite....

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22394#discussion_r216875288 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -245,28 +249,32 @@ class

[GitHub] spark pull request #22394: [SPARK-25406][SQL] For ParquetSchemaPruningSuite....

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22394#discussion_r216875122 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -245,28 +249,32 @@ class

[GitHub] spark pull request #22394: [SPARK-25406][SQL] For ParquetSchemaPruningSuite....

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22394#discussion_r216874181 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -156,20 +156,24 @@ class

[GitHub] spark issue #22394: [SPARK-25406][SQL] For ParquetSchemaPruningSuite.scala, ...

2018-09-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22394 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22341: [SPARK-24889][Core] Update block info when unpersist rdd...

2018-09-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22341 Thanks @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216708046 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -110,7 +110,17 @@ private[sql

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216702003 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -110,7 +110,17 @@ private[sql

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216694201 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -155,6 +163,60 @@ class

[GitHub] spark pull request #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nu...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22375#discussion_r216679499 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -223,8 +223,8 @@ trait

[GitHub] spark pull request #22375: [WIP][SPARK-25388][Test][SQL] Detect incorrect nu...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22375#discussion_r216676386 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala --- @@ -223,8 +223,8 @@ trait

[GitHub] spark issue #22341: [SPARK-24889][Core] Update block info when unpersist rdd...

2018-09-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22341 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216601125 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -110,7 +110,17 @@ private[sql

[GitHub] spark issue #22341: [SPARK-24889][Core] Update block info when unpersist rdd...

2018-09-11 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22341 @vanzin Thanks for the good review! I've updated this to address them all. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r216574255 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -646,7 +647,47 @@ private[spark] class AppStatusListener

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-11 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216565635 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -110,7 +110,17 @@ private[sql

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 FYI, per further checking code and discussion with @dbtsai regarding with predicate pushdown, we know that predicate push down only works for primitive types on Parquet datasource. So both

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r216514947 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -646,7 +647,47 @@ private[spark] class AppStatusListener

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r216513548 --- Diff: core/src/main/scala/org/apache/spark/status/LiveEntity.scala --- @@ -538,6 +538,14 @@ private class LiveRDD(val info: RDDInfo) extends LiveEntity

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r216512185 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -646,7 +647,47 @@ private[spark] class AppStatusListener

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r216511718 --- Diff: core/src/main/scala/org/apache/spark/status/LiveEntity.scala --- @@ -538,6 +538,14 @@ private class LiveRDD(val info: RDDInfo) extends LiveEntity

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 Btw, this PR isn't intended to address filter push down for schema pruning. I do think it should be another one topic

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 I just read @mallman's comment. Thanks for that. Roughly, my two cents: > IMO, we can get closer to settling the question of relative performance/behavior by pushing down Parquet rea

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r216257678 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -646,7 +647,47 @@ private[spark] class AppStatusListener

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216256434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -199,6 +209,15 @@ private[sql

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216255549 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -155,6 +161,47 @@ class

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r216244620 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -155,6 +161,47 @@ class

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 Thanks @dbtsai and @HyukjinKwon. Your comments are addressed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19045: [WIP][SPARK-20628][CORE] Keep track of nodes (/ s...

2018-09-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19045#discussion_r216153775 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ExecutorLossReason.scala --- @@ -58,3 +58,11 @@ private [spark] object LossReasonPending extends

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 Thanks! @mallman For the first query, I think the query plan produced by your WIP patch is not correct. We don't need to read the `company:struct` from `employer:struct

[GitHub] spark issue #22335: [SPARK-25091][Core] reduce the storage memory in Executo...

2018-09-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22335 Yea, looks like this is duplicate to #22341. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r215899595 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -155,6 +155,30 @@ class

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r215899485 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -196,6 +196,7 @@ private[sql] object

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22357#discussion_r215864442 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -196,6 +196,7 @@ private[sql] object

[GitHub] spark issue #22357: [SPARK-25363][SQL] Fix schema pruning in where clause by...

2018-09-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22357 cc @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22357: [SPARK-25363][SQL] Fix schema pruning in where cl...

2018-09-06 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22357 [SPARK-25363][SQL] Fix schema pruning in where clause by ignoring unnecessary root fields ## What changes were proposed in this pull request? Schema pruning doesn't work if nested

[GitHub] spark issue #22344: [SPARK-25352][SQL] Perform ordered global limit when lim...

2018-09-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22344 cc @hvanhovell @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-09-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22171 Scientific notation is more efficient on saving the values in CSV. If there are many zero values of high scale decimal type, this non scientific notation can cost storage space and loading time

[GitHub] spark issue #22344: [SPARK-25352][SQL] Perform ordered global limit when lim...

2018-09-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22344 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22344: [SPARK-25352][SQL] Perform ordered global limit w...

2018-09-05 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22344 [SPARK-25352][SQL] Perform ordered global limit when limit number is bigger than topKSortFallbackThreshold ## What changes were proposed in this pull request? We have optimization

[GitHub] spark pull request #21525: [SPARK-24513][ML] Attribute support in UnaryTrans...

2018-09-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21525#discussion_r215461652 --- Diff: mllib/src/main/scala/org/apache/spark/ml/Transformer.scala --- @@ -116,10 +116,17 @@ abstract class UnaryTransformer[IN, OUT, T

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22341#discussion_r215300948 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -55,7 +55,7 @@ class RDDInfo( } private[spark] object RDDInfo

[GitHub] spark issue #22338: [SPARK-25317][CORE] Avoid perf regression in Murmur3 Has...

2018-09-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22338 Yeah, it's interesting... Seems both `checkedCast` and `Platform.getByte` are changed and performance gets gain. Any single change doesn't work

[GitHub] spark pull request #22338: [SPARK-25317][CORE] Avoid perf regression in Murm...

2018-09-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22338#discussion_r215226248 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/hash/Murmur3_x86_32.java --- @@ -69,22 +70,27 @@ public static int hashUnsafeWords(Object base

[GitHub] spark issue #22341: [SPARK-24889][Core] Update block info when unpersist rdd...

2018-09-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22341 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22341: [SPARK-24889][Core] Update block info when unpers...

2018-09-05 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22341 [SPARK-24889][Core] Update block info when unpersist rdds ## What changes were proposed in this pull request? We will update block info coming from executors, at the timing like caching

[GitHub] spark pull request #22338: [SPARK-25317][CORE] Avoid perf regression in Murm...

2018-09-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22338#discussion_r215192739 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/hash/Murmur3_x86_32.java --- @@ -69,22 +70,27 @@ public static int hashUnsafeWords(Object base

[GitHub] spark issue #22330: [SPARK-19355][SQL][FOLLOWUP][TEST] Properly recycle Spar...

2018-09-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22330 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22330: [SPARK-19355][SQL][FOLLOWUP][TEST] Properly recyc...

2018-09-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22330#discussion_r214903740 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/TakeOrderedAndProjectSuite.scala --- @@ -45,6 +45,7 @@ class TakeOrderedAndProjectSuite

[GitHub] spark pull request #22313: [SPARK-25306][SQL] Avoid skewed filter trees to s...

2018-09-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22313#discussion_r214787227 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala --- @@ -398,6 +398,24 @@ class

[GitHub] spark issue #22317: [SPARK-25310][SQL] ArraysOverlap may throw a Compilation...

2018-09-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22317 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22317: [SPARK-25310][SQL] ArraysOverlap may throw a Comp...

2018-09-02 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22317#discussion_r214556466 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -1623,12 +1623,13 @@ case class

[GitHub] spark pull request #22313: [SPARK-25306][SQL] Use cache to speed up `createF...

2018-09-02 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22313#discussion_r214556166 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala --- @@ -55,19 +59,52 @@ import org.apache.spark.sql.types._ * known

[GitHub] spark pull request #22313: [SPARK-25306][SQL] Use cache to speed up `createF...

2018-09-02 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22313#discussion_r214556040 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala --- @@ -55,19 +59,52 @@ import org.apache.spark.sql.types._ * known

[GitHub] spark pull request #22266: [SPARK-25270] lint-python: Add flake8 to find syn...

2018-08-31 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22266#discussion_r214494032 --- Diff: dev/lint-python --- @@ -82,6 +82,25 @@ else rm "$PYCODESTYLE_REPORT_PATH" fi +# stop the build if there are Pyt

[GitHub] spark issue #22297: [SPARK-25290][Core][Test] Reduce the size of acquired ar...

2018-08-31 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22297 @cloud-fan Ok. Changed as suggestion. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22197: [SPARK-25207][SQL] Case-insensitve field resolution for ...

2018-08-31 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22197 One minor comment that can be addressed in a follow-up PR. LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-31 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r214263031 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -1021,6 +1022,113 @@ class

[GitHub] spark issue #22297: [SPARK-25290][Core][Test] Reduce the size of acquired ar...

2018-08-31 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22297 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22297: [SPARK-25290][Core][Test] Reduce the size of acquired ar...

2018-08-31 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22297 The allocated arrays are put in a HashMap and are iterated after the loop, to retrieve and compare with elements from the `BytesToBytesMap

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-30 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21732 > The only tricky thing is, Product is handled specially in the top level, being flattened into multiple columns. @cloud-fan Compared with Option of Product which is not supported bef

[GitHub] spark issue #22297: [SPARK-25290][Core][Test] Reduce the size of acquired ar...

2018-08-30 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22297 cc @cloud-fan @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21968: [SPARK-24999][SQL]Reduce unnecessary 'new' memory...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21968#discussion_r214235758 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/RowBasedHashMapGenerator.scala --- @@ -141,11 +151,8 @@ class

[GitHub] spark pull request #21968: [SPARK-24999][SQL]Reduce unnecessary 'new' memory...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21968#discussion_r214235660 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/RowBasedHashMapGenerator.scala --- @@ -141,11 +151,8 @@ class

[GitHub] spark pull request #22297: [SPARK-25290][Core][Test] Reduce the size of acqu...

2018-08-30 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/22297 [SPARK-25290][Core][Test] Reduce the size of acquired arrays to avoid OOM error ## What changes were proposed in this pull request? `BytesToBytesMapOnHeapSuite`.`randomizedStressTest

[GitHub] spark pull request #22284: [SPARK-25278][SQL] Avoid duplicated Exec nodes wh...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22284#discussion_r214227196 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -497,6 +497,17 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #22284: [SPARK-25278][SQL] Avoid duplicated Exec nodes wh...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22284#discussion_r214227012 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/QueryPlanner.scala --- @@ -81,7 +81,7 @@ abstract class QueryPlanner

[GitHub] spark pull request #22266: [SPARK-25270] lint-python: Add flake8 to find syn...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22266#discussion_r214203205 --- Diff: dev/lint-python --- @@ -82,6 +82,25 @@ else rm "$PYCODESTYLE_REPORT_PATH" fi +# stop the build if there are Pyt

[GitHub] spark pull request #22266: [SPARK-25270] lint-python: Add flake8 to find syn...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22266#discussion_r214200375 --- Diff: dev/lint-python --- @@ -82,6 +82,25 @@ else rm "$PYCODESTYLE_REPORT_PATH" fi +# stop the build if there are Pyt

[GitHub] spark pull request #22270: [SPARK-25267][SQL][TEST] Disable ConvertToLocalRe...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22270#discussion_r213945889 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -652,65 +653,66 @@ class ALSSuite extends MLTest

[GitHub] spark pull request #22270: [SPARK-25267][SQL][TEST] Disable ConvertToLocalRe...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22270#discussion_r213939010 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala --- @@ -59,7 +60,8 @@ object TestHive .set

[GitHub] spark pull request #22270: [SPARK-25267][SQL][TEST] Disable ConvertToLocalRe...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22270#discussion_r213938811 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -652,65 +653,66 @@ class ALSSuite extends MLTest

[GitHub] spark pull request #22270: [SPARK-25267][SQL][TEST] Disable ConvertToLocalRe...

2018-08-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22270#discussion_r213923292 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -24,12 +24,10 @@ import scala.collection.JavaConverters

[GitHub] spark pull request #22266: [SPARK-25270] lint-python: Add flake8 to find syn...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22266#discussion_r213873372 --- Diff: dev/lint-python --- @@ -82,6 +82,26 @@ else rm "$PYCODESTYLE_REPORT_PATH" fi +python -m pip install flake8 --

[GitHub] spark pull request #22253: [SPARK-24411] [SQL] Adding native Java tests for ...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22253#discussion_r213864459 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaColumnExpressionSuite.java --- @@ -0,0 +1,80 @@ +package test.org.apache.spark.sql

[GitHub] spark pull request #22253: [SPARK-24411] [SQL] Adding native Java tests for ...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22253#discussion_r213863748 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/JavaColumnExpressionSuite.java --- @@ -0,0 +1,80 @@ +package test.org.apache.spark.sql

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r213862165 --- Diff: python/pyspark/serializers.py --- @@ -187,9 +187,15 @@ def loads(self, obj): class ArrowStreamSerializer(Serializer

[GitHub] spark pull request #22275: [SPARK-25274][PYTHON][SQL] In toPandas with Arrow...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22275#discussion_r213860328 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3279,34 +3280,33 @@ class Dataset[T] private[sql]( val timeZoneId

[GitHub] spark issue #22266: [SPARK-25270] lint-python: Add flake8 to find syntax err...

2018-08-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22266 Can you also format the PR description to follow the template? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21732: [SPARK-24762][SQL] Enable Option of Product encoders

2018-08-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21732 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22219: [SPARK-25224][SQL] Improvement of Spark SQL Thrif...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22219#discussion_r213608853 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -289,6 +289,14 @@ private

[GitHub] spark pull request #21669: [SPARK-23257][K8S][WIP] Kerberos Support for Spar...

2018-08-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21669#discussion_r213591711 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -164,7 +164,15 @@ private[spark] class SparkSubmit extends Logging

[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...

2018-08-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22246 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scal...

2018-08-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22246#discussion_r213492833 --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala --- @@ -43,10 +44,26 @@ class SparkILoop(in0: Option[BufferedReader], out: JPrintWriter

[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...

2018-08-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22246 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22246: [SPARK-25235] [SHELL] Merge the REPL code in Scala 2.11 ...

2018-08-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22246 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22246: [WIP] [SPARK-25235] [SHELL] Merge the REPL code in Scala...

2018-08-27 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22246 > @viirya The reflection trick we use in scala.reflect.internal.util.ScalaClassLoader doesn't work when the REPL is called from test. Do you have any idea about it? Thanks. Yeah, it se

<    2   3   4   5   6   7   8   9   10   11   >