[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-12-03 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23152 LGTM, thanks for the change! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-12-03 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23152 jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-30 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23152 This reproes it: ``` sql("create table all_null (attr1 int, attr2 int)") sql("insert into all_null values (null, null)") sql(&qu

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-30 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23152 While at it, could we kill one more potential for a bug? In `FilterEstimation.evaluateBinaryForTwoColumns` there is a ``` attrLeft.dataType match { case StringType

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-30 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23152 LGTM, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-30 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237789881 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2276,4 +2276,16 @@ class SQLQuerySuite extends

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-27 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23127 jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-27 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23127 jenkins retest this --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-27 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23127 Talked with @hvanhovell offline and set `LocalTableScanExec` and `InputAdapter` to not create an unsafe projection, and `RDDScanExec` and `RowDataSourceScanExec` to always do so, which

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-27 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23127 @cloud-fan Thanks. Actually, I had to revert earlier updates because the plan no longer changes for LocalTableScanExec that is alone in a wholestagecodegen

[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...

2018-11-26 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/23127#discussion_r236395764 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -406,14 +415,62 @@ trait

[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...

2018-11-26 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/23127#discussion_r236393985 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -406,14 +415,62 @@ trait

[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...

2018-11-26 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/23127#discussion_r236391673 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -406,14 +415,62 @@ trait

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and RDDScan...

2018-11-26 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23127 @cloud-fan @rednaxelafx Actually, the input to a codegen stage can be an internal row so I can't make the inputRDD be `RDD[UnsafeRow], but the output needs to be UnsafeRow. Doing

[GitHub] spark issue #23127: [SPARK-26159] Codegen for LocalTableScanExec and Existin...

2018-11-23 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23127 cc @hvanhovell @rednaxelafx --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...

2018-11-23 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/23127 [SPARK-26159] Codegen for LocalTableScanExec and ExistingRDDExec ## What changes were proposed in this pull request? Implement codegen for LocalTableScanExec and ExistingRDDExec

[GitHub] spark issue #23022: [SPARK-26038] Decimal toScalaBigInt/toJavaBigInteger for...

2018-11-23 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23022 ping @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23022: [SPARK-26038] Decimal toScalaBigInt/toJavaBigInteger for...

2018-11-13 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/23022 cc @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23022: [SPARK-26038] Decimal toScalaBigInt/toJavaBigInte...

2018-11-13 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/23022 [SPARK-26038] Decimal toScalaBigInt/toJavaBigInteger for decimals not fitting in long ## What changes were proposed in this pull request? Fix Decimal `toScalaBigInt

[GitHub] spark issue #22029: [SPARK-24395][SQL] IN operator should return NULL when c...

2018-10-22 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/22029 IMHO if a new wrapper was justifiable for the IN-subquery in #21403, then it is also justifiable to add one here for the IN-literal-list case

[GitHub] spark issue #22566: [SPARK-25458][SQL] Support FOR ALL COLUMNS in ANALYZE TA...

2018-09-28 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/22566 LGTM. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22566: [SPARK-25458][SQL] Support FOR ALL COLUMNS in ANALYZE TA...

2018-09-28 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/22566 jenkins retest this --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22209: [SPARK-24415][Core] Fixed the aggregated stage me...

2018-08-31 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/22209#discussion_r214381992 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -350,11 +350,20 @@ private[spark] class AppStatusListener

[GitHub] spark pull request #22209: [SPARK-24415][Core] Fixed the aggregated stage me...

2018-08-31 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/22209#discussion_r214320305 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -350,11 +350,20 @@ private[spark] class AppStatusListener

[GitHub] spark issue #22286: [SPARK-25284] Spark UI: make sure skipped stages are upd...

2018-08-31 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/22286 Thanks pointing to #22209. Closing this one. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22286: [SPARK-25284] Spark UI: make sure skipped stages ...

2018-08-31 Thread juliuszsompolski
Github user juliuszsompolski closed the pull request at: https://github.com/apache/spark/pull/22286 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22286: [SPARK-25284] Spark UI: make sure skipped stages are upd...

2018-08-30 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/22286 cc @gatorsmile fyi @dbkerkela --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22286: [SPARK-25284] Spark UI: make sure skipped stages ...

2018-08-30 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/22286 [SPARK-25284] Spark UI: make sure skipped stages are updated onJobEnd ## What changes were proposed in this pull request? Tiny bug in onJobEnd, not forcing the refresh of skipped

[GitHub] spark issue #22268: [DOC] Fix comment on SparkPlanGraphEdge

2018-08-29 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/22268 cc @gatorsmile just a tiny nit... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22268: [DOC] Fix comment on SparkPlanGraphEdge

2018-08-29 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/22268 [DOC] Fix comment on SparkPlanGraphEdge ## What changes were proposed in this pull request? `fromId` is the child, and `toId` is the parent, see line 127

[GitHub] spark issue #21403: [SPARK-24341][SQL] Support only IN subqueries with the s...

2018-07-24 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/21403 Looks good to me, though I'm not very familiar with analyzer. @cloud-fan, @hvanhovell ? --- - To unsubscribe, e

[GitHub] spark issue #21403: [SPARK-24341][WIP][SQL] Support IN subqueries with struc...

2018-05-29 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/21403 @mgaido91 BTW: In SPARK-24395 I would consider the cases to still be valid, because I believe there is no other syntactic way to do a multi-column IN/NOT IN with list of literals

[GitHub] spark issue #21403: [SPARK-24341][WIP][SQL] Support IN subqueries with struc...

2018-05-29 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/21403 @mgaido91 This also works, +1. What about `a in (select (b, c) from ...)` when `a` is a struct? - I guess allow it, but a potential gotcha during implementation

[GitHub] spark issue #21403: [SPARK-24341][WIP][SQL] Support IN subqueries with struc...

2018-05-29 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/21403 I think that the way the columns are defined in the subquery should define the semantics. E.g.: `(a, b) IN (select c, d from ...)` - unpack (a, b) and treat it as a multi column

[GitHub] spark pull request #21228: [SPARK-24171] Adding a note for non-deterministic...

2018-05-03 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21228#discussion_r185792840 --- Diff: R/pkg/R/functions.R --- @@ -3184,6 +3191,7 @@ setMethod("create_map", #' collect(select(df2, collect_lis

[GitHub] spark pull request #21228: [SPARK-24171] Adding a note for non-deterministic...

2018-05-03 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21228#discussion_r185791719 --- Diff: R/pkg/R/functions.R --- @@ -963,6 +964,7 @@ setMethod("kurtosis", #' last(df$c, TRUE) #' } #' @note last s

[GitHub] spark issue #21133: [SPARK-24013][SQL] Remove unneeded compress in Approxima...

2018-04-30 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/21133 Maybe we could add the former test as a benchmark to `AggregateBenchmark`? --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...

2018-04-27 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184662359 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -279,4 +282,11 @@ class

[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...

2018-04-27 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184656896 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -279,4 +282,11 @@ class

[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...

2018-04-27 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184654132 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -238,12 +238,6

[GitHub] spark issue #21171: [SPARK-24104] SQLAppStatusListener overwrites metrics on...

2018-04-26 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/21171 cc @gengliangwang @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #21171: [SPARK-24104] SQLAppStatusListener overwrites met...

2018-04-26 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/21171 [SPARK-24104] SQLAppStatusListener overwrites metrics onDriverAccumUpdates instead of updating them ## What changes were proposed in this pull request? Event

[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...

2018-04-26 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184343803 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -279,4 +282,10 @@ class

[GitHub] spark pull request #21133: [SPARK-24013][SQL] Remove unneeded compress in Ap...

2018-04-26 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/21133#discussion_r184347998 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -238,12 +238,6

[GitHub] spark issue #20718: [SPARK-23514][FOLLOW-UP] Remove more places using sparkC...

2018-03-02 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20718 jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #20718: [SPARK-23514][FOLLOW-UP] Remove more places using sparkC...

2018-03-02 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20718 jenkins retest this please ``` org.apache.spark.sql.FileBasedDataSourceSuite.(It is not a test it is a sbt.testing.SuiteSelector

[GitHub] spark issue #20718: [SPARK-23514][FOLLOW-UP] Remove more places using sparkC...

2018-03-02 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20718 jenkins retest this please `hudson.plugins.git.GitException: Failed to fetch from https://github.com/apache/spark.git

[GitHub] spark issue #20718: [SPARK-23514][FOLLOW-UP] Remove more places using sparkC...

2018-03-02 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20718 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20718: [SPARK-23514][FOLLOW-UP] Remove more places using...

2018-03-02 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/20718 [SPARK-23514][FOLLOW-UP] Remove more places using sparkContext.hadoopConfiguration directly ## What changes were proposed in this pull request? In https://github.com/apache/spark

[GitHub] spark issue #20679: [SPARK-23514] Use SessionState.newHadoopConf() to propag...

2018-02-28 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20679 jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #20679: [SPARK-23514] Use SessionState.newHadoopConf() to propag...

2018-02-27 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20679 jenkins retest this please Flaky: `sbt.ForkMain$ForkError: java.io.IOException: Failed to delete: /home/jenkins/workspace/SparkPullRequestBuilder/target/tmp/spark-c0f7c3f9-f48a-4bb8

[GitHub] spark pull request #20679: [SPARK-23514] Use SessionState.newHadoopConf() to...

2018-02-27 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20679#discussion_r170981204 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -115,7 +115,9 @@ private[sql] class SessionState

[GitHub] spark issue #20679: [SPARK-23514] Use SessionState.newHadoopConf() to propag...

2018-02-26 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20679 cc @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20679: [SPARK-23514] Use SessionState.newHadoopConf() to propag...

2018-02-26 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20679 cc @gatorsmile @rxin We had a chat whether to implement something that would catch direct misuses of sc.hadoopConfiguration in sql module, but it seems that it's not very common, so

[GitHub] spark pull request #20679: [SPARK-23514] Use SessionState.newHadoopConf() to...

2018-02-26 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/20679 [SPARK-23514] Use SessionState.newHadoopConf() to propage hadoop configs set in SQLConf. ## What changes were proposed in this pull request? A few places in `spark-sql` were

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-26 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170567874 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala --- @@ -187,11 +187,11 @@ object

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-25 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170465836 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -387,6 +390,101 @@ case class

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-25 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170465726 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala --- @@ -305,15 +260,15 @@ object ColumnStat

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-25 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170465448 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -387,6 +390,101 @@ case class

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-25 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170463362 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -387,6 +390,101 @@ case class

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-25 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170462960 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -387,6 +390,101 @@ case class

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-22 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170086812 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -387,6 +390,101 @@ case class

[GitHub] spark issue #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20624 cc @gatorsmile @cloud-fan @marmbrus --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring

2018-02-15 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/20624 [SPARK-23445] ColumnStat refactoring ## What changes were proposed in this pull request? Refactor ColumnStat to be more flexible. * Split `ColumnStat

[GitHub] spark pull request #20555: [SPARK-23366] Improve hot reading path in ReadAhe...

2018-02-13 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20555#discussion_r168048937 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -78,9 +79,8 @@ // whether there is a read ahead task

[GitHub] spark pull request #20555: [SPARK-23366] Improve hot reading path in ReadAhe...

2018-02-13 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20555#discussion_r168048795 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -230,24 +227,32 @@ private void signalAsyncReadComplete

[GitHub] spark issue #20555: [SPARK-23366] Improve hot reading path in ReadAheadInput...

2018-02-13 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20555 @jiangxb1987 there is ReadAheadInputStreamSuite that extends GenericFileInputStreamSuite. I updated it and added more combination testing with different buffer sizes that should

[GitHub] spark pull request #20555: [SPARK-23366] Improve hot reading path in ReadAhe...

2018-02-13 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20555#discussion_r167971273 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -258,54 +263,43 @@ public int read(byte[] b, int offset, int len

[GitHub] spark pull request #20555: [SPARK-23366] Improve hot reading path in ReadAhe...

2018-02-12 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20555#discussion_r167651037 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -232,7 +229,9 @@ private void waitForAsyncReadComplete() throws

[GitHub] spark pull request #20555: [SPARK-23366] Improve hot reading path in ReadAhe...

2018-02-12 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20555#discussion_r167646954 --- Diff: core/src/main/java/org/apache/spark/io/ReadAheadInputStream.java --- @@ -258,54 +262,43 @@ public int read(byte[] b, int offset, int len

[GitHub] spark issue #20555: [SPARK-23366] Improve hot reading path in ReadAheadInput...

2018-02-08 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20555 cc @kiszk @sitalkedia @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20555: [SPARK-23366] Improve hot reading path in ReadAhe...

2018-02-08 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/20555 [SPARK-23366] Improve hot reading path in ReadAheadInputStream ## What changes were proposed in this pull request? `ReadAheadInputStream` was introduced in https://github.com

[GitHub] spark issue #20152: [SPARK-22957] ApproxQuantile breaks if the number of row...

2018-01-04 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20152 If the serialized form change is a problem, that part can probably be reverted - it's far less likely that a single compressed stats chunk will overflow Int. The bug I hit

[GitHub] spark pull request #20152: [SPARK-22957] ApproxQuantile breaks if the number...

2018-01-04 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/20152 [SPARK-22957] ApproxQuantile breaks if the number of rows exceeds MaxInt ## What changes were proposed in this pull request? 32bit Int was used for row rank. That overflowed

[GitHub] spark pull request #20136: [SPARK-22938] Assert that SQLConf.get is accessed...

2018-01-03 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/20136#discussion_r159408791 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -70,7 +72,7 @@ object SQLConf { * Default config

[GitHub] spark issue #20136: [SPARK-22938] Assert that SQLConf.get is accessed only o...

2018-01-02 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/20136 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20136: [SPARK-22938] Assert that SQLConf.get is accessed...

2018-01-02 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/20136 [SPARK-22938] Assert that SQLConf.get is accessed only on the driver. ## What changes were proposed in this pull request? Assert if code tries to access SQLConf.get on executor

[GitHub] spark issue #19923: [SPARK-22721] BytesToBytesMap peak memory not updated.

2017-12-07 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19923 jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #19923: [SPARK-22721] BytesToBytesMap peak memory not updated.

2017-12-07 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19923 Sorry @hvanhovell for not getting it fully right the first time... --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19923: [SPARK-22721] BytesToBytesMap peak memory not upd...

2017-12-07 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/19923 [SPARK-22721] BytesToBytesMap peak memory not updated. ## What changes were proposed in this pull request? Follow-up to earlier commit. The peak memory of BytesToBytesMap

[GitHub] spark issue #19915: [SPARK-22721] BytesToBytesMap peak memory usage not accu...

2017-12-06 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19915 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19915: [SPARK-22721] BytesToBytesMap peak memory usage n...

2017-12-06 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/19915 [SPARK-22721] BytesToBytesMap peak memory usage not accurate after reset() ## What changes were proposed in this pull request? BytesToBytesMap doesn't update peak memory usage

[GitHub] spark issue #19689: [SPARK-22462][SQL] Make rdd-based actions in Dataset tra...

2017-11-09 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19689 Thanks for the fix @viirya! But I'm not a Spark committer to approve it :-). --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...

2017-11-06 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19656 `isShouldStopRequired` is currently not respected by most operators (they insert `shouldStop()` code regardless of this setting). If you're refactoring this, maybe make sure that all

[GitHub] spark issue #19181: [SPARK-21907][CORE] oom during spill

2017-10-10 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19181 Looks good to me. What do you think @hvanhovell ? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19386: [SPARK-22161] [SQL] Add Impala-modified TPC-DS queries

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19386 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843276 --- Diff: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorterSuite.java --- @@ -139,4 +139,44 @@ public int

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843447 --- Diff: core/src/test/scala/org/apache/spark/memory/TestMemoryManager.scala --- @@ -58,11 +58,15 @@ class TestMemoryManager(conf: SparkConf

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843414 --- Diff: core/src/test/scala/org/apache/spark/memory/TestMemoryManager.scala --- @@ -58,11 +58,15 @@ class TestMemoryManager(conf: SparkConf

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843193 --- Diff: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorterSuite.java --- @@ -139,4 +139,44 @@ public int

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141842424 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java --- @@ -85,7 +85,7 @@ private final

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141842918 --- Diff: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorterSuite.java --- @@ -503,6 +511,39 @@ public void

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141842522 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java --- @@ -162,14 +162,25 @@ private int

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141842730 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java --- @@ -162,14 +162,25 @@ private int

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843914 --- Diff: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorterSuite.java --- @@ -19,10 +19,18

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843227 --- Diff: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorterSuite.java --- @@ -139,4 +139,44 @@ public int

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843046 --- Diff: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorterSuite.java --- @@ -503,6 +511,39 @@ public void

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141842948 --- Diff: core/src/test/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorterSuite.java --- @@ -503,6 +511,39 @@ public void

[GitHub] spark pull request #19181: [SPARK-21907][CORE] oom during spill

2017-09-29 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19181#discussion_r141843494 --- Diff: core/src/test/scala/org/apache/spark/memory/TestMemoryManager.scala --- @@ -58,11 +58,15 @@ class TestMemoryManager(conf: SparkConf

[GitHub] spark issue #19353: [SPARK-22103][FOLLOWUP] Rename addExtraCode to addInnerC...

2017-09-26 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19353 jenkins retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

  1   2   >