[GitHub] spark pull request #22388: Revert [SPARK-24882][SQL] improve data source v2 ...

2018-09-12 Thread cloud-fan
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/22388 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22374: [SPARK-25387][SQL] Fix for NPE caused by bad CSV ...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22374#discussion_r217075954 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -216,7 +216,12 @@ class UnivocityParser

[GitHub] spark issue #22388: Revert [SPARK-24882][SQL] improve data source v2 API fro...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22388 I can't recall the exact conflicts. There are only 2 commits touched these 2 files after my PR, and I carefully checked and theese changs are still

[GitHub] spark issue #22344: [SPARK-25352][SQL] Perform ordered global limit when lim...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22344 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22395: [SPARK-16323][SQL] Add IntegralDivide expression

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22395 LGTM, cc @viirya @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22395: [SPARK-16323][SQL] Add IntegralDivide expression

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22395 LGTM, cc @viirya @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22353: [SPARK-25357][SQL] Add metadata to SparkPlanInfo to dump...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22353 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22355: [SPARK-25358][SQL] MutableProjection supports fal...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22355#discussion_r217054661 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala --- @@ -37,19 +37,22

[GitHub] spark issue #22401: [SPARK-25413] Precision value is going for toss when Avg...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22401 It's not only about `avg`, it's also about `sum`. I don't think the decision is made randomly, IIRC we did check other databases and pick the best one we can do. `sum

[GitHub] spark issue #22390: [SPARK-25402][SQL] Null handling in BooleanSimplificatio...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22390 can you send a new PR for 2.2? thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22390: [SPARK-25402][SQL] Null handling in BooleanSimplificatio...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22390 thanks, merging to master/2.4/2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #22391: [SPARK-25371][SQL][BACKPORT-2.3] struct() should allow b...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22391 thanks, merging to 2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22396: [SPARK-23425][SQL][FOLLOWUP] Support wildcards in...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22396#discussion_r217008531 --- Diff: docs/sql-programming-guide.md --- @@ -1898,6 +1898,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22402: [SPARK-25414][SS] The numInputRows metrics can be incorr...

2018-09-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22402 cc @jose-torres @tdas @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22402: [SPARK-25414][SS] The numInputRows metrics can be...

2018-09-12 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22402 [SPARK-25414][SS] The numInputRows metrics can be incorrect for streaming self-join ## What changes were proposed in this pull request? For self-join/self-union, Spark will produce

[GitHub] spark issue #22390: [SPARK-25402][SQL] Null handling in BooleanSimplificatio...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22390 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22380: [SPARK-25278][SQL][followup] remove the hack in Progress...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22380 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22326: [SPARK-25314][SQL] Fix Python UDF accessing attributes f...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22326 IIUC, you are pulling out the join condition with python UDF and create a filter above join. Then the join become a cross join, which usually runs very slowly. I think we should keep the cross

[GitHub] spark issue #22353: [SPARK-25357][SQL] Add metadata to SparkPlanInfo to dump...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22353 So you need a way to reliably report some extra information like file path in the event logs, but don't want to show it in the UI as it maybe too long. Basically we shouldn't put

[GitHub] spark issue #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STORED AS ...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22378 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22378: [SPARK-25389][SQL] INSERT OVERWRITE DIRECTORY STO...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22378#discussion_r216580577 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala --- @@ -750,4 +751,27 @@ class InsertSuite extends QueryTest

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22371 My opinion is, it's not worth to spend time on it. The lock is not likely to be a bottleneck and it's better to keep it simple even it's sub-optimal

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22371 @ConeyLiu we may have an executor lost and then come back, and may have 2 same tasks running on the same executor

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22343 To clarify: this is just a workaround when we hit a problematic(having case-insensitive duplicated filed names in the parquet file) hive parquet tables and we want to read it with the native

[GitHub] spark pull request #22390: [SPARK-25402][SQL] Null handling in BooleanSimpli...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22390#discussion_r216575397 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -263,10 +263,12 @@ object BooleanSimplification

[GitHub] spark issue #22388: Revert [SPARK-24882][SQL] improve data source v2 API fro...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22388 yes --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22387: [SPARK-25313][SQL][FOLLOW-UP][BACKPORT-2.3] Fix InsertIn...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22387 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22373: [SPARK-25371][SQL] struct() should allow being called wi...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22373 thanks, merging to master/2.4! @mgaido91 can you send a new PR to 2.3? it conflicts --- - To unsubscribe, e-mail

[GitHub] spark issue #22353: [SPARK-25357][SQL] Add metadata to SparkPlanInfo to dump...

2018-09-11 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22353 Although event log is in JSON format, it's mostly for internal usage, to be load by history server and used to build the Spark UI. For compatibility, we only focus on making history to be able

[GitHub] spark issue #22382: [SPARK-23243] [SPARK-20715][CORE][2.2] Fix RDD.repartiti...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22382 thanks, merging to 2.2! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22387: [SPARK-25313][SQL][FOLLOW-UP][BACKPORT-2.3] Fix InsertIn...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22387 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22388: Revert [SPARK-24882][SQL] improve data source v2 ...

2018-09-10 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22388 Revert [SPARK-24882][SQL] improve data source v2 API from branch 2.4 ## What changes were proposed in this pull request? As discussed in the dev list, we don't want to include

[GitHub] spark issue #22388: Revert [SPARK-24882][SQL] improve data source v2 API fro...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22388 cc @rxin @rdblue --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #18544: [SPARK-21318][SQL]Improve exception message thrown by `l...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18544 what's the status here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20673: [SPARK-23515] Use input/output streams for large events ...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20673 What's the status of this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for D...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21308#discussion_r216329544 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/DeleteSupport.java --- @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #22373: [SPARK-25371][SQL] struct() should allow being called wi...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22373 Can you also update the PR description? thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22373: [SPARK-25371][SQL] struct() should allow being ca...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22373#discussion_r216323311 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -256,4 +256,9 @@ class VectorAssemblerSuite assert

[GitHub] spark issue #22380: [SPARK-25278][SQL][followup] remove the hack in Progress...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22380 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22373: [SPARK-25371][ML] VectorAssembler should not fail with e...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22373 @maropu says it's OK to revert that part, @mgaido91 can you do that? thanks! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22380: [SPARK-25278][SQL][followup] remove the hack in Progress...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22380 cc @tdas @zsxwing @mgaido91 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22380: [SPARK-25278][SQL][followup] remove the hack in P...

2018-09-10 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22380 [SPARK-25278][SQL][followup] remove the hack in ProgressReporter ## What changes were proposed in this pull request? It turns out it's a bug that a `DataSourceV2ScanExec` instance may

[GitHub] spark issue #22373: [SPARK-25371][ML] VectorAssembler should not fail with e...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22373 I think we should allow `struct` function to take empty arguments. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #18516: [SPARK-21281][SQL] Use string types by default if...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18516#discussion_r216292382 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -292,14 +296,17 @@ trait

[GitHub] spark pull request #18516: [SPARK-21281][SQL] Use string types by default if...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18516#discussion_r216292172 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -292,14 +296,17 @@ trait

[GitHub] spark issue #22284: [SPARK-25278][SQL] Avoid duplicated Exec nodes when the ...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22284 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22377: [SPARK-24849][SPARK-24911][SQL][FOLLOW-UP] Converting a ...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22377 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #22343: [SPARK-25391][SQL] Make behaviors consistent when...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22343#discussion_r216218261 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -69,12 +69,25 @@ class ParquetOptions

[GitHub] spark issue #22343: [SPARK-25391][SQL] Make behaviors consistent when conver...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22343 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #21968: [SPARK-24999][SQL]Reduce unnecessary 'new' memory operat...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21968 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18142 > Spark SQL is designed to be compatible with the Hive Metastore, SerDes and UDFs. This is different from `Spark can run any Hive SQL`. Spark can load and use Hive UDFs, with the ri

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22318 Can you define the scope of this PR? In which case we should change the references in the join condition

[GitHub] spark issue #22371: [SPARK-25386][CORE] Don't need to synchronize the IndexS...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22371 How much perf can we save here? I don't think shuffle writing will be bottlenecked by this lock. --- - To unsubscribe, e-mail

[GitHub] spark issue #22010: [SPARK-21436][CORE] Take advantage of known partitioner ...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22010 I think this works, can we post some Spark web UI screenshots to confirm the shuffle is indeed eliminated? BTW one idea to simplify the implementation: ``` def distinct

[GitHub] spark pull request #21433: [SPARK-23820][CORE] Enable use of long form of ca...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21433#discussion_r216209985 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -53,10 +55,16 @@ class RDDInfo( } private[spark] object

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18142 > BTW, I believe there's no particular standard for backticks themselves since different DBMS uses different backtick implementations. You are right, but SQL standard does define

[GitHub] spark pull request #22343: [SPARK-25391][SQL] Make behaviors consistent when...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22343#discussion_r216204114 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala --- @@ -69,12 +69,25 @@ class ParquetOptions

[GitHub] spark issue #22359: [SPARK-25313][SQL][FOLLOW-UP] Fix InsertIntoHiveDirComma...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22359 thanks, merging to master/2.4! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #22351: [MINOR][SQL] Add a debug log when a SQL text is used for...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22351 I'm surprised Hive changes the view text set by Spark. Is it a problem for views? cc @gatorsmile @jiangxb1987 @hvanhovell

[GitHub] spark issue #22343: [SPARK-25132][SQL][FOLLOW-UP] The behavior must be consi...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22343 @dongjoon-hyun does the orc conversion need the same fix? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22343: [SPARK-25132][SQL][FOLLOW-UP] The behavior must b...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22343#discussion_r216191236 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala --- @@ -1390,7 +1395,11 @@ class

[GitHub] spark issue #22343: [SPARK-25132][SQL][FOLLOW-UP] The behavior must be consi...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22343 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22318: [SPARK-25150][SQL] Rewrite condition when deduplicate Jo...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22318 How does this work? When we have duplicated attributes in the join condition, how can we know which attribute comes from which side

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18142 After a second thought, isn't it a bug? ``` hive> SELECT `d100.udf100`(`emp`.`name`) FROM `emp`; USER ``` This clearly violates the SQL semantic: the string ins

[GitHub] spark issue #22361: Revert [SPARK-10399] [SPARK-23879] [SPARK-23762] [SPARK-...

2018-09-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22361 LGTM, I'm merging it to unblock the 2.4 RC, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22262: [SPARK-25175][SQL] Field resolution should fail if there...

2018-09-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22262 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22270: [SPARK-25267][SQL][TEST] Disable ConvertToLocalRe...

2018-09-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22270#discussion_r215869136 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -1729,10 +1730,8 @@ class DataFrameSuite extends QueryTest

[GitHub] spark pull request #22354: [SPARK-23243][CORE][2.3] Fix RDD.repartition() da...

2018-09-06 Thread cloud-fan
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/22354 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22354: [SPARK-23243][CORE][2.3] Fix RDD.repartition() data corr...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22354 thanks, merging to 2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22352: [SPARK-25208][SQL][FOLLOW-UP] Reduce code size.

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22352 thanks, merging to master/2.4 (since it's a followup) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22346: [branch-2.3][SPARK-25313][SQL] Fix regression in FileFor...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22346 thanks, merging to 2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22354: [SPARK-23243][CORE][2.3] Fix RDD.repartition() data corr...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22354 cc @tgravescs @jiangxb1987 @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22354: [SPARK-23243][CORE][2.3] Fix RDD.repartition() da...

2018-09-06 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22354 [SPARK-23243][CORE][2.3] Fix RDD.repartition() data correctness issue backport https://github.com/apache/spark/pull/22112 to 2.3 --- An alternative fix for https

[GitHub] spark issue #22352: [SPARK-25208][SQL][FOLLOW-UP] Reduce code size.

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22352 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 I'm preparing a PR for 2.3, thanks for reminding! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22284: [SPARK-25278][SQL] Avoid duplicated Exec nodes when the ...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22284 This is a bug for sql metrics, let's include it in Spark 2.4. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22284: [SPARK-25278][SQL] Avoid duplicated Exec nodes when the ...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22284 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22352: [SPARK-25208][SQL][FOLLOW-UP] Reduce code size.

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22352 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22171 Is there a standard about how should CSV store decimal values? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22338: [SPARK-25317][CORE] Avoid perf regression in Murmur3 Has...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22338 Since the change looks safer to me and it does fix the regression, I'm merging it to unblock 2.4 release. Please continue to investigate the root cause, thanks

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18142 hmm, then it's too late. Maybe we can add it in Spark 2.3.2, cc @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22346: [branch-2.3][SPARK-25313][SQL] Fix regression in FileFor...

2018-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22346 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18142 @HyukjinKwon Thanks for the note! I think this behavior is better, I'm adding a `release_note` tag to the JIRA ticket, so that we don't forget to mention it in release notes

[GitHub] spark issue #22320: [SPARK-25313][SQL]Fix regression in FileFormatWriter out...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22320 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22320: [SPARK-25313][SQL]Fix regression in FileFormatWri...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22320#discussion_r215479502 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -82,7 +83,7 @@ case class

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 @tgravescs thanks for testing it out! I've created https://issues.apache.org/jira/browse/SPARK-25341 and https://issues.apache.org/jira/browse/SPARK-25342 to track the followup. I think

[GitHub] spark issue #22336: [SPARK-25306][SQL][FOLLOWUP] Change `test` to `ignore` i...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22336 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22179: [SPARK-25258][SPARK-23131][SPARK-25176][BUILD] Upgrade K...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22179 Do we have any compatibility issues here? Seems fine to me as we already shaded kryo. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22319 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22338: [SPARK-25317][CORE] Avoid perf regression in Murmur3 Has...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22338 This basically reverts the memory block in the hash computing, now the memory block is just a holder of the base object and base offset. This does fix the regression, will we also lose the perf

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22340: [SPARK-25337][SQL][TEST] `runSparkSubmit` should provide...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22340 How does the non-test mode resolve the class path issue? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22338: [SPARK-25317][CORE] Avoid perf regression in Murmur3 Has...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22338 Thanks for working on it! Will it be helpful if we move these hash methods to `MemoryBlock`? e.g. the code can be `int halfWord =bytes[offset + i

[GitHub] spark pull request #22320: [SPARK-25313][SQL]Fix regression in FileFormatWri...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22320#discussion_r215248202 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -82,7 +83,7 @@ case class

[GitHub] spark pull request #22320: [SPARK-25313][SQL]Fix regression in FileFormatWri...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22320#discussion_r215247634 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -754,6 +754,54 @@ class HiveDDLSuite

[GitHub] spark pull request #22320: [SPARK-25313][SQL]Fix regression in FileFormatWri...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22320#discussion_r215246692 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala --- @@ -805,6 +805,80 @@ class DataFrameReaderWriterSuite

[GitHub] spark issue #22320: [SPARK-25313][SQL]Fix regression in FileFormatWriter out...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22320 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22336: [SPARK-25306][SQL][FOLLOWUP] Change `test` to `ignore` i...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22336 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22319: [SPARK-25044][SQL][followup] add back UserDefinedFunctio...

2018-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22319 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

<    11   12   13   14   15   16   17   18   19   20   >