[GitHub] spark issue #22202: [SPARK-25211][Core] speculation and fetch failed result ...

2018-08-24 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/22202 @jinxing64 Do you have any idea? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #22202: [SPARK-25211][Core] speculation and fetch failed ...

2018-08-24 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/22202#discussion_r212601673 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2246,58 +2247,6 @@ class DAGSchedulerSuite extends

[GitHub] spark issue #22202: [SPARK-25211][Core] speculation and fetch failed result ...

2018-08-24 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/22202 @Ngone51 Because some shuffleMapStage has mapStageJobs(JobWaiter) by `SparkContext.submitMapStage` --- - To unsubscribe, e

[GitHub] spark pull request #22202: [SPARK-25211][Core] speculation and fetch failed ...

2018-08-23 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/22202 [SPARK-25211][Core] speculation and fetch failed result in hang of job ## What changes were proposed in this pull request? In current `DAGScheduler.handleTaskCompletion` code, when

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongToUnsafeRowMap in ex...

2018-07-24 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-23 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @viirya This case occurred in our cluster and we took a lot of time to find this bug. For some man-made reasons, the small table's max id has become abnormally large. The LongHasedRelation

[GitHub] spark pull request #21772: [SPARK-24809] [SQL] Serializing LongHashedRelatio...

2018-07-23 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21772#discussion_r204613880 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/HashedRelationSuite.scala --- @@ -278,6 +278,39 @@ class HashedRelationSuite

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-22 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @viirya Hi, Could you have more time to review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 Jenkins test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @viirya Yes, absolutely right. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #21772: [SPARK-24809] [SQL] Serializing LongHashedRelatio...

2018-07-18 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21772#discussion_r203365167 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala --- @@ -726,8 +726,9 @@ private[execution] final class

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-17 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @hvanhovell Thanks for reviewing. Losing data because the variable **cursor** in executor is 0 and serialization depends on it. I will add an UT later

[GitHub] spark pull request #21772: [SPARK-24809] [SQL] Serializing LongHashedRelatio...

2018-07-17 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21772#discussion_r203241485 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala --- @@ -726,8 +726,9 @@ private[execution] final class

[GitHub] spark pull request #21772: [SPARK-24809] [SQL] Serializing LongHashedRelatio...

2018-07-15 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/21772 [SPARK-24809] [SQL] Serializing LongHashedRelation in executor may result in data error When join key is long or int in broadcast join, Spark will use LongHashedRelation as the broadcast value

[GitHub] spark issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait ...

2018-05-07 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21164 @gatorsmile Could you please give some comments when you have time? Thanks so much. In addition, I think this is a critical bug

[GitHub] spark pull request #21164: [SPARK-24098][SQL] ScriptTransformationExec shoul...

2018-05-02 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21164#discussion_r185693669 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformationExec.scala --- @@ -137,13 +137,12 @@ case class

[GitHub] spark issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait ...

2018-05-01 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21164 @rxin hi, Do you have time to look at this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait ...

2018-04-27 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21164 @cloud-fan hi, fan, do you have time to see this pr? I think this is a critical bug. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #21164: [SPARK-24098][SQL] ScriptTransformationExec should wait ...

2018-04-26 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21164 Bug Reappearance: 1. Add Thread.sleep(1000 * 600) before assign for _exception. 2. structure a python script witch will throw exception like follow: test.py ```import sys

[GitHub] spark pull request #21164: [SPARK-24098][SQL] ScriptTransformationExec shoul...

2018-04-26 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/21164 [SPARK-24098][SQL] ScriptTransformationExec should wait process exiting before output iterator finish When feed thread doesn't set its _exception variable and the progress doesn't exit

[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-23 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21100#discussion_r183608838 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -896,6 +896,25 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-22 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21100#discussion_r183269702 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -171,6 +171,15 @@ object TypeCoercion

[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-20 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21100#discussion_r183005177 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -896,6 +896,19 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-20 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/21100#discussion_r182979953 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -896,6 +896,19 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #21100: [SPARK-24012][SQL] Union of map and other compati...

2018-04-18 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/21100 [SPARK-24012][SQL] Union of map and other compatible column ## What changes were proposed in this pull request? Union of map and other compatible column result in unresolved operator 'Union

[GitHub] spark issue #20846: [SPARK-5498][SQL][FOLLOW] add schema to table partition

2018-03-17 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/20846 The exception is not thrown in `ALTER TABLE`. We should prevent user to change table's column type. But, for historical data, should we do some compatible measures

[GitHub] spark pull request #20846: [SPARK-5498][SQL][FOLLOW] add schema to table par...

2018-03-17 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/20846#discussion_r175274916 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -99,7 +99,8 @@ case class CatalogTablePartition

[GitHub] spark pull request #20846: [SPARK-5498][SQL][FOLLOW] add schema to table par...

2018-03-16 Thread liutang123
GitHub user liutang123 reopened a pull request: https://github.com/apache/spark/pull/20846 [SPARK-5498][SQL][FOLLOW] add schema to table partition ## What changes were proposed in this pull request? When query a orc table witch some partition schemas are different from

[GitHub] spark pull request #20846: [SPARK-5498][SQL][FOLLOW] add schema to table par...

2018-03-16 Thread liutang123
Github user liutang123 closed the pull request at: https://github.com/apache/spark/pull/20846 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20846: [SPARK-5498][SQL][FOLLOW] add schema to table par...

2018-03-16 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/20846 [SPARK-5498][SQL][FOLLOW] add schema to table partition ## What changes were proposed in this pull request? When query a orc table witch some partition schemas are different from table

[GitHub] spark issue #20184: [SPARK-22987][Core] UnsafeExternalSorter cases OOM when ...

2018-01-17 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/20184 hi, @jerryshao , I try lazily allocate all the InputStream and byte arr in UnsafeSorterSpillReader. And would you please look at this when you have time

[GitHub] spark issue #20184: [SPARK-22987][Core] UnsafeExternalSorter cases OOM when ...

2018-01-15 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/20184 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #20184: [SPARK-22987][Core] UnsafeExternalSorter cases OOM when ...

2018-01-15 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/20184 I think that a lazy buffer allocation can not thoroughly solve this problem because UnsafeSorterSpillReader has BufferedFileInputStream witch will allocate off heap memory

[GitHub] spark issue #20184: [SPARK-22987][Core] UnsafeExternalSorter cases OOM when ...

2018-01-12 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/20184 Hi, @jerryshao , we can produce this issue as follows: ``` $ bin/spark-shell --master local --conf spark.sql.windowExec.buffer.spill.threshold=1 --driver-memory 1G scala>sc.rang

[GitHub] spark pull request #20184: [SPARK-22987][Core] UnsafeExternalSorter cases OO...

2018-01-07 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/20184 [SPARK-22987][Core] UnsafeExternalSorter cases OOM when invoking `getIterator` function. ## What changes were proposed in this pull request? ChainedIterator.UnsafeExternalSorter

[GitHub] spark issue #19364: [SPARK-22144][SQL] ExchangeCoordinator combine the parti...

2017-12-06 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19364 @cloud-fan Would you please look at this when you have time? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19812: [SPARK-22598][CORE] ExecutorAllocationManager doe...

2017-11-29 Thread liutang123
Github user liutang123 closed the pull request at: https://github.com/apache/spark/pull/19812 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19812: [SPARK-22598][CORE] ExecutorAllocationManager does not r...

2017-11-29 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19812 Sorry, I can not reproduce it now. But, sometimes, `ExecutorAllocationManager ` did not request new executors and `YarnSchedulerBackend.requestedTotalExecutors` is 0. I will close this PR now

[GitHub] spark issue #19812: [SPARK-22598][CORE] ExecutorAllocationManager does not r...

2017-11-28 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19812 Hi @jerryshao , I modified the info of this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19812: [SPARK-22598][CORE] ExecutorAllocationManager does not r...

2017-11-26 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19812 @srowen Would you please look at this when you have time? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19812: [SPARK-22598][CORE] ExecutorAllocationManager does not r...

2017-11-24 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19812 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19812: [SPARK-22598][CORE] ExecutorAllocationManager doe...

2017-11-24 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/19812 [SPARK-22598][CORE] ExecutorAllocationManager does not requests new executors when executor has failed and target has not changed ## What changes were proposed in this pull request

[GitHub] spark issue #19692: [SPARK-22469][SQL] Accuracy problem in comparison with s...

2017-11-16 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19692 Sorry, I just saw it. Thank fan for doing this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19692: [SPARK-22469][SQL] Accuracy problem in comparison with s...

2017-11-14 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19692 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19692: [SPARK-22469][SQL] Accuracy problem in comparison...

2017-11-13 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19692#discussion_r150726583 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -137,6 +137,8 @@ object TypeCoercion

[GitHub] spark pull request #19692: [SPARK-22469][SQL] Accuracy problem in comparison...

2017-11-08 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19692#discussion_r149659840 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -137,6 +137,8 @@ object TypeCoercion

[GitHub] spark issue #19692: [SPARK-22469][SQL] Accuracy problem in comparison with s...

2017-11-08 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19692 @cloud-fan Would you please look at this when you have time? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19692: [SPARK-22469][SQL] Accuracy problem in comparison...

2017-11-08 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/19692 [SPARK-22469][SQL] Accuracy problem in comparison with string and numeric ## What changes were proposed in this pull request? When compare string and numeric, cast them as double like

[GitHub] spark pull request #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to fi...

2017-10-16 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19504#discussion_r144823226 --- Diff: core/src/test/scala/org/apache/spark/FileSuite.scala --- @@ -549,9 +551,11 @@ class FileSuite extends SparkFunSuite with LocalSparkContext

[GitHub] spark issue #19504: [SPARK-22233] [CORE] [FOLLOW-UP] Allow user to filter ou...

2017-10-16 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19504 It looks better. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #19464: [SPARK-22233] [core] Allow user to filter out emp...

2017-10-13 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19464#discussion_r144683771 --- Diff: core/src/test/scala/org/apache/spark/FileSuite.scala --- @@ -510,4 +510,87 @@ class FileSuite extends SparkFunSuite with LocalSparkContext

[GitHub] spark pull request #19464: [SPARK-22233] [core] Allow user to filter out emp...

2017-10-12 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19464#discussion_r144235787 --- Diff: core/src/test/scala/org/apache/spark/FileSuite.scala --- @@ -510,4 +510,16 @@ class FileSuite extends SparkFunSuite with LocalSparkContext

[GitHub] spark pull request #19464: [SPARK-22233] [core] Allow user to filter out emp...

2017-10-12 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19464#discussion_r144235417 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -122,7 +122,10 @@ class NewHadoopRDD[K, V]( case

[GitHub] spark issue #19464: [SPARK-22233] [core] Allow user to filter out empty spli...

2017-10-11 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19464 @kiszk Any other suggestions an can ti PR be merged? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19464: Spark 22233

2017-10-10 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/19464 Spark 22233 ## What changes were proposed in this pull request? add spark.hadoop.filterOutEmptySplit confituration to allow user to filter out empty split in HadoopRDD. You can merge

[GitHub] spark issue #19364: [SPARK-22144][SQL] ExchangeCoordinator combine the parti...

2017-10-08 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19364 @maropu Any other suggestions and can this PR be merged? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19364: [SPARK-22144][SQL] ExchangeCoordinator combine th...

2017-09-28 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19364#discussion_r141785140 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala --- @@ -232,7 +232,7 @@ class ExchangeCoordinator

[GitHub] spark pull request #19364: [SPARK-22144][SQL] ExchangeCoordinator combine th...

2017-09-28 Thread liutang123
Github user liutang123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19364#discussion_r141780697 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala --- @@ -232,7 +232,7 @@ class ExchangeCoordinator

[GitHub] spark issue #19364: [SPARK-22144] ExchangeCoordinator combine the partitions...

2017-09-28 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/19364 @yhuai @hvanhovell Would you please look at this when you have time? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19364: [SPARK-22144] ExchangeCoordinator combine the par...

2017-09-27 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/19364 [SPARK-22144] ExchangeCoordinator combine the partitions of an 0 sized pre-shuffle to 0 ## What changes were proposed in this pull request? when the length of pre-shuffle's partitions is 0

[GitHub] spark pull request #18871: Merge pull request #1 from apache/master

2017-08-07 Thread liutang123
Github user liutang123 closed the pull request at: https://github.com/apache/spark/pull/18871 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18871: Merge pull request #1 from apache/master

2017-08-07 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/18871 Merge pull request #1 from apache/master 20170521 pull request ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How