[GitHub] spark issue #21019: [SPARK-23948] Trigger mapstage's job listener in submitM...

2018-04-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21019 @squito @cloud-fan How do you think this change ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21019: [SPARK-23948] Trigger mapstage's job listener in submitM...

2018-04-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21019 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21019: [SPARK-23948] Trigger mapstage's job listener in submitM...

2018-04-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21019 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21019: [SPARK-23948] Trigger mapstage's job listener in submitM...

2018-04-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21019 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21019: [SPARK-23948] Trigger mapstage's job listener in ...

2018-04-09 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/21019 [SPARK-23948] Trigger mapstage's job listener in submitMissingTasks ## What changes were proposed in this pull request? SparkContext submitted a map stage from `submitMapStage

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 cc @cloud-fan @jerryshao @jiangxb1987 would you take a look at this? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...

2018-04-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 @vanzin Thanks for merging. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20812: [SPARK-23669] Executors fetch jars and name the j...

2018-04-08 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/20812 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20812: [SPARK-23669] Executors fetch jars and name the jars wit...

2018-03-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20812 @jerryshao Understood, `Ideally different udfs should be packaged in different jars with different name/version`. True. But we are faced with tons of udf/jars migrating from other engine. I

[GitHub] spark issue #20812: [SPARK-23669] Executors fetch jars and name the jars wit...

2018-03-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20812 @jerryshao Thanks for comment; Yes, this change is only for `sc.addJar` and the jars will be named with a prefix when executor `updateDependencies

[GitHub] spark pull request #20812: [SPARK-23669] Executors fetch jars and name the j...

2018-03-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/20812#discussion_r175010008 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -752,11 +752,10 @@ private[spark] class Executor

[GitHub] spark issue #20812: [SPARK-23669] Executors fetch jars and name the jars wit...

2018-03-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20812 @jiangxb1987 Thanks a lot for review. I will refine soon ! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20812: [SPARK-23669] Executors fetch jars and name the jars wit...

2018-03-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20812 @vanzin @zsxwing @jerryshao How do you think about this ? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #20812: [SPARK-23669] Executors fetch jars and name the j...

2018-03-13 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/20812 [SPARK-23669] Executors fetch jars and name the jars with md5 prefix ## What changes were proposed in this pull request? In our cluster, there are lots of UDF jars, some of them have

[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...

2018-03-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 @vanzin Thanks for review~ 1. I spent some time but didn't find the reason why same executor is killed multiple times and I cannot reproduce either. 2. I found that same completed

[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...

2018-03-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 @jerryshao Thanks again for review. It does exist in my cluster that same container can be processed multiple times, which will make `numExecutorsRunning` negative. I think I've ever

[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...

2018-03-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 Since the change for `YarnAllocator: killExecutor` is easy. Do you think it's worth to have this defense? Thanks again for review

[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...

2018-03-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 @jerryshao Thanks for advice. I spent some time digging to find why multiple `kill` sent from Driver to AM, but didn't figure out a way to reproduce. I come to find that it's

[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...

2018-03-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 @jerryshao Thanks for taking look. Yes, it does happen. we have jobs which have already finished all the tasks but still holding 40~100 executors

[GitHub] spark issue #20781: [SPARK-23637][YARN]Yarn might allocate more resource if ...

2018-03-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20781 cc @vanzin @tgravescs @cloud-fan @djvulee Could you please help review this ? --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #20781: [SPARK-23637][YARN]Yarn might allocate more resou...

2018-03-08 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/20781 [SPARK-23637][YARN]Yarn might allocate more resource if a same executor is killed multiple times. ## What changes were proposed in this pull request? `YarnAllocator` uses

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 Thanks for merging ! @cloud-fan @squito @zsxwing @Ngone51 --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 @cloud-fan @squito Thanks a lot ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-03-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 @squito @cloud-fan Thanks you so much for reviewing. I refined accordingly. Please take another look when you have time

[GitHub] spark pull request #20685: [SPARK-23524] Big local shuffle blocks should not...

2018-03-06 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/20685#discussion_r172492938 --- Diff: core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala --- @@ -352,6 +352,63 @@ class

[GitHub] spark pull request #20685: [SPARK-23524] Big local shuffle blocks should not...

2018-03-06 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/20685#discussion_r172492581 --- Diff: core/src/test/scala/org/apache/spark/storage/ShuffleBlockFetcherIteratorSuite.scala --- @@ -352,6 +352,63 @@ class

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 cc @cloud-fan @jiangxb1987 Could you please help take a look. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 ![image](https://user-images.githubusercontent.com/4058918/36822880-5f4aa9e8-1d35-11e8-8956-4081a2953d22.png) The failed test is not related, I can pass in my local

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-02-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-02-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20685: [SPARK-23524] Big local shuffle blocks should not be che...

2018-02-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20685 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20685: [SPARK-23524] Big local shuffle blocks should not...

2018-02-27 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/20685 [SPARK-23524] Big local shuffle blocks should not be checked for corruption. ## What changes were proposed in this pull request? In current code, all local blocks will be checked

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2018-02-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 @xxzzycq Currently no --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20461: [SPARK-23289][CORE]OneForOneBlockFetcher.DownloadCallbac...

2018-02-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/20461 @cloud-fan thanks a lot for ping. LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18171: [SPARK-20945] Fix TID key not found in TaskSchedulerImpl

2018-01-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18171 Why this is not merged into 2.2 ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #20069: [SPARK-22895] [SQL] Push down the deterministic p...

2018-01-11 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/20069#discussion_r160910495 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -851,7 +851,7 @@ object PushDownPredicate extends

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2018-01-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 @maropu @kiszk In current change, `ordered` is excluded from `toString`, `buildFormattedString`, `jsonValue`; I prefer to keep `ordered` internal and used only when ordering. Actually

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19868: [SPARK-22676] Avoid iterating all partition paths...

2017-12-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19868 [SPARK-22676] Avoid iterating all partition paths when spark.sql.hive.verifyPartitionPath=true ## What changes were proposed in this pull request? In current code, it will scanning all

[GitHub] spark issue #10572: SPARK-12619 Combine small files in a hadoop directory in...

2017-11-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/10572 @HyukjinKwon To merge small files, should I tune `spark.sql.files.maxPartitionBytes`? But IIUC it only works for `FileSourceScanExec`. So when I select from hive table, it doesn't work

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-11-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-11-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @wangyum Make sense. You can also try approach in this pr. If there are many(tens of thousands of) ETLs in the warehouse, we cannot afford to give that many hints or fix all

[GitHub] spark issue #19652: [SPARK-22435][SQL] Support processing array and map type...

2017-11-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19652 @gatorsmile (Very gentle ping) Could you please give some comments when you have time :) Thanks you so much

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-11-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks a lot for review this pr :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-11-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 True, I restrict it to be Cast (string_typed_attr as integral types) and `EqualTo`. `Not(EqualTo)` is not included, since the extra burden put to metastore is minor

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148775355 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformationExec.scala --- @@ -267,6 +268,33 @@ private class

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148749862 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1485,21 +1487,27 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148749276 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1485,21 +1487,27 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148748292 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1454,22 +1454,24 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19652 [SPARK-22435][SQL] Support processing array and map type using script ## What changes were proposed in this pull request? Currently, It is not supported to use script(e.g. python

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-10-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks again for review this pr. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19602: [SPARK-22384][SQL] Refine partition pruning when ...

2017-10-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19602#discussion_r147583510 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/HiveClientSuite.scala --- @@ -53,7 +52,7 @@ class HiveClientSuite(version: String

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-10-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks a lot for your help :) >Can we just evaluate the right side CAST(2017 as STRING), since it is foldable? Do you mean to add a new rule ? -- cast the t

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-10-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 Could we fix this? Sql like above is common in my warehouse. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19602: [SPARK-22384][SQL] Refine partition pruning when ...

2017-10-29 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19602 [SPARK-22384][SQL] Refine partition pruning when attribute is wrapped in Cast ## What changes were proposed in this pull request? Sql below will get all partitions from metastore, which

[GitHub] spark pull request #19573: [SPARK-22350][SQL] select grouping__id from subqu...

2017-10-27 Thread jinxing64
GitHub user jinxing64 reopened a pull request: https://github.com/apache/spark/pull/19573 [SPARK-22350][SQL] select grouping__id from subquery ## What changes were proposed in this pull request? Currently, sql below will fail: ``` SELECT cnt, k2, k3, grouping__id

[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2017-10-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19573 Thanks a lot. I will leave it open(if it's ok). Actually my friend from a another company also suffers this issue. Maybe people can leave some ideas on this. Thanks again for comment

[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2017-10-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19573 @gatorsmile thanks for reply. It seems you preffer to give the alias explicitly. I will close this pr and go by your suggestion. But in my warehouse, there are lots of ETLs which

[GitHub] spark pull request #19573: [SPARK-22350][SQL] select grouping__id from subqu...

2017-10-27 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/19573 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2017-10-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19573 @DonnyZone Thanks for taking a look. I think not quite the same. After https://github.com/apache/spark/pull/18270, all `grouping__id` are transformed to be `GroupingID` , which makes

[GitHub] spark pull request #19573: [SPARK-22350][SQL] select grouping__id from subqu...

2017-10-25 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19573 [SPARK-22350][SQL] select grouping__id from subquery ## What changes were proposed in this pull request? Currently, sql below will fail: ``` SELECT cnt, k2, k3, grouping__id

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-24 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 >My main concern is, we'd better not to put burden on Spark to deal with metastore failures I think this make sense. I was also thinking about this when proposing this pr. I do ag

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-24 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @wzhfy Thanks for comment; I know your point. In my cluster, namenode is under heavy pressure. Errors in stats happen with big chance. Users always do not know there's error in stats

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-24 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @viirya Thanks a lot for comments. 1. In current change, I verify the stats from file system only when the relation is under join. 2. I added a warning when the size from file system

[GitHub] spark pull request #19560: [SPARK-22334][SQL] Check table size from filesyst...

2017-10-23 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19560#discussion_r146449741 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -120,22 +120,41 @@ class DetermineTableStats(session

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @gatorsmile @dongjoon-hyun Thanks a lot for looking into this. This pr aims to avoid OOM if metastore fails to update table properties after the data is already produced

[GitHub] spark pull request #19560: [SPARK-22334][SQL] Check table size from HDFS in ...

2017-10-23 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19560 [SPARK-22334][SQL] Check table size from HDFS in case the size in metastore is wrong. ## What changes were proposed in this pull request? Currently we use table properties('totalSize

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144776222 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144772767 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144770017 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144768355 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark issue #19476: [SPARK-22062][CORE] Spill large block to disk in BlockMa...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19476 @jerryshao Thanks a lot for ping. I left comments by my understanding. Not sure if it's helpful :) --- - To unsubscribe

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144577910 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -355,11 +355,21 @@ package object config { .doc

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144586111 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -355,11 +355,21 @@ package object config { .doc

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144585860 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19364: [SPARK-22144][SQL] ExchangeCoordinator combine th...

2017-09-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19364#discussion_r141781986 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala --- @@ -232,7 +232,7 @@ class ExchangeCoordinator

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 @kiszk Thanks a lot for comments. Tests passed now. In current change `ordered` is included in `jsonValue`. But I'm not sure it is appropriate. Thanks again for taking time looking

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 Conflicts resolved. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 It seems the failed SparkR unit tests is not related. In current change, I added `trait OrderSpecified`, expressions(`BinaryComparison`, `Max`, `Min`, `SortArray`, `SortOrder`) using ordering

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 Jenkins, retest this plesase. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #19330: Orderable MapType

2017-09-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 @hvanhovell Thanks a lot for comment. I got you point. I will refine soon. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19330: Orderable MapType

2017-09-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 It seems https://github.com/apache/spark/pull/15970 is not being worked. I resolved conflicts and add some tests in this pr

[GitHub] spark pull request #19330: Orderable MapType

2017-09-23 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19330#discussion_r140627825 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -663,6 +663,18 @@ class

[GitHub] spark pull request #19330: Orderable MapType

2017-09-23 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19330 Orderable MapType ## What changes were proposed in this pull request? We can make MapType orderable, and thus usable in aggregates and joins. ## How was this patch tested

[GitHub] spark issue #19068: [SPARK-21428][SQL][FOLLOWUP]CliSessionState should point...

2017-09-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19068 @yaooqinn This change works well for me, thanks for fix ! After this change, hive client for execution(points to a dummy local metastore) will never be used when running sql in`spark-sql

[GitHub] spark issue #19219: [SPARK-21993][SQL] Close sessionState in shutdown hook.

2017-09-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19219 It seems there are still some other places where session state is not guaranteed to be closed. I will update this pr soon

[GitHub] spark issue #19219: [SPARK-21993][SQL] Close sessionState in shutdown hook.

2017-09-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19219 cc @cloud-fan @jiangxb1987 Could you please take a look at this ? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #19219: [SPARK-21993][SQL] Close sessionState in shutdown...

2017-09-13 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19219 [SPARK-21993][SQL] Close sessionState in shutdown hook. ## What changes were proposed in this pull request? In current code, `SessionState` in `SparkSQLCLIDriver` is not guaranteed

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile OK and thanks a lot for review :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #15970: [SPARK-18134][SQL] Comparable MapTypes [POC]

2017-09-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/15970 @hvanhovell Are you still working on this? I think this is feature is useful :) --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-09-10 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/19086 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile I feel sorry if this pr breaks rules of current code. But I think the function is a good(convenient for user) one. In our warehouse, there hundreds ETLs are using this function

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 I'm from Meituan, a Chinese company --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 It's not ok to follow Spark current behavior?(It will be different from Hive) I make this pr because we are migrating from Hive to Spark and lots of our users are using this function

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile More comments on this ? Regarding the behavior change, should we follow Spark previous behavior or follow Hive? I'm ok with both

[GitHub] spark pull request #19127: [SPARK-21916][SQL] Set isolationOn=true when crea...

2017-09-05 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/19127 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19127: [SPARK-21916][SQL] Set isolationOn=true when create hive...

2017-09-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19127 Sure, I will close this then. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19127: [SPARK-21916][SQL] Set isolationOn=true when crea...

2017-09-04 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19127 [SPARK-21916][SQL] Set isolationOn=true when create hive client for metadata. ## What changes were proposed in this pull request? In current code, we set `isolationOn=!isCliSession

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 Sure, current behavior is hive behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

<    1   2   3   4   5   6   7   8   >