[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128217046 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -53,9 +56,13 @@ // that the

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128217450 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -139,6 +153,32 @@ public void

[GitHub] spark pull request #18388: [SPARK-21175] Reject OpenBlocks when memory short...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18388#discussion_r128217700 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -25,6 +25,9 @@ import

[GitHub] spark pull request #18634: [SPARK-21414] Refine SlidingWindowFunctionFrame t...

2017-07-19 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/18634#discussion_r128227194 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLWindowFunctionSuite.scala --- @@ -356,6 +356,46 @@ class SQLWindowFunctionSuite

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21772 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21712: [SPARK-22384][SQL][followup] Refine partition pruning wh...

2018-07-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21712 Thanks for ping me. LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Sure, let me do it today. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-21 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 @cloud-fan Thanks for ping~ I updated the description. Let me know if I should refine it. --- - To unsubscribe, e-mail

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-22 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Sure, updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19868: [SPARK-22676] Replace spark.sql.hive.verifyPartitionPath...

2018-09-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Sure, updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2018-10-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 @maropu Thanks, and yes I'm still here and I can keep going if this pr is interested. I will update this pr this we

[GitHub] spark issue #10572: SPARK-12619 Combine small files in a hadoop directory in...

2017-11-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/10572 @HyukjinKwon To merge small files, should I tune `spark.sql.files.maxPartitionBytes`? But IIUC it only works for `FileSourceScanExec`. So when I select from hive table, it doesn't

[GitHub] spark pull request #19868: [SPARK-22676] Avoid iterating all partition paths...

2017-12-02 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19868 [SPARK-22676] Avoid iterating all partition paths when spark.sql.hive.verifyPartitionPath=true ## What changes were proposed in this pull request? In current code, it will scanning all

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19560: [SPARK-22334][SQL] Check table size from HDFS in ...

2017-10-23 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19560 [SPARK-22334][SQL] Check table size from HDFS in case the size in metastore is wrong. ## What changes were proposed in this pull request? Currently we use table properties('tota

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @gatorsmile @dongjoon-hyun Thanks a lot for looking into this. This pr aims to avoid OOM if metastore fails to update table properties after the data is already produced. With the

[GitHub] spark pull request #19560: [SPARK-22334][SQL] Check table size from filesyst...

2017-10-23 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19560#discussion_r146449741 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -120,22 +120,41 @@ class DetermineTableStats(session

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-24 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @viirya Thanks a lot for comments. 1. In current change, I verify the stats from file system only when the relation is under join. 2. I added a warning when the size from file system

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-24 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @wzhfy Thanks for comment; I know your point. In my cluster, namenode is under heavy pressure. Errors in stats happen with big chance. Users always do not know there's error in

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-10-24 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 >My main concern is, we'd better not to put burden on Spark to deal with metastore failures I think this make sense. I was also thinking about this when proposing this pr. I

[GitHub] spark pull request #19573: [SPARK-22350][SQL] select grouping__id from subqu...

2017-10-25 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19573 [SPARK-22350][SQL] select grouping__id from subquery ## What changes were proposed in this pull request? Currently, sql below will fail: ``` SELECT cnt, k2, k3, grouping__id

[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2017-10-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19573 @DonnyZone Thanks for taking a look. I think not quite the same. After https://github.com/apache/spark/pull/18270, all `grouping__id` are transformed to be `GroupingID` , which makes

[GitHub] spark pull request #19573: [SPARK-22350][SQL] select grouping__id from subqu...

2017-10-27 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/19573 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2017-10-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19573 @gatorsmile thanks for reply. It seems you preffer to give the alias explicitly. I will close this pr and go by your suggestion. But in my warehouse, there are lots of ETLs which are

[GitHub] spark issue #19573: [SPARK-22350][SQL] select grouping__id from subquery

2017-10-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19573 Thanks a lot. I will leave it open(if it's ok). Actually my friend from a another company also suffers this issue. Maybe people can leave some ideas on this. Thanks again for comment on

[GitHub] spark pull request #19573: [SPARK-22350][SQL] select grouping__id from subqu...

2017-10-27 Thread jinxing64
GitHub user jinxing64 reopened a pull request: https://github.com/apache/spark/pull/19573 [SPARK-22350][SQL] select grouping__id from subquery ## What changes were proposed in this pull request? Currently, sql below will fail: ``` SELECT cnt, k2, k3, grouping__id

[GitHub] spark pull request #19602: [SPARK-22384][SQL] Refine partition pruning when ...

2017-10-28 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19602 [SPARK-22384][SQL] Refine partition pruning when attribute is wrapped in Cast ## What changes were proposed in this pull request? Sql below will get all partitions from metastore, which

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-10-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 Could we fix this? Sql like above is common in my warehouse. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-10-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks a lot for your help :) >Can we just evaluate the right side CAST(2017 as STRING), since it is foldable? Do you mean to add a new rule ? -- cast the t

[GitHub] spark pull request #19602: [SPARK-22384][SQL] Refine partition pruning when ...

2017-10-29 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19602#discussion_r147583510 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/HiveClientSuite.scala --- @@ -53,7 +52,7 @@ class HiveClientSuite(version: String

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-10-29 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks again for review this pr. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19652 [SPARK-22435][SQL] Support processing array and map type using script ## What changes were proposed in this pull request? Currently, It is not supported to use script(e.g. python) to

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148748292 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1454,22 +1454,24 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148749276 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1485,21 +1487,27 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148749862 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -1485,21 +1487,27 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark pull request #19652: [SPARK-22435][SQL] Support processing array and m...

2017-11-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19652#discussion_r148775355 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformationExec.scala --- @@ -267,6 +268,33 @@ private class

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-11-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 True, I restrict it to be Cast (string_typed_attr as integral types) and `EqualTo`. `Not(EqualTo)` is not included, since the extra burden put to metastore is minor

[GitHub] spark issue #19602: [SPARK-22384][SQL] Refine partition pruning when attribu...

2017-11-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19602 @gatorsmile Thanks a lot for review this pr :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19652: [SPARK-22435][SQL] Support processing array and map type...

2017-11-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19652 @gatorsmile (Very gentle ping) Could you please give some comments when you have time :) Thanks you so much

[GitHub] spark issue #19560: [SPARK-22334][SQL] Check table size from filesystem in c...

2017-11-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19560 @wangyum Make sense. You can also try approach in this pr. If there are many(tens of thousands of) ETLs in the warehouse, we cannot afford to give that many hints or fix all the

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-11-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #19068: [SPARK-21428][SQL][FOLLOWUP]CliSessionState should point...

2017-09-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19068 @yaooqinn This change works well for me, thanks for fix ! After this change, hive client for execution(points to a dummy local metastore) will never be used when running sql in`spark-sql

[GitHub] spark pull request #19330: Orderable MapType

2017-09-22 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19330 Orderable MapType ## What changes were proposed in this pull request? We can make MapType orderable, and thus usable in aggregates and joins. ## How was this patch tested

[GitHub] spark pull request #19330: Orderable MapType

2017-09-22 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19330#discussion_r140627825 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -663,6 +663,18 @@ class

[GitHub] spark issue #19330: Orderable MapType

2017-09-22 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 It seems https://github.com/apache/spark/pull/15970 is not being worked. I resolved conflicts and add some tests in this pr

[GitHub] spark issue #19330: Orderable MapType

2017-09-23 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 @hvanhovell Thanks a lot for comment. I got you point. I will refine soon. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 Jenkins, retest this plesase. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-26 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 It seems the failed SparkR unit tests is not related. In current change, I added `trait OrderSpecified`, expressions(`BinaryComparison`, `Max`, `Min`, `SortArray`, `SortOrder`) using ordering

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-27 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 Conflicts resolved. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19330: [SPARK-18134][SQL] Orderable MapType

2017-09-28 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19330 @kiszk Thanks a lot for comments. Tests passed now. In current change `ordered` is included in `jsonValue`. But I'm not sure it is appropriate. Thanks again for taking time lo

[GitHub] spark pull request #19364: [SPARK-22144][SQL] ExchangeCoordinator combine th...

2017-09-28 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19364#discussion_r141781986 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ExchangeCoordinator.scala --- @@ -232,7 +232,7 @@ class ExchangeCoordinator

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144585860 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144586111 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -355,11 +355,21 @@ package object config { .doc(&quo

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144577910 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -355,11 +355,21 @@ package object config { .doc(&quo

[GitHub] spark issue #19476: [SPARK-22062][CORE] Spill large block to disk in BlockMa...

2017-10-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19476 @jerryshao Thanks a lot for ping. I left comments by my understanding. Not sure if it's helpful :) --- - To unsubs

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144768355 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144770017 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144772767 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark pull request #19476: [SPARK-22062][CORE] Spill large block to disk in ...

2017-10-16 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19476#discussion_r144776222 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1552,4 +1582,65 @@ private[spark] object BlockManager

[GitHub] spark issue #18866: [WIP][SPARK-21649][SQL] Support writing data into hive b...

2017-08-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 In current change: 1. `ClusteredDistribution` becomes ClusteredDistribution(clustering: Seq[Expression], clustersOpt: Option[Int] = None, useHiveHash: Boolean = false)` -- a) number and

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 @cloud-fan Would you give some advice on this ? Thus I can know if I'm on the right direction. I can keep working on it :) --- If your project is set up for it, you can reply to this

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 @cenyuhai Are you still working on this? Could please fix the test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18866: [SPARK-21649][SQL] Support writing data into hive bucket...

2017-08-22 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18866 @cloud-fan Thanks for reply. Looks like #19001 continues working on this and it's more comprehensive. I will close this pr for now. --- If your project is set up for it, you can rep

[GitHub] spark pull request #18866: [SPARK-21649][SQL] Support writing data into hive...

2017-08-22 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/18866 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 @gatorsmile Could you please give some ideas why the value of `grouping_id()` generated in Spark is different from `grouping__id` Hive? Is it designed on purpose? A lot of our users are

[GitHub] spark issue #18713: [SPARK-21509][SQL] Add a config to enable adaptive query...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18713 cc @cenyuhai As we talked offline, maybe your have interest on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-08-30 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19086 [SPARK-21874][SQL] Support changing database when rename table. ## What changes were proposed in this pull request? Support changing database of table by `alter table dbA.XXX rename to

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 Thank you so much ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile Thanks for taking time look at this. I updated description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-08-30 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 Thanks, I will refine soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-09-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18270 Thanks for notification. Actually we implement the same logic with hive, though there's a bug ... --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136719337 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -502,17 +502,16 @@ private[spark] class HiveExternalCatalog

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile I updated, let me known if there's still comments not resolved. Thanks again for review. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/19086#discussion_r136747432 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -569,46 +569,51 @@ class SessionCatalog

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 yes, correct --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 Sure, current behavior is hive behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #19127: [SPARK-21916][SQL] Set isolationOn=true when crea...

2017-09-04 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19127 [SPARK-21916][SQL] Set isolationOn=true when create hive client for metadata. ## What changes were proposed in this pull request? In current code, we set `isolationOn=!isCliSession

[GitHub] spark issue #19127: [SPARK-21916][SQL] Set isolationOn=true when create hive...

2017-09-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19127 Sure, I will close this then. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19127: [SPARK-21916][SQL] Set isolationOn=true when crea...

2017-09-05 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/19127 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile More comments on this ? Regarding the behavior change, should we follow Spark previous behavior or follow Hive? I'm ok with

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-06 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 It's not ok to follow Spark current behavior?(It will be different from Hive) I make this pr because we are migrating from Hive to Spark and lots of our users are using this fun

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 I'm from Meituan, a Chinese company --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional com

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile I feel sorry if this pr breaks rules of current code. But I think the function is a good(convenient for user) one. In our warehouse, there hundreds ETLs are using this function

[GitHub] spark pull request #19086: [SPARK-21874][SQL] Support changing database when...

2017-09-10 Thread jinxing64
Github user jinxing64 closed the pull request at: https://github.com/apache/spark/pull/19086 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15970: [SPARK-18134][SQL] Comparable MapTypes [POC]

2017-09-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/15970 @hvanhovell Are you still working on this? I think this is feature is useful :) --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #19086: [SPARK-21874][SQL] Support changing database when rename...

2017-09-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19086 @gatorsmile OK and thanks a lot for review :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19219: [SPARK-21993][SQL] Close sessionState in shutdown...

2017-09-13 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/19219 [SPARK-21993][SQL] Close sessionState in shutdown hook. ## What changes were proposed in this pull request? In current code, `SessionState` in `SparkSQLCLIDriver` is not guaranteed to

[GitHub] spark issue #19219: [SPARK-21993][SQL] Close sessionState in shutdown hook.

2017-09-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19219 cc @cloud-fan @jiangxb1987 Could you please take a look at this ? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19219: [SPARK-21993][SQL] Close sessionState in shutdown hook.

2017-09-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19219 It seems there are still some other places where session state is not guaranteed to be closed. I will update this pr soon

[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-04-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17533 @kayousterhout Thanks a lot for comment and sorry for late reply. I replied your comment from JIRA. Please take a look when you have time :) --- If your project is set up for it, you can

[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-04-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17533 @squito Thank you so much for taking look into this. > we don't want the TSM requesting info from the DAGSCheduler Sorry I missed this point for the previous change. Now I

[GitHub] spark pull request #17603: [SPARK-20288] Avoid generating the MapStatus by s...

2017-04-10 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/17603 [SPARK-20288] Avoid generating the MapStatus by stageId in BasicSchedulerIntegrationSuite ## What changes were proposed in this pull request? ShuffleId is determined before job

[GitHub] spark issue #17603: [SPARK-20288] Avoid generating the MapStatus by stageId ...

2017-04-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17603 I found this when test https://github.com/apache/spark/pull/17533. It failed now and then when try to get size of reduce from `MapStatus`. I'm not sure how to make it better: M

[GitHub] spark pull request #17634: [SPARK-20333] HashPartitioner should be compatibl...

2017-04-13 Thread jinxing64
GitHub user jinxing64 opened a pull request: https://github.com/apache/spark/pull/17634 [SPARK-20333] HashPartitioner should be compatible with num of child RDD's partitions. ## What changes were proposed in this pull request? Fix test "don't submit

[GitHub] spark issue #17634: [SPARK-20333] HashPartitioner should be compatible with ...

2017-04-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17634 I found this when doing https://github.com/apache/spark/pull/17533 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #17533: [WIP][SPARK-20219] Schedule tasks based on size o...

2017-04-14 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/17533#discussion_r111545019 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -472,6 +472,47 @@ class DAGScheduler

[GitHub] spark pull request #17533: [WIP][SPARK-20219] Schedule tasks based on size o...

2017-04-14 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/17533#discussion_r111545285 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -472,6 +472,47 @@ class DAGScheduler

[GitHub] spark pull request #17533: [WIP][SPARK-20219] Schedule tasks based on size o...

2017-04-14 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/17533#discussion_r111545327 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1080,6 +1122,25 @@ class DAGScheduler

[GitHub] spark pull request #17533: [WIP][SPARK-20219] Schedule tasks based on size o...

2017-04-14 Thread jinxing64
Github user jinxing64 commented on a diff in the pull request: https://github.com/apache/spark/pull/17533#discussion_r111545406 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1080,6 +1122,25 @@ class DAGScheduler

<    1   2   3   4   5   6   7   8   >