[GitHub] spark pull request #16099: [SPARK-18665][SQL] set statement state to "ERROR"...

2018-02-04 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/16099#discussion_r165878592 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -241,6 +241,8

[GitHub] spark issue #16099: [SPARK-18665][SQL] set statement state to "ERROR" after ...

2018-02-04 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/16099 @gatorsmile two years passed... I don't know what to say. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #16099: [SPARK-18665][SQL] set statement state to "ERROR"...

2018-02-04 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/16099#discussion_r165877094 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- @@ -241,6 +241,8

[GitHub] spark issue #14129: [SPARK-16280][SQL] Implement histogram_numeric SQL funct...

2017-12-08 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14129 Is this pr available? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19756: [SPARK-22527][SQL] Reuse coordinated exchanges if...

2017-11-16 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/19756#discussion_r151615572 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala --- @@ -109,3 +109,67 @@ case class ReuseExchange(conf: SQLConf

[GitHub] spark pull request #19756: [SPARK-22527][SQL] Reuse coordinated exchanges if...

2017-11-16 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/19756#discussion_r151612678 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala --- @@ -109,3 +109,67 @@ case class ReuseExchange(conf: SQLConf

[GitHub] spark pull request #19756: [SPARK-22527][SQL] Reuse coordinated exchanges if...

2017-11-16 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/19756#discussion_r151611386 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/Exchange.scala --- @@ -109,3 +109,67 @@ case class ReuseExchange(conf: SQLConf

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-10-08 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19301: [SPARK-22084][SQL] Fix performance regression in aggrega...

2017-09-21 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/19301 should `sum(mt_cnt)` and `sum(ele_cnt)` be compute again? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19301: [SPARK-22084][SQL] Fix performance regression in aggrega...

2017-09-21 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/19301 I don't know wether my case can be optimized or not. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19301: [SPARK-22084][SQL] Fix performance regression in aggrega...

2017-09-21 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/19301 my case: ```sql select dt, geohash_of_latlng, sum(mt_cnt), sum(ele_cnt), round(sum(mt_cnt) * 1.0 * 100 / sum(mt_cnt_all), 2), round(sum(ele_cnt) * 1.0 * 100 / sum

[GitHub] spark pull request #19301: [SPARK-22084][SQL] Fix performance regression in ...

2017-09-21 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/19301#discussion_r140155406 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala --- @@ -38,7 +38,7 @@ import

[GitHub] spark pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallba...

2017-09-19 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/18193#discussion_r139861601 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -140,6 +141,62 @@ class DetermineTableStats(session: SparkSession

[GitHub] spark pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallba...

2017-09-17 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/18193#discussion_r139312866 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -139,6 +138,54 @@ class DetermineTableStats(session: SparkSession

[GitHub] spark pull request #19219: [SPARK-21993][SQL][WIP] Close sessionState when f...

2017-09-16 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/19219#discussion_r139281683 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala --- @@ -42,7 +42,7 @@ class HiveSessionStateBuilder(session

[GitHub] spark issue #17924: [SPARK-20682][SQL] Support a new faster ORC data source ...

2017-09-03 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/17924 @dongjoon-hyun I have a question: does this orc data sources reader support a table contains multiple file format for example: table/ day=2017-09-01 RCFile day=2017

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-09-02 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 @jinxing64 I think you may revert the changes in Spark, and use the same logic of grouping__id as hive. Keep the wrong result consistently as hive did. --- If your project is set up for it, you

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-09-02 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 @gatorsmile I had already tried to resolve grouping__id in ResolveFunctions. But ResolveFunctions is behind ResolveGroupingAnalytics. grouping__id may change in ResolveGroupingAnalytics

[GitHub] spark pull request #18270: [SPARK-21055][SQL] replace grouping__id with grou...

2017-09-02 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/18270#discussion_r136695110 --- Diff: sql/core/src/test/resources/sql-tests/results/group-analytics.sql.out --- @@ -223,12 +223,19 @@ grouping_id() can only be used

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-09-02 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 why failed? Couldn't I add order by? ```java org.scalatest.exceptions.TestFailedException: Expected "...Y CUBE(course, year)[ ORDER BY grouping__id, course, year]", but got

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-31 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 I can't see any comment at 77d4f7c? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-30 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18270: [SPARK-21055][SQL] replace grouping__id with grou...

2017-08-30 Thread cenyuhai
GitHub user cenyuhai reopened a pull request: https://github.com/apache/spark/pull/18270 [SPARK-21055][SQL] replace grouping__id with grouping_id() ## What changes were proposed in this pull request? spark does not support grouping__id, it has grouping_id() instead

[GitHub] spark pull request #19087: [SPARK-21055][SQL] replace grouping__id with grou...

2017-08-30 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/19087 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #19087: [SPARK-21055][SQL] replace grouping__id with grou...

2017-08-30 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/19087 [SPARK-21055][SQL] replace grouping__id with grouping_id() ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch

[GitHub] spark pull request #18270: [SPARK-21055][SQL] replace grouping__id with grou...

2017-08-30 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/18270 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #18270: [SPARK-21055][SQL] replace grouping__id with grou...

2017-08-30 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/18270#discussion_r136082775 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -954,6 +951,12 @@ class Analyzer( try

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-08-30 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 Ok,I will update it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #18193: [SPARK-15616] [SQL] CatalogRelation should fallba...

2017-06-25 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/18193#discussion_r123903846 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -139,6 +138,54 @@ class DetermineTableStats(session: SparkSession

[GitHub] spark issue #16832: [SPARK-19490][SQL] ignore case sensitivity when filterin...

2017-06-16 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/16832 @taklwu this pr is completed, you can merge this pr by yourself. A committer told me that other pr has fixed this bug, my pr will not be merged.. --- If your project is set up for it, you

[GitHub] spark issue #18270: [SPARK-21055][SQL] replace grouping__id with grouping_id...

2017-06-12 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/18270 why it failed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #18270: [SPARK-21055][SQL] replace grouping__id with grou...

2017-06-11 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/18270 [SPARK-21055][SQL] replace grouping__id with grouping_id() ## What changes were proposed in this pull request? spark does not support grouping__id, it has grouping_id() instead

[GitHub] spark issue #17362: [SPARK-20033][SQL] support hive permanent function

2017-03-23 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/17362 Ok, I think @weiqingy 's pr will resolve this problem --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17362: [SPARK-20033][SQL] support hive permanent functio...

2017-03-23 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/17362#discussion_r107828225 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala --- @@ -135,6 +142,35 @@ private[sql] class HiveSessionCatalog

[GitHub] spark issue #17362: [SPARK-20033][SQL] support hive permanent function

2017-03-23 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/17362 @gatorsmile hi,spark just suport HIVE UDF which resources is in the local disk, or spark-sql --jars xxx.jar or something else. But I think spark don't support the hive permanent function which

[GitHub] spark pull request #17362: [SPARK-20033][SQL] support hive permanent functio...

2017-03-20 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/17362 [SPARK-20033][SQL] support hive permanent function ## What changes were proposed in this pull request? support hive permanent function ## How was this patch tested? You can

[GitHub] spark issue #16832: [SPARK-19490][SQL] change hive column names to lower cas...

2017-02-09 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/16832 @rxin I think it is safe, it is only used to check whether the schema contains the columns. Hive columns are not case-sensitive. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #16832: [SPARK-19490][SQL] change column names to lower c...

2017-02-07 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/16832 [SPARK-19490][SQL] change column names to lower case ## What changes were proposed in this pull request? change column names to lower case ## How was this patch tested? You can

[GitHub] spark pull request #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter...

2017-01-11 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/16481#discussion_r95538748 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -413,17 +413,22 @@ case class DataSource

[GitHub] spark issue #15109: [SPARK-17501][CORE] Record executor heartbeat timestamp ...

2016-12-15 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/15109 Ok, I will close this PR. This is not a big problem --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #15109: [SPARK-17501][CORE] Record executor heartbeat tim...

2016-12-15 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/15109 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16099: [SPARK-18665][SQL] set statement state to error a...

2016-12-01 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/16099 [SPARK-18665][SQL] set statement state to error after user canceled job ## What changes were proposed in this pull request? set statement state to error after user canceled job You can

[GitHub] spark pull request #16097: [SPARK-18665] set job to "ERROR" when job is canc...

2016-11-30 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/16097 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #16097: [SPARK-18665] set job to "ERROR" when job is canc...

2016-11-30 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/16097 [SPARK-18665] set job to "ERROR" when job is canceled ## What changes were proposed in this pull request? set job to "ERROR" when job is canceled You can merge this pul

[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-23 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/15041 Ok, I close this PR --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when th...

2016-09-23 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/15041 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-23 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/15041 Driver is ok, but executor is running out of memory, this method is called by executor. Our maximum limit of memory is 15G. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-23 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/15041 In terms of data security, we can't get all data. Every sql should limit 10 million records. But sometimes it will OOM... My demand is to avoid OOM. @srowen Do you have any idea? --- If your

[GitHub] spark pull request #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when th...

2016-09-20 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/15041#discussion_r79754097 --- Diff: core/src/main/scala/org/apache/spark/util/collection/Utils.scala --- @@ -30,10 +34,22 @@ private[spark] object Utils { * Returns

[GitHub] spark pull request #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when th...

2016-09-20 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/15041#discussion_r79753174 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1384,14 +1385,15 @@ abstract class RDD[T: ClassTag]( * @param ord the implicit

[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-15 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14969 OK, so it's still not sure that this will never happen again because SparkQA can't find out whether developer has added all excludes. --- If your project is set up for it, you can reply

[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-15 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14969 Ah, I was confused by MimaExcludes.scala. I asked @liancheng, he told me that just add these to MimaExcludes.scala which is imported from spark 2.0. I see your HOTFIX, you just remove what I

[GitHub] spark pull request #15109: [SPARK-17501][CORE] Record executor heartbeat tim...

2016-09-15 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/15109 [SPARK-17501][CORE] Record executor heartbeat timestamp when received heartbeat event. ## What changes were proposed in this pull request? Record executor's latest heartbeat timestamp when

[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-13 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14969 OK --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-12 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r78360879 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -38,47 +37,68 @@ private[ui] class ExecutorsTab(parent: SparkUI) extends

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-12 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r78358827 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -38,47 +37,68 @@ private[ui] class ExecutorsTab(parent: SparkUI) extends

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-12 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r78357353 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -38,47 +37,68 @@ private[ui] class ExecutorsTab(parent: SparkUI) extends

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-12 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r78355162 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -38,47 +37,68 @@ private[ui] class ExecutorsTab(parent: SparkUI) extends

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-12 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r78352242 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -38,47 +37,68 @@ private[ui] class ExecutorsTab(parent: SparkUI) extends

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-12 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r78351884 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -38,47 +37,68 @@ private[ui] class ExecutorsTab(parent: SparkUI) extends

[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-09-12 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14737 thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when th...

2016-09-10 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/15041#discussion_r78272386 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -493,8 +494,7 @@ abstract class RDD[T: ClassTag]( * * @param weights

[GitHub] spark pull request #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when th...

2016-09-09 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/15041 [SPARK-17488][CORE] TakeAndOrder will OOM when the data is very large ## What changes were proposed in this pull request? In function Utils.takeOrdered, it will sort all data in memory, when

[GitHub] spark issue #15014: [SPARK-17429][SQL] use ImplicitCastInputTypes with funct...

2016-09-09 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/15014 In this case , we store a business type by int (to decrease record size). for example, xxx are machine error types, are application types. --- If your project is set up for it, you can reply

[GitHub] spark issue #15014: [SPARK-17429][SQL] use ImplicitCastInputTypes with funct...

2016-09-09 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/15014 @hvanhovell Why hive is so popular? because hive is compatible and stable. From the user's point of view, hive is easy to use. Users need not care about types all the time. I agree that hive

[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-09-08 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14737 @srowen If it is ok,can you merge this pr to master?thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #14969: [SPARK-17406][WEB UI] limit timeline executor events

2016-09-08 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14969 @srowen I remove parallel maps, please review the latest codes.Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #15014: [SPARK-17429][SQL] use ImplicitCastInputTypes wit...

2016-09-08 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/15014 [SPARK-17429][SQL] use ImplicitCastInputTypes with function Length ## What changes were proposed in this pull request? select length(11); select length(2.0); these sql will return

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-06 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r77652830 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -70,15 +72,33 @@ class ExecutorsListener(storageStatusListener

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-06 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r77652206 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala --- @@ -123,55 +123,55 @@ private[ui] class AllJobsPage(parent: JobsTab) extends

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-06 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r77651971 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -70,15 +72,33 @@ class ExecutorsListener(storageStatusListener

[GitHub] spark pull request #14969: [SPARK-17406][WEB UI] limit timeline executor eve...

2016-09-06 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14969#discussion_r77651759 --- Diff: core/src/main/scala/org/apache/spark/ui/exec/ExecutorsTab.scala --- @@ -59,7 +59,9 @@ class ExecutorsListener(storageStatusListener

[GitHub] spark issue #14969: [SPARK-17406][WEB-UI] limit timeline executor events

2016-09-06 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14969 [error] * method executorIdToData()scala.collection.mutable.HashMap in class org.apache.spark.ui.exec.ExecutorsListener does not have a correspondent in current version [error]filter

[GitHub] spark pull request #14969: [SPARK-17406][WEB-UI] limit timeline executor eve...

2016-09-05 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/14969 [SPARK-17406][WEB-UI] limit timeline executor events ## What changes were proposed in this pull request? The job page will be too slow to open when there are thousands of executor events

[GitHub] spark pull request #14966: Merge pull request #8 from apache/master

2016-09-05 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/14966 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14966: Merge pull request #8 from apache/master

2016-09-05 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14966 Sorry, I make a mistake... I want to merge pull request to my fork. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #14966: Merge pull request #8 from apache/master

2016-09-05 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/14966 Merge pull request #8 from apache/master ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested

[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-08-25 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14737 @srowen Why I set this value 2, because a "JOIN" action needs 2 elements.Because Users always don't care about how many partitions the graphs has, they just want to know the relation

[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-08-25 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14737 I am very sorry about, the first picture is for stage, the second picture is for job, but it is the same job "select count(1) from partitionedTables " --- If your project is set up f

[GitHub] spark issue #14737: [SPARK-17171][WEB UI] DAG will list all partitions in th...

2016-08-25 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14737 @srowen please review the latest codes, thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14739: [SPARK-17176][WEB UI]set default task sort column to "St...

2016-08-23 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14739 @srowen can we make it an option, default by "Index", users can choose "Status" or anything else? --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark issue #14737: [Spark-17171][WEB UI] DAG will list all partitions in th...

2016-08-21 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14737 @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14737: [Spark-17171][WEB UI] DAG will list all partition...

2016-08-21 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14737#discussion_r75593904 --- Diff: core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala --- @@ -119,18 +119,47 @@ private[ui] object RDDOperationGraph extends

[GitHub] spark pull request #14737: [Spark-17171][WEB UI] DAG will list all partition...

2016-08-21 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/14737#discussion_r75593902 --- Diff: core/src/main/scala/org/apache/spark/ui/scope/RDDOperationGraph.scala --- @@ -119,18 +119,47 @@ private[ui] object RDDOperationGraph extends

[GitHub] spark issue #14739: [SPARK-17176][WEB UI]set default task sort column to "St...

2016-08-21 Thread cenyuhai
Github user cenyuhai commented on the issue: https://github.com/apache/spark/pull/14739 YES, "FAILED" will come before "RUNNING".That is what I want, because we want to know why task will fail more than the need to sort by ID. ID is just a unique identifier f

[GitHub] spark pull request #14739: [SPARK-17176][WEB UI]set default task sort column...

2016-08-21 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/14739 [SPARK-17176][WEB UI]set default task sort column to "Status" ## What changes were proposed in this pull request? Task are sorted by "Index" in Stage Page, but user are alw

[GitHub] spark pull request #14737: [Spark-17171][WEB UI] DAG will list all partition...

2016-08-20 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/14737 [Spark-17171][WEB UI] DAG will list all partitions in the graph ## What changes were proposed in this pull request? DAG will list all partitions in the graph, it is too slow and hard to see

[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...

2016-05-07 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/11546 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...

2016-05-06 Thread cenyuhai
Github user cenyuhai commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-217467789 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...

2016-05-06 Thread cenyuhai
Github user cenyuhai commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-217466944 @andrewor14 I alter the code as what you said, but the test failed because of timeout. It seems like that it is none of my business... --- If your project is set up

[GitHub] spark pull request: [SPARK-13566][CORE] Avoid deadlock between Blo...

2016-04-19 Thread cenyuhai
Github user cenyuhai commented on the pull request: https://github.com/apache/spark/pull/11546#issuecomment-212208592 @jamesecahill I don't know whether @JoshRosen will provide any other patch for this issue. But I have fixed this bug in my production environment by this PR

[GitHub] spark pull request: [Spark-13772][SQL] fix data type mismatch for ...

2016-04-15 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/11605 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-13566] Avoid deadlock between BlockMana...

2016-03-10 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/11546#discussion_r55668498 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -227,6 +228,17 @@ private[spark] class Executor( logError

[GitHub] spark pull request: [Spark-13772] fix data type mismatch for decim...

2016-03-10 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/11605#discussion_r55665217 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercionSuite.scala --- @@ -299,6 +299,19 @@ class

[GitHub] spark pull request: [SPARK-13566] Avoid deadlock between BlockMana...

2016-03-10 Thread cenyuhai
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/spark/pull/11546#discussion_r55664945 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -227,6 +228,17 @@ private[spark] class Executor( logError

[GitHub] spark pull request: [Spark-13772] fix data type mismatch for decim...

2016-03-09 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/11605 [Spark-13772] fix data type mismatch for decimal fix data type mismatch for decimal, patch for branch 1.6 You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-13566] Avoid deadlock between BlockMana...

2016-03-06 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/11546 [SPARK-13566] Avoid deadlock between BlockManager and Executor Thread Temp patch for branch 1.6, avoid deadlock between BlockManager and Executor Thread. You can merge this pull request

[GitHub] spark pull request: Multi user

2015-06-14 Thread cenyuhai
Github user cenyuhai closed the pull request at: https://github.com/apache/spark/pull/6812 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Multi user

2015-06-14 Thread cenyuhai
Github user cenyuhai commented on the pull request: https://github.com/apache/spark/pull/6812#issuecomment-111811208 I am so sorry, I just push my commits to my branch. I don't know it will happen. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: Multi user

2015-06-14 Thread cenyuhai
GitHub user cenyuhai opened a pull request: https://github.com/apache/spark/pull/6812 Multi user You can merge this pull request into a Git repository by running: $ git pull https://github.com/cenyuhai/spark MultiUser Alternatively you can review and apply these changes

  1   2   >