[GitHub] spark pull request #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
Github user shaolinliu closed the pull request at: https://github.com/apache/spark/pull/17581 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 ok. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
Github user shaolinliu commented on a diff in the pull request: https://github.com/apache/spark/pull/17581#discussion_r111080385 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -359,6 +359,16 @@ object SQLConf { .booleanConf .createWithDefault(false) + val THRIFTSERVER_RESULT_LIMIT = +buildConf("spark.sql.thriftserver.retainedResults") --- End diff -- In hive, we use parameter "hive.fetch.task.conversion=minimal" to take result from mr job's output(from disk), in this mode hive will not collect the result to the memory, avoiding the hive process crash. And I can not think of a good name. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 Ok, I have modified the description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 Sorry, I am wrong. It's just increase user's query time, not occupy the resource. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 In a department, we can not constraint everyone, but when we start ts2 with this parameter, even if the user goes wrong, it does not matter.We have used the SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT parameter, but the actual process of business slowed down, so add the parameter. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 My opinion is: In the production, the user often select without a limit, often lead to service offline,this is a general situation, so increase the parameters. When SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT is open, the process is : beeline[get] -> hs2[get] -> executor[ret] -> hs2[ret] ->beeline[ret] and in the process,the executor's resource is occupied, and when it it close,the process is: beeline[get] -> hs2[collect] -> beeline[ret]. similar to the redis's pipeline to enhance performance, and reducing the time will reduce resources too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 @ueshin please take a look at this pr, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
GitHub user shaolinliu opened a pull request: https://github.com/apache/spark/pull/17581 [SPARK-20248][ SQL]Spark SQL add limit parameter to enhance the reliability. ## What changes were proposed in this pull request? Add a parameter "spark.sql.thriftServer.retainedResults" with default value 200, when user run a query without a limit, this will implicitly add a limit to this query. When user run a query with a limit,we do nothing. If this parameter is set to 0,we do nothing too. ## How was this patch tested? manual tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shaolinliu/spark SPARK-20248 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17581.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17581 commit b47cf92fee79a57fbec37bbf9c7d35753f5c7d75 Author: liu shaolin Date: 2017-04-07T05:25:21Z modified: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala modified: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala commit f556bf64b46b40beb58872ebb2fafa69a97c43ec Author: liu shaolin Date: 2017-04-07T08:19:04Z repair the compile error. commit 81cd6333164b0915a0fb492a34ab84b973f7cd0d Author: liu shaolin Date: 2017-04-07T09:22:30Z modify the doc. commit 59a1b1ae96809076916849c9a1d396dc7d40251d Author: liu shaolin Date: 2017-04-10T02:01:03Z Merge branch 'liu' into SPARK-20248 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
Github user shaolinliu closed the pull request at: https://github.com/apache/spark/pull/17561 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17561 ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17561 @ueshin the SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT will cause the resource waste of the cluster, because the cluster's resource will release when the query is finish, so the executor's memory and cpu is occupied, in general it is slower. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
Github user shaolinliu commented on a diff in the pull request: https://github.com/apache/spark/pull/17561#discussion_r110346704 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -359,6 +359,13 @@ object SQLConf { .booleanConf .createWithDefault(false) + val THRIFTSERVER_RESULT_LIMIT = +buildConf("spark.sql.thriftServer.retainedResults") + .internal() + .doc("The number of sql results returned by Thrift Server, and 0 is unlimited.") --- End diff -- ok,i will modify. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17561 has modify the error, and retest the usecase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17561 @ueshin please take a look at this pr, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
Github user shaolinliu closed the pull request at: https://github.com/apache/spark/pull/17560 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17560 @ueshin i resubmit the pr, please close this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
GitHub user shaolinliu opened a pull request: https://github.com/apache/spark/pull/17561 [SPARK-20248][ SQL]Spark SQL add limit parameter to enhance the reliability. ## What changes were proposed in this pull request? Add a parameter "spark.sql.thriftServer.retainedResults" with default value 200, when user run a query without a limit, this will implicitly add a limit to this query. When user run a query with a limit,we do nothing. If this parameter is set to 0,we do nothing too. ## How was this patch tested? manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/shaolinliu/spark liu Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17561.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17561 commit b47cf92fee79a57fbec37bbf9c7d35753f5c7d75 Author: liu shaolin Date: 2017-04-07T05:25:21Z modified: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala modified: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17560 yes, i am fixing, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter ...
GitHub user shaolinliu opened a pull request: https://github.com/apache/spark/pull/17560 [SPARK-20248][ SQL]Spark SQL add limit parameter to enhance the reliability. ## What changes were proposed in this pull request? Add a parameter "spark.sql.thriftServer.retainedResults" with default value 200, when user run a query without a limit, this will implicitly add a limit to this query. When user run a query with a limit,we do nothing. If this parameter is set to 0,we do nothing too. ## How was this patch tested? manual tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/shaolinliu/spark SPARK-20248 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17560.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17560 commit 2ce4f358b6cc70c45a715214b9f1024ee724b804 Author: liu shaolin Date: 2017-04-07T05:25:21Z modified: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala modified: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17258: [SPARK-19807][Web UI]Add reason for cancellation when a ...
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17258 Thank you for the advice, your result description seems simpler and more appropriate. And I have pushed the change to the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17258: [SPARK-19807][Web UI]Add reason for cancellation ...
GitHub user shaolinliu opened a pull request: https://github.com/apache/spark/pull/17258 [SPARK-19807][Web UI]Add reason for cancellation when a stage is killed using web UI ## What changes were proposed in this pull request? When a user kills a stage using web UI (in Stages page), StagesTab.handleKillRequest requests SparkContext to cancel the stage without giving a reason. SparkContext has cancelStage(stageId: Int, reason: String) that Spark could use to pass the information for monitoring/debugging purposes. ## How was this patch tested? manual tests Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/shaolinliu/spark SPARK-19807 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17258.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17258 commit f43d1d689800d4d6fabdd7c3c4a85065f93bc34c Author: lvdongr Date: 2017-03-11T11:37:00Z [SPARK-19807][Web UI]Add reason for cancellation when a stage is killed using web UI --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org