[GitHub] spark pull request #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2018-01-18 Thread shaolinliu
Github user shaolinliu closed the pull request at:

https://github.com/apache/spark/pull/17581


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-10-27 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17581
  
ok.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-12 Thread shaolinliu
Github user shaolinliu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17581#discussion_r111080385
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -359,6 +359,16 @@ object SQLConf {
   .booleanConf
   .createWithDefault(false)
 
+  val THRIFTSERVER_RESULT_LIMIT =
+buildConf("spark.sql.thriftserver.retainedResults")
--- End diff --

In hive, we use parameter "hive.fetch.task.conversion=minimal" to take 
result from mr job's output(from disk),  in this mode hive will not collect the 
result to the memory, avoiding the hive process crash. And I can not think of a 
good name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-10 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17581
  
Ok, I have modified the description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17581
  
Sorry, I am wrong. It's just increase user's query time, not occupy the 
resource.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17581
  
In a department, we can not constraint everyone, but when we start ts2 with 
this parameter, even if the user goes wrong, it does not matter.We have used 
the SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT  parameter, but the actual process 
of business slowed down, so add the parameter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17581
  
My opinion is:
In the production, the user often select without a limit, often lead to 
service offline,this is a general situation, so increase the parameters. 
When SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT is open, the process is :  
beeline[get] -> hs2[get] -> executor[ret] -> hs2[ret] ->beeline[ret] 
and in the process,the executor's resource is occupied, and when it it 
close,the process is:
 beeline[get] -> hs2[collect] -> beeline[ret].
 similar to the redis's pipeline to enhance performance, and reducing the 
time will reduce resources too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17581
  
@ueshin please  take a look at this pr, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-09 Thread shaolinliu
GitHub user shaolinliu opened a pull request:

https://github.com/apache/spark/pull/17581

[SPARK-20248][ SQL]Spark SQL add limit parameter to enhance the reliability.

## What changes were proposed in this pull request?

Add a parameter "spark.sql.thriftServer.retainedResults" with default value 
200, when user run a query without a limit, this will implicitly add a limit to 
this query. When user run a query with a limit,we do nothing. If this parameter 
is set to 0,we do nothing too.

## How was this patch tested?

manual tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaolinliu/spark SPARK-20248

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17581.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17581


commit b47cf92fee79a57fbec37bbf9c7d35753f5c7d75
Author: liu shaolin 
Date:   2017-04-07T05:25:21Z

modified:   
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
modified:   
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala

commit f556bf64b46b40beb58872ebb2fafa69a97c43ec
Author: liu shaolin 
Date:   2017-04-07T08:19:04Z

repair the compile error.

commit 81cd6333164b0915a0fb492a34ab84b973f7cd0d
Author: liu shaolin 
Date:   2017-04-07T09:22:30Z

modify the doc.

commit 59a1b1ae96809076916849c9a1d396dc7d40251d
Author: liu shaolin 
Date:   2017-04-10T02:01:03Z

Merge branch 'liu' into SPARK-20248




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-09 Thread shaolinliu
Github user shaolinliu closed the pull request at:

https://github.com/apache/spark/pull/17561


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17561
  
ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-07 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17561
  
@ueshin the SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT will cause the 
resource waste of the cluster, because the cluster's resource will release when 
the query is finish, so the executor's memory and cpu is occupied, in general 
it is slower.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-07 Thread shaolinliu
Github user shaolinliu commented on a diff in the pull request:

https://github.com/apache/spark/pull/17561#discussion_r110346704
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -359,6 +359,13 @@ object SQLConf {
   .booleanConf
   .createWithDefault(false)
 
+  val THRIFTSERVER_RESULT_LIMIT =
+buildConf("spark.sql.thriftServer.retainedResults")
+  .internal()
+  .doc("The number of sql results returned by Thrift Server, and 0 is 
unlimited.")
--- End diff --

ok,i will modify.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-07 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17561
  
has modify the error, and retest the usecase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-06 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17561
  
@ueshin please take a look at this pr, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-06 Thread shaolinliu
Github user shaolinliu closed the pull request at:

https://github.com/apache/spark/pull/17560


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-06 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17560
  
@ueshin i resubmit the pr, please close this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-06 Thread shaolinliu
GitHub user shaolinliu opened a pull request:

https://github.com/apache/spark/pull/17561

[SPARK-20248][ SQL]Spark SQL add limit parameter to enhance the reliability.


## What changes were proposed in this pull request?

Add a parameter "spark.sql.thriftServer.retainedResults" with default value 
200, when user run a query without a limit, this will implicitly add a limit to 
this query. When user run a query with a limit,we do nothing. If this parameter 
is set to 0,we do nothing too.

## How was this patch tested?

manual tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaolinliu/spark liu

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17561.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17561


commit b47cf92fee79a57fbec37bbf9c7d35753f5c7d75
Author: liu shaolin 
Date:   2017-04-07T05:25:21Z

modified:   
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
modified:   
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-06 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17560
  
yes, i am fixing, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17560: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-06 Thread shaolinliu
GitHub user shaolinliu opened a pull request:

https://github.com/apache/spark/pull/17560

[SPARK-20248][ SQL]Spark SQL add limit parameter to enhance the reliability.

## What changes were proposed in this pull request?
Add a parameter "spark.sql.thriftServer.retainedResults" with default value 
200, when user run a query without a limit, this will implicitly add a limit to 
this query. When user run a query with a limit,we do nothing. If this parameter 
is set to 0,we do nothing too.


## How was this patch tested?
manual tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaolinliu/spark SPARK-20248

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17560.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17560


commit 2ce4f358b6cc70c45a715214b9f1024ee724b804
Author: liu shaolin 
Date:   2017-04-07T05:25:21Z

modified:   
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
modified:   
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17258: [SPARK-19807][Web UI]Add reason for cancellation when a ...

2017-03-13 Thread shaolinliu
Github user shaolinliu commented on the issue:

https://github.com/apache/spark/pull/17258
  
Thank you for the advice, your result description seems simpler and more 
appropriate. And I have pushed the change to the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17258: [SPARK-19807][Web UI]Add reason for cancellation ...

2017-03-11 Thread shaolinliu
GitHub user shaolinliu opened a pull request:

https://github.com/apache/spark/pull/17258

[SPARK-19807][Web UI]Add reason for cancellation when a stage is killed 
using web UI


## What changes were proposed in this pull request?

When a user kills a stage using web UI (in Stages page), 
StagesTab.handleKillRequest requests SparkContext to cancel the stage without 
giving a reason. SparkContext has cancelStage(stageId: Int, reason: String) 
that Spark could use to pass the information for monitoring/debugging purposes.

## How was this patch tested?

manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaolinliu/spark SPARK-19807

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17258.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17258


commit f43d1d689800d4d6fabdd7c3c4a85065f93bc34c
Author: lvdongr 
Date:   2017-03-11T11:37:00Z

[SPARK-19807][Web UI]Add reason for cancellation when a stage is killed 
using web UI




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org