Re: 答复: Limit Query Performance Suggestion

2017-01-18 Thread Liang-Chi Hsieh
hua > > > -----邮件原件- > 发件人: Liang-Chi Hsieh [mailto: > viirya@ > ] > 发送时间: 2017年1月18日 15:48 > 收件人: > dev@.apache > 主题: Re: Limit Query Performance Suggestion > > > Hi Sujith, > > I saw your updated post. Seems it makes sense to me now. >

答复: Limit Query Performance Suggestion

2017-01-18 Thread wangzhenhua (G)
: 2017年1月18日 15:48 收件人: dev@spark.apache.org 主题: Re: Limit Query Performance Suggestion Hi Sujith, I saw your updated post. Seems it makes sense to me now. If you use a very big limit number, the shuffling before `GlobalLimit` would be a bottleneck for performance, of course, even it can even

Re: Limit Query Performance Suggestion

2017-01-17 Thread sujith71955
ort with sample data and also figuring out a solution for this problem. Please let me know for any clarifications or suggestions regarding this issue. Regards, Sujith -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Limit-Query-Performance-Suggest

Re: Limit Query Performance Suggestion

2017-01-15 Thread Liang-Chi Hsieh
s or solution. > > Thanks in advance, > Sujith ----- Liang-Chi Hsieh | @viirya Spark Technology Center http://www.spark.tc/ -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Limit-Query-Performance-Suggestion-tp20570p20607.html Sent

Limit Query Performance Suggestion

2017-01-12 Thread sujith chacko
When limit is being added in the terminal of the physical plan there will be possibility of memory bottleneck if the limit value is too large and system will try to aggregate all the partition limit values as part of single partition. Description: Eg: create table src_temp as select * from src