Re: Is LIMIT n in Spark SQL useful?

2015-05-05 Thread Robin East
Michael

Are there plans to add LIMIT push down? It's quite a natural thing to do in 
interactive querying.

Sent from my iPhone

 On 4 May 2015, at 22:57, Michael Armbrust mich...@databricks.com wrote:
 
 The JDBC interface for Spark SQL does not support pushing down limits today.
 
 On Mon, May 4, 2015 at 8:06 AM, Robin East robin.e...@xense.co.uk wrote:
 and a further question - have you tried running this query in pqsl? what’s 
 the performance like there?
 
 On 4 May 2015, at 16:04, Robin East robin.e...@xense.co.uk wrote:
 
 What query are you running. It may be the case that your query requires 
 PosgreSQL to do a large amount of work before identifying the first n rows
 On 4 May 2015, at 15:52, Yi Zhang zhangy...@yahoo.com.INVALID wrote:
 
 I am trying to query PostgreSQL using LIMIT(n) to reduce memory size and 
 improve query performance, but I found it took long time as same as 
 querying not using LIMIT. It let me confused. Anybody know why?
 
 Thanks.
 
 Regards,
 Yi
 


Re: Is LIMIT n in Spark SQL useful?

2015-05-04 Thread Robin East
What query are you running. It may be the case that your query requires 
PosgreSQL to do a large amount of work before identifying the first n rows
 On 4 May 2015, at 15:52, Yi Zhang zhangy...@yahoo.com.INVALID wrote:
 
 I am trying to query PostgreSQL using LIMIT(n) to reduce memory size and 
 improve query performance, but I found it took long time as same as querying 
 not using LIMIT. It let me confused. Anybody know why?
 
 Thanks.
 
 Regards,
 Yi



Re: Is LIMIT n in Spark SQL useful?

2015-05-04 Thread Robin East
and a further question - have you tried running this query in pqsl? what’s the 
performance like there?
 On 4 May 2015, at 16:04, Robin East robin.e...@xense.co.uk wrote:
 
 What query are you running. It may be the case that your query requires 
 PosgreSQL to do a large amount of work before identifying the first n rows
 On 4 May 2015, at 15:52, Yi Zhang zhangy...@yahoo.com.INVALID 
 mailto:zhangy...@yahoo.com.INVALID wrote:
 
 I am trying to query PostgreSQL using LIMIT(n) to reduce memory size and 
 improve query performance, but I found it took long time as same as querying 
 not using LIMIT. It let me confused. Anybody know why?
 
 Thanks.
 
 Regards,
 Yi
 



Re: Is LIMIT n in Spark SQL useful?

2015-05-04 Thread Yi Zhang
Robin,My query statement is as below:select id, name, trans_date, gender, 
hobby, job, country from Employees LIMIT 100
In PostgreSQL, it works very well. For 10M records in DB, it just took less 
than 20ms, but in SparkSQL, it took long time. 
Michael,
Got it. For me, it is not good news. Anyway, thanks.
Regards,Yi



 On Tuesday, May 5, 2015 5:59 AM, Michael Armbrust mich...@databricks.com 
wrote:
   

 The JDBC interface for Spark SQL does not support pushing down limits today.
On Mon, May 4, 2015 at 8:06 AM, Robin East robin.e...@xense.co.uk wrote:

and a further question - have you tried running this query in pqsl? what’s the 
performance like there?

On 4 May 2015, at 16:04, Robin East robin.e...@xense.co.uk wrote:
What query are you running. It may be the case that your query requires 
PosgreSQL to do a large amount of work before identifying the first n rows

On 4 May 2015, at 15:52, Yi Zhang zhangy...@yahoo.com.INVALID wrote:
I am trying to query PostgreSQL using LIMIT(n) to reduce memory size and 
improve query performance, but I found it took long time as same as querying 
not using LIMIT. It let me confused. Anybody know why?
Thanks.
Regards,Yi







  

Re: Is LIMIT n in Spark SQL useful?

2015-05-04 Thread Michael Armbrust
The JDBC interface for Spark SQL does not support pushing down limits today.

On Mon, May 4, 2015 at 8:06 AM, Robin East robin.e...@xense.co.uk wrote:

 and a further question - have you tried running this query in pqsl? what’s
 the performance like there?

 On 4 May 2015, at 16:04, Robin East robin.e...@xense.co.uk wrote:

 What query are you running. It may be the case that your query requires
 PosgreSQL to do a large amount of work before identifying the first n rows

 On 4 May 2015, at 15:52, Yi Zhang zhangy...@yahoo.com.INVALID wrote:

 I am trying to query PostgreSQL using LIMIT(n) to reduce memory size and
 improve query performance, but I found it took long time as same as
 querying not using LIMIT. It let me confused. Anybody know why?

 Thanks.

 Regards,
 Yi