[ 
https://issues.apache.org/jira/browse/SPARK-19655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873093#comment-15873093
 ] 

hosein edited comment on SPARK-19655 at 2/18/17 10:37 AM:
----------------------------------------------------------

I have a Vertica database with 100 million rows and I run this code in spark:

 df = spark.read.format("jdbc").option("url" , 
vertica_jdbc_url).option("dbtable", 'test_table')
   .option("user",     "spark_user").option("password" , "password").load()

result = df.filter(df['id'] > 100).count()

print result

I monitor queries in Vertica and spark code generates this query in Vertica:

SELECT 1 FROM test_table WHERE ("id" > 100)

this query returns about 100 million "1" and I think this is not suitable









was (Author: hosein_ey):
I have a Vertica database with 100 million rows and I run this code in spark:

 df = spark.read.format("jdbc").option("url" , 
vertica_jdbc_url).option("dbtable", 'test_table')
   .option("user",     "spark_user").option("password" , "password").load()

result = df.filter(df['id'] > 100).count()

print result

I monitor queries in Vertica and spark code generates this query in Vertica:

SELECT 1 FROM test_table WHERE ("int_id" > 100)

this query returns about 100 million "1" and I think this is not suitable








> select count(*) , requests 1 for each row
> -----------------------------------------
>
>                 Key: SPARK-19655
>                 URL: https://issues.apache.org/jira/browse/SPARK-19655
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: hosein
>            Priority: Minor
>
> when I want query select count( * ) by JDBC and monitor queries in database 
> side, I see spark requests: select 1 for destination table
> it means 1 for each row and it is not optimized



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to