So, there is some input:
So the problem could be in spark-sql-thriftserver.
When I use spark console to submit SQL query, it takes 10 seconds and
reasonable count of tasks.
import com.datastax.spark.connector._;
val cc = new CassandraSQLContext(sc);
cc.sql("select su.user_id from appdata.site_u
>version
We are on DSE 4.7. (Cassandra 2.1) and spark 1.2.1
>cqlsh
select * from site_users
returns fast, subsecond, only 3 rows
>Can you show some code how you're doing the reads?
dse beeline
!connect ...
select * from site_users
--table has 3 rows, several columns in each row. Spark eunts 769 t
Can you show some code how you're doing the reads? Have you successfully
read other stuff from Cassandra (i.e. do you have a lot of experience with
this path and this particular table is causing issues or are you trying to
figure out the right way to do a read).
What version of Spark and Cassandra
Hi, can somebody suggest me the way to reduce quantity of task?
2015-06-15 18:26 GMT+02:00 Serega Sheypak :
> Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes, Each
> of them has spark worker.
> The problem is that spark runs 869 task to read 3 lines: select bar from
> foo.
>
Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes, Each
of them has spark worker.
The problem is that spark runs 869 task to read 3 lines: select bar from
foo.
I've tried these properties:
#try to avoid 769 tasks per dummy select foo from bar qeury
spark.cassandra.input.split.si