spark sql and cassandra. spark generate 769 tasks to read 3 lines from cassandra table

Serega Sheypak Mon, 15 Jun 2015 09:28:03 -0700

Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes, Each
of them has spark worker.
The problem is that spark runs 869 task to read 3 lines: select bar from
foo.
I've tried these properties:


#try to avoid 769 tasks per dummy select foo from bar qeury
spark.cassandra.input.split.size_in_mb=32mb
spark.cassandra.input.fetch.size_in_rows=1000
spark.cassandra.input.split.size=10000

but it doesn't help.

Here are  mean metrics for the job :
input1= 8388608.0 TB
input2 = -320 B
input3 = -400 B

I'm confused with input, there are only 3 rows in C* table.
Definitely, I don't have 8388608.0 TB of data :)

spark sql and cassandra. spark generate 769 tasks to read 3 lines from cassandra table

Reply via email to