Re: spark sql and cassandra. spark generate 769 tasks to read 3 lines from cassandra table

Serega Sheypak Wed, 17 Jun 2015 01:38:01 -0700

Hi, can somebody suggest me the way to reduce quantity of task?

2015-06-15 18:26 GMT+02:00 Serega Sheypak <serega.shey...@gmail.com>:


> Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes, Each
> of them has spark worker.
> The problem is that spark runs 869 task to read 3 lines: select bar from
> foo.
> I've tried these properties:
>
> #try to avoid 769 tasks per dummy select foo from bar qeury
> spark.cassandra.input.split.size_in_mb=32mb
> spark.cassandra.input.fetch.size_in_rows=1000
> spark.cassandra.input.split.size=10000
>
> but it doesn't help.
>
> Here are  mean metrics for the job :
> input1= 8388608.0 TB
> input2 = -320 B
> input3 = -400 B
>
> I'm confused with input, there are only 3 rows in C* table.
> Definitely, I don't have 8388608.0 TB of data :)
>
>
>
>

Re: spark sql and cassandra. spark generate 769 tasks to read 3 lines from cassandra table

Reply via email to