...@gmail.com
wrote:
Hi, can somebody suggest me the way to reduce quantity of task?
2015-06-15 18:26 GMT+02:00 Serega Sheypak serega.shey...@gmail.com:
Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes,
Each of them has spark worker.
The problem is that spark runs 869 task
serega.shey...@gmail.com:
Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes,
Each of them has spark worker.
The problem is that spark runs 869 task to read 3 lines: select bar from
foo.
I've tried these properties:
#try to avoid 769 tasks per dummy select foo from bar qeury
Hi, can somebody suggest me the way to reduce quantity of task?
2015-06-15 18:26 GMT+02:00 Serega Sheypak serega.shey...@gmail.com:
Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes, Each
of them has spark worker.
The problem is that spark runs 869 task to read 3 lines
Hi, spark-sql estimated input for Cassandra table with 3 rows as 8 TB.
sometimes it's estimated as -167B.
I run it on laptop, I don't have 8 TB space for the data.
, 2015 at 4:37 AM, Serega Sheypak serega.shey...@gmail.com
wrote:
Hi, can somebody suggest me the way to reduce quantity of task?
2015-06-15 18:26 GMT+02:00 Serega Sheypak serega.shey...@gmail.com:
Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes,
Each of them has spark
Hi, I'm running spark sql against Cassandra table. I have 3 C* nodes, Each
of them has spark worker.
The problem is that spark runs 869 task to read 3 lines: select bar from
foo.
I've tried these properties:
#try to avoid 769 tasks per dummy select foo from bar qeury
would like to execute SQL queries on Cassandra using Spark SQL. Is it
possible to get just Spark SQL to run on top of Cassandra, without Spark? My
goal is to access Cassandra data with BI tools. Spark SQL looks like the
perfect tool for this.
--
View this message in context:
http://apache-spark
records in cassandra . Is there a better way of doing this ? How can
I optimize it ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-on-Cassandra-tp13696.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
]. For me calling toArray on cassandra rdd takes forever as
have
million records in cassandra . Is there a better way of doing this ? How
can
I optimize it ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-on-Cassandra-tp13696.html
Sent from