
I have a Spark Batch job for reading timeseries data from Cassandra which has 
50,000 rows.

JavaRDD<String> cassandraRowsRDD = javaFunctions.cassandraTable("iotdata", 
                .map(new Function<CassandraRow, String>() {
                    public String call(CassandraRow cassandraRow) throws 
Exception {
                        return cassandraRow.toString();

List<String> lm = cassandraRowsRDD.collect();

I am testing in local mode where I am observing Spark is creating 770870 tasks 
(one job, one stage) which is taking many hours to complete. Can any please 
suggest, what could be possible issues.

Stage Id




Tasks: Succeeded/Total



Shuffle Read

Shuffle Write


collect at 

2016/03/10 21:01:15

9 s


Thank You

"DISCLAIMER: This message is proprietary to Aricent and is intended solely for 
the use of the individual to whom it is addressed. It may contain privileged or 
confidential information and should not be circulated or used for any purpose 
other than for what it is intended. If you have received this message in error, 
please notify the originator immediately. If you are not the intended 
recipient, you are notified that you are strictly prohibited from using, 
copying, altering, or disclosing the contents of this message. Aricent accepts 
no responsibility for loss or damage arising from the use of the information 
transmitted by this email including damage from virus."

Reply via email to