How Spark Calculate partition size automatically

2015-01-12 Thread rajnish
Hi, When I am running a job, that is loading the data from Cassandra, Spark has created almost 9million partitions. How spark decide the partition count? I have read from one of the presentation that it is good to have 1000 to 10,000 partitions. Regards Raj -- View this message in context:

Find S3 file attributes by Spark

2015-01-08 Thread rajnish
Hi, We have file in AWS S3 bucket, that is loaded frequently, When accessing that file from spark, can we get file properties by some method in spark? Regards Raj -- View this message in context:

Timeout Exception in standalone cluster

2015-01-05 Thread rajnish
Hi, I am getting following exception in Spark (1.1.0) Job that is running on Standalone Cluster. My cluster configuration is: Intel(R) 2.50GHz 4 Core 16 GB RAM 5 Machines. Exception in thread main java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs at

Re: Api to get the status of spark workers

2015-01-05 Thread rajnish
You can use 4040 port, that gives information for current running application. That will give detail summary of currently running executors. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Api-to-get-the-status-of-spark-workers-tp20967p20980.html Sent from

Re: Failed to read chunk exception

2014-12-29 Thread rajnish
I am facing the same issue in spark-1.1.0 versions /12/29 20:44:31 INFO scheduler.TaskSetManager: Starting task 5.0 in stage 1.1 (TID 1373, X.X.X.X , ANY, 2185 bytes) 14/12/29 20:44:31 WARN scheduler.TaskSetManager: Lost task 6.0 in stage 3.0 (TID 1367, iX.X.X.X): java.io.IOException: failed to