Hi I am running svd function on on 45GB data csv with 8.9M rows. I have configured BLAST and ARPACK. So it's 14.8 hours since my job is running and from Spark UI on port 4040
*Cores* *Memory per Executor* *State* *Duration* 160 15.0 GB RUNNING 14.4 h In Jobs UI i am seeing *Job Id **▾* <http://52.26.18.233:4040/jobs/?&completedJob.sort=Job+Id&completedJob.desc=false&completedJob.pageSize=100#completed> *Description* <http://52.26.18.233:4040/jobs/?&completedJob.sort=Description&completedJob.pageSize=100#completed> *Submitted* <http://52.26.18.233:4040/jobs/?&completedJob.sort=Submitted&completedJob.pageSize=100#completed> *Duration* <http://52.26.18.233:4040/jobs/?&completedJob.sort=Duration&completedJob.pageSize=100#completed> *Stages: Succeeded/Total* *Tasks (for all stages): Succeeded/Total* 1091 treeAggregate at RowMatrix.scala:93 2017/10/26 21:21:32 46 s 2/2 383/383 So above treeAggregate started FROM JOBS ID 6 and now it's 1091+ and counting. In *logs* i am getting following messages. [Stage 2179:===================================================>(363 + 2) / 365] [Stage 2179:===================================================>(364 + 1) / 365] [Stage 2209:===================================================>(364 + 1) / 365] So 1. How can i identify how many jobs are left for this operation ? 2. also there is on .toLocalIterator task which will run after this. So i have a understanding that .toLocalIterator jobs will be equal to Number of cores in my system ? 3. Also why is it so slow ? Best Regards, *Abdullah Bashir* *Senior Software Engineer,* *Foretheta, LLC.*