Whats does the below Skipped Stage means. can anyone help in clarifying? I was expecting 3 stages to get Succeeded but only 2 of them getting completed while one is skipped. Status: SUCCEEDED Completed Stages: 2 Skipped Stages: 1
Scala REPL Code Used: accounts is a basic RDD contains weblog text data. var accountsByID = accounts. map(line => line.split(',')). map(values => (values(0),values(4)+','+values(3))); var userreqs = sc. textFile("/loudacre/weblogs/*6"). map(line => line.split(' ')). map(words => (words(2),1)). reduceByKey((v1,v2) => v1 + v2); var accounthits = accountsByID.join(userreqs).map(pair => pair._2) accounthits. saveAsTextFile("/loudacre/userreqs") scala> accounthits.toDebugString res15: String = (32) MapPartitionsRDD[24] at map at <console>:28 [] | MapPartitionsRDD[23] at join at <console>:28 [] | MapPartitionsRDD[22] at join at <console>:28 [] | CoGroupedRDD[21] at join at <console>:28 [] +-(15) MapPartitionsRDD[15] at map at <console>:25 [] | | MapPartitionsRDD[14] at map at <console>:24 [] | | /loudacre/accounts/* MapPartitionsRDD[13] at textFile at <console>:21 [] | | /loudacre/accounts/* HadoopRDD[12] at textFile at <console>:21 [] | ShuffledRDD[20] at reduceByKey at <console>:25 [] +-(32) MapPartitionsRDD[19] at map at <console>:24 [] | MapPartitionsRDD[18] at map at <console>:23 [] | /loudacre/weblogs/*6 MapPartitionsRDD[17] at textFile at <console>:22 [] | /loudacre/weblogs/*6 HadoopRDD[16] at textFile at <con...