Whats does the below Skipped Stage means. can anyone help in clarifying?
I was expecting 3 stages to get Succeeded but only 2 of them getting
completed while one is skipped.
Status: SUCCEEDED
Completed Stages: 2
Skipped Stages: 1
Scala REPL Code Used:
accounts is a basic RDD contains weblog text data.
var accountsByID = accounts.
map(line => line.split(',')).
map(values => (values(0),values(4)+','+values(3)));
var userreqs = sc.
textFile("/loudacre/weblogs/*6").
map(line => line.split(' ')).
map(words => (words(2),1)).
reduceByKey((v1,v2) => v1 + v2);
var accounthits =
accountsByID.join(userreqs).map(pair => pair._2)
accounthits.
saveAsTextFile("/loudacre/userreqs")
scala> accounthits.toDebugString
res15: String =
(32) MapPartitionsRDD[24] at map at <console>:28 []
| MapPartitionsRDD[23] at join at <console>:28 []
| MapPartitionsRDD[22] at join at <console>:28 []
| CoGroupedRDD[21] at join at <console>:28 []
+-(15) MapPartitionsRDD[15] at map at <console>:25 []
| | MapPartitionsRDD[14] at map at <console>:24 []
| | /loudacre/accounts/* MapPartitionsRDD[13] at textFile at
<console>:21 []
| | /loudacre/accounts/* HadoopRDD[12] at textFile at <console>:21 []
| ShuffledRDD[20] at reduceByKey at <console>:25 []
+-(32) MapPartitionsRDD[19] at map at <console>:24 []
| MapPartitionsRDD[18] at map at <console>:23 []
| /loudacre/weblogs/*6 MapPartitionsRDD[17] at textFile at
<console>:22 []
| /loudacre/weblogs/*6 HadoopRDD[16] at textFile at <con...