Web UI doesn't show some stages

2014-08-20 Thread Grzegorz BiaƂek
Hi, I am wondering why in web UI some stages (like join, filter) are not visible. For example this code: val simple = sc.parallelize(Array.range(0,100)) val simple2 = sc.parallelize(Array.range(0,100)) val toJoin = simple.map(x = (x, x.toString + x.toString)) val rdd = simple2 .map(x =

Re: Web UI doesn't show some stages

2014-08-20 Thread Patrick Wendell
The reason is that some operators get pipelined into a single stage. rdd.map(XX).filter(YY) - this executes in a single stage since there is no data movement needed in between these operations. If you call toDeubgString on the final RDD it will give you some information about the exact lineage.

Re: Web UI doesn't show some stages

2014-08-20 Thread Zhan Zhang
Try to answer your another question. One sortByKey is triggered by rangePartition which does sample to calculate the range boundaries, which again triggers the first reduceByKey. The second sortByKey is doing the real work to sort based on the partition calculated, which again trigger the