I believe it's the shell (Scala shell) that's cropping the output. See http://blog.ssanj.net/posts/2016-10-16-output-in-scala-repl-is-truncated.html
On Sun, Nov 13, 2016 at 1:56 AM Anirudh Perugu < anirudh.per...@stonybrook.edu> wrote: > Hello all, > > I am trying to understanding how graphx works internally. > > I created a small program in graphx : > 1. I create a new graph > val graph: Graph[(String, Double), Int] = Graph(vertexRDD, edgeRDD) > 2. Now I want to see how my vertices were created, hence I use > scala> graph.vertices.toDebugString > res11: String = > (48) VertexRDDImpl[11] at RDD at VertexRDD.scala:57 [] > | VertexRDD, VertexRDD ZippedPartitionsRDD2[9] at zipPartitions at > VertexRDD.scala:322 [] > | CachedPartitions: 48; MemorySize: 328.0 KB; > ExternalBlockStoreSize: 0.0 B; DiskSize: 0.0 B > | ShuffledRDD[5] at partitionBy at VertexRDD.scala:319 [] > +-(48) ParallelCollectionRDD[0] at parallelize at <console>:45 [] > | MapPartitionsRDD[8] at mapPartitions at VertexRDD.scala:361 [] > | ShuffledRDD[7] at partitionBy at VertexRDD.scala:361 [] > +-(48) VertexRDD.createRoutingTables - vid2pid (aggregation) > MapPartitionsRDD[6] at mapPartitions at VertexRDD.scala:356 [] > | EdgeRDD, EdgeRDD MapPartitionsRDD[2] at mapPartitionsWithIndex at > EdgeRDD.scala:105 [] > | ParallelCollectionRDD[1] at parallelize at <cons... > scala> > But this doesn't give me the whole picture as you can see it is clipped > (10 lines I guess is the default), > (a) is there an option to increase this number so that I can see the whole > output. > (b) i know that indentations indicate a shuffle boundary & the parentheses > indicate parallelism at each step of this physical plan so does this mean > the above can be put into a picture like : > RDD A (VertexRDD.cre..) [48 partitions] > \ > --- RDD C (VertexRDD, > VertexRDD Zipped...)[48 partitions] > / > RDD B (ParallelCollecti..) [48 partitions] > > I am fairly new to spark, so please feel free to correct! > > Thanks > Anirudh >