Kanchan, the `toDebugString` looks unformatted because in some scenarios you need to parse it before (can't remember the reason, though). I suggest you to print the RDD Lineage using `print(rdd.toDebugString().decode("utf-8"))` instead (obs: this only occurs in Pyspark).
About the other question, you may use `getNumberPartitions`. On Sat, Apr 20, 2019 at 2:40 PM kanchan tewary <kanchan.tew...@gmail.com> wrote: > Dear All, > > Greetings! > > I am new to Apache Spark and working on RDDs using pyspark. I am trying to > understand the logical plan provided by toDebugString function, but I find > two issues a) the output is not formatted when I print the result > b) I do not see number of partitions shown. > > Can anyone direct me to any reference documentation to understand the > logical plan better? Or, do you suggest to use DAG from spark UI instead? > > > Thanks & Best Regards, > Kanchan > Data Engineer, IBM >