Re: zip in pyspark truncates RDD to number of processors

2014-06-21 Thread Kan Zhang
oesn't > happen without calling map on b. > > Any ideas? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/zip-in-pyspark-truncates-RDD-to-number-of-processors-tp8069.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >

zip in pyspark truncates RDD to number of processors

2014-06-21 Thread madeleine
er. by calling c.collect(), I see the RDD has simply been truncated to the first 4 entries. weirdly, this doesn't happen without calling map on b. Any ideas? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/zip-in-pyspark-truncates-RDD-to-number-of