Diana Carroll created SPARK-2334: ------------------------------------ Summary: Attribute Error calling PipelinedRDD.id() in pyspark Key: SPARK-2334 URL: https://issues.apache.org/jira/browse/SPARK-2334 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.0.0 Reporter: Diana Carroll
calling the id() function of a PipelinedRDD causes an error in PySpark. (Works fine in Scala.) The second id() call here fails, the first works: {code} r1 = sc.parallelize([1,2,3]) r1.id() r2=r1.map(lambda i: i+1) r2.id() {code} Error: {code} --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-31-a0cf66fcf645> in <module>() ----> 1 r2.id() /usr/lib/spark/python/pyspark/rdd.py in id(self) 180 A unique ID for this RDD (within its SparkContext). 181 """ --> 182 return self._id 183 184 def __repr__(self): AttributeError: 'PipelinedRDD' object has no attribute '_id' {code} -- This message was sent by Atlassian JIRA (v6.2#6252)