Diana Carroll created SPARK-2334:
------------------------------------

             Summary: Attribute Error calling PipelinedRDD.id() in pyspark
                 Key: SPARK-2334
                 URL: https://issues.apache.org/jira/browse/SPARK-2334
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.0.0
            Reporter: Diana Carroll


calling the id() function of a PipelinedRDD causes an error in PySpark.  (Works 
fine in Scala.)

The second id() call here fails, the first works:
{code}
r1 = sc.parallelize([1,2,3])
r1.id()
r2=r1.map(lambda i: i+1)
r2.id()
{code}

Error:

{code}
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-a0cf66fcf645> in <module>()
----> 1 r2.id()

/usr/lib/spark/python/pyspark/rdd.py in id(self)
    180         A unique ID for this RDD (within its SparkContext).
    181         """
--> 182         return self._id
    183 
    184     def __repr__(self):

AttributeError: 'PipelinedRDD' object has no attribute '_id'
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to