Maciej Szymkiewicz created SPARK-17756:
------------------------------------------

             Summary: java.lang.ClassCastException when using cartesian with 
DStream.transform
                 Key: SPARK-17756
                 URL: https://issues.apache.org/jira/browse/SPARK-17756
             Project: Spark
          Issue Type: Bug
          Components: PySpark, Streaming
    Affects Versions: 2.0.0
            Reporter: Maciej Szymkiewicz


Steps to reproduce:

{code}

from pyspark.streaming import StreamingContext

ssc = StreamingContext(spark.sparkContext, 10)
(ssc
    .queueStream([sc.range(10)])
    .transform(lambda rdd: rdd.cartesian(rdd))
    .pprint())

ssc.start()

## 16/10/01 21:34:30 ERROR JobScheduler: Error generating jobs for time 
1475350470000 ms
## java.lang.ClassCastException: org.apache.spark.api.java.JavaPairRDD ## 
cannot be cast to org.apache.spark.api.java.JavaRDD
##      at com.sun.proxy.$Proxy15.call(Unknown Source)
##    ....
{code}

A dummy fix is to put {{map(lamba x: x)}} which suggests it is a problem 
similar to https://issues.apache.org/jira/browse/SPARK-16589



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to