Edmond La Chance created SPARK-29315: ----------------------------------------
Summary: RDD.cache() called early creates problems Key: SPARK-29315 URL: https://issues.apache.org/jira/browse/SPARK-29315 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.4 Environment: Apache Spark 2.4.4 Windows 10 Reporter: Edmond La Chance First issue I post here. I noticed that when I call RDD.cache() early in my code, the results are all wrong! If I remove the call to cache(), or I add cache later in the code, after the first map transformation, it works fine. The graph is created from a data structure that already contains the random. I have posted versions that work, and versions that don't work here in this gist. [https://gist.github.com/mitchi/edd9637687cf47fac2616bb72932f8e7] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org