[ https://issues.apache.org/jira/browse/SPARK-29315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Edmond La Chance updated SPARK-29315: ------------------------------------- Description: First issue I post here. I noticed that when I call RDD.cache() early in my code, the results are all wrong! If I remove the call to cache(), or I add cache later in the code, after the first map transformation, it works fine. The graph is created from a data structure that already contains the random. I have posted versions that work, and versions that don't work here in this gist. [https://gist.github.com/mitchi/edd9637687cf47fac2616bb72932f8e7 ] here is an output that works : _Colors of the graph_ _3 2 1 3 2 1 1 4 2 3_ and an output that doesn't work : _Colors of the graph_ _25 16 36 49 3 1 6 15 10 3_ was: First issue I post here. I noticed that when I call RDD.cache() early in my code, the results are all wrong! If I remove the call to cache(), or I add cache later in the code, after the first map transformation, it works fine. The graph is created from a data structure that already contains the random. I have posted versions that work, and versions that don't work here in this gist. [https://gist.github.com/mitchi/edd9637687cf47fac2616bb72932f8e7] > RDD.cache() called early creates problems > ----------------------------------------- > > Key: SPARK-29315 > URL: https://issues.apache.org/jira/browse/SPARK-29315 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.4.4 > Environment: Apache Spark 2.4.4 > Windows 10 > Reporter: Edmond La Chance > Priority: Minor > > First issue I post here. I noticed that when I call RDD.cache() early in my > code, the results are all wrong! > If I remove the call to cache(), or I add cache later in the code, after the > first map transformation, it works fine. > The graph is created from a data structure that already contains the random. > > I have posted versions that work, and versions that don't work here in this > gist. > [https://gist.github.com/mitchi/edd9637687cf47fac2616bb72932f8e7 > ] > here is an output that works : > _Colors of the graph_ > _3 2 1 3 2 1 1 4 2 3_ > and an output that doesn't work : > _Colors of the graph_ > _25 16 36 49 3 1 6 15 10 3_ > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org