Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/2083#issuecomment-54124826
  
    @nchammas to be clear, the question isn't about ordering really. The issue 
is that result of the same RDD in this example changes when it is reevaluated. 
It's more like having a ResultSet change under you while iterating. @mateiz 
explained that this is working as intended. I think some straightforward uses 
of `zipWithIndex` may surprise people then. For example I add an index to each 
datum, do some computation, and later go back to look up a datum by index on 
the very same RDD. I am not likely to get back the same datum -- unless it has 
been persisted. Something to keep in mind to see if it actually bites people 
regularly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to