[ https://issues.apache.org/jira/browse/SPARK-17633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513110#comment-15513110 ]
Anshul commented on SPARK-17633: -------------------------------- RDD is not cached, in this scenario. > texFile() and wholeTextFiles() count difference > ----------------------------------------------- > > Key: SPARK-17633 > URL: https://issues.apache.org/jira/browse/SPARK-17633 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 1.6.2 > Environment: Unix/Linux > Reporter: Anshul > > sc.textFile() creates an RDD of string from a text file. > After that when count is performed, the line count is correct, but if more > than one line is appended to the file manually and counting the same RDD of > string increments the output/result only by 1. > But in case of sc.wholeTextFiles() the output/result is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org