I agree with the previous statements. You cannot expect any ordering guarantee. 
This means you need to ensure that the same ordering is done as the original 
file. Internally Spark is using the Hadoop Client libraries - even if you do 
not have Hadoop installed, because it is a flexible transparent solution to 
access many file systems including the local one. In the case you mentioned it 
is the TextInputFileFormat that returns a key and the value. The key i
This means you can sort by the key.
However to access this key you must use the hadoopFile method of Sparl together 
with the TextInputFormat.

> On 27 Jan 2017, at 10:44, Soheila S. <soheila...@gmail.com> wrote:
> 
> Hi All,
> I read a test file using sparkContext.textfile(filename) and assign it to an 
> RDD and process the RDD (replace some words) and finally write it to a text 
> file using rdd.saveAsTextFile(output).
> Is there any way to be sure the order of the sentences will not be changed? I 
> need to have the same text with some corrected words.
> 
> thanks!
> 
> Soheila

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to