Hi all, Assume I have read the lines of a text file into an RDD:
textFile = sc.textFile("SomeArticle.txt") Also assume that the sentence breaks in SomeArticle.txt were done by machine and have some errors, such as the break at Fig. in the sample text below. Index Text N ...as shown in Fig. N+1 1. N+2 The figure shows... What I want is an RDD with: N ... as shown in Fig. 1. N+1 The figure shows... Is there some way a filter() can look at neighboring elements in an RDD? That way I could look, in parallel, at neighboring elements in an RDD and come up with a new RDD that may have a different number of elements. Or do I just have to sequentially iterate through the RDD? Thanks, Ron