Accessing neighboring elements in an RDD

2014-09-03 Thread Daniel, Ronald (ELS-SDG)
Hi all, Assume I have read the lines of a text file into an RDD: textFile = sc.textFile(SomeArticle.txt) Also assume that the sentence breaks in SomeArticle.txt were done by machine and have some errors, such as the break at Fig. in the sample text below. Index Text N...as shown

Re: Accessing neighboring elements in an RDD

2014-09-03 Thread Victor Tso-Guillen
Interestingly, there was an almost identical question posed on Aug 22 by cjwang. Here's the link to the archive: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-and-next-element-in-a-sorted-RDD-td12621.html#a12664 On Wed, Sep 3, 2014 at 10:33 AM, Daniel, Ronald (ELS-SDG)

Re: Accessing neighboring elements in an RDD

2014-09-03 Thread Chris Gore
There is support for Spark in ElasticSearch’s Hadoop integration package. http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/spark.html Maybe you could split and insert all of your documents from Spark and then query for “MoreLikeThis” on the ElasticSearch index. I haven’t

RE: Accessing neighboring elements in an RDD

2014-09-03 Thread Daniel, Ronald (ELS-SDG)
, Ronald (ELS-SDG) Cc: user@spark.apache.org Subject: Re: Accessing neighboring elements in an RDD Interestingly, there was an almost identical question posed on Aug 22 by cjwang. Here's the link to the archive: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-and-next-element

Re: Accessing neighboring elements in an RDD

2014-09-03 Thread Xiangrui Meng
neighboring elements in an RDD Interestingly, there was an almost identical question posed on Aug 22 by cjwang. Here's the link to the archive: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-and-next-element-in-a-sorted-RDD-td12621.html#a12664 On Wed, Sep 3, 2014 at 10

RE: Accessing neighboring elements in an RDD

2014-09-03 Thread Daniel, Ronald (ELS-SDG)
) Cc: user@spark.apache.org Subject: Re: Accessing neighboring elements in an RDD Interestingly, there was an almost identical question posed on Aug 22 by cjwang. Here's the link to the archive: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-a nd-next-element