Hi all,
Assume I have read the lines of a text file into an RDD:
textFile = sc.textFile(SomeArticle.txt)
Also assume that the sentence breaks in SomeArticle.txt were done by machine
and have some errors, such as the break at Fig. in the sample text below.
Index Text
N...as shown
Interestingly, there was an almost identical question posed on Aug 22 by
cjwang. Here's the link to the archive:
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-and-next-element-in-a-sorted-RDD-td12621.html#a12664
On Wed, Sep 3, 2014 at 10:33 AM, Daniel, Ronald (ELS-SDG)
There is support for Spark in ElasticSearch’s Hadoop integration package.
http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/spark.html
Maybe you could split and insert all of your documents from Spark and then
query for “MoreLikeThis” on the ElasticSearch index. I haven’t
, Ronald (ELS-SDG)
Cc: user@spark.apache.org
Subject: Re: Accessing neighboring elements in an RDD
Interestingly, there was an almost identical question posed on Aug 22 by
cjwang. Here's the link to the archive:
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-and-next-element
neighboring elements in an RDD
Interestingly, there was an almost identical question posed on Aug 22 by
cjwang. Here's the link to the archive:
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-and-next-element-in-a-sorted-RDD-td12621.html#a12664
On Wed, Sep 3, 2014 at 10
)
Cc: user@spark.apache.org
Subject: Re: Accessing neighboring elements in an RDD
Interestingly, there was an almost identical question posed on Aug 22
by cjwang. Here's the link to the archive:
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-previous-a
nd-next-element