GitHub user wingchen opened a pull request:
https://github.com/apache/spark/pull/3408
Supporting elasticsearch API in this page in pyspark
I am trying to support elasticsearch API in this page in pyspark:
https://github.com/elasticsearch/elasticsearch-hadoop/blob/master/spark/src/main/scala/org/elasticsearch/spark/rdd/api/java/JavaEsSpark.scala
The POC code is committed into this pull request for `def esRDD(self,
resource, query)`. It's working well. I can come up with the rest functions
too.
However, I understand that introduction of a new dependency is not
generally accepted:
> Introduction of dependencies: Due to the complex nature of Spark, we are
conservative about introducing new dependencies. If patches add new
dependencies to Spark, they may not be merged.
I will be happy to listen to a 3-rd party way to do it from the community
too.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wingchen/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3408.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3408
----
commit 585348553726669f82bf7a24a7c7c23c8e77d5ba
Author: Winston Chen <[email protected]>
Date: 2014-11-11T02:17:08Z
get the dependency ready
commit a3cdbc4ff5798fdc8792c120a7b2753e1f402411
Author: Winston Chen <[email protected]>
Date: 2014-11-14T22:37:55Z
suceeded in hooking up esRDD function
commit d6c7807e3054f305645412c6c0ee2d89b650e1c5
Author: Winston Chen <[email protected]>
Date: 2014-11-14T23:24:53Z
refine the function hack
commit e8c5412e973c4b19f06629621aa7cd7f6529f78f
Author: Winston Chen <[email protected]>
Date: 2014-11-21T20:40:22Z
remove the test node
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]