GitHub user gresockj opened a pull request: https://github.com/apache/nifi/pull/733
NIFI-2417: Implementing QueryElasticsearchHttp and ScrollElasticsearchHttp I have implemented these processors for my own project, and thought it might be useful to submit them to NiFi. They are based on FetchElasticsearchHttp, and have the following execution designs: - QueryElasticsearchHttp - submits an ES query and pages through the results in a single execution, emitting one flow file per document. Allows both flow file input (in case the flow file has an attribute with the query to run) and non-input execution. - ScrollElasticsearchHttp - submits an ES query and uses the scroll API to scroll through the results. The scroll_id for each respective page is kept in the state management for the processor, and each subsequent execution of the processor emits a single page of documents as a flow file. We found this to be the most efficient way to scroll through a huge result set, as in the case of reindexing Elasticsearch, without losing our place if NiFi goes down. The only quirky thing is that the processor state must be cleared before another query can be run, but this is documented in the processor, and jives with the use case of only being needed for rare events like a reindex. Since the processors already work correctly in our system, I am no longer authorized to put time into making major modifications to the code. As a result, if any re-designs of this code is desired, I will be unable to put time toward it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gresockj/nifi NIFI-2417 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/733.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #733 ---- commit 5bbe09e2a7c4689bfa01588260ea89d2375e8356 Author: Joe Gresock <joseph.gres...@lmco.com> Date: 2016-07-28T11:44:29Z NIFI-2417: Implementing QueryElasticsearchHttp and ScrollElasticsearchHttp ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---