GitHub user gresockj opened a pull request:

    https://github.com/apache/nifi/pull/733

    NIFI-2417: Implementing QueryElasticsearchHttp and ScrollElasticsearchHttp

    I have implemented these processors for my own project, and thought it 
might be useful to submit them to NiFi.  They are based on 
FetchElasticsearchHttp, and have the following execution designs:
    
    - QueryElasticsearchHttp - submits an ES query and pages through the 
results in a single execution, emitting one flow file per document.  Allows 
both flow file input (in case the flow file has an attribute with the query to 
run) and non-input execution.
    - ScrollElasticsearchHttp - submits an ES query and uses the scroll API to 
scroll through the results.  The scroll_id for each respective page is kept in 
the state management for the processor, and each subsequent execution of the 
processor emits a single page of documents as a flow file.  We found this to be 
the most efficient way to scroll through a huge result set, as in the case of 
reindexing Elasticsearch, without losing our place if NiFi goes down.  The only 
quirky thing is that the processor state must be cleared before another query 
can be run, but this is documented in the processor, and jives with the use 
case of only being needed for rare events like a reindex.
    
    Since the processors already work correctly in our system, I am no longer 
authorized to put time into making major modifications to the code.  As a 
result, if any re-designs of this code is desired, I will be unable to put time 
toward it.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gresockj/nifi NIFI-2417

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/733.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #733
    
----
commit 5bbe09e2a7c4689bfa01588260ea89d2375e8356
Author: Joe Gresock <joseph.gres...@lmco.com>
Date:   2016-07-28T11:44:29Z

    NIFI-2417: Implementing QueryElasticsearchHttp and ScrollElasticsearchHttp

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to