[ 
https://issues.apache.org/jira/browse/NIFI-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305633#comment-16305633
 ] 

ASF GitHub Bot commented on NIFI-4428:
--------------------------------------

Github user vakshorton commented on the issue:

    https://github.com/apache/nifi/pull/2310
  
    @MikeThomsen There is an indexing service that works with Kafka. However, 
for high volume use cases, this requires a lot of Kafka resources just to 
ingest data into Druid. Presumably, all that data would have already flowed 
through and been processed by Nifi. Providing the capability to send the data 
directly into Druid can really reduce the required hardware resources, 
management overhead, and usage complexity.


> Implement PutDruid Processor and Controller
> -------------------------------------------
>
>                 Key: NIFI-4428
>                 URL: https://issues.apache.org/jira/browse/NIFI-4428
>             Project: Apache NiFi
>          Issue Type: New Feature
>    Affects Versions: 1.3.0
>            Reporter: Vadim Vaks
>            Assignee: Matt Burgess
>
> Implement a PutDruid Processor and Controller using Tranquility API. This 
> will enable Nifi to index contents of flow files in Druid. The implementation 
> should also be able to handle late arriving data (event timestamp points to 
> Druid indexing task that has closed, segment granularity and grace window 
> period expired). Late arriving data is typically dropped. Nifi should allow 
> late arriving data to be diverted to FAILED or DROPPED relationship. That 
> would allow late arriving data to be stored on HDFS or S3 until a re-indexing 
> task can merge it into the correct segment in deep storage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to