[ 
https://issues.apache.org/jira/browse/NIFI-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15940247#comment-15940247
 ] 

Joseph Witt commented on NIFI-3644:
-----------------------------------

Bjorn,

We can add you to the contributors list in JIRA so that you can assign items to 
yourself.  However, in the meantime you can definitely contribute and work on 
tasks without this.  For this concept please note you should only need to 
create an implementation of the DistributedMapCache which is backed by HBase 
rather than a new processor.  DetectDuplicate can use any implementation of 
that interface by design.

Thanks
Joe

> Add DetectDuplicateUsingHBase processor
> ---------------------------------------
>
>                 Key: NIFI-3644
>                 URL: https://issues.apache.org/jira/browse/NIFI-3644
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Bjorn Olsen
>            Priority: Minor
>
> The DetectDuplicate processor makes use of a distributed map cache for 
> maintaining a list of unique file identifiers (such as hashes).
> The distributed map cache functionality could be provided by an HBase table, 
> which then allows for reliably storing a huge volume of file identifiers and 
> auditing information. The downside of this approach is of course that HBase 
> is required.
> Storing the unique file identifiers in a reliable, query-able manner along 
> with some audit information is of benefit to several use cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to