[ https://issues.apache.org/jira/browse/NIFI-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952671#comment-15952671 ]
ASF GitHub Bot commented on NIFI-3644: -------------------------------------- GitHub user baolsen opened a pull request: https://github.com/apache/nifi/pull/1645 NIFI-3644 - Added HBase_1_1_2_ClientMapCacheService Added HBase_1_1_2_ClientMapCacheService which implements DistributedMapCacheClient. The DetectDuplicate processor can now make use of HBase_1_1_2_ClientMapCacheService for storing the duplicate cache on HBase. You can merge this pull request into a Git repository by running: $ git pull https://github.com/baolsen/nifi DistributedMapCacheHBaseClientService Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/1645.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1645 ---- commit 8c0285b5efb6afd1607bb050650b758fed7d06e3 Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-23T12:35:43Z Update HBaseClientService.java Added "get" function call for doing single row lookup on HBase (HBase get) commit 03d1b36376c6954d8bdcf4056314fced0cf0d1fc Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-23T13:20:41Z Update HBase_1_1_2_ClientService.java Implemented "get" function for retrieval of single HBase rows. commit 6dbca10e82b3b6b8ac94f8f0152b8fff85008082 Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-23T13:33:15Z Update HBase_1_1_2_ClientService.java commit df30a22a3ba71fedfe1dffedefcc0eb64c3670b0 Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-23T13:40:08Z Update HBase_1_1_2_ClientService.java commit 6d8036cc03ef49e41b92dbb5fa7e0de41cc15c3d Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-23T13:44:12Z Update MockHBaseClientService.java Implemented "get" function with UnsupportedException commit 4bcb26fd6a99a23852097f4f3db02cbeb6b8a3b5 Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-23T13:46:23Z Update HBase_1_1_2_ClientService.java commit 4b266d9d1d112e2bf8aa198f87253d17c055dbbc Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-23T13:50:09Z Update MockHBaseClientService.java commit 2ef850bc7c2bce5f9dd35fc9ce5cf08c7ecf07c4 Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-29T08:51:11Z Test commit e802f147bcd19664b9053e240ec1476ff7a61e7b Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-29T08:52:35Z Test commit 4cabff26658090c08d813e74d27894a9fd684c57 Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-31T07:59:50Z Completed initial development of HBase_1_1_2_ClientMapCacheService.java which is compatible with DetectDuplicate (and other processors) Still need to implement value deletion commit 7790d3f5a8d56f0801d40ad2c836a8db7c123e1b Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-31T08:31:06Z Undid changes to files for an earlier attempt at this commit 594dc059cdbe708f10849c794b826d24e83e787d Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-03-31T08:33:47Z Undid changes to files for an earlier attempt at this commit fbd3034e736ecdd1d721cc788e5c984eee6560c7 Author: baolsen <bjorn.ols...@gmail.com> Date: 2017-04-02T13:01:21Z Added remove() for cache and Documentation ---- > Add DetectDuplicateUsingHBase processor > --------------------------------------- > > Key: NIFI-3644 > URL: https://issues.apache.org/jira/browse/NIFI-3644 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions > Reporter: Bjorn Olsen > Priority: Minor > > The DetectDuplicate processor makes use of a distributed map cache for > maintaining a list of unique file identifiers (such as hashes). > The distributed map cache functionality could be provided by an HBase table, > which then allows for reliably storing a huge volume of file identifiers and > auditing information. The downside of this approach is of course that HBase > is required. > Storing the unique file identifiers in a reliable, query-able manner along > with some audit information is of benefit to several use cases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)