Adding HBase Support for AtomicDistributedMapCacheClient

2019-04-24 Thread Shawn Weeks
Seems like this should be fairly easy for HBase 2.x with the checkAndMutate 
functionality and I was wondering if there is already a Jira for this. 
Otherwise I might make an attempt at it. It would be good to be able to support 
Wait/Notify and other things that need AtomicDistributedMapCacheClient using an 
Apache developed product commonly found in a Hadoop Cluster.

Thanks
Shawn


Re: HBase Atomic Cache

2019-04-24 Thread Bryan Bende
I'm not sure if there is a real reason it can't be implemented, I
think Hbase has the necessary compare-and-set operations to make it
work, may just be that someone needs to do the work to implement it.

On Wed, Apr 24, 2019 at 3:18 PM Shawn Weeks  wrote:
>
> With Nifi 1.9 and Hbase 2.x what is the current limitation preventing 
> implementing atomic map cache? If there’s already a jira discussing I’d like 
> to read through it.
>
> Thanks
> Shawn Weeks
>
> Sent from my iPhone


Re: Lookup services for cleanup and other tasks

2019-04-24 Thread Mike Thomsen
Andy,

Without going into specifics, we have been using this on a client project
where the cleanup needs are "non-trivial" and often require writing custom
lookup services to repair fields and things like that. So far, it seems to
be working very well for this sort of advanced cleanup need. Before I
dropped it in, we were having a very long delay using the standard bundle
because a 100MB input CSV would have to go through about 20+ UpdateRecord
and LookupRecords (mainly the latter) to get all of the cleanup done.

So might be a good idea to revisit this.

On Mon, Jan 21, 2019 at 7:51 PM Andy LoPresto  wrote:

> Mike,
>
> This looks useful to me as an offering, but also very complex and almost
> abstracting an arbitrary DSL out. One of the patterns of NiFi has been the
> Unix tool model of each component doing a single activity and separating
> responsibilities between “black box” processors. This seems a bit complex
> to add into core NiFi, but I think it would be great to offer from your
> repository for users who could take advantage of it. Perhaps with the
> coming Extension Registry, a component marketplace will emerge and it will
> gain traction there.
>
> Thanks.
>
>
> Andy LoPresto
> alopre...@apache.org
> alopresto.apa...@gmail.com
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Jan 21, 2019, at 9:45 AM, Mike Thomsen 
> wrote:
> >
> > A lot of the data we get is very dirty and in need of cleanup to such a
> > point that UpdateRecord cannot handle it. So we use lookup services and
> > LookupRecord to do things like normalize certain fields that are often
> > wildly off from what we need. I wrote this processor over the weekend,
> and
> > am wondering if I should add it to a PR:
> >
> > https://github.com/MikeThomsen/nifi-record-enhancement-bundle
> >
> > It's similar to LookupRecord, but it's designed for one purpose: let a
> user
> > apply a bunch of cleanup tasks in a single hop through the NiFi graph.
> >
> > Any interest?
> >
> > Thanks,
> >
> > Mike
>
>