Hey Mike, I recall looking at Mark's PR in May and believe it fit the bill for a processor looking to live ingest data into Accumulo. Just be cautious of minor and major compaction storms that may cause backpressure. Bulk load is also possible -- and probably even easier with the record-oriented paradigm -- but this is a great step.
If you have a relatively large batch size and want to reduce the number of mutations ( keep in mind internally the batch writer will make a copy of that mutation ), there may be some benefit in reusing *mutation* [1] if those puts are all within the same or a small set of rows within a table. I don't suspect this will be a limiting factor for most, so it probably depends on your situation. [1] https://github.com/apache/nifi/commit/114c8578e097c338b44d31e38454b199f2bb2660#diff-b68a6e83fae66bb7f1617ade75661fa4R243 On Mon, Sep 17, 2018 at 3:43 PM Mike Thomsen <[email protected]> wrote: > David, > > Any progress? Been flirting with the idea of looking at Accumulo and wanted > to sync up. > > Thanks, > > Mike > > On Sun, May 27, 2018 at 5:06 PM DAVID SMITH > <[email protected]> wrote: > > > Ok, thanks I will keep that in mind when I (or whoever on my team writes > > these) get to a point where we would like a review before trying to > submit > > back to the OS. > > Dave > > > > > > On Sunday, 27 May 2018, 21:51, Mike Thomsen <[email protected]> > > wrote: > > > > > > If you do a PR, ping @joshelser. He works on HBase and Accumulo and > > mentioned to me that he might be up for a code review on the Accumulo > > processors. > > > > On Sun, May 27, 2018 at 4:38 PM DAVID SMITH > > <[email protected]> wrote: > > > > > Mike > > > Thanks for the suggestion I will certainly have a look at the HBase > code, > > > I'm not really familiar with Accumulo, I am currently reading the docs > on > > > the apache web site. > > > Dave > > > > > > > > > On Sunday, 27 May 2018, 21:25, Mike Thomsen <[email protected] > > > > > wrote: > > > > > > > > > Dave, > > > > > > I don't know how far along Mark's code is, but you might find some > useful > > > code you can borrow from the HBase commit(s) that added visibility > label > > > support. For example, there is code for handling visibility labels w/ > > > PutHBaseRecord that might be reusable if you want to create a put > record > > > processor for Accumulo. It won't help you with the Accumulo APIs, but > it > > > could provide useful strategies for how to identify and assign labels > > from > > > user input. > > > > > > On Sun, May 27, 2018 at 3:14 PM DAVID SMITH > > > <[email protected]> wrote: > > > > > > > Mark > > > > Thanks for the link, I have downloaded the code from Github, it will > > be a > > > > good basis to start with. > > > > Many thanksDave > > > > > > > > > > > > On Saturday, 26 May 2018, 20:22, Mark Payne <[email protected] > > > > > > wrote: > > > > > > > > > > > > Hi Dave, > > > > > > > > I do have a branch in Github with the work that I had done: > > > > https://github.com/apache/nifi/tree/NIFI-818 > > > > To be perfectly honest, though, I have absolutely no idea what state > > the > > > > code is in, if it's been tested, etc. > > > > But you're welcome to take it and run with it, if you'd like. > > > > > > > > Thanks > > > > -Mark > > > > > > > > > > > > On May 26, 2018, at 12:05 PM, davidrsmith < > [email protected] > > > > .INVALID<mailto:[email protected]>> wrote: > > > > > > > > Hi > > > > > > > > A team at work has a need to interface with accumulo, has anyone > tried > > > > this, I know a while ago Mark Payne raised nifi jira ticket 818 but > as > > > far > > > > as I am aware this was never completed. > > > > I would be grateful if anyone can help or point me in the direction > of > > > > Mark's code that will give us a start. > > > > > > > > Many thanks > > > > Dave > > > > > > > > > > > > > > > > > > > > Sent from Samsung tablet > > > > > > > > > > > > > > > > > > > > > > > > > > > >
