Edgardo,

There really are 2 ways to do batching: calling session.get(50) and using the @SupportsBatching annotation.

I realize that it seems weird to have 2 different mechanisms, but each provides advantages/tradeoffs.

In general you should now just use the @SupportsBatching annotation.

When that annotation is present, what happens is that the framework will handle all of the batching for you. So you can just do "FlowFile flowFile = session.get();" and not even worry about batching at all. This means that the user
is also able to choose how much batching should occur.

EvaluateXQuery uses "session.get(50)" for a very specific reason: it has to compile the XQuery each time that onTrigger is called (because the compiled object is not thread-safe). As a result, it wants to pull a bunch of FlowFiles (up to 50) and process them all. If it just used @SupportsBatching it would end up having to recompile the XQuery for every single FlowFile,
which would be very inefficient.

Does this answer your question?

If something doesn't make sense or if you have any further questions, I'm happy to help!

Thanks
-Mark


------ Original Message ------
From: "Edgardo Vega" <[email protected]>
To: [email protected]
Sent: 4/21/2015 11:32:44 AM
Subject: Batching Questions

I am creating a custom processor and was attempting to support batching. From reading the code for the standard nifi processor it seems like all I
have to do is:

final List<FlowFile> flowFileBatch = session.get(50);

Is this correct?

The documentation states "If this annotation is present, the user will be able to choose whether they prefer high throughput or lower latency in the Processor’s Scheduling tab. " I pulled in the EvaluateXQuery processor and
I do not see where that is set. Is that feature not currently enabled?

Also if I decided not to all batching in the scheduling tab what would
happen in the processor when i do session.get(50), will it really only get
1 flow file or would it always get 50?

Thanks for the help ahead of time.

Reply via email to