Jack,

For 1, check one of the existing Split processors such as SplitRecord for
the "fragment.*" attribute pattern, it should be consistent with that
behavior and code.

For 2, if an invalid packet should "spoil the bunch" so to speak, you can
use session.rollback() then transfer the original FlowFile to the failure
relationship and commit that.

Regards,
Matt

On Mon, Jun 24, 2024 at 10:03 AM Jack Hinton <jack.hin...@smartdcsit.co.uk>
wrote:

> Hi all,
>
> I've been doing some further testing with the SplitPCAP processor, and
> I've found that with file sizes larger than around 3GB it tends to error
> out with the response that some packet or other in the main PCAP file is
> invalid. I've been unable to determine precisely why an invalid packet
> error is returned rather than a framework-generated error, but I have found
> that if any resultant flowfiles are transferred as soon as they're split
> from the original then this issue no longer occurs.
>
> To remedy this, I propose that flowfiles should be transferred in
> configurably-sized batches during the process of splitting the main file
> rather than being collated and sent after processing is complete. This will
> also have the effect that the amount of RAM that is dedicated to the task
> of splitting PCAPs can be determined by the user rather than by how big the
> original PCAP file is. There are some issues with this, however:
>
>
>   1.
> 'Split' processors mark every resultant flowfile with a 'number X of Y'
> attribute that means the resultant flowfiles can't be sent off until it's
> known how many there are in total.
>   2.
> As the 'split' flowfiles would be transferred as they're created, if there
> is an invalid packet later in the original PCAP then a situation could
> arise where flowfiles are transferred both to the 'split' relationship and
> the 'failure' relationship.
>
> Does anyone have any thoughts on how to address those problems?
>
> Thanks,
>
> Jack Hinton
>

Reply via email to