Hi all,

I've been doing some further testing with the SplitPCAP processor, and I've 
found that with file sizes larger than around 3GB it tends to error out with 
the response that some packet or other in the main PCAP file is invalid. I've 
been unable to determine precisely why an invalid packet error is returned 
rather than a framework-generated error, but I have found that if any resultant 
flowfiles are transferred as soon as they're split from the original then this 
issue no longer occurs.

To remedy this, I propose that flowfiles should be transferred in 
configurably-sized batches during the process of splitting the main file rather 
than being collated and sent after processing is complete. This will also have 
the effect that the amount of RAM that is dedicated to the task of splitting 
PCAPs can be determined by the user rather than by how big the original PCAP 
file is. There are some issues with this, however:


  1.
'Split' processors mark every resultant flowfile with a 'number X of Y' 
attribute that means the resultant flowfiles can't be sent off until it's known 
how many there are in total.
  2.
As the 'split' flowfiles would be transferred as they're created, if there is 
an invalid packet later in the original PCAP then a situation could arise where 
flowfiles are transferred both to the 'split' relationship and the 'failure' 
relationship.

Does anyone have any thoughts on how to address those problems?

Thanks,

Jack Hinton

Reply via email to