Hello,

I have a little bit of a hard time to design processors correctly. I find it 
difficult to decide if a processor should e.g. process a single line from a 
flow file or process also flow files with multiples lines of data (e.g. in the 
case of CSV files). Another point is the handling of header rows. One other 
point is data provenenance events: what is the correct event I should use when 
modifying attributes, content or both?

Is there a guide which outlines the best practices for such cases? I have the 
feeling that many of the processors handle these issues quite differently. I 
think there either should be a sort of standard or otherwise it should be well 
documented. And although there is very good documentation available for the 
project, for some of the processors one has to play around quite a bit to get 
it right because they behave differently or have a different philosophie and 
one has to understand it first to get it right.

Would appreciate to get some feedback and advice or pointers to documentation.

Uwe

Reply via email to