Hi all,
my intention is to write generic pig script using UDF that can process csv files with different number of fields per file. Each time pig processes one type of the input file. The UDF will produce a bag with two tuples, the number of records inside the tuple will depend based on the internal logic inside UDF. My problem is that I cant pass any temporary variable from the exec() method into outputSchema(Schema input) method which is part of the UDF class. The temporary variable contains information needed to generate valid output schema inside outputSchema(), eg. size of the tuples, names definition, data types, etc. Is there any solution or any more efficient way how to solve it? Thank you