Hi Lorand

Thanks for the reply. My use case has around 100 columns and growing,
and I didn't want to make the script look ugly and error prone with
the definition of schema of all 100 columns.

My idea was the UDF will return tuple for each record with a self
explanatory schema returned by outputSchema() and I can use this to
write directly into a Hive Table with HCatStorer(). The HCatStorer
expects same name for each field from the Pig script and hive table
schema. Hence if the outputSchema() could provide a Tuple with same
name as the field name instead of a null:: prefix, it will be helpful.


Regards
Narayanan

Reply via email to