When you write a custom LoadFunc, you have two ways to determine the
schema returned by LoadFunc:
1. Implemneting LoadMetadata, tell Pig the right schema your LoadFunc
will produce. You need to convert LoadFunc output to match the schema
you tell Pig
2. Not implementing LoadMetadata, Pig will assume that everything from
your LoadFunc is bytearray. Pig will do the type conversion according to
"AS" clause in your "Load" statement.
It seems you implement LoadMetadata, but in LoadFunc, you still return
bytearray. If that's the case, either make conversion of data to match
your declared schema, or not implementing LoadMetadata, let Pig do the
type conversion.
Daniel
Jae Lee wrote:
Hi,
I'm finding my custom loader fails to deal with SUM() for fields returned as
DataByteArray (as in PigStorage) when its LoadMetadata indicates the field type
being float.
I've also changed all the field type to chararray which then doesn't find SUM
function for chararray.
So is it true that automatic type conversion only kicked in when your loadFunc
doesn't implement LoadMetadata?
i.e. if your loadFunc implements LoadMetadata, then the schema needs to tell
the right type for the fields and the tuple it returns needs to hold the right
type?
J