When you write a custom LoadFunc, you have two ways to determine the schema returned by LoadFunc:

1. Implemneting LoadMetadata, tell Pig the right schema your LoadFunc will produce. You need to convert LoadFunc output to match the schema you tell Pig

2. Not implementing LoadMetadata, Pig will assume that everything from your LoadFunc is bytearray. Pig will do the type conversion according to "AS" clause in your "Load" statement.

It seems you implement LoadMetadata, but in LoadFunc, you still return bytearray. If that's the case, either make conversion of data to match your declared schema, or not implementing LoadMetadata, let Pig do the type conversion.

Daniel

Jae Lee wrote:
Hi,

I'm finding my custom loader fails to deal with SUM() for fields returned as 
DataByteArray (as in PigStorage) when its LoadMetadata indicates the field type 
being float.

I've also changed all the field type to chararray which then doesn't find SUM 
function for chararray.

So is it true that automatic type conversion only kicked in when your loadFunc 
doesn't implement LoadMetadata?

i.e. if your loadFunc implements LoadMetadata, then the schema needs to tell 
the right type for the fields and the tuple it returns needs to hold the right 
type?

J


Reply via email to