PigStorage (the loader you are using) creates all values as
bytearrays, which in Java is represented as a DataByteArray. So when
you get the "id" element of your map, it is a DataByteArray.
If all you really want to do is cast from bytearray to a long you
don't need a UDF for that. Changing the 3rd line in your Pig Latin
script to
C = foreach B generate (long)fld#'id'
will do the same thing.
If this is just an example script and you really want to do more in
your UDF than cast it (which I'm guessing is the case based on the UDF
name) then I would suggest casting it with Pig (as shown above) and
passing that to your UDF to do your UDF specific logic.
Alan.
On Feb 16, 2010, at 10:39 AM, Kelvin Moss wrote:
Could soemone please point out the mistake in UDF?
package UDF;
import java.io.IOException;
import java.util.Map;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;
public class NetworkId extends EvalFunc<Long>
{
public Long exec(Tuple input) throws IOException {
if (input == null || input.size() == 0)
return null;
try{
Map m = (Map)input.get(0);
return (Long) m.get("id");
}catch(Exception e){
throw WrappedIOException.wrap("Caught exception
processing input row ", e);
}
}
}
cat data
[id#5978]
[id#5979]
grunt> A = LOAD 'data' AS fld:bytearray;
grunt> B = FOREACH A GENERATE (map[])fld;
grunt> C = FOREACH B generate UDF.NetworkId(fld);
grunt> dump C;
2010-02-16 18:38:18,113 [main] INFO
org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No column
pruned for A
2010-02-16 18:38:18,113 [main] INFO
org.apache.pig.impl.logicalLayer.optimizer.PruneColumns - No map
keys pruned for A
2010-02-16 18:38:18,160 [main] WARN
org.apache.pig.impl.io.FileLocalizer - FileLocalizer.create: failed
to create /tmp/temp-1969643253
2010-02-16 18:38:18,200 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 2078: Caught error from
UDF: UDF.NetworkId [Caught exception processing input row
[org.apache.pig.data.DataByteArray cannot be cast to java.lang..Long]]
Thanks!