Thanks to Jeremy Karn, we figured out that the problem is with the name of the python script. The script's name was 'test.py' and apparently some other test.py was picked up from python path during runtime. Changing the name fixed the problem.
Nezih From: Yigitbasi, Nezih Sent: Thursday, November 7, 2013 8:27 AM To: [email protected] Subject: problem with simple cpython udf Hi, I am having problems running a very simple cpython udf with Pig 0.12, Python 2.7.3, and Hadoop 1.2.1. I have the following cpython udf: from pig_util import outputSchema @outputSchema("as:int") def square(num): if num == None: return None return ((num) * (num)) And then in my pig script: a = load '/etc/passwd' using PigStorage(':'); register 'test.py' using org.apache.pig.scripting.streaming.python.PythonScriptEngine as myfuncs; b = foreach a generate myfuncs.square(3); dump b I get the error: java.lang.Exception: org.apache.pig.impl.streaming.StreamingUDFException: LINE : KeyError: 'square' at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE : KeyError: 'square' I also tried to run it in MapReduce mode but I still get the same error. Any ideas? Thanks, Nezih
