Thanks to Jeremy Karn, we figured out that the problem is with the name of the 
python script. The script's name was 'test.py' and apparently some other 
test.py was picked up from python path during runtime. Changing the name fixed 
the problem.

Nezih

From: Yigitbasi, Nezih
Sent: Thursday, November 7, 2013 8:27 AM
To: [email protected]
Subject: problem with simple cpython udf

Hi,
I am having problems running a very simple cpython udf with Pig 0.12, Python 
2.7.3, and Hadoop 1.2.1.

I have the following cpython udf:

from pig_util import outputSchema

@outputSchema("as:int")
def square(num):
    if num == None:
        return None
    return ((num) * (num))

And then in my pig script:

a  = load '/etc/passwd' using PigStorage(':');
register 'test.py' using 
org.apache.pig.scripting.streaming.python.PythonScriptEngine as myfuncs;
b = foreach a generate myfuncs.square(3);
dump b

I get the error:
java.lang.Exception: org.apache.pig.impl.streaming.StreamingUDFException: LINE 
: KeyError: 'square'



        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE : 
KeyError: 'square'

I also tried to run it in MapReduce mode but I still get the same error. Any 
ideas?

Thanks,
Nezih

Reply via email to