[ https://issues.apache.org/jira/browse/PIG-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039803#comment-13039803 ]
Woody Anderson commented on PIG-2098: ------------------------------------- to be clear on the parans issue, Nicolas Torzec cleared that up: In Python, a tuple is recognized by the commas that separate its elements, not by its surrounding parenthesis, which are just used for grouping expressions... That’s why both “t = (key, )” and “t = key, ” work, but not “t = (key)”. Nicolas. > jython - problem with single item tuple in bag > ---------------------------------------------- > > Key: PIG-2098 > URL: https://issues.apache.org/jira/browse/PIG-2098 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.1, 0.9.0 > Reporter: Vivek Padmanabhan > Assignee: Woody Anderson > > While using phython udf, if I create a tuple with a single field, Pig > execution fails with ClassCastException. > Caused by: java.io.IOException: Error executing function: > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert > jython type to pig datatype java.lang.ClassCastException: java.lang.String > cannot be cast to org.apache.pig.data.Tuple > at > org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:111) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:245) > An example to reproduce the issuue ; > Pig Script > {code} > register 'mapkeys.py' using jython as mapkeys; > A = load 'mapkeys.data' using PigStorage() as ( aMap: map[] ); > C = foreach A generate mapkeys.keys(aMap); > dump C; > {code} > mapkeys.py > {code} > @outputSchema("keys:bag{t:tuple(key:chararray)}") > def keys(map): > print "mapkeys.py:keys:map:", map > outBag = [] > for key in map.iterkeys(): > t = (key) ## doesn't work, causes Pig to crash > #t = (key,) ## adding empty value works :-/ > outBag.append(t) > print "mapkeys.py:keys:outBag:", outBag > return outBag > {code} > Input data 'mapkeys.data' > [name#John,phone#5551212] > In the udf, t = (key) , because of this the item inside the bag is treated as > a string instead of a tuple which causes for the class cast execption. > If I provide an additional comma, t = (key,) , then the script goes through > fine. > From code what I can see is that ,for "t = (key,)" , pythonToPig(..) recieves > the pyObject as [(u'name',), (u'phone',)] from the PyFunction call . > But for "t = (key)" the return from PyFunction call is [u'name', u'phone'] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira