[ https://issues.apache.org/jira/browse/PIG-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated PIG-1942: ---------------------------- Status: Open (was: Patch Available) Marking open pending response to Thejas' comments. > script UDF (jython) should utilize the intended output schema to more > directly convert Py objects to Pig objects > ---------------------------------------------------------------------------------------------------------------- > > Key: PIG-1942 > URL: https://issues.apache.org/jira/browse/PIG-1942 > Project: Pig > Issue Type: Improvement > Components: impl > Affects Versions: 0.9.0, 0.8.0 > Reporter: Woody Anderson > Assignee: Woody Anderson > Priority: Minor > Labels: python, schema, udf > Attachments: 1942.patch, 1942_with_junit.patch > > > from https://issues.apache.org/jira/browse/PIG-1824 > {code} > import re > @outputSchema("y:bag{t:tuple(word:chararray)}") > def strsplittobag(content,regex): > return re.compile(regex).split(content) > {code} > does not work because split returns a list of strings. However, the output > schema is known, and it would be quite simple to implicitly promote the > string element to a tupled element. > also, a list/array/tuple/set etc. are all equally convertable to bag, and > list/array/tuple are equally convertable to Tuple, this conversion can be > done in a much less rigid way with the use of the schema. > this allows much more facile re-use of existing python code and less memory > overhead to create intermediate re-converting of object types. > I have written the code to do this a while back as part of my version of the > jython script framework, i'll isolate that and attach. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira