script UDF (jython) should utilize the intended output schema to more directly convert Py objects to Pig objects ----------------------------------------------------------------------------------------------------------------
Key: PIG-1942 URL: https://issues.apache.org/jira/browse/PIG-1942 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.8.0, 0.9.0 Reporter: Woody Anderson Priority: Minor Fix For: 0.9.0 from https://issues.apache.org/jira/browse/PIG-1824 {code} import re @outputSchema("y:bag{t:tuple(word:chararray)}") def strsplittobag(content,regex): return re.compile(regex).split(content) {code} does not work because split returns a list of strings. However, the output schema is known, and it would be quite simple to implicitly promote the string element to a tupled element. also, a list/array/tuple/set etc. are all equally convertable to bag, and list/array/tuple are equally convertable to Tuple, this conversion can be done in a much less rigid way with the use of the schema. this allows much more facile re-use of existing python code and less memory overhead to create intermediate re-converting of object types. I have written the code to do this a while back as part of my version of the jython script framework, i'll isolate that and attach. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira