[ https://issues.apache.org/jira/browse/PIG-958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ankur updated PIG-958: ---------------------- Attachment: 958.v4.patch 1. When run in cluster mode, static variable PigMapReduce.sJobConf is null when checked in the UDF constructor but NOT null when UDF is actually invoked. This causes incorrect initialization of FileSystem object 'fs' to local filesystem, causing the test to fail. Moved to 'fs' initialization to intijobSpecificParams() method. 2. Deleting the temporary directory manually in finish(), causes the job to fail. Removed the manual deletion. As a side effect, user specified PARENT output directory in the UDF will have empty part-* files. These should be deleted manually by the user. Verfied that UDF works correctly and that unit test pass > Splitting output data on key field > ---------------------------------- > > Key: PIG-958 > URL: https://issues.apache.org/jira/browse/PIG-958 > Project: Pig > Issue Type: Bug > Affects Versions: 0.4.0 > Reporter: Ankur > Attachments: 958.v3.patch, 958.v4.patch > > > Pig users often face the need to split the output records into a bunch of > files and directories depending on the type of record. Pig's SPLIT operator > is useful when record types are few and known in advance. In cases where type > is not directly known but is derived dynamically from values of a key field > in the output tuple, a custom store function is a better solution. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.