support expression of indirect dependency between statements -------------------------------------------------------------
Key: PIG-2212 URL: https://issues.apache.org/jira/browse/PIG-2212 Project: Pig Issue Type: New Feature Reporter: Thejas M Nair There needs to be a better way to support following use case, than using exec. It should be possible to express a dependency between a store statement and another statement, to ensure that the store happens first. It is worth considering if allowing user to specify dependency between any two statements is going to be useful. {noformat} I have some data that I would like to store into a file and then load it in a UDF to do some operations in the next pig statement. For example, doc_ids = FOREACH docs GENERATE doc_id; STORE doc_ids INTO '$TEMP'; modifieddocs = FOREACH docs GENERATE myUDF('$TEMP', doc_id); where myUDF loads doc_ids stored in '$TEMP' and does some operation using $TEMP and doc_id. Now I need to make sure that the "STORE doc_ids INTO '$TEMP';" occurs before the FOREACH statement, so that loading the index occurs smoothly. Is there anyway to guarantee that that can happen? {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira