[
https://issues.apache.org/jira/browse/PIG-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582412#comment-13582412
]
Cheolsoo Park edited comment on PIG-3169 at 2/20/13 6:57 PM:
-------------------------------------------------------------
The issue is that the test cases generate input files under /tmp and use them
across MR jobs:
{code}
pig.registerQuery("A = LOAD '" + Util.generateURI(tmpFile.toString(),
pig.getPigContext()) + "';");
pig.registerQuery("Split A into A1 if $0<=10, A2 if $0>10;");
pig.registerQuery("Store A1 into '" +
FileLocalizer.getTemporaryPath(pigContext) + "';");
pig.openIterator("A2");
{code}
The "store A1" is the 1st job, and "openIterator(A2)" is the 2nd job. Since the
input file was deleted after the 1st job, "openIterator(A2)" fails to load it.
Attached is a patch that remove "deleteTempFiles()" from PigServer. I think
having it only in GruntParser serves the original motivation of this jira.
Please let me know anyone thinks otherwise.
was (Author: cheolsoo):
The issue is that the test cases generate input files under /tmp and use
them across MR jobs:
{code}
pig.registerQuery("A = LOAD '" + Util.generateURI(tmpFile.toString(),
pig.getPigContext()) + "';");
pig.registerQuery("Split A into A1 if $0<=10, A2 if $0>10;");
pig.registerQuery("Store A1 into '" +
FileLocalizer.getTemporaryPath(pigContext) + "';");
pig.openIterator("A2");
{code}
The "store A1" is the 1st job, and "openIterator(A2)" is the 2nd job. Since the
input file was deleted after the 1st job, "openIterator(A2)" fails to load it.
Attached is a patch that remove "deleteTemp
Files()" from PigServer. I think having it only in GruntParser servers the
original motivation of this jira. Please let me know anyone thinks otherwise.
> Remove temporary files that are not needed
> ------------------------------------------
>
> Key: PIG-3169
> URL: https://issues.apache.org/jira/browse/PIG-3169
> Project: Pig
> Issue Type: Improvement
> Reporter: Mark Wagner
> Assignee: Mark Wagner
> Priority: Minor
> Fix For: 0.12
>
> Attachments: PIG-3169.1.patch, PIG-3169-hotfix.patch
>
>
> When using Grunt, intermediate data and distributed caches files are left in
> 'pig.temp.dir' until the session is closed. It would be nice to cleanup files
> as they are no longer needed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira