[ https://issues.apache.org/jira/browse/PIG-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689716#comment-13689716 ]
Cheolsoo Park commented on PIG-3288: ------------------------------------ [~aniket486], thank you very much for your feedback! # I like your suggestion regarding the name of the property/counter. I'll probably change it to "pig.exec.termination.counter.limit". Let me know if you have a better suggestion. # The storefunc (PigStorageWithFileCount) that I wrote is just for e2e test, and _with this storefunc_, it is true that for each new file, a new storefunc is initialized. Again, the implementation of how to increment the counter _entirely_ depends on storage implementation. For example, if you're using CombinedOutputFormat, it's your responsibility to increment the counter properly in your storage. I documented it clearly. > Kill jobs if the number of output files is over a configurable limit > -------------------------------------------------------------------- > > Key: PIG-3288 > URL: https://issues.apache.org/jira/browse/PIG-3288 > Project: Pig > Issue Type: Wish > Reporter: Cheolsoo Park > Assignee: Cheolsoo Park > Fix For: 0.12 > > Attachments: PIG-3288-2.patch, PIG-3288-3.patch, PIG-3288-4.patch, > PIG-3288.patch > > > I ran into a situation where a Pig job tried to create too many files on hdfs > and overloaded NN. To prevent such events, it would be nice if we could set a > upper limit on the number of files that a Pig job can create. > In fact, Hive has a property called "hive.exec.max.created.files". The idea > is that each mapper/reducer increases a counter every time when they create > files. Then, MRLauncher periodically checks whether the number of created > files so far has exceeded the upper limit. If so, we kill running jobs and > exit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira