[ 
https://issues.apache.org/jira/browse/PIG-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-3288:
-------------------------------

    Attachment: PIG-3288-3.patch

I updated my patch as follows:
* I removed the counter logic from PigTextOutputFormat, but I didn't push it 
into StoreFunc.  The reason is because I think this logic is storage-specific, 
so I wanted to leave implementation to storages. Even though we can provide a 
default implementation in StoreFunc class, it won't be useful unless other 
storages subclass it.
* However, I still needed a storage that increments the counter for test. So I 
wrote one by wrapping PigStorage with StoreFuncWrapper. I updated my e2e test 
using this storage.

Tests done:
* Ran the new e2e test case _TooManyFilesCreatedErrors\_1_ on cluster.
* Ran test-commit.

Thanks!
                
> Kill jobs if the number of output files is over a configurable limit
> --------------------------------------------------------------------
>
>                 Key: PIG-3288
>                 URL: https://issues.apache.org/jira/browse/PIG-3288
>             Project: Pig
>          Issue Type: Wish
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12
>
>         Attachments: PIG-3288-2.patch, PIG-3288-3.patch, PIG-3288.patch
>
>
> I ran into a situation where a Pig job tried to create too many files on hdfs 
> and overloaded NN. To prevent such events, it would be nice if we could set a 
> upper limit on the number of files that a Pig job can create.
> In fact, Hive has a property called "hive.exec.max.created.files". The idea 
> is that each mapper/reducer increases a counter every time when they create 
> files. Then, MRLauncher periodically checks whether the number of created 
> files so far has exceeded the upper limit. If so, we kill running jobs and 
> exit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to