Anirrudha,

You can use custom Storage Agent to configure how you can want to checkpoint 
and where you want to checkpoint. You can set your custom storage agent using 
STORAGE_AGENT 
(https://www.datatorrent.com/docs/apidocs/com/datatorrent/api/Context.OperatorContext.html#STORAGE_AGENT)
 attribute.

HTH

Thanks
- Gaurav

> On Jan 20, 2016, at 7:03 AM, Sandesh Hegde <[email protected]> wrote:
> 
> It is already supported refer the following jira for more information,
> 
> https://issues.apache.org/jira/browse/APEXCORE-283
> 
> 
> 
> On Tue, Jan 19, 2016 at 10:43 PM Aniruddha Thombare <
> [email protected]> wrote:
> 
>> Hi,
>> 
>> Is it possible to save checkpoints in any other highly available
>> distributed file systems (which maybe mounted directories across the
>> cluster) other than HDFS?
>> If yes, is it configurable?
>> 
>> AFAIK, there is no configurable option available to achieve that.
>> If that's the case, can we have that feature?
>> 
>> This is with the intention to recover the applications faster and do away
>> with HDFS's small files problem as described here:
>> 
>> http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
>> 
>> http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-small-files-problem/
>> http://inquidia.com/news-and-info/working-small-files-hadoop-part-1
>> 
>> If we could save checkpoints in some other distributed file system (or even
>> a HA NAS box) geared for small files, we could achieve -
>> 
>>   - Better performance of NN & HDFS for the production usage (read:
>>   production data I/O & not temp files)
>>   - Faster application recovery in case of planned shutdown / unplanned
>>   restarts
>> 
>> Please, send your comments, suggestions or ideas.
>> 
>> Thanks,
>> 
>> 
>> Aniruddha
>> 

Reply via email to