RE: Possibility of saving checkpoints on other distributed filesystems

Kottapalli, Venkatesh Mon, 01 Feb 2016 22:20:07 -0800

Hi,

        Now that this has been discussed, Will the checkpointed data be purged 
when we kill the application forcefully?  In our current usage, we forcefully 
kill the app after it processes a certain batch of data. I see these small 
files are created under (user/datatorrent) directory and not removed.


        Another scenario, when some of the containers keep failing, we have 
observed this state where the data is continuously checkpointed into small 
files. When we kill the app, the data will be there. 

        We have received concerns saying this is impacting namenode performance 
since these small files are stored in HDFS. So we manually remove these 
checkpointed data at regular intervals.

-Venkatesh

-----Original Message-----
From: Amol Kekre [mailto:[email protected]] 
Sent: Monday, February 01, 2016 7:49 AM
To: [email protected]; [email protected]
Subject: Re: Possibility of saving checkpoints on other distributed filesystems

Aniruddha,
We have not heard this request from users yet. It may be because our 
checkpointing has a purge, i.e. the small files are not left over. Small file 
problem has been there in Hadoop and relates to storing small files in Hadoop 
for a longer time (more likely forever).

Thks,
Amol


On Mon, Feb 1, 2016 at 6:05 AM, Aniruddha Thombare < [email protected]> 
wrote:

> Hi Community,
>
> Or Let me say BigFoots, do you think this feature should be available?
>
> The reason to bring this up was discussed in the start of this thread as:
>
> This is with the intention to recover the applications faster and do 
> away
> > with HDFS's small files problem as described here:
> > http://blog.cloudera.com/blog/2009/02/the-small-files-problem/
> >
> >
> http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-smal
> l-files-problem/
> > http://inquidia.com/news-and-info/working-small-files-hadoop-part-1
> > If we could save checkpoints in some other distributed file system 
> > (or even a HA NAS box) geared for small files, we could achieve -
> >
> >    - Better performance of NN & HDFS for the production usage (read:
> >    production data I/O & not temp files)
> >
> >
> >    - Faster application recovery in case of planned shutdown / unplanned
> >    restarts
> >
> > If you feel the need of this feature, please cast your opinions and 
> > ideas
> so that it can be converted in a jira.
>
>
>
> Thanks,
>
>
> Aniruddha
>
> On Thu, Jan 21, 2016 at 11:19 PM, Gaurav Gupta 
> <[email protected]>
> wrote:
>
> > Aniruddha,
> >
> > Currently we don't have any support for that.
> >
> > Thanks
> > Gaurav
> >
> > Thanks
> > -Gaurav
> >
> > On Thu, Jan 21, 2016 at 12:24 AM, Tushar Gosavi 
> > <[email protected]>
> > wrote:
> >
> > > Default FSStorageAgent can be used as it can work with local
> filesystem,
> > > but I far as I know there is no support for specifying the 
> > > directory through xml file. by default it use the application directory 
> > > on HDFS.
> > >
> > > Not sure If we could specify storage agent with its properties 
> > > through
> > the
> > > configuration at dag level.
> > >
> > > - Tushar.
> > >
> > >
> > > On Thu, Jan 21, 2016 at 12:14 PM, Aniruddha Thombare < 
> > > [email protected]> wrote:
> > >
> > > > Hi,
> > > >
> > > > Do we have any storage agent which I can use readily, 
> > > > configurable
> > > through
> > > > dt-site.xml?
> > > >
> > > > I am looking for something which would save checkpoints in 
> > > > mounted
> file
> > > > system [eg. HA-NAS] which is basically just another directory 
> > > > for
> Apex.
> > > >
> > > >
> > > >
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > Aniruddha
> > > >
> > > > On Wed, Jan 20, 2016 at 8:33 PM, Sandesh Hegde <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > It is already supported refer the following jira for more
> > information,
> > > > >
> > > > > https://issues.apache.org/jira/browse/APEXCORE-283
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jan 19, 2016 at 10:43 PM Aniruddha Thombare < 
> > > > > [email protected]> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Is it possible to save checkpoints in any other highly 
> > > > > > available distributed file systems (which maybe mounted 
> > > > > > directories across
> > the
> > > > > > cluster) other than HDFS?
> > > > > > If yes, is it configurable?
> > > > > >
> > > > > > AFAIK, there is no configurable option available to achieve that.
> > > > > > If that's the case, can we have that feature?
> > > > > >
> > > > > > This is with the intention to recover the applications 
> > > > > > faster and
> > do
> > > > away
> > > > > > with HDFS's small files problem as described here:
> > > > > >
> > > > > > http://blog.cloudera.com/blog/2009/02/the-small-files-proble
> > > > > > m/
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://snowplowanalytics.com/blog/2013/05/30/dealing-with-hadoops-smal
> l-files-problem/
> > > > > >
> > http://inquidia.com/news-and-info/working-small-files-hadoop-part-1
> > > > > >
> > > > > > If we could save checkpoints in some other distributed file
> system
> > > (or
> > > > > even
> > > > > > a HA NAS box) geared for small files, we could achieve -
> > > > > >
> > > > > >    - Better performance of NN & HDFS for the production 
> > > > > > usage
> > (read:
> > > > > >    production data I/O & not temp files)
> > > > > >    - Faster application recovery in case of planned shutdown 
> > > > > > /
> > > > unplanned
> > > > > >    restarts
> > > > > >
> > > > > > Please, send your comments, suggestions or ideas.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > >
> > > > > > Aniruddha
> > > > > >
> > > > >
> > > >
> > >
> >
>

RE: Possibility of saving checkpoints on other distributed filesystems

Reply via email to