Flume removes them when there are no more events to be read from them (though once 2 or more files are created, there will be a minimum of 2 files - that is just a safety net).
Thanks, Hari On Mon, Nov 10, 2014 at 1:23 AM, Needham, Guy <[email protected]> wrote: > Is there a concept of a data file which is 'done'? Does Flume remove data > files it no longer needs, or will these build up? > Regards, > Guy Needham | Data Discovery > Virgin Media | Enterprise Data, Design & Management > Bartley Wood Business Park, Hook, Hampshire RG27 9UP > D 01256 75 3362 > I welcome VSRE emails. Learn more at http://vsre.info/ > ________________________________ > From: Hari Shreedharan [mailto:[email protected]] > Sent: 10 November 2014 09:15 > To: [email protected] > Cc: [email protected] > Subject: RE: File channels creating many large files > That value is in bytes. At 500k, you will likely end up with too many files. > You should set it as high as you can. > Thanks, Hari > On Mon, Nov 10, 2014 at 1:05 AM, Needham, Guy > <[email protected]<mailto:[email protected]>> wrote: > Hari, Jeff, > thanks for your replies. It's Flume 1.5.0, I'll use the maxFileSize parameter > to fix this. Is there any impact on channel optimisation from setting it to > say 500000? > Regards, > Guy Needham | Data Discovery > Virgin Media | Enterprise Data, Design & Management > Bartley Wood Business Park, Hook, Hampshire RG27 9UP > D 01256 75 3362 > I welcome VSRE emails. Learn more at http://vsre.info/ > ________________________________ > From: Hari Shreedharan [mailto:[email protected]] > Sent: 07 November 2014 17:59 > To: [email protected] > Cc: [email protected] > Subject: Re: File channels creating many large files > Flume will leave at least 2 files per data directory. Once you have enough > events to cause 2 files to be created, there will be at least 2 per dir. You > can use maxFileSize parameter to control the size of these files. > Thanks, Hari > On Fri, Nov 7, 2014 at 10:25 AM, Jeff Lord > <[email protected]<mailto:[email protected]>> wrote: > Guy, > What version of flume is this? > -Jeff > On Fri, Nov 7, 2014 at 1:19 AM, Needham, Guy > <[email protected]<mailto:[email protected]>> wrote: > Hi all, > I have a configuration with a file channel configured such that: > a1.channels.ch1.type = file > a1.channels.ch1.checkpointDir = /hadoop/user/flume/channels/checkpoint > a1.channels.ch1.dataDirs = /hadoop/user/flume/channels/data > a1.channels.ch1.capacity = 100000 > a1.channels.ch1.transactionCapacity = 5000 > It's been running since October 28th with no issues, but when I looked today > in /hadoop/user/flume/channels/data I saw that the file channel was building > up large files which had been processed and not deleting them: > [rdd@hadoop-kn-p2-m01 flume]$ ls -lh channels/data/ > total 1.6G > -rw-r----- 1 rdd rdd 1.5G Oct 28 16:10 log-1 > -rw-r----- 1 rdd rdd 47 Oct 28 16:10 log-1.meta > -rw-r----- 1 rdd rdd 72M Oct 31 16:28 log-2 > -rw-r----- 1 rdd rdd 47 Oct 31 16:29 log-2.meta > It seems like for each day that data landed (we're still in testing so data > not landing constantly) a data file has been created but not deleted when > reading was completed. > Is this expected behaviour? Is there a way to stop large files building up > and still use the file channel? > Regards, > Guy Needham | Data Discovery > Virgin Media | Enterprise Data, Design & Management > Bartley Wood Business Park, Hook, Hampshire RG27 9UP > D 01256 75 3362 > I welcome VSRE emails. Learn more at http://vsre.info/ > -------------------------------------------------------------------- > Save Paper - Do you really need to print this e-mail? > Visit www.virginmedia.com<http://www.virginmedia.com> for more information, > and more fun. > This email and any attachments are or may be confidential and legally > privileged > and are sent solely for the attention of the addressee(s). If you have > received this > email in error, please delete it from your system: its use, disclosure or > copying is > unauthorised. Statements and opinions expressed in this email may not > represent > those of Virgin Media. Any representations or commitments in this email are > subject to contract. > Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, > RG27 9UP > Registered in England and Wales with number 2591237 > -------------------------------------------------------------------- > Save Paper - Do you really need to print this e-mail? > Visit www.virginmedia.com for more information, and more fun. > This email and any attachments are or may be confidential and legally > privileged > and are sent solely for the attention of the addressee(s). If you have > received this > email in error, please delete it from your system: its use, disclosure or > copying is > unauthorised. Statements and opinions expressed in this email may not > represent > those of Virgin Media. Any representations or commitments in this email are > subject to contract. > Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, > RG27 9UP > Registered in England and Wales with number 2591237 > -------------------------------------------------------------------- > Save Paper - Do you really need to print this e-mail? > Visit www.virginmedia.com for more information, and more fun. > This email and any attachments are or may be confidential and legally > privileged > and are sent solely for the attention of the addressee(s). If you have > received this > email in error, please delete it from your system: its use, disclosure or > copying is > unauthorised. Statements and opinions expressed in this email may not > represent > those of Virgin Media. Any representations or commitments in this email are > subject to contract. > Registered office: Media House, Bartley Wood Business Park, Hook, Hampshire, > RG27 9UP > Registered in England and Wales with number 2591237
