Yes I understand the concerns with this use case. If so we need to configure failover in this scenario , can we have it like channel level ,sink channel.
Does flume support to configure failover incase channel fills up. On Mon, Oct 27, 2014 at 3:54 PM, Ahmed Vila <[email protected]> wrote: > Hi, > > In fact, this is not the problem with Flume. > > No solution will function reliably for your use case, simply because all > of them will have to do some sort of tail-f or streaming on a file and if > they can't keep up with it (they mostly don't in high speed entry points), > they will drop some entries. > Please, be kind to yourself and plan for failures - if you need to restart > Flume or any other solution then you'll face dropped entries that you'll > not be able to re-ingest easily as in most cases you won't know which ones > you've dropped. > > > Regards, > Ahmed > > On Mon, Oct 27, 2014 at 11:13 AM, SaravanaKumar TR <[email protected] > > wrote: > >> Thanks for comments Ahmed. >> >> So from your comments , I consider that flume doesn't have any reliable >> source option for use case provided by me. >> >> If flume can't provide it, can you help me with any other log collector >> solutions which can I consider here to move real time data to HDFS. >> >> >> >> On Mon, Oct 27, 2014 at 3:37 PM, Ahmed Vila <[email protected]> wrote: >> >>> Hi, >>> >>> Then, you're out of luck in my opinion, as there is no way other than >>> tail -f. >>> The problem with fail-f is that tail will not wait for source/channel to >>> keep up with it. If Cnannel is full it will back-off to the source and then >>> the source will just stop ingesting. >>> >>> There is a possibility to hack up the tail -f into another file and then >>> custom-rotate that duplicate file. >>> But, I wouldn't recommend such case. >>> >>> Just a side note - If you're operating Java application (Tomcat or >>> similar), then you can create multiple output files via log4j.properties >>> configuration without application itself knowing anything about it. >>> >>> Regards, >>> Ahmed >>> >>> >>> On Mon, Oct 27, 2014 at 10:56 AM, SaravanaKumar TR < >>> [email protected]> wrote: >>> >>>> Ahmed, >>>> >>>> Here in my case , the application will rename the existing file as >>>> <logfile>.yesterdaydate and create a new file as <logfile> at 00:00 AM. >>>> >>>> I can't change the log rotation policy of application for now.So I >>>> guess I should rule out the option of using spooling directory source in my >>>> case. >>>> >>>> Can you suggest me with any other options other than spooling dir >>>> source. >>>> >>>> Thanks, >>>> >>>> On Mon, Oct 27, 2014 at 3:10 PM, Ahmed Vila <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> It all depends on how log rotation is done and how application >>>>> producing the log file handles log rotation. >>>>> Most of the applications just reopens the log file when it receives a >>>>> kill signal. For example, nginx reopens the log file when it receives USR1 >>>>> signal, but it doesn't stop the process. Some applications might restart >>>>> as >>>>> a result. >>>>> >>>>> If the application just reopens the log file, then you can change your >>>>> log rotation policy to be per minute. >>>>> In that case logrotate daemon won't satisfy such case, so you'll have >>>>> to make a cron job to do it. >>>>> In such case, you would separate finished logs location and live log >>>>> location so the spooling directory source doesn't freak out about active >>>>> log file being appended. >>>>> >>>>> Anyway, spooling directory source is a way to go, as it will leave log >>>>> files in place, just renamed. >>>>> >>>>> Regards, >>>>> Ahmed >>>>> >>>>> >>>>> On Mon, Oct 27, 2014 at 10:21 AM, SaravanaKumar TR < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I am using Apache flume 1.5.0.Quick setup explanation here. >>>>>> >>>>>> Source:exec , tail –F command for a logfile. >>>>>> >>>>>> Channel: file channel >>>>>> >>>>>> Sink: HDFS >>>>>> >>>>>> Use case:to move real time data from logfile to HDFS. >>>>>> >>>>>> >>>>>> It appears like exec is not a reliable source , as we may data loss >>>>>> if channel/source is down. >>>>>> >>>>>> >>>>>> So i tried with other option "spooling directory source" which is >>>>>> mentioned as reliable source.But here I have a single logfile where data >>>>>> gets appended in , so I dont see option of moving the file to spool >>>>>> directory. >>>>>> >>>>>> >>>>>> Can anyone help me with providing any other reliable source option in >>>>>> case where logfile gets appended with data and logfile rotation happens >>>>>> only at the end of the day. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Saravana >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> This e-mail and any attachment is for authorised use by the intended >>>>> recipient(s) only. This email contains confidential information. It should >>>>> not be copied, disclosed to, retained or used by, any party other than the >>>>> intended recipient. Any unauthorised distribution, dissemination or >>>>> copying >>>>> of this E-mail or its attachments, and/or any use of any information >>>>> contained in them, is strictly prohibited and may be illegal. If you are >>>>> not an intended recipient then please promptly delete this e-mail and any >>>>> attachment and all copies and inform the sender directly via email. Any >>>>> emails that you send to us may be monitored by systems or persons other >>>>> than the named communicant for the purposes of ascertaining whether the >>>>> communication complies with the law and company policies. >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> This e-mail and any attachment is for authorised use by the intended >>> recipient(s) only. This email contains confidential information. It should >>> not be copied, disclosed to, retained or used by, any party other than the >>> intended recipient. Any unauthorised distribution, dissemination or copying >>> of this E-mail or its attachments, and/or any use of any information >>> contained in them, is strictly prohibited and may be illegal. If you are >>> not an intended recipient then please promptly delete this e-mail and any >>> attachment and all copies and inform the sender directly via email. Any >>> emails that you send to us may be monitored by systems or persons other >>> than the named communicant for the purposes of ascertaining whether the >>> communication complies with the law and company policies. >>> >> > > --------------------------------------------------------------------- > This e-mail and any attachment is for authorised use by the intended > recipient(s) only. This email contains confidential information. It should > not be copied, disclosed to, retained or used by, any party other than the > intended recipient. Any unauthorised distribution, dissemination or copying > of this E-mail or its attachments, and/or any use of any information > contained in them, is strictly prohibited and may be illegal. If you are > not an intended recipient then please promptly delete this e-mail and any > attachment and all copies and inform the sender directly via email. Any > emails that you send to us may be monitored by systems or persons other > than the named communicant for the purposes of ascertaining whether the > communication complies with the law and company policies. >
