yes , I agree . I think no logging solution like source in flume/producer in kafka have any marking feature like exact point till it consumed from logfile , to recover incase of its failure to again start reading from the same point of the logfile.(before failure)
This is the major point where failures were difficult to ignore.Am I right? On Mon, Oct 27, 2014 at 4:51 PM, Ahmed Vila <[email protected]> wrote: > Hi, > > You can use spillable channel that will store events in memory and once it > fills it, it will spill to the disk. > Also, you can use file channel, but it's as fast as your disk is and it's > suggested to use a separate disk for it due to high IO with it, preferably > an SSD. > > But, that will not solve the issue you might run into - if the flume fails > for whatever the reason, you'll never be able to continue from the exact > point where it failed. > Yes, File channel preserves the state, so it will continue with whatever > he already received, but what about the time while it was down ? > > If you cannot change anything regarding the application that produces the > logs, then such circumstance has to be taken as a trade off. > > > On Mon, Oct 27, 2014 at 12:09 PM, SaravanaKumar TR <[email protected] > > wrote: > >> Yes I understand the concerns with this use case. >> >> If so we need to configure failover in this scenario , can we have it >> like channel level ,sink channel. >> >> Does flume support to configure failover incase channel fills up. >> >> >> >> On Mon, Oct 27, 2014 at 3:54 PM, Ahmed Vila <[email protected]> wrote: >> >>> Hi, >>> >>> In fact, this is not the problem with Flume. >>> >>> No solution will function reliably for your use case, simply because all >>> of them will have to do some sort of tail-f or streaming on a file and if >>> they can't keep up with it (they mostly don't in high speed entry points), >>> they will drop some entries. >>> Please, be kind to yourself and plan for failures - if you need to >>> restart Flume or any other solution then you'll face dropped entries that >>> you'll not be able to re-ingest easily as in most cases you won't know >>> which ones you've dropped. >>> >>> >>> Regards, >>> Ahmed >>> >>> On Mon, Oct 27, 2014 at 11:13 AM, SaravanaKumar TR < >>> [email protected]> wrote: >>> >>>> Thanks for comments Ahmed. >>>> >>>> So from your comments , I consider that flume doesn't have any reliable >>>> source option for use case provided by me. >>>> >>>> If flume can't provide it, can you help me with any other log collector >>>> solutions which can I consider here to move real time data to HDFS. >>>> >>>> >>>> >>>> On Mon, Oct 27, 2014 at 3:37 PM, Ahmed Vila <[email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> Then, you're out of luck in my opinion, as there is no way other than >>>>> tail -f. >>>>> The problem with fail-f is that tail will not wait for source/channel >>>>> to keep up with it. If Cnannel is full it will back-off to the source and >>>>> then the source will just stop ingesting. >>>>> >>>>> There is a possibility to hack up the tail -f into another file and >>>>> then custom-rotate that duplicate file. >>>>> But, I wouldn't recommend such case. >>>>> >>>>> Just a side note - If you're operating Java application (Tomcat or >>>>> similar), then you can create multiple output files via log4j.properties >>>>> configuration without application itself knowing anything about it. >>>>> >>>>> Regards, >>>>> Ahmed >>>>> >>>>> >>>>> On Mon, Oct 27, 2014 at 10:56 AM, SaravanaKumar TR < >>>>> [email protected]> wrote: >>>>> >>>>>> Ahmed, >>>>>> >>>>>> Here in my case , the application will rename the existing file as >>>>>> <logfile>.yesterdaydate and create a new file as <logfile> at 00:00 AM. >>>>>> >>>>>> I can't change the log rotation policy of application for now.So I >>>>>> guess I should rule out the option of using spooling directory source in >>>>>> my >>>>>> case. >>>>>> >>>>>> Can you suggest me with any other options other than spooling dir >>>>>> source. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> On Mon, Oct 27, 2014 at 3:10 PM, Ahmed Vila <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> It all depends on how log rotation is done and how application >>>>>>> producing the log file handles log rotation. >>>>>>> Most of the applications just reopens the log file when it receives >>>>>>> a kill signal. For example, nginx reopens the log file when it receives >>>>>>> USR1 signal, but it doesn't stop the process. Some applications might >>>>>>> restart as a result. >>>>>>> >>>>>>> If the application just reopens the log file, then you can change >>>>>>> your log rotation policy to be per minute. >>>>>>> In that case logrotate daemon won't satisfy such case, so you'll >>>>>>> have to make a cron job to do it. >>>>>>> In such case, you would separate finished logs location and live log >>>>>>> location so the spooling directory source doesn't freak out about active >>>>>>> log file being appended. >>>>>>> >>>>>>> Anyway, spooling directory source is a way to go, as it will leave >>>>>>> log files in place, just renamed. >>>>>>> >>>>>>> Regards, >>>>>>> Ahmed >>>>>>> >>>>>>> >>>>>>> On Mon, Oct 27, 2014 at 10:21 AM, SaravanaKumar TR < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am using Apache flume 1.5.0.Quick setup explanation here. >>>>>>>> >>>>>>>> Source:exec , tail –F command for a logfile. >>>>>>>> >>>>>>>> Channel: file channel >>>>>>>> >>>>>>>> Sink: HDFS >>>>>>>> >>>>>>>> Use case:to move real time data from logfile to HDFS. >>>>>>>> >>>>>>>> >>>>>>>> It appears like exec is not a reliable source , as we may data loss >>>>>>>> if channel/source is down. >>>>>>>> >>>>>>>> >>>>>>>> So i tried with other option "spooling directory source" which is >>>>>>>> mentioned as reliable source.But here I have a single logfile where >>>>>>>> data >>>>>>>> gets appended in , so I dont see option of moving the file to spool >>>>>>>> directory. >>>>>>>> >>>>>>>> >>>>>>>> Can anyone help me with providing any other reliable source option >>>>>>>> in case where logfile gets appended with data and logfile rotation >>>>>>>> happens >>>>>>>> only at the end of the day. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Saravana >>>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------ >>>>>>> --------- >>>>>>> This e-mail and any attachment is for authorised use by the intended >>>>>>> recipient(s) only. This email contains confidential information. It >>>>>>> should >>>>>>> not be copied, disclosed to, retained or used by, any party other than >>>>>>> the >>>>>>> intended recipient. Any unauthorised distribution, dissemination or >>>>>>> copying >>>>>>> of this E-mail or its attachments, and/or any use of any information >>>>>>> contained in them, is strictly prohibited and may be illegal. If you are >>>>>>> not an intended recipient then please promptly delete this e-mail and >>>>>>> any >>>>>>> attachment and all copies and inform the sender directly via email. Any >>>>>>> emails that you send to us may be monitored by systems or persons other >>>>>>> than the named communicant for the purposes of ascertaining whether the >>>>>>> communication complies with the law and company policies. >>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> This e-mail and any attachment is for authorised use by the intended >>>>> recipient(s) only. This email contains confidential information. It should >>>>> not be copied, disclosed to, retained or used by, any party other than the >>>>> intended recipient. Any unauthorised distribution, dissemination or >>>>> copying >>>>> of this E-mail or its attachments, and/or any use of any information >>>>> contained in them, is strictly prohibited and may be illegal. If you are >>>>> not an intended recipient then please promptly delete this e-mail and any >>>>> attachment and all copies and inform the sender directly via email. Any >>>>> emails that you send to us may be monitored by systems or persons other >>>>> than the named communicant for the purposes of ascertaining whether the >>>>> communication complies with the law and company policies. >>>>> >>>> >>> >>> --------------------------------------------------------------------- >>> This e-mail and any attachment is for authorised use by the intended >>> recipient(s) only. This email contains confidential information. It should >>> not be copied, disclosed to, retained or used by, any party other than the >>> intended recipient. Any unauthorised distribution, dissemination or copying >>> of this E-mail or its attachments, and/or any use of any information >>> contained in them, is strictly prohibited and may be illegal. If you are >>> not an intended recipient then please promptly delete this e-mail and any >>> attachment and all copies and inform the sender directly via email. Any >>> emails that you send to us may be monitored by systems or persons other >>> than the named communicant for the purposes of ascertaining whether the >>> communication complies with the law and company policies. >>> >> >> > > > -- > > Best regards, > Ahmed Vila | Senior software developer > DevLogic | Sarajevo | Bosnia and Herzegovina > > Office : +387 33 942 123 > Mobile: +387 62 139 348 > > Website: www.devlogic.eu > E-mail : [email protected] > --------------------------------------------------------------------- > This e-mail and any attachment is for authorised use by the intended > recipient(s) only. This email contains confidential information. It should > not be copied, disclosed to, retained or used by, any party other than the > intended recipient. Any unauthorised distribution, dissemination or copying > of this E-mail or its attachments, and/or any use of any information > contained in them, is strictly prohibited and may be illegal. If you are > not an intended recipient then please promptly delete this e-mail and any > attachment and all copies and inform the sender directly via email. Any > emails that you send to us may be monitored by systems or persons other > than the named communicant for the purposes of ascertaining whether the > communication complies with the law and company policies. > > --------------------------------------------------------------------- > This e-mail and any attachment is for authorised use by the intended > recipient(s) only. This email contains confidential information. It should > not be copied, disclosed to, retained or used by, any party other than the > intended recipient. Any unauthorised distribution, dissemination or copying > of this E-mail or its attachments, and/or any use of any information > contained in them, is strictly prohibited and may be illegal. If you are > not an intended recipient then please promptly delete this e-mail and any > attachment and all copies and inform the sender directly via email. Any > emails that you send to us may be monitored by systems or persons other > than the named communicant for the purposes of ascertaining whether the > communication complies with the law and company policies. >
