Hi Ari,
Thanks for you replay.
We in deed encounter some problems in using CharFileTailingAdaptorUTF8 .
The method tailFile in FileTailingAdaptor,
* RandomAccessFile newReader = new RandomAccessFile(toWatch, "r");
*
* len = reader.length();*
* long newLength = newReader.length();*
* if (newLength < len && fileReadOffset >= len) {*
* if (reader != null) {*
* reader.close();*
* }*
* *
* reader = newReader;*
* fileReadOffset = 0L;*
* log.debug("Adaptor|"+ adaptorID + "| File size mismatched,
rotating: "*
* + toWatch.getAbsolutePath());*
* *
* *When filetailing adaptor finds the log file has rotated ,the reader is
assigned to the new reader . Does this means the log which haven't been
sent in the old log file is missing ?
On Wed, May 16, 2012 at 2:51 PM, Ariel Rabkin <[email protected]> wrote:
> Rotation is a bit of a mess.
>
> We've tried a couple strategies to handle it, none of which are perfect.
> One approach is to have a modified logger that explicitly invokes
> chukwa, starting and stopping adaptors.
> The other is that the FileTailingAdaptors keep not only a physical
> "how long is the file" offset, but a logical "what is the byte number
> of the first byte of the file" -- the idea is that if the file
> rotates, the adaptor should add the length of the rotated-out section
> to the length of the current file.
>
> This is a bit fragile, since the adaptor has to guess which was the
> previously-rotated file. I believe we use timestamps for that. I
> suspect it won't always work.
>
> --Ari
>
> On Tue, May 15, 2012 at 11:45 PM, IvyTang <[email protected]> wrote:
> > After reading the source code ,i'm confuesd about the checkpoint
> file .
> >
> > The file tailer generate the chunks into the memlimitqueue, the
> > httpsender get the chunks to send from the memlimitqueue. And after the
> > httpsender send the chunks to collector reliably
> ,the reportCommit(Adaptor
> > src, long uuid) will be called.
> >
> > In this reportCommit(Adaptor src, long uuid) method, the src is the
> > adaptor , the uuid is the offset of those chunks which have beend in the
> > file .And if the uuid is > adaptor.offset , the means some chunks have
> been
> > sent , so the adaptor.offset is assigned to the uuid.
> >
> > This works file when the log file is not rotating .
> >
> > But if the log file is rotating(i mean the way like log4j , move this
> > file to *.1 and generate a file named *), the adaptor.offset is the
> offset
> > of those chunks been sent in last file , it's of course very big . but
> uuid
> > is the offset of chunks been sent of this file , the uuid is smaller the
> > the adaptor.offset .
> >
> > So the checkpoint file won't change .
> >
> > Even though chukwa is still sending chunks to collector , but if
> chukwa
> > restarted , the checkpoint is larger than the log file size , the log
> file
> > will be sent again.
> >
> >
> >
> > On Mon, May 14, 2012 at 7:01 PM, IvyTang <[email protected]> wrote:
> >>
> >> The gamelog size is 158023223, but the check point file is
> >>
> >> ADD adaptor_2963225a90653a309cf779d4a1d815a3 =
> >>
> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8
> >> Gamelog 0 /var/log/dataproxy/gamelog 229406124
> >>
> >> The gamelog didn't rotate , i'm sure.
> >>
> >> But the check point file size is bigger than the file size , the chukwa
> >> WARN Thread-2 FileTailingAdaptor -
> >> Adaptor|adaptor_2963225a90653a309cf779d4a1d815a3| file:
> >> /var/log/dataproxy/gamelog, has rotated and no detection - reset
> counters to
> >> 0L
> >> And the agent began to transfer the whole log file.
> >>
> >> I just feel confused why agent generate a offset size is bigger than the
> >> log size when the gamelog did not rotate.
> >>
> >> The chukwa version is 0.4.0
> >>
> >> --
> >> Best regards,
> >>
> >> Ivy Tang
> >>
> >>
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> > Ivy Tang
> >
> >
> >
>
>
>
> --
> Ari Rabkin [email protected]
> UC Berkeley Computer Science Department
>
--
Best regards,
Ivy Tang