Rotation is a bit of a mess. We've tried a couple strategies to handle it, none of which are perfect. One approach is to have a modified logger that explicitly invokes chukwa, starting and stopping adaptors. The other is that the FileTailingAdaptors keep not only a physical "how long is the file" offset, but a logical "what is the byte number of the first byte of the file" -- the idea is that if the file rotates, the adaptor should add the length of the rotated-out section to the length of the current file.
This is a bit fragile, since the adaptor has to guess which was the previously-rotated file. I believe we use timestamps for that. I suspect it won't always work. --Ari On Tue, May 15, 2012 at 11:45 PM, IvyTang <[email protected]> wrote: > After reading the source code ,i'm confuesd about the checkpoint file . > > The file tailer generate the chunks into the memlimitqueue, the > httpsender get the chunks to send from the memlimitqueue. And after the > httpsender send the chunks to collector reliably ,the reportCommit(Adaptor > src, long uuid) will be called. > > In this reportCommit(Adaptor src, long uuid) method, the src is the > adaptor , the uuid is the offset of those chunks which have beend in the > file .And if the uuid is > adaptor.offset , the means some chunks have been > sent , so the adaptor.offset is assigned to the uuid. > > This works file when the log file is not rotating . > > But if the log file is rotating(i mean the way like log4j , move this > file to *.1 and generate a file named *), the adaptor.offset is the offset > of those chunks been sent in last file , it's of course very big . but uuid > is the offset of chunks been sent of this file , the uuid is smaller the > the adaptor.offset . > > So the checkpoint file won't change . > > Even though chukwa is still sending chunks to collector , but if chukwa > restarted , the checkpoint is larger than the log file size , the log file > will be sent again. > > > > On Mon, May 14, 2012 at 7:01 PM, IvyTang <[email protected]> wrote: >> >> The gamelog size is 158023223, but the check point file is >> >> ADD adaptor_2963225a90653a309cf779d4a1d815a3 = >> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8 >> Gamelog 0 /var/log/dataproxy/gamelog 229406124 >> >> The gamelog didn't rotate , i'm sure. >> >> But the check point file size is bigger than the file size , the chukwa >> WARN Thread-2 FileTailingAdaptor - >> Adaptor|adaptor_2963225a90653a309cf779d4a1d815a3| file: >> /var/log/dataproxy/gamelog, has rotated and no detection - reset counters to >> 0L >> And the agent began to transfer the whole log file. >> >> I just feel confused why agent generate a offset size is bigger than the >> log size when the gamelog did not rotate. >> >> The chukwa version is 0.4.0 >> >> -- >> Best regards, >> >> Ivy Tang >> >> >> > > > > -- > Best regards, > > Ivy Tang > > > -- Ari Rabkin [email protected] UC Berkeley Computer Science Department
