[
https://issues.apache.org/jira/browse/FLUME-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528519#comment-17528519
]
Satoshi Iijima commented on FLUME-3320:
---------------------------------------
I've met the same problem recently. This is because inode, filesize and
lastModifiedTime are retrieved at different times.
To resolve this completely, it is necessary to retrieve these at the same time.
I created a [PR|https://github.com/apache/flume/pull/368] for this.
> Taildir source reset position bug
> ---------------------------------
>
> Key: FLUME-3320
> URL: https://issues.apache.org/jira/browse/FLUME-3320
> Project: Flume
> Issue Type: Bug
> Components: Master, Sinks+Sources
> Affects Versions: 1.7.0, 1.8.0
> Reporter: zzf
> Priority: Major
> Attachments: error.log
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> 1) background
> we use flume to transport logs from local file to kafka
> every night at 0 p.m , local file will be moved to another directory to be
> compressed as backup , after that , old file's inode will be released. The
> monitor directory will generate new log file with the same filename.Since
> flume monitor files with inode , never seen any conflict before.
> 2 ) problem
> 2019-02-23 23:59:34,157 (Log-BackgroundWorker-c1) [INFO -
> org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint(EventQueueBackingStoreFile.java:230)]
> Start checkpoint for /data/flume/filechannel/checkpoint/checkpoint, elements
> to sync = 3406
> 2019-02-23 23:59:34,167 (Log-BackgroundWorker-c1) [INFO -
> org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint(EventQueueBackingStoreFile.java:255)]
> Updating checkpoint metadata: logWriteOrderID: 1550778307223, queueSize: 0,
> queueHead: 1754854
> 2019-02-23 23:59:34,176 (Log-BackgroundWorker-c1) [INFO -
> org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1065)] Updated
> checkpoint for file: /data/flume/filechannel/data/log-68 position: 149727224
> logWriteOrderID: 1550778307223
> *2019-02-24 00:00:01,550 (PollableSourceRunner-TaildirSource-r1) [INFO -
> org.apache.flume.source.taildir.ReliableTaildirEventReader.updateTailFiles(ReliableTaildirEventReader.java:257)]
> Pos 15927858851 is larger than file size! Restarting from pos 0, file:
> /data/logs/nginx/stm-collect.cn.miaozhen.com.log, inode: 3932190*
> *2019-02-24 00:00:01,550 (PollableSourceRunner-TaildirSource-r1) [INFO -
> org.apache.flume.source.taildir.TailFile.updatePos(TailFile.java:126)]
> Updated position, file: /data/logs/nginx/stm-collect.cn.miaozhen.com.log,
> inode: 3932190, pos: 0*
> 2019-02-24 00:00:04,177 (Log-BackgroundWorker-c1) [INFO -
> org.apache.flume.channel.file.EventQueueBackingStoreFile.beginCheckpoint(EventQueueBackingStoreFile.java:230)]
> Start checkpoint for /data/flume/filechannel/checkpoint/checkpoint, elements
> to sync = 33326
> 2019-02-24 00:00:04,189 (Log-BackgroundWorker-c1) [INFO -
> org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint(EventQueueBackingStoreFile.java:255)]
> Updating checkpoint metadata: logWriteOrderID: 1550778350086, queueSize:
> 24452, queueHead: 1763726
> 2019-02-24 00:00:04,213 (Log-BackgroundWorker-c1) [INFO -
> org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:1065)] Updated
> checkpoint for file: /data/flume/filechannel/data/log-68 position: 190948775
> logWriteOrderID: 1550778350086
> normally,flume will open the new file and get it's inode. if inode in
> Map<inode,taildir> ,and taildir position larger than file size. the position
> will be reset to 0.The system went well for a long time.
> but in this time,flume failed to get the new file's inode.In the log above
> ,you can see there was no 'Opening file ...' . And somehow the old log
> file's position was reset and all the data in 23,Feb sent double.
> 3)question
> Logically , flume relates the real file and object tailfile with inode ,even
> though I get the old file's inode,I still can not figure out why the
> tailfile.getPos() will be larger than f.length().
> has this happened before? should i blame this to java NIO?
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]