----- Original Message ----- > From: "Raghavendra Gowdappa" <rgowd...@redhat.com> > To: "George Lian (Nokia - CN/Hangzhou)" <george.l...@nokia.com> > Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>, "Bingxuan Zhang (Nokia > - CN/Hangzhou)" <bingxuan.zh...@nokia.com>, Gluster-devel@gluster.org, "Jan > Zizka (Nokia - CZ/Prague)" > <jan.zi...@nokia.com> > Sent: Tuesday, November 1, 2016 7:46:47 AM > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the > really size of the syslog file > > > > ----- Original Message ----- > > From: "George Lian (Nokia - CN/Hangzhou)" <george.l...@nokia.com> > > To: "Raghavendra Gowdappa" <rgowd...@redhat.com>, "Jan Zizka (Nokia - > > CZ/Prague)" <jan.zi...@nokia.com>, "Bingxuan > > Zhang (Nokia - CN/Hangzhou)" <bingxuan.zh...@nokia.com> > > Cc: "Pranith Kumar Karampuri" <pkara...@redhat.com>, > > "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>, > > Gluster-devel@gluster.org > > Sent: Tuesday, November 1, 2016 6:35:10 AM > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the > > really size of the syslog file > > > > Hi, Raghavendra, > > > > Thanks a lots for your update! > > > > >IIUC, the "tail issue" can happen if 'tail -f' reads a stat with st_size > > >lesser than previously read value (and hence the complaint - file > > >truncated). In this case, even though fstat at T2 doesn't account the > > >write > > >at T0, it doesn't prove that st_size of fstat at T2 is lesser than that at > > >any time before T2. > > > > I just mean the st_size of fstat maybe less than the previously read value > > in > > that time, and it will lead to the "tail truncated" issue. Do you agree > > with > > me? > > Yes. But, in your example there is only one fstat. For this to happen we need > atleast two fstats and the latest st_size should be less than the oldest > one. Am I missing anything here? > > > > > >As to the relative ordering of write at T0 and fstat at T2, POSIX leaves > > >it > > >undefined. Unless write and fstat happen from same > > >thread/single-threaded-application there is no requirement for maintaining > > >that order (If they are issued from same thread fstat should account write > > >at T0). Also note that it is not mentioned here fstat at T2 is issued > > >_after_ write at T0 is _complete_. If that is the case, mdc_writev_cbk > > >would've updated correct stat in cache and fstat would get correct value. > > >If it is not the case, then there is no well defined order here. > > > > >So, I don't think there is a bug here, unless I've missed out something. > > > > Do you mean the GlusterFS not conflict with the requirement, so that the > > application like "tail" should consider the case in network file system? > > No. Applications shouldn't do anything different to work on Glusterfs. > Otherwise its a bug :). What I am saying is that the issue with 'tail -f' > might be because of a different bug than the example you gave. In other > words, the RCA you posted may not be correct. It might be because of issues > with write-behind (and other xlators) as I posted in other mail. > > Priliminary testing by Pranith showed that Elasticsearch works fine with just > write-behind.
with patch http://review.gluster.org/15757 applied. > So, that's a progress. Will keep you posted with our efforts > on getting Elasticsearch working on Gluster. I've a feeling that, it will > solve your issue (tail -f) too. > > regards, > Raghavendra > > > > > @Jan & @Bingxuan, do you have some comments for the above information? > > > > > > Best Regards, > > George > > > > -----Original Message----- > > From: Raghavendra Gowdappa [mailto:rgowd...@redhat.com] > > Sent: Monday, October 31, 2016 6:35 PM > > To: Lian, George (Nokia - CN/Hangzhou) <george.l...@nokia.com> > > Cc: Pranith Kumar Karampuri <pkara...@redhat.com>; > > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan (Nokia > > - > > CN/Hangzhou) <bingxuan.zh...@nokia.com>; Gluster-devel@gluster.org; Zizka, > > Jan (Nokia - CZ/Prague) <jan.zi...@nokia.com> > > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the > > really size of the syslog file > > > > > > > > ----- Original Message ----- > > > From: "George Lian (Nokia - CN/Hangzhou)" <george.l...@nokia.com> > > > To: "Pranith Kumar Karampuri" <pkara...@redhat.com>, "Raghavendra > > > Gowdappa" > > > <rgowd...@redhat.com> > > > Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>, "Bingxuan Zhang > > > (Nokia > > > - CN/Hangzhou)" <bingxuan.zh...@nokia.com>, Gluster-devel@gluster.org, > > > "Jan > > > Zizka (Nokia - CZ/Prague)" > > > <jan.zi...@nokia.com> > > > Sent: Monday, October 31, 2016 2:32:34 PM > > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than > > > the > > > really size of the syslog file > > > > > > Hi, > > > > > > I suppose there seems a defect on mdc_writev_cbk and mdc_fstat > > > Let’s assume in 2 timestamp which called write and fstat operation in > > > application: > > > T0: write (process a) > > > T1: read (process b) with the data of T0 of process a. > > > T2: fstat (process c) > > > In my view, mdc_write is non-block operation and have some lock to > > > protect > > > in > > > afr xlator, because mdc_fstat not check the lock in AFR xaltor, so > > > mdc_writev_cbk which called “mdc_inode_iatt_set_validate” maybe later > > > than > > > mdc_fstat. > > > Such like > > > T3: fstat result of T2 without the “mdc_inode_iatt_set_validate” of T0 > > > when > > > stat-prefetch options is on. > > > T4: “mdc_inode_iatt_set_validate” is called of T0 in mdc_writev_cbk. > > > > > > Lets’ assume T0<T1<T2<T3<T4, is the above assumption is reasonable case > > > when > > > in multi-process environment and the load of CPU is high? > > > If it is reasonable, then issue of “tail issue” will be happened. > > > > IIUC, the "tail issue" can happen if 'tail -f' reads a stat with st_size > > lesser than previously read value (and hence the complaint - file > > truncated). In this case, even though fstat at T2 doesn't account the write > > at T0, it doesn't prove that st_size of fstat at T2 is lesser than that at > > any time before T2. > > > > As to the relative ordering of write at T0 and fstat at T2, POSIX leaves it > > undefined. Unless write and fstat happen from same > > thread/single-threaded-application there is no requirement for maintaining > > that order (If they are issued from same thread fstat should account write > > at T0). Also note that it is not mentioned here fstat at T2 is issued > > _after_ write at T0 is _complete_. If that is the case, mdc_writev_cbk > > would've updated correct stat in cache and fstat would get correct value. > > If > > it is not the case, then there is no well defined order here. > > > > So, I don't think there is a bug here, unless I've missed out something. > > > > > > > > > > So maybe a fix suggestion is on mdc_fstat operation , we should add an > > > operation to check whether the writev operation is ongoing or not, if > > > write-operation is ongoing, should goto uncached label in mdc_fstat > > > function. > > > > > > Could you please confirm the above assumption and suggestion? > > > > > > > > > Thanks & Best Regards, > > > George > > > > > > > > > From: Lian, George (Nokia - CN/Hangzhou) > > > Sent: Monday, October 31, 2016 4:25 PM > > > To: Pranith Kumar Karampuri <pkara...@redhat.com>; Raghavendra Gowdappa > > > <rgowd...@redhat.com> > > > Cc: I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>; Zhang, Bingxuan > > > (Nokia > > > - > > > CN/Hangzhou) <bingxuan.zh...@nokia.com>; Gluster-devel@gluster.org; > > > Zizka, > > > Jan (Nokia - CZ/Prague) <jan.zi...@nokia.com> > > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than > > > the > > > really size of the syslog file > > > > > > Hi, > > > > > > How can we enable debug.trace so that we can inspect the debug data on > > > different xlator? > > > I just set “debug.trace on” and “debug.log-file yes” seems not work now. > > > > > > And one more update for this issue, if we set performance.stat-prefetch > > > to > > > off, the issue will not be occurred. (our previous test maybe not > > > correct☺ > > > ) > > > > > > Thanks & Best Regards, > > > George > > > > > > From: Pranith Kumar Karampuri [mailto:pkara...@redhat.com] > > > Sent: Friday, October 28, 2016 2:39 PM > > > To: Lian, George (Nokia - CN/Hangzhou) > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> > > > Cc: Raghavendra Gowdappa > > > <rgowd...@redhat.com<mailto:rgowd...@redhat.com>>; > > > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>; > > > Zhang, Bingxuan (Nokia - CN/Hangzhou) > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>; > > > Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>; Zizka, Jan > > > (Nokia - CZ/Prague) <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than > > > the > > > really size of the syslog file > > > > > > hi George, > > > It would help if we can identify the bare minimum xlators which > > > are > > > contributing to the issue like Raghavendra was mentioning earlier. > > > We > > > were wondering if it is possible for you to help us in identifying > > > the issue by running the workload on a modified setup? We can > > > suggest > > > testing out using custom volfiles so that we can slowly build the > > > graph which could be causing this issue. We would like you guys to > > > try out this problem with just posix-xlator and fuse and nothing > > > else. > > > > > > On Thu, Oct 27, 2016 at 1:40 PM, Lian, George (Nokia - CN/Hangzhou) > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> wrote: > > > Hi, Raghavendra, > > > > > > Could you please give some suggestion for this issue? we try to find the > > > clue > > > for this issue for a long time, but it has no progress:( > > > > > > Thanks & Best Regards, > > > George > > > > > > -----Original Message----- > > > From: Lian, George (Nokia - CN/Hangzhou) > > > Sent: Wednesday, October 19, 2016 4:40 PM > > > To: 'Raghavendra Gowdappa' > > > <rgowd...@redhat.com<mailto:rgowd...@redhat.com>> > > > Cc: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>; > > > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>; > > > Zhang, Bingxuan (Nokia - CN/Hangzhou) > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>; Zizka, Jan > > > (Nokia - CZ/Prague) <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than > > > the > > > really size of the syslog file > > > > > > Hi, Raghavendra > > > > > > Just now, we test it with glusterfs log with debug-level "TRACE", and let > > > some application trigger "glusterfs" produce large log, in that case, > > > when > > > we set write-behind and stat-prefetch both OFF, > > > Tail the glusterfs log such like mnt-{VOLUME-NAME}.log, it still failed > > > with > > > "file truncated", > > > > > > So that means if file's IO in huge amount, the issue will still be there > > > even > > > write-behind and stat-prefetch both OFF. > > > > > > Best Regards, > > > George > > > > > > -----Original Message----- > > > From: Raghavendra Gowdappa > > > [mailto:rgowd...@redhat.com<mailto:rgowd...@redhat.com>] > > > Sent: Wednesday, October 19, 2016 2:54 PM > > > To: Lian, George (Nokia - CN/Hangzhou) > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> > > > Cc: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>; > > > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>; > > > Zhang, Bingxuan (Nokia - CN/Hangzhou) > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>; Zizka, Jan > > > (Nokia - CZ/Prague) <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than > > > the > > > really size of the syslog file > > > > > > > > > > > > ----- Original Message ----- > > > > From: "George Lian (Nokia - CN/Hangzhou)" > > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> > > > > To: "Raghavendra Gowdappa" > > > > <rgowd...@redhat.com<mailto:rgowd...@redhat.com>> > > > > Cc: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>, > > > > "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" > > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>, > > > > "Bingxuan Zhang (Nokia - CN/Hangzhou)" > > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>, "Jan Zizka > > > > (Nokia - CZ/Prague)" <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > > Sent: Wednesday, October 19, 2016 12:05:01 PM > > > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less than > > > > the > > > > really size of the syslog file > > > > > > > > Hi, Raghavendra, > > > > > > > > Thanks a lots for your quickly update! > > > > In my case, there are so many process(write) is writing to the syslog > > > > file, > > > > it do involve the writer is in the same host and writing in same mount > > > > point > > > > while the tail(reader) is reading it. > > > > > > > > The bug I just guess is: > > > > When a writer write the data with write-behind, it call the call-back > > > > function " mdc_writev_cbk" and called "mdc_inode_iatt_set_validate" to > > > > validate the "iatt" data, but with the code I mentioned last mail, it > > > > do > > > > nothing. > > > > > > mdc_inode_iatt_set_validate has following code > > > > > > <snippet> > > > if (!iatt || !iatt->ia_ctime) { > > > mdc->ia_time = 0; > > > goto unlock; > > > } > > > </snippet> > > > > > > Which means a NULL iatt sets mdc->ia_time to 0. This results in > > > subsequent > > > lookup/stat calls to be NOT served from md-cache. Instead, the stat is > > > served from backend bricks. So, I don't see an issue here. > > > > > > However, one case where a NULL iatt is different from a valid iatt (which > > > differs from the value stored in md-cache) is that the latter results in > > > a > > > call to inode_invalidate. This invalidation propagates to kernel and all > > > dentry and page cache corresponding to file is purged. So, I am > > > suspecting > > > whether the stale stat you saw was served from kernel cache (not from > > > glusterfs). If this is the case, having mount options > > > "attribute-timeout=0" > > > and "entry-timeout=0" should've helped. > > > > > > I am still at loss to point out the RCA for this issue. > > > > > > > > > > And in same time, the reader(tail) read the "iatt" data, but in case of > > > > the > > > > cache-time is not timeout, it will return the "iatt" data without the > > > > last > > > > change. > > > > > > > > Do your think it is a possible bug? > > > > > > > > Thanks & Best Regards, > > > > George > > > > > > > > -----Original Message----- > > > > From: Raghavendra Gowdappa > > > > [mailto:rgowd...@redhat.com<mailto:rgowd...@redhat.com>] > > > > Sent: Wednesday, October 19, 2016 2:06 PM > > > > To: Lian, George (Nokia - CN/Hangzhou) > > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> > > > > Cc: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>; > > > > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>; > > > > Zhang, Bingxuan (Nokia - > > > > CN/Hangzhou) > > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>; > > > > Zizka, Jan (Nokia - CZ/Prague) > > > > <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > > Subject: Re: [Gluster-devel] Issue about the size of fstat is less than > > > > the > > > > really size of the syslog file > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "George Lian (Nokia - CN/Hangzhou)" > > > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> > > > > > To: "Raghavendra Gowdappa" > > > > > <rgowd...@redhat.com<mailto:rgowd...@redhat.com>> > > > > > Cc: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>, > > > > > "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" > > > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>, > > > > > "Bingxuan Zhang (Nokia > > > > > - CN/Hangzhou)" > > > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>, "Jan > > > > > Zizka > > > > > (Nokia - CZ/Prague)" > > > > > <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > > > Sent: Wednesday, October 19, 2016 10:51:24 AM > > > > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less > > > > > than > > > > > the > > > > > really size of the syslog file > > > > > > > > > > Hi, Raghavendra, > > > > > > > > > > When we disable md-cache(gluster volume set log > > > > > performance.md-cache-timeout > > > > > 0), the issue seems gone. > > > > > (we can't disable with " gluster volume set log performance.md-cache > > > > > off" > > > > > why?) > > > > > > > > Please use > > > > #gluster volume set log performance.stat-prefetch off > > > > > > > > > > > > > > So I double confuse that the code I abstract in last mail maybe have > > > > > some > > > > > issue for this case. > > > > > Could you please share your comments? > > > > > > > > Please find my comments below. > > > > > > > > > > > > > > Thanks & Best Regards, > > > > > George > > > > > > > > > > -----Original Message----- > > > > > From: Lian, George (Nokia - CN/Hangzhou) > > > > > Sent: Friday, October 14, 2016 1:44 PM > > > > > To: 'Raghavendra Gowdappa' > > > > > <rgowd...@redhat.com<mailto:rgowd...@redhat.com>> > > > > > Cc: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>; > > > > > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>; > > > > > Zhang, Bingxuan (Nokia > > > > > - > > > > > CN/Hangzhou) > > > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>; > > > > > Zizka, Jan (Nokia - CZ/Prague) > > > > > <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > > > Subject: RE: [Gluster-devel] Issue about the size of fstat is less > > > > > than > > > > > the > > > > > really size of the syslog file > > > > > > > > > > Hi, Raghavendra, > > > > > > > > > > Our version of GlusterFS is 3.6.9, and I also check the newest code > > > > > of > > > > > main > > > > > branch, the function of " mdc_inode_iatt_set_validate" is almost > > > > > same, > > > > > from > > > > > the following code of this function, > > > > > We could see a "TODO" comments inline, does it mean if we enhance > > > > > write-behind feature, the "iatt" field in callback will be NULL, so > > > > > that > > > > > inode_invalidate will not be called? So the size of file will not > > > > > update > > > > > since "write behind" enabled ? > > > > > Is it the root cause for "tail" application failed with "file > > > > > truncated" > > > > > issue ? > > > > > > > > > > LOCK (&mdc->lock); > > > > > { > > > > > if (!iatt || !iatt->ia_ctime) { > > > > > mdc->ia_time = 0; > > > > > goto unlock; > > > > > } > > > > > > > > > > /* > > > > > * Invalidate the inode if the mtime or ctime has changed > > > > > * and the prebuf doesn't match the value we have cached. > > > > > * TODO: writev returns with a NULL iatt due to > > > > > * performance/write-behind, causing invalidation on > > > > > writes. > > > > > */ > > > > > > > > The issue explained in this comment is hit only when writes are done. > > > > But, > > > > in > > > > your use-case only "tail" is the application running on the mount (If I > > > > am > > > > not wrong, the writer is running on a different mountpoint). So, I > > > > doubt > > > > you are hitting this issue. But, you are saying that the issue goes > > > > away > > > > when write-behind/md-cache is turned off pointing to some interaction > > > > between md-cache and write-behind causing the issue. I need more time > > > > to > > > > look into this issue. Can you file a bug on this? > > > > > > > > > if (IA_ISREG(inode->ia_type) && > > > > > ((iatt->ia_mtime != mdc->md_mtime) || > > > > > (iatt->ia_ctime != mdc->md_ctime))) > > > > > if (!prebuf || (prebuf->ia_ctime != > > > > > mdc->md_ctime) > > > > > || > > > > > (prebuf->ia_mtime != mdc->md_mtime)) > > > > > inode_invalidate(inode); > > > > > > > > > > mdc_from_iatt (mdc, iatt); > > > > > > > > > > time (&mdc->ia_time); > > > > > } > > > > > > > > > > Best Regards, > > > > > George > > > > > -----Original Message----- > > > > > From: Raghavendra Gowdappa > > > > > [mailto:rgowd...@redhat.com<mailto:rgowd...@redhat.com>] > > > > > Sent: Thursday, October 13, 2016 8:58 PM > > > > > To: Lian, George (Nokia - CN/Hangzhou) > > > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> > > > > > Cc: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org>; > > > > > I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS > > > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>; > > > > > Zhang, Bingxuan (Nokia > > > > > - > > > > > CN/Hangzhou) > > > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>; > > > > > Zizka, Jan (Nokia - CZ/Prague) > > > > > <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > > > Subject: Re: [Gluster-devel] Issue about the size of fstat is less > > > > > than > > > > > the > > > > > really size of the syslog file > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > From: "George Lian (Nokia - CN/Hangzhou)" > > > > > > <george.l...@nokia.com<mailto:george.l...@nokia.com>> > > > > > > To: Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org> > > > > > > Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" > > > > > > <i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com<mailto:i_ext_mbb_wcdma_swd3_da1_mat...@internal.nsn.com>>, > > > > > > "Bingxuan Zhang > > > > > > (Nokia > > > > > > - CN/Hangzhou)" > > > > > > <bingxuan.zh...@nokia.com<mailto:bingxuan.zh...@nokia.com>>, "Jan > > > > > > Zizka (Nokia - > > > > > > CZ/Prague)" > > > > > > <jan.zi...@nokia.com<mailto:jan.zi...@nokia.com>> > > > > > > Sent: Thursday, October 13, 2016 2:33:53 PM > > > > > > Subject: [Gluster-devel] Issue about the size of fstat is less than > > > > > > the > > > > > > really size of the syslog file > > > > > > > > > > > > Hi, Dear Expert, > > > > > > We have use glusterfs as a network filesystem, and syslog store in > > > > > > there, > > > > > > some clients on different host may write the syslog file via > > > > > > “glusterfs” > > > > > > mount point. > > > > > > Now we encounter an issue when we “tail” the syslog file, it will > > > > > > occasional > > > > > > failed with error “ file truncated ” > > > > > > As we study and trace with the “tail” source code, it failed with > > > > > > the > > > > > > following code: > > > > > > if ( S_ISREG (mode) && stats.st_size < f[i].size ) > > > > > > { > > > > > > error (0, 0, _("%s: file truncated"), quotef (name)); > > > > > > /* Assume the file was truncated to 0, > > > > > > and therefore output all "new" data. */ > > > > > > xlseek (fd, 0, SEEK_SET, name); > > > > > > f[i].size = 0; > > > > > > } > > > > > > When stats.st_size < f[i].size, what mean the size report by fstat > > > > > > is > > > > > > less > > > > > > than “tail” had read, it lead to “file truncated”, we also use > > > > > > “strace” > > > > > > tools to trace the tail application, the related tail strace log as > > > > > > the > > > > > > below: > > > > > > nanosleep({1, 0}, NULL) = 0 > > > > > > fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0 > > > > > > nanosleep({1, 0}, NULL) = 0 > > > > > > fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0 > > > > > > nanosleep({1, 0}, NULL) = 0 > > > > > > fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0 > > > > > > nanosleep({1, 0}, NULL) = 0 > > > > > > fstat(3, {st_mode=S_IFREG|0644, st_size=192544549, ...}) = 0 > > > > > > read(3, " Data … -"..., 8192) = 1444 > > > > > > read(3, " Data.. "..., 8192) = 720 > > > > > > read(3, "", 8192) = 0 > > > > > > fstat(3, {st_mode=S_IFREG|0644, st_size=192544789, ...}) = 0 > > > > > > write(1, “DATA…..” ) = 2164 > > > > > > write(2, "tail: ", 6tail: ) = 6 > > > > > > write(2, "/mnt/log/master/syslog: file tru"..., > > > > > > 38/mnt/log/master/syslog: > > > > > > file truncated) = 38 > > > > > > as the above strace log, tail has read 1444+720=2164 bytes, > > > > > > but fstat tell “tail” 192544789 – 192543105 = 1664 which less than > > > > > > 2164, > > > > > > so > > > > > > it lead to “tail” application “file truncated”. > > > > > > And if we turn off “write-behind” feature, the issue will not be > > > > > > reproduced > > > > > > any more. > > > > > > > > > > That seems strange. There are no writes happening on the fd/inode > > > > > through > > > > > which tail is reading/stating from. So, it seems strange that > > > > > write-behind > > > > > is involved here. I suspect whether any of > > > > > md-cache/read-ahead/io-cache > > > > > is > > > > > causing the issue. Can you, > > > > > > > > > > 1. Turn off md-cache, read-ahead, io-cache xlators > > > > > 2. mount glusterfs with --attribute-timeout=0 > > > > > 3. set write-behind on > > > > > > > > > > and rerun the tests? If you don't hit the issue, you can experiment > > > > > by > > > > > turning on/off of md-cache, read-ahead and io-cache translators and > > > > > see > > > > > what > > > > > are the minimal number of xlators that need to be turned off to not > > > > > hit > > > > > the > > > > > issue (with write-behind on)? > > > > > > > > > > regards, > > > > > Raghavendra > > > > > > > > > > > So we think it may be related to cache consistence issue due to > > > > > > performance > > > > > > consider, but we still have concern that: > > > > > > The syslog file is used only with “Append” mode, so the size of > > > > > > file > > > > > > shouldn’t be reduced, when a client read the file, why “fstat” > > > > > > can’t > > > > > > return > > > > > > the really size match to the cache? > > > > > > From current investigation, we doubt that the current implement of > > > > > > “glusterfs” has a bug on “fstat” when cache is on. > > > > > > Your comments is our highly appreciated! > > > > > > Thanks & Best Regards > > > > > > George > > > > > > > > > > > > _______________________________________________ > > > > > > Gluster-devel mailing list > > > > > > Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org> > > > > > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel@gluster.org<mailto:Gluster-devel@gluster.org> > > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > -- > > > Pranith > > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel