On Wed, Jan 24, 2018 at 2:24 PM, Pranith Kumar Karampuri < pkara...@redhat.com> wrote:
> hi, > In the same commit you mentioned earlier, there was this code > earlier: > -/* Returns 1 if the stat seems to be filled with zeroes. */ > -int > -nfs_zero_filled_stat (struct iatt *buf) > -{ > - if (!buf) > - return 1; > - > - /* Do not use st_dev because it is transformed to store the > xlator id > - * in place of the device number. Do not use st_ino because by > this time > - * we've already mapped the root ino to 1 so it is not guaranteed > to be > - * 0. > - */ > - if ((buf->ia_nlink == 0) && (buf->ia_ctime == 0)) > - return 1; > - > - return 0; > -} > - > - > > I moved this to a common library function that can be used in afr as well. > Why was it there in NFS? +Niels for answering that question. > > If I give you a patch which will assert the error condition, would it be > possible for you to figure out the first xlator which is unwinding the iatt > with nlink count as zero but ctime as non-zero? > Hey, I went through the code, I think gf_fuse_stat2attr() function could be sending both ctime and nlink count as zero to the kernel like you were mentioning. So I guess we should wait for Niels' answer about the need for marking nlink count as zero. We may need to patch fuse code differently and mark entry, attr valid seconds/nseconds to zero, so that we get a lookup on the entry. > On Wed, Jan 24, 2018 at 1:03 PM, Lian, George (NSB - CN/Hangzhou) < > george.l...@nokia-sbell.com> wrote: > >> Hi, Pranith Kumar, >> >> >> >> Can you tell me while need set buf->ia_nlink to “0”in function >> gf_zero_fill_stat(), which API or Application will care it? >> >> If I remove this line and also update corresponding in function >> gf_is_zero_filled_stat, >> >> The issue seems gone, but I can’t confirm will lead to other issues. >> >> >> >> So could you please double check it and give your comments? >> >> >> >> My change is as the below: >> >> >> >> gf_boolean_t >> >> gf_is_zero_filled_stat (struct iatt *buf) >> >> { >> >> if (!buf) >> >> return 1; >> >> >> >> /* Do not use st_dev because it is transformed to store the >> xlator id >> >> * in place of the device number. Do not use st_ino because by >> this time >> >> * we've already mapped the root ino to 1 so it is not guaranteed >> to be >> >> * 0. >> >> */ >> >> // if ((buf->ia_nlink == 0) && (buf->ia_ctime == 0)) >> >> if (buf->ia_ctime == ) >> >> return 1; >> >> >> >> return 0; >> >> } >> >> >> >> void >> >> gf_zero_fill_stat (struct iatt *buf) >> >> { >> >> // buf->ia_nlink = 0; >> >> buf->ia_ctime = 0; >> >> } >> >> >> >> Thanks & Best Regards >> >> George >> >> *From:* Lian, George (NSB - CN/Hangzhou) >> *Sent:* Friday, January 19, 2018 10:03 AM >> *To:* Pranith Kumar Karampuri <pkara...@redhat.com>; Zhou, Cynthia (NSB >> - CN/Hangzhou) <cynthia.z...@nokia-sbell.com> >> *Cc:* Li, Deqian (NSB - CN/Hangzhou) <deqian...@nokia-sbell.com>; >> Gluster-devel@gluster.org; Sun, Ping (NSB - CN/Hangzhou) < >> ping....@nokia-sbell.com> >> >> *Subject:* RE: [Gluster-devel] a link issue maybe introduced in a bug >> fix " Don't let NFS cache stat after writes" >> >> >> >> Hi, >> >> >>> Cool, this works for me too. Send me a mail off-list once you are >> available and we can figure out a way to get into a call and work on this. >> >> >> >> Have you reproduced the issue per the step I listed in >> https://bugzilla.redhat.com/show_bug.cgi?id=1531457 and last mail? >> >> >> >> If not, I would like you could try it yourself , which the difference >> between yours and mine is just create only 2 bricks instead of 6 bricks. >> >> >> >> And Cynthia could have a session with you if you needed when I am not >> available in next Monday and Tuesday. >> >> >> >> Thanks & Best Regards, >> >> George >> >> >> >> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@ >> gluster.org <gluster-devel-boun...@gluster.org>] *On Behalf Of *Pranith >> Kumar Karampuri >> *Sent:* Thursday, January 18, 2018 6:03 PM >> *To:* Lian, George (NSB - CN/Hangzhou) <george.l...@nokia-sbell.com> >> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>; >> Li, Deqian (NSB - CN/Hangzhou) <deqian...@nokia-sbell.com>; >> Gluster-devel@gluster.org; Sun, Ping (NSB - CN/Hangzhou) < >> ping....@nokia-sbell.com> >> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug >> fix " Don't let NFS cache stat after writes" >> >> >> >> >> >> >> >> On Thu, Jan 18, 2018 at 12:17 PM, Lian, George (NSB - CN/Hangzhou) < >> george.l...@nokia-sbell.com> wrote: >> >> Hi, >> >> >>>I actually tried it with replica-2 and replica-3 and then distributed >> replica-2 before replying to the earlier mail. We can have a debugging >> session if you are okay with it. >> >> >> >> It is fine if you can’t reproduce the issue in your ENV. >> >> And I has attached the detail reproduce log in the Bugzilla FYI >> >> >> >> But I am sorry I maybe OOO at Monday and Tuesday next week, so debug >> session will be fine to me at next Wednesday. >> >> >> >> Cool, this works for me too. Send me a mail off-list once you are >> available and we can figure out a way to get into a call and work on this. >> >> >> >> >> >> >> >> Paste the detail reproduce log FYI here: >> >> *root@ubuntu:~# gluster peer probe ubuntu* >> >> *peer probe: success. Probe on localhost not needed* >> >> *root@ubuntu:~# gluster v create test replica 2 ubuntu:/home/gfs/b1 >> ubuntu:/home/gfs/b2 force* >> >> *volume create: test: success: please start the volume to access data* >> >> *root@ubuntu:~# gluster v start test* >> >> *volume start: test: success* >> >> *root@ubuntu:~# gluster v info test* >> >> >> >> *Volume Name: test* >> >> *Type: Replicate* >> >> *Volume ID: fef5fca3-81d9-46d3-8847-74cde6f701a5* >> >> *Status: Started* >> >> *Snapshot Count: 0* >> >> *Number of Bricks: 1 x 2 = 2* >> >> *Transport-type: tcp* >> >> *Bricks:* >> >> *Brick1: ubuntu:/home/gfs/b1* >> >> *Brick2: ubuntu:/home/gfs/b2* >> >> *Options Reconfigured:* >> >> *transport.address-family: inet* >> >> *nfs.disable: on* >> >> *performance.client-io-threads: off* >> >> *root@ubuntu:~# gluster v status* >> >> *Status of volume: test* >> >> *Gluster process TCP Port RDMA Port Online >> Pid* >> >> >> *------------------------------------------------------------------------------* >> >> *Brick ubuntu:/home/gfs/b1 49152 0 Y >> 7798* >> >> *Brick ubuntu:/home/gfs/b2 49153 0 Y >> 7818* >> >> *Self-heal Daemon on localhost N/A N/A Y >> 7839* >> >> >> >> *Task Status of Volume test* >> >> >> *------------------------------------------------------------------------------* >> >> *There are no active volume tasks* >> >> >> >> >> >> *root@ubuntu:~# gluster v set test cluster.consistent-metadata on* >> >> *volume set: success* >> >> >> >> *root@ubuntu:~# ls /mnt/test* >> >> *ls: cannot access '/mnt/test': No such file or directory* >> >> *root@ubuntu:~# mkdir -p /mnt/test* >> >> *root@ubuntu:~# mount -t glusterfs ubuntu:/test /mnt/test* >> >> >> >> *root@ubuntu:~# cd /mnt/test* >> >> *root@ubuntu:/mnt/test# echo "abc">aaa* >> >> *root@ubuntu:/mnt/test# cp aaa bbb;link bbb ccc* >> >> >> >> *root@ubuntu:/mnt/test# kill -9 7818* >> >> *root@ubuntu:/mnt/test# cp aaa ddd;link ddd eee* >> >> *link: cannot create link 'eee' to 'ddd': No such file or directory* >> >> >> >> >> >> Best Regards, >> >> George >> >> >> >> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@ >> gluster.org] *On Behalf Of *Pranith Kumar Karampuri >> *Sent:* Thursday, January 18, 2018 2:40 PM >> >> >> *To:* Lian, George (NSB - CN/Hangzhou) <george.l...@nokia-sbell.com> >> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>; >> Gluster-devel@gluster.org; Li, Deqian (NSB - CN/Hangzhou) < >> deqian...@nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) < >> ping....@nokia-sbell.com> >> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug >> fix " Don't let NFS cache stat after writes" >> >> >> >> >> >> >> >> On Thu, Jan 18, 2018 at 6:33 AM, Lian, George (NSB - CN/Hangzhou) < >> george.l...@nokia-sbell.com> wrote: >> >> Hi, >> >> I suppose the brick numbers in your testing is six, and you just shut >> down the 3 process. >> >> When I reproduce the issue, I only create a replicate volume with 2 >> bricks, only let ONE brick working and set cluster.consistent-metadata on, >> >> With this 2 test condition, the issue could 100% reproducible. >> >> >> >> Hi, >> >> I actually tried it with replica-2 and replica-3 and then >> distributed replica-2 before replying to the earlier mail. We can have a >> debugging session if you are okay with it. >> >> I am in the middle of a customer issue myself(That is the reason for this >> delay :-( ) and thinking of wrapping it up early next week. Would that be >> fine with you? >> >> >> >> >> >> >> >> >> >> 16:44:28 :) ⚡ gluster v status >> >> Status of volume: r2 >> >> Gluster process TCP Port RDMA Port Online >> Pid >> >> ------------------------------------------------------------ >> ------------------ >> >> Brick localhost.localdomain:/home/gfs/r2_0 49152 0 Y >> 5309 >> >> Brick localhost.localdomain:/home/gfs/r2_1 49154 0 Y >> 5330 >> >> Brick localhost.localdomain:/home/gfs/r2_2 49156 0 Y >> 5351 >> >> Brick localhost.localdomain:/home/gfs/r2_3 49158 0 Y >> 5372 >> >> Brick localhost.localdomain:/home/gfs/r2_4 49159 0 Y >> 5393 >> >> Brick localhost.localdomain:/home/gfs/r2_5 49160 0 Y >> 5414 >> >> Self-heal Daemon on localhost N/A N/A Y >> 5436 >> >> >> >> Task Status of Volume r2 >> >> ------------------------------------------------------------ >> ------------------ >> >> There are no active volume tasks >> >> >> >> root@dhcp35-190 - ~ >> >> 16:44:38 :) ⚡ kill -9 5309 5351 5393 >> >> >> >> Best Regards, >> >> George >> >> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@ >> gluster.org] *On Behalf Of *Pranith Kumar Karampuri >> *Sent:* Wednesday, January 17, 2018 7:27 PM >> *To:* Lian, George (NSB - CN/Hangzhou) <george.l...@nokia-sbell.com> >> *Cc:* Li, Deqian (NSB - CN/Hangzhou) <deqian...@nokia-sbell.com>; >> Gluster-devel@gluster.org; Zhou, Cynthia (NSB - CN/Hangzhou) < >> cynthia.z...@nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) < >> ping....@nokia-sbell.com> >> >> >> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug >> fix " Don't let NFS cache stat after writes" >> >> >> >> >> >> >> >> On Mon, Jan 15, 2018 at 1:55 PM, Pranith Kumar Karampuri < >> pkara...@redhat.com> wrote: >> >> >> >> >> >> On Mon, Jan 15, 2018 at 8:46 AM, Lian, George (NSB - CN/Hangzhou) < >> george.l...@nokia-sbell.com> wrote: >> >> Hi, >> >> >> >> Have you reproduced this issue? If yes, could you please confirm whether >> it is an issue or not? >> >> >> >> Hi, >> >> I tried recreating this on my laptop and on both master and 3.12 >> and I am not able to recreate the issue :-(. >> >> Here is the execution log: https://paste.fedoraproject.or >> g/paste/-csXUKrwsbrZAVW1KzggQQ >> >> Since I was doing this on my laptop, I changed shutting down of the >> replica to killing the brick process to simulate this test. >> >> Let me know if I missed something. >> >> >> >> >> >> Sorry, I am held up with some issue at work, so I think I will get some >> time day after tomorrow to look at this. In the mean time I am adding more >> people who know about afr to see if they get a chance to work on this >> before me. >> >> >> >> >> >> And if it is an issue, do you have any solution for this issue? >> >> >> >> Thanks & Best Regards, >> >> George >> >> >> >> *From:* Lian, George (NSB - CN/Hangzhou) >> *Sent:* Thursday, January 11, 2018 2:01 PM >> *To:* Pranith Kumar Karampuri <pkara...@redhat.com> >> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>; >> Gluster-devel@gluster.org; Li, Deqian (NSB - CN/Hangzhou) < >> deqian...@nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) < >> ping....@nokia-sbell.com> >> *Subject:* RE: [Gluster-devel] a link issue maybe introduced in a bug >> fix " Don't let NFS cache stat after writes" >> >> >> >> Hi, >> >> >> >> Please see detail test step on https://bugzilla.redhat.com/sh >> ow_bug.cgi?id=1531457 >> >> >> >> How reproducible: >> >> >> >> >> >> Steps to Reproduce: >> >> 1.create a volume name "test" with replicated >> >> 2.set volume option cluster.consistent-metadata with on: >> >> gluster v set test cluster.consistent-metadata on >> >> 3. mount volume test on client on /mnt/test >> >> 4. create a file aaa size more than 1 byte >> >> echo "1234567890" >/mnt/test/aaa >> >> 5. shutdown a replicat node, let's say sn-1, only let sn-0 worked >> >> 6. cp /mnt/test/aaa /mnt/test/bbb; link /mnt/test/bbb /mnt/test/ccc >> >> >> >> >> >> BRs >> >> George >> >> >> >> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@ >> gluster.org <gluster-devel-boun...@gluster.org>] *On Behalf Of *Pranith >> Kumar Karampuri >> *Sent:* Thursday, January 11, 2018 12:39 PM >> *To:* Lian, George (NSB - CN/Hangzhou) <george.l...@nokia-sbell.com> >> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>; >> Gluster-devel@gluster.org; Li, Deqian (NSB - CN/Hangzhou) < >> deqian...@nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) < >> ping....@nokia-sbell.com> >> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug >> fix " Don't let NFS cache stat after writes" >> >> >> >> >> >> >> >> On Thu, Jan 11, 2018 at 6:35 AM, Lian, George (NSB - CN/Hangzhou) < >> george.l...@nokia-sbell.com> wrote: >> >> Hi, >> >> >>> In which protocol are you seeing this issue? Fuse/NFS/SMB? >> >> It is fuse, within mountpoint by “mount -t glusterfs …“ command. >> >> >> >> Could you let me know the test you did so that I can try to re-create and >> see what exactly is going on? >> >> Configuration of the volume and the steps to re-create the issue you are >> seeing would be helpful in debugging the issue further. >> >> >> >> >> >> Thanks & Best Regards, >> >> George >> >> >> >> *From:* gluster-devel-boun...@gluster.org [mailto:gluster-devel-bounces@ >> gluster.org] *On Behalf Of *Pranith Kumar Karampuri >> *Sent:* Wednesday, January 10, 2018 8:08 PM >> *To:* Lian, George (NSB - CN/Hangzhou) <george.l...@nokia-sbell.com> >> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.z...@nokia-sbell.com>; >> Zhong, Hua (NSB - CN/Hangzhou) <hua.zh...@nokia-sbell.com>; Li, Deqian >> (NSB - CN/Hangzhou) <deqian...@nokia-sbell.com>; >> Gluster-devel@gluster.org; Sun, Ping (NSB - CN/Hangzhou) < >> ping....@nokia-sbell.com> >> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug >> fix " Don't let NFS cache stat after writes" >> >> >> >> >> >> >> >> On Wed, Jan 10, 2018 at 11:09 AM, Lian, George (NSB - CN/Hangzhou) < >> george.l...@nokia-sbell.com> wrote: >> >> Hi, Pranith Kumar, >> >> >> >> I has create a bug on Bugzilla https://bugzilla.redhat.com/sh >> ow_bug.cgi?id=1531457 >> >> After my investigation for this link issue, I suppose your changes on >> afr-dir-write.c with issue " Don't let NFS cache stat after writes" , your >> fix is like: >> >> -------------------------------------- >> >> if (afr_txn_nothing_failed (frame, this)) { >> >> /*if it did pre-op, it will do post-op changing >> ctime*/ >> >> if (priv->consistent_metadata && >> >> afr_needs_changelog_update (local)) >> >> afr_zero_fill_stat (local); >> >> local->transaction.unwind (frame, this); >> >> } >> >> In the above fix, it set the ia_nlink to ‘0’ if option >> consistent-metadata is set to “on”. >> >> And hard link a file with which just created will lead to an error, and >> the error is caused in kernel function “vfs_link”: >> >> if (inode->i_nlink == 0 && !(inode->i_state & I_LINKABLE)) >> >> error = -ENOENT; >> >> >> >> could you please have a check and give some comments here? >> >> >> >> When stat is "zero filled", understanding is that the higher layer >> protocol doesn't send stat value to the kernel and a separate lookup is >> sent by the kernel to get the latest stat value. In which protocol are you >> seeing this issue? Fuse/NFS/SMB? >> >> >> >> >> >> Thanks & Best Regards, >> >> George >> >> >> >> >> -- >> >> Pranith >> >> >> >> >> -- >> >> Pranith >> >> >> >> >> -- >> >> Pranith >> >> >> >> >> -- >> >> Pranith >> >> >> >> >> -- >> >> Pranith >> >> >> >> >> -- >> >> Pranith >> > > > > -- > Pranith > -- Pranith
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel