Hi Atin,
Thank you for your reply.


As you said that the info file can only be changed in the 
glusterd_store_volinfo() sequentially because of the big lock.


I have found the similar issue as below that you mentioned. 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487


You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp file 
has the correct contents." in comment.

I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this issue is cause by rename is interrupted in 
kernel?

In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are 
both empty.
But in my view only one rename can be running at the same time because of the 
big lock.
Why there are two files are empty?


Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
running in two thread?

Thanks,
Xin




在 2016-11-11 14:36:40,"Atin Mukherjee" <amukh...@redhat.com> 写道:





On Fri, Nov 11, 2016 at 8:33 AM, songxin <songxin_1...@126.com> wrote:

Hi Atin,


Thank you for your reply.
I have two questions for you.


1.Are the two files info and info.tmp are only to be created or changed in 
function glusterd_store_volinfo()? I did not find other point in which the two 
file are changed.


If we are talking about info file volume then yes, the mentioned function 
actually takes care of it.
 

2.I found that glusterd_store_volinfo() will be call in many point by 
glusterd.Is there a problem of thread synchronization?If so, one thread may 
open a same file info.tmp using O_TRUNC flag when another thread is writing the 
info,tmp.Could this case happen?


 In glusterd threads are big lock protected and I don't see a possibility 
(theoretically) to have two glusterd_store_volinfo () calls at a given point of 
time.
 



Thanks,
Xin



At 2016-11-10 21:41:06, "Atin Mukherjee" <amukh...@redhat.com> wrote:

Did you run out of disk space by any chance? AFAIK, the code is like we write 
new stuffs to .tmp file and rename it back to the original file. In case of a 
disk space issue I expect both the files to be of non zero size. But having 
said that I vaguely remember a similar issue (in the form of a bug or an email) 
landed up once but we couldn't reproduce it, so something is wrong with the 
atomic update here is what I guess. I'll be glad if you have a reproducer for 
the same and then we can dig into it further.



On Thu, Nov 10, 2016 at 1:32 PM, songxin <songxin_1...@126.com> wrote:

Hi,
When I start the glusterd some error happened.
And the log is following.

[2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] 
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: 
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) 
[2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] 
0-management: Maximum allowed open file descriptors set to 65536 
[2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] 
0-management: Using /system/glusterd as working directory
[2016-11-08 07:58:35.024508] I [MSGID: 106514] 
[glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade 
detected. Setting op-version to minimum : 1 
[2016-11-08 07:58:35.025356] E [MSGID: 106206] 
[glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed to 
get next store iter 
[2016-11-08 07:58:35.025401] E [MSGID: 106207] 
[glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed to 
update volinfo for c_glusterfs volume 
[2016-11-08 07:58:35.025463] E [MSGID: 106201] 
[glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to 
restore volume: c_glusterfs 
[2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init] 
0-management: Initialization of volume 'management' failed, review your volfile 
again 
[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] 0-management: 
initializing translator failed 
[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] 0-graph: 
init failed 
[2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] 
(-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] 
-->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] 
-->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received 
signum (0), shutting down 




And then I found that the size of vols/volume_name/info is 0.It cause glusterd 
shutdown.
But I found that vols/volume_name_info.tmp is not 0.
And I found that there is a brick file vols/volume_name/bricks/xxxx.brick is 0, 
but vols/volume_name/bricks/xxxx.brick.tmp is not 0.


I read the function code glusterd_store_volinfo () in glusterd-store.c .
I know that the info.tmp will be rename to info in function 
glusterd_store_volume_atomic_update().


But my question is that why the info file is 0 but info.tmp is not 0.




Thanks,
Xin




 


_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users




--



~ Atin (atinm)





 




--



~ Atin (atinm)
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to