Re: [Gluster-users] question about info and info.tmp

Atin Mukherjee Thu, 10 Nov 2016 23:50:10 -0800

On Fri, Nov 11, 2016 at 1:15 PM, songxin <songxin_1...@126.com> wrote:


> Hi Atin,
> Thank you for your reply.
> Actually it is very difficult to reproduce because I don't know when there
> was an ongoing commit happening.It is just a coincidence.
> But I want to make sure the root cause.
>

I'll give it a another try and see if this situation can be
simulated/reproduced and will keep you posted.


>
> So I would be grateful if you could answer my questions below.
>
> You said that "This issue is hit at part of the negative testing where
> while gluster volume set was executed at the same point of time glusterd in
> another instance was brought down. In the faulty node we could see
> /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp
> file has the correct contents." in comment.
>
> I have two questions for you.
>
> 1.Could you reproduce this issue by gluster volume set glusterd which was 
> brought down?
> 2.Could you be certain that this issue is cause by rename is interrupted in 
> kernel?
>
> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, 
> are both empty.
> But in my view only one rename can be running at the same time because of the 
> big lock.
> Why there are two files are empty?
>
>
> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
> running in two thread?
>
> Thanks,
> Xin
>
>
> 在 2016-11-11 15:27:03，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>
>
>
> On Fri, Nov 11, 2016 at 12:38 PM, songxin <songxin_1...@126.com> wrote:
>
>>
>> Hi Atin,
>> Thank you for your reply.
>>
>> As you said that the info file can only be changed in the 
>> glusterd_store_volinfo()
>> sequentially because of the big lock.
>>
>> I have found the similar issue as below that you mentioned.
>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487
>>
>
> Great, so this is what I was actually trying to refer in my first email
> that I saw a similar issue. Have you got a chance to look at
> https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your
> case, did you try to bring down glusterd when there was an ongoing commit
> happening?
>
>
>>
>> You said that "This issue is hit at part of the negative testing where
>> while gluster volume set was executed at the same point of time glusterd in
>> another instance was brought down. In the faulty node we could see
>> /var/lib/glusterd/vols/<volname>info file been empty whereas the
>> info.tmp file has the correct contents." in comment.
>>
>> I have two questions for you.
>>
>> 1.Could you reproduce this issue by gluster volume set glusterd which was 
>> brought down?
>> 2.Could you be certain that this issue is cause by rename is interrupted in 
>> kernel?
>>
>> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, 
>> are both empty.
>> But in my view only one rename can be running at the same time because of 
>> the big lock.
>> Why there are two files are empty?
>>
>>
>> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
>> running in two thread?
>>
>> Thanks,
>> Xin
>>
>>
>>
>>
>> 在 2016-11-11 14:36:40，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>>
>>
>>
>> On Fri, Nov 11, 2016 at 8:33 AM, songxin <songxin_1...@126.com> wrote:
>>
>>> Hi Atin,
>>>
>>> Thank you for your reply.
>>> I have two questions for you.
>>>
>>> 1.Are the two files info and info.tmp are only to be created or changed
>>> in function glusterd_store_volinfo()? I did not find other point in which
>>> the two file are changed.
>>>
>>
>> If we are talking about info file volume then yes, the mentioned function
>> actually takes care of it.
>>
>>
>>> 2.I found that glusterd_store_volinfo() will be call in many point by
>>> glusterd.Is there a problem of thread synchronization?If so, one thread may
>>> open a same file info.tmp using O_TRUNC flag when another thread is
>>> writing the info,tmp.Could this case happen?
>>>
>>
>>  In glusterd threads are big lock protected and I don't see a possibility
>> (theoretically) to have two glusterd_store_volinfo () calls at a given
>> point of time.
>>
>>
>>>
>>> Thanks,
>>> Xin
>>>
>>>
>>> At 2016-11-10 21:41:06, "Atin Mukherjee" <amukh...@redhat.com> wrote:
>>>
>>> Did you run out of disk space by any chance? AFAIK, the code is like we
>>> write new stuffs to .tmp file and rename it back to the original file. In
>>> case of a disk space issue I expect both the files to be of non zero size.
>>> But having said that I vaguely remember a similar issue (in the form of a
>>> bug or an email) landed up once but we couldn't reproduce it, so something
>>> is wrong with the atomic update here is what I guess. I'll be glad if you
>>> have a reproducer for the same and then we can dig into it further.
>>>
>>> On Thu, Nov 10, 2016 at 1:32 PM, songxin <songxin_1...@126.com> wrote:
>>>
>>>> Hi,
>>>> When I start the glusterd some error happened.
>>>> And the log is following.
>>>>
>>>> [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main]
>>>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6
>>>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
>>>> [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init]
>>>> 0-management: Maximum allowed open file descriptors set to 65536
>>>> [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init]
>>>> 0-management: Using /system/glusterd as working directory
>>>> [2016-11-08 07:58:35.024508] I [MSGID: 106514]
>>>> [glusterd-store.c:2075:glusterd_restore_op_version] 0-management:
>>>> Upgrade detected. Setting op-version to minimum : 1
>>>> *[2016-11-08 07:58:35.025356] E [MSGID: 106206]
>>>> [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed
>>>> to get next store iter *
>>>> *[2016-11-08 07:58:35.025401] E [MSGID: 106207]
>>>> [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed
>>>> to update volinfo for c_glusterfs volume *
>>>> *[2016-11-08 07:58:35.025463] E [MSGID: 106201]
>>>> [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management:
>>>> Unable to restore volume: c_glusterfs *
>>>> *[2016-11-08 07:58:35.025544] E [MSGID: 101019]
>>>> [xlator.c:428:xlator_init] 0-management: Initialization of volume
>>>> 'management' failed, review your volfile again *
>>>> *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init]
>>>> 0-management: initializing translator failed *
>>>> *[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate]
>>>> 0-graph: init failed *
>>>> [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit]
>>>> (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718]
>>>> -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8]
>>>> -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-:
>>>> received signum (0), shutting down
>>>>
>>>>
>>>> And then I found that the size of vols/volume_name/info is 0.It cause
>>>> glusterd shutdown.
>>>> But I found that vols/volume_name_info.tmp is not 0.
>>>> And I found that there is a brick file vols/volume_name/bricks/xxxx.brick
>>>> is 0, but vols/volume_name/bricks/xxxx.brick.tmp is not 0.
>>>>
>>>> I read the function code glusterd_store_volinfo () in glusterd-store.c
>>>> .
>>>> I know that the info.tmp will be rename to info in function
>>>> glusterd_store_volume_atomic_update().
>>>>
>>>> But my question is that why the info file is 0 but info.tmp is not 0.
>>>>
>>>>
>>>> Thanks,
>>>> Xin
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users@gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> ~ Atin (atinm)
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>



-- 

~ Atin (atinm)

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] question about info and info.tmp

Reply via email to