Re: [Gluster-users] question about info and info.tmp

Atin Mukherjee Wed, 16 Nov 2016 04:59:48 -0800

On Tue, Nov 15, 2016 at 1:53 PM, songxin <songxin_1...@126.com> wrote:


> ok, thank you.
>
>
>
>
> 在 2016-11-15 16:12:34，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>
>
>
> On Tue, Nov 15, 2016 at 12:47 PM, songxin <songxin_1...@126.com> wrote:
>
>>
>> Hi Atin,
>>
>> I think the root cause is in the function glusterd_import_friend_volume
>> as below.
>>
>> int32_t
>> glusterd_import_friend_volume (dict_t *peer_data, size_t count)
>> {
>> ...
>>         ret = glusterd_volinfo_find (new_volinfo->volname, &old_volinfo);
>>         if (0 == ret) {
>>                 (void) gd_check_and_update_rebalance_info (old_volinfo,
>>                                                            new_volinfo);
>>                 (void) glusterd_delete_stale_volume (old_volinfo,
>> new_volinfo);
>>         }
>> ...
>>         ret = glusterd_store_volinfo (new_volinfo,
>> GLUSTERD_VOLINFO_VER_AC_NONE);
>>         if (ret) {
>>                 gf_msg (this->name, GF_LOG_ERROR, 0,
>>                         GD_MSG_VOLINFO_STORE_FAIL, "Failed to store "
>>                         "volinfo for volume %s", new_volinfo->volname);
>>                 goto out;
>>         }
>> ...
>> }
>>
>> glusterd_delete_stale_volume will remove the info and bricks/* and the
>> glusterd_store_volinfo will create the new one.
>> But if glusterd is killed before rename the info will is empty.
>>
>> And glusterd will start failed because the infois empty in the next time
>> you start the glusterd.
>>
>> Any idea， Atin？
>>
>
> Give me some time, will check it out, but reading at this analysis looks
> very well possible if a volume is changed when the glusterd was done on
> node a and when the same comes up during peer handshake we update the
> volinfo and during that time glusterd goes down once again. I'll confirm it
> by tomorrow.
>
>
I checked the code and it does look like you have got the right RCA for the
issue which you simulated through those two scripts. However this can
happen even when you try to create a fresh volume and while glusterd tries
to write the content into the store and goes down before renaming the
info.tmp file you get into the same situation.

I'd really need to think through if this can be fixed. Suggestions are
always appreciated.



>
> BTW, excellent work Xin!
>
>
>> Thanks,
>> Xin
>>
>>
>> 在 2016-11-15 12:07:05，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>>
>>
>>
>> On Tue, Nov 15, 2016 at 8:58 AM, songxin <songxin_1...@126.com> wrote:
>>
>>> Hi Atin,
>>> I have some clues about this issue.
>>> I could reproduce this issue use the scrip that mentioned in
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487 .
>>>
>>
>> I really appreciate your help in trying to nail down this issue. While I
>> am at your email and going through the code to figure out the possible
>> cause for it, unfortunately I don't see any script in the attachment of the
>> bug.  Could you please cross check?
>>
>>
>>>
>>> After I added some debug print,which like below, in glusterd-store.c and
>>> I found that the /var/lib/glusterd/vols/xxx/info and
>>> /var/lib/glusterd/vols/xxx/bricks/* are removed.
>>> But other files in /var/lib/glusterd/vols/xxx/ will not be remove.
>>>
>>> int32_t
>>> glusterd_store_volinfo (glusterd_volinfo_t *volinfo,
>>> glusterd_volinfo_ver_ac_t ac)
>>> {
>>>         int32_t                 ret = -1;
>>>
>>>         GF_ASSERT (volinfo)
>>>
>>>         ret = access("/var/lib/glusterd/vols/gv0/info", F_OK);
>>>         if(ret < 0)
>>>         {
>>>                 gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not
>>> exit(%d)", errno);
>>>         }
>>>         else
>>>         {
>>>                 ret = stat("/var/lib/glusterd/vols/gv0/info", &buf);
>>>                 if(ret < 0)
>>>                 {
>>>                         gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat
>>> info error");
>>>                 }
>>>                 else
>>>                 {
>>>                         gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info
>>> size is %lu, inode num is %lu", buf.st_size, buf.st_ino);
>>>                 }
>>>         }
>>>
>>>         glusterd_perform_volinfo_version_action (volinfo, ac);
>>>         ret = glusterd_store_create_volume_dir (volinfo);
>>>         if (ret)
>>>                 goto out;
>>>
>>> ...
>>> }
>>>
>>> So it is easy to understand why  the info or 10.32.1.144.-opt-lvmdir-c2-
>>> brick sometimes is empty.
>>> It is becaue the info file is not exist, and it will be create by “fd =
>>> open (path, O_RDWR | O_CREAT | O_APPEND, 0600);” in function
>>> gf_store_handle_new.
>>> And the info file is empty before rename.
>>> So the info file is empty if glusterd shutdown before rename.
>>>
>>>
>>
>>> My question is following.
>>> 1.I did not find the point the info is removed.Could you tell me the
>>> point where the info and /bricks/* are removed?
>>> 2.why the file info and bricks/* is removed?But other files in 
>>> var/lib/glusterd/vols/xxx/
>>> are not be removed?
>>>
>>
>> AFAIK, we never delete the info file and hence this file is opened with
>> O_APPEND flag. As I said I will go back and cross check the code once again.
>>
>>
>>
>>
>>> Thanks,
>>> Xin
>>>
>>>
>>> 在 2016-11-11 20:34:05，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>>>
>>>
>>>
>>> On Fri, Nov 11, 2016 at 4:00 PM, songxin <songxin_1...@126.com> wrote:
>>>
>>>> Hi Atin,
>>>>
>>>> Thank you for your support.
>>>> Sincerely wait for your reply.
>>>>
>>>> By the way, could you make sure that the issue, file info is empty,
>>>> cause by rename is interrupted in kernel?
>>>>
>>>
>>> As per my RCA on that bug, it looked to be.
>>>
>>>
>>>>
>>>> Thanks,
>>>> Xin
>>>>
>>>> 在 2016-11-11 15:49:02，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>>>>
>>>>
>>>>
>>>> On Fri, Nov 11, 2016 at 1:15 PM, songxin <songxin_1...@126.com> wrote:
>>>>
>>>>> Hi Atin,
>>>>> Thank you for your reply.
>>>>> Actually it is very difficult to reproduce because I don't know when there
>>>>> was an ongoing commit happening.It is just a coincidence.
>>>>> But I want to make sure the root cause.
>>>>>
>>>>
>>>> I'll give it a another try and see if this situation can be
>>>> simulated/reproduced and will keep you posted.
>>>>
>>>>
>>>>>
>>>>> So I would be grateful if you could answer my questions below.
>>>>>
>>>>> You said that "This issue is hit at part of the negative testing where
>>>>> while gluster volume set was executed at the same point of time glusterd 
>>>>> in
>>>>> another instance was brought down. In the faulty node we could see
>>>>> /var/lib/glusterd/vols/<volname>info file been empty whereas the
>>>>> info.tmp file has the correct contents." in comment.
>>>>>
>>>>> I have two questions for you.
>>>>>
>>>>> 1.Could you reproduce this issue by gluster volume set glusterd which was 
>>>>> brought down?
>>>>> 2.Could you be certain that this issue is cause by rename is interrupted 
>>>>> in kernel?
>>>>>
>>>>> In my case there are two files, info and 
>>>>> 10.32.1.144.-opt-lvmdir-c2-brick, are both empty.
>>>>> But in my view only one rename can be running at the same time because of 
>>>>> the big lock.
>>>>> Why there are two files are empty?
>>>>>
>>>>>
>>>>> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") 
>>>>> be running in two thread?
>>>>>
>>>>> Thanks,
>>>>> Xin
>>>>>
>>>>>
>>>>> 在 2016-11-11 15:27:03，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Nov 11, 2016 at 12:38 PM, songxin <songxin_1...@126.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> Hi Atin,
>>>>>> Thank you for your reply.
>>>>>>
>>>>>> As you said that the info file can only be changed in the 
>>>>>> glusterd_store_volinfo()
>>>>>> sequentially because of the big lock.
>>>>>>
>>>>>> I have found the similar issue as below that you mentioned.
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487
>>>>>>
>>>>>
>>>>> Great, so this is what I was actually trying to refer in my first
>>>>> email that I saw a similar issue. Have you got a chance to look at
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your
>>>>> case, did you try to bring down glusterd when there was an ongoing commit
>>>>> happening?
>>>>>
>>>>>
>>>>>>
>>>>>> You said that "This issue is hit at part of the negative testing
>>>>>> where while gluster volume set was executed at the same point of time
>>>>>> glusterd in another instance was brought down. In the faulty node we 
>>>>>> could
>>>>>> see /var/lib/glusterd/vols/<volname>info file been empty whereas the
>>>>>> info.tmp file has the correct contents." in comment.
>>>>>>
>>>>>> I have two questions for you.
>>>>>>
>>>>>> 1.Could you reproduce this issue by gluster volume set glusterd which 
>>>>>> was brought down?
>>>>>> 2.Could you be certain that this issue is cause by rename is interrupted 
>>>>>> in kernel?
>>>>>>
>>>>>> In my case there are two files, info and 
>>>>>> 10.32.1.144.-opt-lvmdir-c2-brick, are both empty.
>>>>>> But in my view only one rename can be running at the same time because 
>>>>>> of the big lock.
>>>>>> Why there are two files are empty?
>>>>>>
>>>>>>
>>>>>> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", 
>>>>>> "xxx-brick") be running in two thread?
>>>>>>
>>>>>> Thanks,
>>>>>> Xin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 在 2016-11-11 14:36:40，"Atin Mukherjee" <amukh...@redhat.com> 写道：
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Nov 11, 2016 at 8:33 AM, songxin <songxin_1...@126.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Atin,
>>>>>>>
>>>>>>> Thank you for your reply.
>>>>>>> I have two questions for you.
>>>>>>>
>>>>>>> 1.Are the two files info and info.tmp are only to be created or
>>>>>>> changed in function glusterd_store_volinfo()? I did not find other 
>>>>>>> point in
>>>>>>> which the two file are changed.
>>>>>>>
>>>>>>
>>>>>> If we are talking about info file volume then yes, the mentioned
>>>>>> function actually takes care of it.
>>>>>>
>>>>>>
>>>>>>> 2.I found that glusterd_store_volinfo() will be call in many point
>>>>>>> by glusterd.Is there a problem of thread synchronization?If so, one 
>>>>>>> thread
>>>>>>> may open a same file info.tmp using O_TRUNC flag when another
>>>>>>> thread is writing the info,tmp.Could this case happen?
>>>>>>>
>>>>>>
>>>>>>  In glusterd threads are big lock protected and I don't see a
>>>>>> possibility (theoretically) to have two glusterd_store_volinfo () calls 
>>>>>> at
>>>>>> a given point of time.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Xin
>>>>>>>
>>>>>>>
>>>>>>> At 2016-11-10 21:41:06, "Atin Mukherjee" <amukh...@redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Did you run out of disk space by any chance? AFAIK, the code is like
>>>>>>> we write new stuffs to .tmp file and rename it back to the original 
>>>>>>> file.
>>>>>>> In case of a disk space issue I expect both the files to be of non zero
>>>>>>> size. But having said that I vaguely remember a similar issue (in the 
>>>>>>> form
>>>>>>> of a bug or an email) landed up once but we couldn't reproduce it, so
>>>>>>> something is wrong with the atomic update here is what I guess. I'll be
>>>>>>> glad if you have a reproducer for the same and then we can dig into it
>>>>>>> further.
>>>>>>>
>>>>>>> On Thu, Nov 10, 2016 at 1:32 PM, songxin <songxin_1...@126.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> When I start the glusterd some error happened.
>>>>>>>> And the log is following.
>>>>>>>>
>>>>>>>> [2016-11-08 07:58:34.989365] I [MSGID: 100030]
>>>>>>>> [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running
>>>>>>>> /usr/sbin/glusterd version 3.7.6 (args: /usr/sbin/glusterd -p
>>>>>>>> /var/run/glusterd.pid --log-level INFO)
>>>>>>>> [2016-11-08 07:58:34.998356] I [MSGID: 106478]
>>>>>>>> [glusterd.c:1350:init] 0-management: Maximum allowed open file 
>>>>>>>> descriptors
>>>>>>>> set to 65536
>>>>>>>> [2016-11-08 07:58:35.000667] I [MSGID: 106479]
>>>>>>>> [glusterd.c:1399:init] 0-management: Using /system/glusterd as working
>>>>>>>> directory
>>>>>>>> [2016-11-08 07:58:35.024508] I [MSGID: 106514]
>>>>>>>> [glusterd-store.c:2075:glusterd_restore_op_version] 0-management:
>>>>>>>> Upgrade detected. Setting op-version to minimum : 1
>>>>>>>> *[2016-11-08 07:58:35.025356] E [MSGID: 106206]
>>>>>>>> [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: 
>>>>>>>> Failed
>>>>>>>> to get next store iter *
>>>>>>>> *[2016-11-08 07:58:35.025401] E [MSGID: 106207]
>>>>>>>> [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: 
>>>>>>>> Failed
>>>>>>>> to update volinfo for c_glusterfs volume *
>>>>>>>> *[2016-11-08 07:58:35.025463] E [MSGID: 106201]
>>>>>>>> [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management:
>>>>>>>> Unable to restore volume: c_glusterfs *
>>>>>>>> *[2016-11-08 07:58:35.025544] E [MSGID: 101019]
>>>>>>>> [xlator.c:428:xlator_init] 0-management: Initialization of volume
>>>>>>>> 'management' failed, review your volfile again *
>>>>>>>> *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init]
>>>>>>>> 0-management: initializing translator failed *
>>>>>>>> *[2016-11-08 07:58:35.025629] E
>>>>>>>> [graph.c:661:glusterfs_graph_activate] 0-graph: init failed *
>>>>>>>> [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit]
>>>>>>>> (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260)
>>>>>>>> [0x1000a718] -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8)
>>>>>>>> [0x1000a5a8] -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c)
>>>>>>>> [0x100098bc] ) 0-: received signum (0), shutting down
>>>>>>>>
>>>>>>>>
>>>>>>>> And then I found that the size of vols/volume_name/info is 0.It
>>>>>>>> cause glusterd shutdown.
>>>>>>>> But I found that vols/volume_name_info.tmp is not 0.
>>>>>>>> And I found that there is a brick file 
>>>>>>>> vols/volume_name/bricks/xxxx.brick
>>>>>>>> is 0, but vols/volume_name/bricks/xxxx.brick.tmp is not 0.
>>>>>>>>
>>>>>>>> I read the function code glusterd_store_volinfo () in
>>>>>>>> glusterd-store.c .
>>>>>>>> I know that the info.tmp will be rename to info in function
>>>>>>>> glusterd_store_volume_atomic_update().
>>>>>>>>
>>>>>>>> But my question is that why the info file is 0 but info.tmp is not
>>>>>>>> 0.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Xin
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users@gluster.org
>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> ~ Atin (atinm)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ~ Atin (atinm)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ~ Atin (atinm)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~ Atin (atinm)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> ~ Atin (atinm)
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>>
>> ~ Atin (atinm)
>>
>>
>>
>>
>>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>



-- 

~ Atin (atinm)

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] question about info and info.tmp

Reply via email to