Re: [Gluster-users] question about info and info.tmp

2016-11-10 Thread Atin Mukherjee
On Fri, Nov 11, 2016 at 1:15 PM, songxin  wrote:

> Hi Atin,
> Thank you for your reply.
> Actually it is very difficult to reproduce because I don't know when there
> was an ongoing commit happening.It is just a coincidence.
> But I want to make sure the root cause.
>

I'll give it a another try and see if this situation can be
simulated/reproduced and will keep you posted.


>
> So I would be grateful if you could answer my questions below.
>
> You said that "This issue is hit at part of the negative testing where
> while gluster volume set was executed at the same point of time glusterd in
> another instance was brought down. In the faulty node we could see
> /var/lib/glusterd/vols/info file been empty whereas the info.tmp
> file has the correct contents." in comment.
>
> I have two questions for you.
>
> 1.Could you reproduce this issue by gluster volume set glusterd which was 
> brought down?
> 2.Could you be certain that this issue is cause by rename is interrupted in 
> kernel?
>
> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, 
> are both empty.
> But in my view only one rename can be running at the same time because of the 
> big lock.
> Why there are two files are empty?
>
>
> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
> running in two thread?
>
> Thanks,
> Xin
>
>
> 在 2016-11-11 15:27:03,"Atin Mukherjee"  写道:
>
>
>
> On Fri, Nov 11, 2016 at 12:38 PM, songxin  wrote:
>
>>
>> Hi Atin,
>> Thank you for your reply.
>>
>> As you said that the info file can only be changed in the 
>> glusterd_store_volinfo()
>> sequentially because of the big lock.
>>
>> I have found the similar issue as below that you mentioned.
>> https://bugzilla.redhat.com/show_bug.cgi?id=1308487
>>
>
> Great, so this is what I was actually trying to refer in my first email
> that I saw a similar issue. Have you got a chance to look at
> https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your
> case, did you try to bring down glusterd when there was an ongoing commit
> happening?
>
>
>>
>> You said that "This issue is hit at part of the negative testing where
>> while gluster volume set was executed at the same point of time glusterd in
>> another instance was brought down. In the faulty node we could see
>> /var/lib/glusterd/vols/info file been empty whereas the
>> info.tmp file has the correct contents." in comment.
>>
>> I have two questions for you.
>>
>> 1.Could you reproduce this issue by gluster volume set glusterd which was 
>> brought down?
>> 2.Could you be certain that this issue is cause by rename is interrupted in 
>> kernel?
>>
>> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, 
>> are both empty.
>> But in my view only one rename can be running at the same time because of 
>> the big lock.
>> Why there are two files are empty?
>>
>>
>> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
>> running in two thread?
>>
>> Thanks,
>> Xin
>>
>>
>>
>>
>> 在 2016-11-11 14:36:40,"Atin Mukherjee"  写道:
>>
>>
>>
>> On Fri, Nov 11, 2016 at 8:33 AM, songxin  wrote:
>>
>>> Hi Atin,
>>>
>>> Thank you for your reply.
>>> I have two questions for you.
>>>
>>> 1.Are the two files info and info.tmp are only to be created or changed
>>> in function glusterd_store_volinfo()? I did not find other point in which
>>> the two file are changed.
>>>
>>
>> If we are talking about info file volume then yes, the mentioned function
>> actually takes care of it.
>>
>>
>>> 2.I found that glusterd_store_volinfo() will be call in many point by
>>> glusterd.Is there a problem of thread synchronization?If so, one thread may
>>> open a same file info.tmp using O_TRUNC flag when another thread is
>>> writing the info,tmp.Could this case happen?
>>>
>>
>>  In glusterd threads are big lock protected and I don't see a possibility
>> (theoretically) to have two glusterd_store_volinfo () calls at a given
>> point of time.
>>
>>
>>>
>>> Thanks,
>>> Xin
>>>
>>>
>>> At 2016-11-10 21:41:06, "Atin Mukherjee"  wrote:
>>>
>>> Did you run out of disk space by any chance? AFAIK, the code is like we
>>> write new stuffs to .tmp file and rename it back to the original file. In
>>> case of a disk space issue I expect both the files to be of non zero size.
>>> But having said that I vaguely remember a similar issue (in the form of a
>>> bug or an email) landed up once but we couldn't reproduce it, so something
>>> is wrong with the atomic update here is what I guess. I'll be glad if you
>>> have a reproducer for the same and then we can dig into it further.
>>>
>>> On Thu, Nov 10, 2016 at 1:32 PM, songxin  wrote:
>>>
 Hi,
 When I start the glusterd some error happened.
 And the log is following.

 [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main]
 

Re: [Gluster-users] question about info and info.tmp

2016-11-10 Thread songxin
Hi Atin,
Thank you for your reply.
Actually it is very difficult to reproduce because I don't know when there was 
an ongoing commit happening.It is just a coincidence.
But I want to make sure the root cause.


So I would be grateful if you could answer my questions below.


You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct contents." in comment.
I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this issue is cause by rename is interrupted in 
kernel?
In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are 
both empty.
But in my view only one rename can be running at the same time because of the 
big lock.
Why there are two files are empty?


Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
running in two thread?
Thanks,
Xin



在 2016-11-11 15:27:03,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 12:38 PM, songxin  wrote:



Hi Atin,
Thank you for your reply.


As you said that the info file can only be changed in the 
glusterd_store_volinfo() sequentially because of the big lock.


I have found the similar issue as below that you mentioned. 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487


Great, so this is what I was actually trying to refer in my first email that I 
saw a similar issue. Have you got a chance to look at 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your case, did 
you try to bring down glusterd when there was an ongoing commit happening?
 



You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct contents." in comment.
I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this issue is cause by rename is interrupted in 
kernel?
In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are 
both empty.
But in my view only one rename can be running at the same time because of the 
big lock.
Why there are two files are empty?


Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
running in two thread?

Thanks,
Xin




在 2016-11-11 14:36:40,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 8:33 AM, songxin  wrote:

Hi Atin,


Thank you for your reply.
I have two questions for you.


1.Are the two files info and info.tmp are only to be created or changed in 
function glusterd_store_volinfo()? I did not find other point in which the two 
file are changed.


If we are talking about info file volume then yes, the mentioned function 
actually takes care of it.
 

2.I found that glusterd_store_volinfo() will be call in many point by 
glusterd.Is there a problem of thread synchronization?If so, one thread may 
open a same file info.tmp using O_TRUNC flag when another thread is writing the 
info,tmp.Could this case happen?


 In glusterd threads are big lock protected and I don't see a possibility 
(theoretically) to have two glusterd_store_volinfo () calls at a given point of 
time.
 



Thanks,
Xin



At 2016-11-10 21:41:06, "Atin Mukherjee"  wrote:

Did you run out of disk space by any chance? AFAIK, the code is like we write 
new stuffs to .tmp file and rename it back to the original file. In case of a 
disk space issue I expect both the files to be of non zero size. But having 
said that I vaguely remember a similar issue (in the form of a bug or an email) 
landed up once but we couldn't reproduce it, so something is wrong with the 
atomic update here is what I guess. I'll be glad if you have a reproducer for 
the same and then we can dig into it further.



On Thu, Nov 10, 2016 at 1:32 PM, songxin  wrote:

Hi,
When I start the glusterd some error happened.
And the log is following.

[2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] 
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: 
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) 
[2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] 
0-management: Maximum allowed open file descriptors set to 65536 
[2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] 
0-management: Using /system/glusterd as working directory
[2016-11-08 07:58:35.024508] I [MSGID: 106514] 
[glusterd-store.c:2075:glusterd_restore_op_version] 

Re: [Gluster-users] getting "Transport endpoint is not connected" in glusterfs mount log file.

2016-11-10 Thread ABHISHEK PALIWAL
Hi Pranith,

Could you please tell tell me the logs showing that the mount is not
available to connect to both the bricks.

On Fri, Nov 11, 2016 at 12:05 PM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

> As per the logs, the mount is not able to connect to both the bricks. Are
> the connections fine?
>
> On Fri, Nov 11, 2016 at 10:20 AM, ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>> Hi,
>>
>> Its an urgent case.
>>
>> Atleast provide your views on this
>>
>> On Wed, Nov 9, 2016 at 11:08 AM, ABHISHEK PALIWAL <
>> abhishpali...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We could see that sync is getting failed to sync the GlusterFS bricks
>>> due to error trace "Transport endpoint is not connected "
>>>
>>> [2016-10-31 04:06:03.627395] E [MSGID: 114031]
>>> [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
>>> remote operation failed [Transport endpoint is not connected]
>>> [2016-10-31 04:06:03.628381] I [socket.c:3308:socket_submit_request]
>>> 0-c_glusterfs-client-9: not connected (priv->connected = 0)
>>> [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit]
>>> 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f Program:
>>> GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport
>>> (c_glusterfs-client-9)
>>> [2016-10-31 04:06:03.628466] E [MSGID: 114031]
>>> [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
>>> remote operation failed [Transport endpoint is not connected]
>>> [2016-10-31 04:06:03.628475] I [MSGID: 108019]
>>> [afr-lk-common.c:1086:afr_lock_blocking] 0-c_glusterfs-replicate-0:
>>> unable to lock on even one child
>>> [2016-10-31 04:06:03.628539] I [MSGID: 108019]
>>> [afr-transaction.c:1224:afr_post_blocking_inodelk_cbk]
>>> 0-c_glusterfs-replicate-0: Blocking inodelks failed.
>>> [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk]
>>> 0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is not
>>> connected)
>>> [2016-10-31 04:06:03.629149] E [rpc-clnt.c:362:saved_frames_unwind]
>>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58]
>>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90]
>>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10]
>>> (--> 
>>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18]
>>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808]
>>> ) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3)
>>> op(LOOKUP(27)) called at 2016-10-31 04:06:03.624346 (xid=0x7f5a)
>>> [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig]
>>> 0-c_glusterfs-client-9: changing port to 49391 (from 0)
>>> [2016-10-31 04:06:03.629210] W [MSGID: 114031]
>>> [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9:
>>> remote operation failed. Path: 
>>> /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews
>>> (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not
>>> connected]
>>>
>>>
>>> Could you please tell us the reason why we are getting these trace and
>>> how to resolve this.
>>>
>>> Logs are attached here please share your analysis.
>>>
>>> Thanks in advanced
>>>
>>> --
>>> Regards
>>> Abhishek Paliwal
>>>
>>
>>
>>
>> --
>>
>>
>>
>>
>> Regards
>> Abhishek Paliwal
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> --
> Pranith
>



-- 




Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] question about info and info.tmp

2016-11-10 Thread Atin Mukherjee
On Fri, Nov 11, 2016 at 12:38 PM, songxin  wrote:

>
> Hi Atin,
> Thank you for your reply.
>
> As you said that the info file can only be changed in the 
> glusterd_store_volinfo()
> sequentially because of the big lock.
>
> I have found the similar issue as below that you mentioned.
> https://bugzilla.redhat.com/show_bug.cgi?id=1308487
>

Great, so this is what I was actually trying to refer in my first email
that I saw a similar issue. Have you got a chance to look at
https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your case,
did you try to bring down glusterd when there was an ongoing commit
happening?


>
> You said that "This issue is hit at part of the negative testing where
> while gluster volume set was executed at the same point of time glusterd in
> another instance was brought down. In the faulty node we could see
> /var/lib/glusterd/vols/info file been empty whereas the info.tmp
> file has the correct contents." in comment.
>
> I have two questions for you.
>
> 1.Could you reproduce this issue by gluster volume set glusterd which was 
> brought down?
> 2.Could you be certain that this issue is cause by rename is interrupted in 
> kernel?
>
> In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, 
> are both empty.
> But in my view only one rename can be running at the same time because of the 
> big lock.
> Why there are two files are empty?
>
>
> Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
> running in two thread?
>
> Thanks,
> Xin
>
>
>
>
> 在 2016-11-11 14:36:40,"Atin Mukherjee"  写道:
>
>
>
> On Fri, Nov 11, 2016 at 8:33 AM, songxin  wrote:
>
>> Hi Atin,
>>
>> Thank you for your reply.
>> I have two questions for you.
>>
>> 1.Are the two files info and info.tmp are only to be created or changed
>> in function glusterd_store_volinfo()? I did not find other point in which
>> the two file are changed.
>>
>
> If we are talking about info file volume then yes, the mentioned function
> actually takes care of it.
>
>
>> 2.I found that glusterd_store_volinfo() will be call in many point by
>> glusterd.Is there a problem of thread synchronization?If so, one thread may
>> open a same file info.tmp using O_TRUNC flag when another thread is
>> writing the info,tmp.Could this case happen?
>>
>
>  In glusterd threads are big lock protected and I don't see a possibility
> (theoretically) to have two glusterd_store_volinfo () calls at a given
> point of time.
>
>
>>
>> Thanks,
>> Xin
>>
>>
>> At 2016-11-10 21:41:06, "Atin Mukherjee"  wrote:
>>
>> Did you run out of disk space by any chance? AFAIK, the code is like we
>> write new stuffs to .tmp file and rename it back to the original file. In
>> case of a disk space issue I expect both the files to be of non zero size.
>> But having said that I vaguely remember a similar issue (in the form of a
>> bug or an email) landed up once but we couldn't reproduce it, so something
>> is wrong with the atomic update here is what I guess. I'll be glad if you
>> have a reproducer for the same and then we can dig into it further.
>>
>> On Thu, Nov 10, 2016 at 1:32 PM, songxin  wrote:
>>
>>> Hi,
>>> When I start the glusterd some error happened.
>>> And the log is following.
>>>
>>> [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main]
>>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6
>>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
>>> [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init]
>>> 0-management: Maximum allowed open file descriptors set to 65536
>>> [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init]
>>> 0-management: Using /system/glusterd as working directory
>>> [2016-11-08 07:58:35.024508] I [MSGID: 106514]
>>> [glusterd-store.c:2075:glusterd_restore_op_version] 0-management:
>>> Upgrade detected. Setting op-version to minimum : 1
>>> *[2016-11-08 07:58:35.025356] E [MSGID: 106206]
>>> [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed
>>> to get next store iter *
>>> *[2016-11-08 07:58:35.025401] E [MSGID: 106207]
>>> [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed
>>> to update volinfo for c_glusterfs volume *
>>> *[2016-11-08 07:58:35.025463] E [MSGID: 106201]
>>> [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management:
>>> Unable to restore volume: c_glusterfs *
>>> *[2016-11-08 07:58:35.025544] E [MSGID: 101019]
>>> [xlator.c:428:xlator_init] 0-management: Initialization of volume
>>> 'management' failed, review your volfile again *
>>> *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init]
>>> 0-management: initializing translator failed *
>>> *[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate]
>>> 0-graph: init failed *
>>> [2016-11-08 07:58:35.026109] W 

Re: [Gluster-users] getting "Transport endpoint is not connected" in glusterfs mount log file.

2016-11-10 Thread ABHISHEK PALIWAL
Hi Rafi KC,

I have already attached all the logs in my first mail and I am getting
these logs on 25th board.

You can find the logs at
logs/d/usr/002500_glusterfiles/varlog_glusterfs/brick/


//Abhishek

On Fri, Nov 11, 2016 at 11:50 AM, Mohammed Rafi K C 
wrote:

> Hi Abhishek,
>
> Could you please see if you are bricks are healthy or not, may be you can
> do a gluster volume status or you can look into the logs. If bricks are not
> running can you please attach the bricks logs in /var/log/gluster/bricks/ .
>
>
> Rafi KC
>
> On 11/11/2016 10:20 AM, ABHISHEK PALIWAL wrote:
>
> Hi,
>
> Its an urgent case.
>
> Atleast provide your views on this
>
> On Wed, Nov 9, 2016 at 11:08 AM, ABHISHEK PALIWAL <
> abhishpali...@gmail.com> wrote:
>
>> Hi,
>>
>> We could see that sync is getting failed to sync the GlusterFS bricks due
>> to error trace "Transport endpoint is not connected "
>>
>> [2016-10-31 04:06:03.627395] E [MSGID: 114031]
>> [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
>> remote operation failed [Transport endpoint is not connected]
>> [2016-10-31 04:06:03.628381] I [socket.c:3308:socket_submit_request]
>> 0-c_glusterfs-client-9: not connected (priv->connected = 0)
>> [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit]
>> 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f Program:
>> GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport
>> (c_glusterfs-client-9)
>> [2016-10-31 04:06:03.628466] E [MSGID: 114031]
>> [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
>> remote operation failed [Transport endpoint is not connected]
>> [2016-10-31 04:06:03.628475] I [MSGID: 108019]
>> [afr-lk-common.c:1086:afr_lock_blocking] 0-c_glusterfs-replicate-0:
>> unable to lock on even one child
>> [2016-10-31 04:06:03.628539] I [MSGID: 108019]
>> [afr-transaction.c:1224:afr_post_blocking_inodelk_cbk]
>> 0-c_glusterfs-replicate-0: Blocking inodelks failed.
>> [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk]
>> 0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is not
>> connected)
>> [2016-10-31 04:06:03.629149] E [rpc-clnt.c:362:saved_frames_unwind] (-->
>> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10]
>> (--> 
>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18]
>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808]
>> ) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3)
>> op(LOOKUP(27)) called at 2016-10-31 04:06:03.624346 (xid=0x7f5a)
>> [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig]
>> 0-c_glusterfs-client-9: changing port to 49391 (from 0)
>> [2016-10-31 04:06:03.629210] W [MSGID: 114031]
>> [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9:
>> remote operation failed. Path: 
>> /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews
>> (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not
>> connected]
>>
>>
>> Could you please tell us the reason why we are getting these trace and
>> how to resolve this.
>>
>> Logs are attached here please share your analysis.
>>
>> Thanks in advanced
>>
>> --
>> Regards
>> Abhishek Paliwal
>>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
>
>


-- 




Regards
Abhishek Paliwal
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] question about info and info.tmp

2016-11-10 Thread songxin


Hi Atin,
Thank you for your reply.


As you said that the info file can only be changed in the 
glusterd_store_volinfo() sequentially because of the big lock.


I have found the similar issue as below that you mentioned. 
https://bugzilla.redhat.com/show_bug.cgi?id=1308487


You said that "This issue is hit at part of the negative testing where while 
gluster volume set was executed at the same point of time glusterd in another 
instance was brought down. In the faulty node we could see 
/var/lib/glusterd/vols/info file been empty whereas the info.tmp file 
has the correct contents." in comment.

I have two questions for you.

1.Could you reproduce this issue by gluster volume set glusterd which was 
brought down?
2.Could you be certain that this issue is cause by rename is interrupted in 
kernel?

In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are 
both empty.
But in my view only one rename can be running at the same time because of the 
big lock.
Why there are two files are empty?


Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be 
running in two thread?

Thanks,
Xin




在 2016-11-11 14:36:40,"Atin Mukherjee"  写道:





On Fri, Nov 11, 2016 at 8:33 AM, songxin  wrote:

Hi Atin,


Thank you for your reply.
I have two questions for you.


1.Are the two files info and info.tmp are only to be created or changed in 
function glusterd_store_volinfo()? I did not find other point in which the two 
file are changed.


If we are talking about info file volume then yes, the mentioned function 
actually takes care of it.
 

2.I found that glusterd_store_volinfo() will be call in many point by 
glusterd.Is there a problem of thread synchronization?If so, one thread may 
open a same file info.tmp using O_TRUNC flag when another thread is writing the 
info,tmp.Could this case happen?


 In glusterd threads are big lock protected and I don't see a possibility 
(theoretically) to have two glusterd_store_volinfo () calls at a given point of 
time.
 



Thanks,
Xin



At 2016-11-10 21:41:06, "Atin Mukherjee"  wrote:

Did you run out of disk space by any chance? AFAIK, the code is like we write 
new stuffs to .tmp file and rename it back to the original file. In case of a 
disk space issue I expect both the files to be of non zero size. But having 
said that I vaguely remember a similar issue (in the form of a bug or an email) 
landed up once but we couldn't reproduce it, so something is wrong with the 
atomic update here is what I guess. I'll be glad if you have a reproducer for 
the same and then we can dig into it further.



On Thu, Nov 10, 2016 at 1:32 PM, songxin  wrote:

Hi,
When I start the glusterd some error happened.
And the log is following.

[2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] 
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: 
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) 
[2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] 
0-management: Maximum allowed open file descriptors set to 65536 
[2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] 
0-management: Using /system/glusterd as working directory
[2016-11-08 07:58:35.024508] I [MSGID: 106514] 
[glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade 
detected. Setting op-version to minimum : 1 
[2016-11-08 07:58:35.025356] E [MSGID: 106206] 
[glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed to 
get next store iter 
[2016-11-08 07:58:35.025401] E [MSGID: 106207] 
[glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed to 
update volinfo for c_glusterfs volume 
[2016-11-08 07:58:35.025463] E [MSGID: 106201] 
[glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to 
restore volume: c_glusterfs 
[2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init] 
0-management: Initialization of volume 'management' failed, review your volfile 
again 
[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] 0-management: 
initializing translator failed 
[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] 0-graph: 
init failed 
[2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] 
(-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] 
-->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] 
-->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received 
signum (0), shutting down 




And then I found that the size of vols/volume_name/info is 0.It cause glusterd 
shutdown.
But I found that vols/volume_name_info.tmp is not 0.
And I found that there is a brick file vols/volume_name/bricks/.brick is 0, 
but vols/volume_name/bricks/.brick.tmp is not 0.


I read the function code glusterd_store_volinfo () in glusterd-store.c .
I know that 

Re: [Gluster-users] question about info and info.tmp

2016-11-10 Thread Atin Mukherjee
On Fri, Nov 11, 2016 at 8:33 AM, songxin  wrote:

> Hi Atin,
>
> Thank you for your reply.
> I have two questions for you.
>
> 1.Are the two files info and info.tmp are only to be created or changed in
> function glusterd_store_volinfo()? I did not find other point in which the
> two file are changed.
>

If we are talking about info file volume then yes, the mentioned function
actually takes care of it.


> 2.I found that glusterd_store_volinfo() will be call in many point by
> glusterd.Is there a problem of thread synchronization?If so, one thread may
> open a same file info.tmp using O_TRUNC flag when another thread is
> writing the info,tmp.Could this case happen?
>

 In glusterd threads are big lock protected and I don't see a possibility
(theoretically) to have two glusterd_store_volinfo () calls at a given
point of time.


>
> Thanks,
> Xin
>
>
> At 2016-11-10 21:41:06, "Atin Mukherjee"  wrote:
>
> Did you run out of disk space by any chance? AFAIK, the code is like we
> write new stuffs to .tmp file and rename it back to the original file. In
> case of a disk space issue I expect both the files to be of non zero size.
> But having said that I vaguely remember a similar issue (in the form of a
> bug or an email) landed up once but we couldn't reproduce it, so something
> is wrong with the atomic update here is what I guess. I'll be glad if you
> have a reproducer for the same and then we can dig into it further.
>
> On Thu, Nov 10, 2016 at 1:32 PM, songxin  wrote:
>
>> Hi,
>> When I start the glusterd some error happened.
>> And the log is following.
>>
>> [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main]
>> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6
>> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
>> [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init]
>> 0-management: Maximum allowed open file descriptors set to 65536
>> [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init]
>> 0-management: Using /system/glusterd as working directory
>> [2016-11-08 07:58:35.024508] I [MSGID: 106514]
>> [glusterd-store.c:2075:glusterd_restore_op_version] 0-management:
>> Upgrade detected. Setting op-version to minimum : 1
>> *[2016-11-08 07:58:35.025356] E [MSGID: 106206]
>> [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed
>> to get next store iter *
>> *[2016-11-08 07:58:35.025401] E [MSGID: 106207]
>> [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed
>> to update volinfo for c_glusterfs volume *
>> *[2016-11-08 07:58:35.025463] E [MSGID: 106201]
>> [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management:
>> Unable to restore volume: c_glusterfs *
>> *[2016-11-08 07:58:35.025544] E [MSGID: 101019]
>> [xlator.c:428:xlator_init] 0-management: Initialization of volume
>> 'management' failed, review your volfile again *
>> *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init]
>> 0-management: initializing translator failed *
>> *[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate]
>> 0-graph: init failed *
>> [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit]
>> (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718]
>> -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8]
>> -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-:
>> received signum (0), shutting down
>>
>>
>> And then I found that the size of vols/volume_name/info is 0.It cause
>> glusterd shutdown.
>> But I found that vols/volume_name_info.tmp is not 0.
>> And I found that there is a brick file vols/volume_name/bricks/.brick
>> is 0, but vols/volume_name/bricks/.brick.tmp is not 0.
>>
>> I read the function code glusterd_store_volinfo () in glusterd-store.c .
>> I know that the info.tmp will be rename to info in function
>> glusterd_store_volume_atomic_update().
>>
>> But my question is that why the info file is 0 but info.tmp is not 0.
>>
>>
>> Thanks,
>> Xin
>>
>>
>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
>
> --
>
> ~ Atin (atinm)
>
>
>
>
>



-- 

~ Atin (atinm)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] getting "Transport endpoint is not connected" in glusterfs mount log file.

2016-11-10 Thread Pranith Kumar Karampuri
As per the logs, the mount is not able to connect to both the bricks. Are
the connections fine?

On Fri, Nov 11, 2016 at 10:20 AM, ABHISHEK PALIWAL 
wrote:

> Hi,
>
> Its an urgent case.
>
> Atleast provide your views on this
>
> On Wed, Nov 9, 2016 at 11:08 AM, ABHISHEK PALIWAL  > wrote:
>
>> Hi,
>>
>> We could see that sync is getting failed to sync the GlusterFS bricks due
>> to error trace "Transport endpoint is not connected "
>>
>> [2016-10-31 04:06:03.627395] E [MSGID: 114031]
>> [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
>> remote operation failed [Transport endpoint is not connected]
>> [2016-10-31 04:06:03.628381] I [socket.c:3308:socket_submit_request]
>> 0-c_glusterfs-client-9: not connected (priv->connected = 0)
>> [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit]
>> 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f Program:
>> GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport
>> (c_glusterfs-client-9)
>> [2016-10-31 04:06:03.628466] E [MSGID: 114031]
>> [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
>> remote operation failed [Transport endpoint is not connected]
>> [2016-10-31 04:06:03.628475] I [MSGID: 108019]
>> [afr-lk-common.c:1086:afr_lock_blocking] 0-c_glusterfs-replicate-0:
>> unable to lock on even one child
>> [2016-10-31 04:06:03.628539] I [MSGID: 108019]
>> [afr-transaction.c:1224:afr_post_blocking_inodelk_cbk]
>> 0-c_glusterfs-replicate-0: Blocking inodelks failed.
>> [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk]
>> 0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is not
>> connected)
>> [2016-10-31 04:06:03.629149] E [rpc-clnt.c:362:saved_frames_unwind] (-->
>> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90]
>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10]
>> (--> 
>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18]
>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808]
>> ) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3)
>> op(LOOKUP(27)) called at 2016-10-31 04:06:03.624346 (xid=0x7f5a)
>> [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig]
>> 0-c_glusterfs-client-9: changing port to 49391 (from 0)
>> [2016-10-31 04:06:03.629210] W [MSGID: 114031]
>> [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9:
>> remote operation failed. Path: 
>> /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews
>> (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not
>> connected]
>>
>>
>> Could you please tell us the reason why we are getting these trace and
>> how to resolve this.
>>
>> Logs are attached here please share your analysis.
>>
>> Thanks in advanced
>>
>> --
>> Regards
>> Abhishek Paliwal
>>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"

2016-11-10 Thread Mohammed Rafi K C


On 11/10/2016 09:05 PM, Raghavendra G wrote:
>
>
> On Thu, Nov 10, 2016 at 8:57 PM, Vijay Bellur  > wrote:
>
> On Thu, Nov 10, 2016 at 3:17 AM, Nithya Balachandran
> > wrote:
> >
> >
> > On 8 November 2016 at 20:21, Kyle Johnson  > wrote:
> >>
> >> Hey there,
> >>
> >> We have a number of processes which daily walk our entire
> directory tree
> >> and perform operations on the found files.
> >>
> >> Pre-gluster, this processes was able to complete within 24 hours of
> >> starting.  After outgrowing that single server and moving to a
> gluster setup
> >> (two bricks, two servers, distribute, 10gig uplink), the
> processes became
> >> unusable.
> >>
> >> After turning this option on, we were back to normal run times,
> with the
> >> process completing within 24 hours.
> >>
> >> Our data is heavy nested in a large number of subfolders under
> /media/ftp.
> >
> >
> > Thanks for getting back to us - this is very good information.
> Can you
> > provide a few more details?
> >
> > How deep is your directory tree and roughly how many directories
> do you have
> > at each level?
> > Are all your files in the lowest level dirs or do they exist on
> several
> > levels?
> > Would you be willing to provide the gluster volume info output
> for this
> > volume?
> >>
>
>
> I have had performance improvement with this option when the first
> level below the root consisted several thousands of directories
> without any files. IIRC, I was testing this in a 16 x 2 setup.
>
>
> Yes Vijay. I remember you mentioning it. This option is expected to
> only boost readdir performance on a directory containing
> subdirectories. For files it has no effect.
>
> On a similar note, I think we can also skip linkto files in readdirp 
> (on brick) as dht_readdirp picks the dentry from subvol containing
> data-file.

doing so will break tier_readdirp.

Rafi KC

>
>
> Regards,
> Vijay
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org 
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
>
>
>
>
> -- 
> Raghavendra G
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] getting "Transport endpoint is not connected" in glusterfs mount log file.

2016-11-10 Thread Raghavendra Talur
On 11-Nov-2016 11:50, "Mohammed Rafi K C"  wrote:
>
> Hi Abhishek,
>
> Could you please see if you are bricks are healthy or not, may be you can
do a gluster volume status or you can look into the logs. If bricks are not
running can you please attach the bricks logs in /var/log/gluster/bricks/ .
>
>
> Rafi KC

Also,  please provide details of Gluster version and setup.
>
>
> On 11/11/2016 10:20 AM, ABHISHEK PALIWAL wrote:
>>
>> Hi,
>>
>> Its an urgent case.
>>
>> Atleast provide your views on this
>>
>> On Wed, Nov 9, 2016 at 11:08 AM, ABHISHEK PALIWAL <
abhishpali...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> We could see that sync is getting failed to sync the GlusterFS bricks
due to error trace "Transport endpoint is not connected "
>>>
>>> [2016-10-31 04:06:03.627395] E [MSGID: 114031]
[client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
remote operation failed [Transport endpoint is not connected]
>>> [2016-10-31 04:06:03.628381] I [socket.c:3308:socket_submit_request]
0-c_glusterfs-client-9: not connected (priv->connected = 0)
>>> [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit]
0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f Program:
GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport
(c_glusterfs-client-9)
>>> [2016-10-31 04:06:03.628466] E [MSGID: 114031]
[client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9:
remote operation failed [Transport endpoint is not connected]
>>> [2016-10-31 04:06:03.628475] I [MSGID: 108019]
[afr-lk-common.c:1086:afr_lock_blocking] 0-c_glusterfs-replicate-0: unable
to lock on even one child
>>> [2016-10-31 04:06:03.628539] I [MSGID: 108019]
[afr-transaction.c:1224:afr_post_blocking_inodelk_cbk]
0-c_glusterfs-replicate-0: Blocking inodelks failed.
>>> [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk]
0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is not
connected)
>>> [2016-10-31 04:06:03.629149] E [rpc-clnt.c:362:saved_frames_unwind]
(-->
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58]
(--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90]
(--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10]
(-->
/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18]
(--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808]
) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3)
op(LOOKUP(27)) called at 2016-10-31 04:06:03.624346 (xid=0x7f5a)
>>> [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig]
0-c_glusterfs-client-9: changing port to 49391 (from 0)
>>> [2016-10-31 04:06:03.629210] W [MSGID: 114031]
[client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9:
remote operation failed. Path:
/loadmodules_norepl/CXC1725605_P93A001/cello/emasviews
(b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not connected]
>>>
>>>
>>> Could you please tell us the reason why we are getting these trace and
how to resolve this.
>>>
>>> Logs are attached here please share your analysis.
>>>
>>> Thanks in advanced
>>>
>>> --
>>> Regards
>>> Abhishek Paliwal
>>
>>
>>
>>
>> --
>>
>>
>>
>>
>> Regards
>> Abhishek Paliwal
>>
>>
>> ___ Gluster-users mailing
list Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] getting "Transport endpoint is not connected" in glusterfs mount log file.

2016-11-10 Thread Mohammed Rafi K C
Hi Abhishek,

Could you please see if you are bricks are healthy or not, may be you
can do a gluster volume status or you can look into the logs. If bricks
are not running can you please attach the bricks logs in
/var/log/gluster/bricks/ .


Rafi KC


On 11/11/2016 10:20 AM, ABHISHEK PALIWAL wrote:
> Hi,
>
> Its an urgent case.
>
> Atleast provide your views on this
>
> On Wed, Nov 9, 2016 at 11:08 AM, ABHISHEK PALIWAL
> > wrote:
>
> Hi,
>
> We could see that sync is getting failed to sync the GlusterFS
> bricks due to error trace "Transport endpoint is not connected "
>
> [2016-10-31 04:06:03.627395] E [MSGID: 114031]
> [client-rpc-fops.c:1673:client3_3_finodelk_cbk]
> 0-c_glusterfs-client-9: remote operation failed [Transport
> endpoint is not connected]
> [2016-10-31 04:06:03.628381] I
> [socket.c:3308:socket_submit_request] 0-c_glusterfs-client-9: not
> connected (priv->connected = 0)
> [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit]
> 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f
> Program: GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport
> (c_glusterfs-client-9)
> [2016-10-31 04:06:03.628466] E [MSGID: 114031]
> [client-rpc-fops.c:1673:client3_3_finodelk_cbk]
> 0-c_glusterfs-client-9: remote operation failed [Transport
> endpoint is not connected]
> [2016-10-31 04:06:03.628475] I [MSGID: 108019]
> [afr-lk-common.c:1086:afr_lock_blocking]
> 0-c_glusterfs-replicate-0: unable to lock on even one child
> [2016-10-31 04:06:03.628539] I [MSGID: 108019]
> [afr-transaction.c:1224:afr_post_blocking_inodelk_cbk]
> 0-c_glusterfs-replicate-0: Blocking inodelks failed.
> [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk]
> 0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is
> not connected)
> [2016-10-31 04:06:03.629149] E
> [rpc-clnt.c:362:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58]
> (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90]
> (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10]
> (-->
> 
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18]
> (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808]
> ) 0-c_glusterfs-client-9: forced unwinding frame
> type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31
> 04:06:03.624346 (xid=0x7f5a)
> [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig]
> 0-c_glusterfs-client-9: changing port to 49391 (from 0)
> [2016-10-31 04:06:03.629210] W [MSGID: 114031]
> [client-rpc-fops.c:2971:client3_3_lookup_cbk]
> 0-c_glusterfs-client-9: remote operation failed. Path:
> /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews
> (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not
> connected]
>
>
> Could you please tell us the reason why we are getting these trace
> and how to resolve this.
>
> Logs are attached here please share your analysis.
>
> Thanks in advanced
>
> -- 
> Regards
> Abhishek Paliwal
>
>
>
>
> -- 
>
>
>
>
> Regards
> Abhishek Paliwal
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] question about info and info.tmp

2016-11-10 Thread songxin
Hi Atin,


Thank you for your reply.
I have two questions for you.


1.Are the two files info and info.tmp are only to be created or changed in 
function glusterd_store_volinfo()? I did not find other point in which the two 
file are changed.
2.I found that glusterd_store_volinfo() will be call in many point by 
glusterd.Is there a problem of thread synchronization?If so, one thread may 
open a same file info.tmp using O_TRUNC flag when another thread is writing the 
info,tmp.Could this case happen?


Thanks,
Xin



At 2016-11-10 21:41:06, "Atin Mukherjee"  wrote:

Did you run out of disk space by any chance? AFAIK, the code is like we write 
new stuffs to .tmp file and rename it back to the original file. In case of a 
disk space issue I expect both the files to be of non zero size. But having 
said that I vaguely remember a similar issue (in the form of a bug or an email) 
landed up once but we couldn't reproduce it, so something is wrong with the 
atomic update here is what I guess. I'll be glad if you have a reproducer for 
the same and then we can dig into it further.



On Thu, Nov 10, 2016 at 1:32 PM, songxin  wrote:

Hi,
When I start the glusterd some error happened.
And the log is following.

[2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] 
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: 
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) 
[2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] 
0-management: Maximum allowed open file descriptors set to 65536 
[2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] 
0-management: Using /system/glusterd as working directory
[2016-11-08 07:58:35.024508] I [MSGID: 106514] 
[glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade 
detected. Setting op-version to minimum : 1 
[2016-11-08 07:58:35.025356] E [MSGID: 106206] 
[glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed to 
get next store iter 
[2016-11-08 07:58:35.025401] E [MSGID: 106207] 
[glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed to 
update volinfo for c_glusterfs volume 
[2016-11-08 07:58:35.025463] E [MSGID: 106201] 
[glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to 
restore volume: c_glusterfs 
[2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init] 
0-management: Initialization of volume 'management' failed, review your volfile 
again 
[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] 0-management: 
initializing translator failed 
[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] 0-graph: 
init failed 
[2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] 
(-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] 
-->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] 
-->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received 
signum (0), shutting down 




And then I found that the size of vols/volume_name/info is 0.It cause glusterd 
shutdown.
But I found that vols/volume_name_info.tmp is not 0.
And I found that there is a brick file vols/volume_name/bricks/.brick is 0, 
but vols/volume_name/bricks/.brick.tmp is not 0.


I read the function code glusterd_store_volinfo () in glusterd-store.c .
I know that the info.tmp will be rename to info in function 
glusterd_store_volume_atomic_update().


But my question is that why the info file is 0 but info.tmp is not 0.




Thanks,
Xin




 


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users




--



~ Atin (atinm)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Can not install gluster-server on CentOS 7

2016-11-10 Thread Alexandr Porunov
Thank you very much! It works

Sincerely,
Alexandr

On Thu, Nov 10, 2016 at 9:33 PM, David Gossage 
wrote:

> On Thu, Nov 10, 2016 at 1:24 PM, Alexandr Porunov <
> alexandr.poru...@gmail.com> wrote:
>
>> Hello,
>>
>> I am following Quick Start Guide. I have installed epel-release-7.8 but I
>> can not install glusterfs. What shell I do?
>>
>> Here is the output:
>> # yum install glusterfs-server
>> Loaded plugins: fastestmirror, priorities
>> Loading mirror speeds from cached hostfile
>>  * base: mirror.mirohost.net
>>  * extras: mirror.besthosting.ua
>>  * updates: mirror.besthosting.ua
>> No package glusterfs-server available.
>> Error: Nothing to do
>>
>
> Did you go here?  Centos 7 uses the sig repo
> https://wiki.centos.org/SpecialInterestGroup/Storage
>
> yum install centos-release-gluster (though I'd probablty
> install centos-release-gluster38)
> yum install glusterfs-server
>
>>
>> Sincerely,
>> Alexandr
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Can not install gluster-server on CentOS 7

2016-11-10 Thread David Gossage
On Thu, Nov 10, 2016 at 1:24 PM, Alexandr Porunov <
alexandr.poru...@gmail.com> wrote:

> Hello,
>
> I am following Quick Start Guide. I have installed epel-release-7.8 but I
> can not install glusterfs. What shell I do?
>
> Here is the output:
> # yum install glusterfs-server
> Loaded plugins: fastestmirror, priorities
> Loading mirror speeds from cached hostfile
>  * base: mirror.mirohost.net
>  * extras: mirror.besthosting.ua
>  * updates: mirror.besthosting.ua
> No package glusterfs-server available.
> Error: Nothing to do
>

Did you go here?  Centos 7 uses the sig repo
https://wiki.centos.org/SpecialInterestGroup/Storage

yum install centos-release-gluster (though I'd probablty
install centos-release-gluster38)
yum install glusterfs-server

>
> Sincerely,
> Alexandr
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Can not install gluster-server on CentOS 7

2016-11-10 Thread Alexandr Porunov
Hello,

I am following Quick Start Guide. I have installed epel-release-7.8 but I
can not install glusterfs. What shell I do?

Here is the output:
# yum install glusterfs-server
Loaded plugins: fastestmirror, priorities
Loading mirror speeds from cached hostfile
 * base: mirror.mirohost.net
 * extras: mirror.besthosting.ua
 * updates: mirror.besthosting.ua
No package glusterfs-server available.
Error: Nothing to do

Sincerely,
Alexandr
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"

2016-11-10 Thread Raghavendra G
On Thu, Nov 10, 2016 at 8:57 PM, Vijay Bellur  wrote:

> On Thu, Nov 10, 2016 at 3:17 AM, Nithya Balachandran
>  wrote:
> >
> >
> > On 8 November 2016 at 20:21, Kyle Johnson  wrote:
> >>
> >> Hey there,
> >>
> >> We have a number of processes which daily walk our entire directory tree
> >> and perform operations on the found files.
> >>
> >> Pre-gluster, this processes was able to complete within 24 hours of
> >> starting.  After outgrowing that single server and moving to a gluster
> setup
> >> (two bricks, two servers, distribute, 10gig uplink), the processes
> became
> >> unusable.
> >>
> >> After turning this option on, we were back to normal run times, with the
> >> process completing within 24 hours.
> >>
> >> Our data is heavy nested in a large number of subfolders under
> /media/ftp.
> >
> >
> > Thanks for getting back to us - this is very good information. Can you
> > provide a few more details?
> >
> > How deep is your directory tree and roughly how many directories do you
> have
> > at each level?
> > Are all your files in the lowest level dirs or do they exist on several
> > levels?
> > Would you be willing to provide the gluster volume info output for this
> > volume?
> >>
>
>
> I have had performance improvement with this option when the first
> level below the root consisted several thousands of directories
> without any files. IIRC, I was testing this in a 16 x 2 setup.
>

Yes Vijay. I remember you mentioning it. This option is expected to only
boost readdir performance on a directory containing subdirectories. For
files it has no effect.

On a similar note, I think we can also skip linkto files in readdirp  (on
brick) as dht_readdirp picks the dentry from subvol containing data-file.


> Regards,
> Vijay
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Raghavendra G
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"

2016-11-10 Thread Vijay Bellur
On Thu, Nov 10, 2016 at 3:17 AM, Nithya Balachandran
 wrote:
>
>
> On 8 November 2016 at 20:21, Kyle Johnson  wrote:
>>
>> Hey there,
>>
>> We have a number of processes which daily walk our entire directory tree
>> and perform operations on the found files.
>>
>> Pre-gluster, this processes was able to complete within 24 hours of
>> starting.  After outgrowing that single server and moving to a gluster setup
>> (two bricks, two servers, distribute, 10gig uplink), the processes became
>> unusable.
>>
>> After turning this option on, we were back to normal run times, with the
>> process completing within 24 hours.
>>
>> Our data is heavy nested in a large number of subfolders under /media/ftp.
>
>
> Thanks for getting back to us - this is very good information. Can you
> provide a few more details?
>
> How deep is your directory tree and roughly how many directories do you have
> at each level?
> Are all your files in the lowest level dirs or do they exist on several
> levels?
> Would you be willing to provide the gluster volume info output for this
> volume?
>>


I have had performance improvement with this option when the first
level below the root consisted several thousands of directories
without any files. IIRC, I was testing this in a 16 x 2 setup.

Regards,
Vijay
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Transport endpoint is not connected

2016-11-10 Thread Joe Julian
Your first step is to look at you client logs.

On November 10, 2016 2:31:02 AM PST, Cory Sanders 
 wrote:
>We removed a server from our cluster: node4
>
>
>Now, on node1,  when I type df -h
>I get this:
>
>root@node1:/mnt/pve/machines# df -h
>
>df: `/mnt/pve/machines0': Transport endpoint is not connected
>
>typing # mount
>
>Produced this:
>
>node4:machines0 on /mnt/pve/machines0 type fuse.glusterfs
>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>
>
>I did this: # umount /mnt/pve/machines0
>
>And now a df -h produces nothing.  The screen just hangs there with no
>information.
>
>The same on node0 and node3.  On node0 I did not unmount anything and I
>get this:
>
>root@node0:/mnt/pve/machines1# df -h
>df: `/mnt/pve/machines0': Transport endpoint is not connected
>
>node0 mount entry is this:
>
>
>node4:machines0 on /mnt/pve/machines0 type fuse.glusterfs
>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>
>node3 mount entry is this:
>
>
>node4:machines0 on /mnt/pve/machines0 type fuse.glusterfs
>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
>
>
>
>My Load Averages are in the 8s and should be in the 1s
>
>Thanks.
>
>
>
>
>
>
>
>
>
>
>
>
>___
>Gluster-users mailing list
>Gluster-users@gluster.org
>http://www.gluster.org/mailman/listinfo/gluster-users

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] question about info and info.tmp

2016-11-10 Thread Atin Mukherjee
Did you run out of disk space by any chance? AFAIK, the code is like we
write new stuffs to .tmp file and rename it back to the original file. In
case of a disk space issue I expect both the files to be of non zero size.
But having said that I vaguely remember a similar issue (in the form of a
bug or an email) landed up once but we couldn't reproduce it, so something
is wrong with the atomic update here is what I guess. I'll be glad if you
have a reproducer for the same and then we can dig into it further.

On Thu, Nov 10, 2016 at 1:32 PM, songxin  wrote:

> Hi,
> When I start the glusterd some error happened.
> And the log is following.
>
> [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main]
> 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6
> (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
> [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init]
> 0-management: Maximum allowed open file descriptors set to 65536
> [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init]
> 0-management: Using /system/glusterd as working directory
> [2016-11-08 07:58:35.024508] I [MSGID: 106514] 
> [glusterd-store.c:2075:glusterd_restore_op_version]
> 0-management: Upgrade detected. Setting op-version to minimum : 1
> *[2016-11-08 07:58:35.025356] E [MSGID: 106206]
> [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed
> to get next store iter *
> *[2016-11-08 07:58:35.025401] E [MSGID: 106207]
> [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed
> to update volinfo for c_glusterfs volume *
> *[2016-11-08 07:58:35.025463] E [MSGID: 106201]
> [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management:
> Unable to restore volume: c_glusterfs *
> *[2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init]
> 0-management: Initialization of volume 'management' failed, review your
> volfile again *
> *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init]
> 0-management: initializing translator failed *
> *[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate]
> 0-graph: init failed *
> [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit]
> (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718]
> -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8]
> -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-:
> received signum (0), shutting down
>
>
> And then I found that the size of vols/volume_name/info is 0.It cause
> glusterd shutdown.
> But I found that vols/volume_name_info.tmp is not 0.
> And I found that there is a brick file vols/volume_name/bricks/.brick
> is 0, but vols/volume_name/bricks/.brick.tmp is not 0.
>
> I read the function code glusterd_store_volinfo () in glusterd-store.c .
> I know that the info.tmp will be rename to info in function
> glusterd_store_volume_atomic_update().
>
> But my question is that why the info file is 0 but info.tmp is not 0.
>
>
> Thanks,
> Xin
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 

~ Atin (atinm)
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] FSFE pads to github wiki / alternative etherpad - info. required

2016-11-10 Thread Saravanakumar Arumugam

Hi,

I am working on moving fsfe pages  to github wiki( as discussed in 
Gluster community meeting, yesterday).

Identified the following links present in fsfe etherpad.

I need your help to check whether the link(maybe created by you) needs 
to be moved to github wiki.

Also, if any other link you wish to add.

Note:
Only items which will "not change" will be moved to github wiki.
(example - meeting status, meeting template)

Items which needs to be updated(read realtime colloboration) will NOT be 
moved to github wiki.

(example - bugs to triage updated real time by multiple users).
We need to identify alternative "etherpad" for the same.

Now the links:
=
https://public.pad.fsfe.org/p/gluster-bug-triage

https://public.pad.fsfe.org/p/gluster-bugs-to-triage

https://public.pad.fsfe.org/p/gluster-community-meetings

https://public.pad.fsfe.org/p/gluster-3.7-hangouts

https://public.pad.fsfe.org/p/glusterfs-release-process-201606

https://public.pad.fsfe.org/p/glusterfs-compound-fops

https://public.pad.fsfe.org/p/glusterfs-3.8-release-notes

https://public.pad.fsfe.org/p/gluster-spurious-failures

https://public.pad.fsfe.org/p/gluster-automated-bug-workflow

https://public.pad.fsfe.org/p/gluster-3.8-features

https://public.pad.fsfe.org/p/gluster-login-issues

https://public.pad.fsfe.org/p/dht_lookup_optimize

https://public.pad.fsfe.org/p/gluster-gerrit-migration

https://public.pad.fsfe.org/p/gluster-component-release-checklist

https://public.pad.fsfe.org/p/glusterfs-bitrot-notes

https://public.pad.fsfe.org/p/review-for-glusterfs-3.7

https://public.pad.fsfe.org/p/gluster-xattr-categorization

https://public.pad.fsfe.org/p/Snapshots_in_glusterfs

https://public.pad.fsfe.org/p/gluster-gd2-kaushal

https://public.pad.fsfe.org/p/gluster-events

https://public.pad.fsfe.org/p/gluster-slogans

https://public.pad.fsfe.org/p/gluster-weekly-news

https://public.pad.fsfe.org/p/gluster-next-planning

https://public.pad.fsfe.org/p/gluster-heketi
==


Thanks,
Saravanakumar




___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Transport endpoint is not connected

2016-11-10 Thread Cory Sanders
We removed a server from our cluster: node4


Now, on node1,  when I type df -h
I get this:

root@node1:/mnt/pve/machines# df -h

df: `/mnt/pve/machines0': Transport endpoint is not connected

typing # mount

Produced this:

node4:machines0 on /mnt/pve/machines0 type fuse.glusterfs 
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)


I did this: # umount /mnt/pve/machines0

And now a df -h produces nothing.  The screen just hangs there with no 
information.

The same on node0 and node3.  On node0 I did not unmount anything and I get 
this:

root@node0:/mnt/pve/machines1# df -h
df: `/mnt/pve/machines0': Transport endpoint is not connected

node0 mount entry is this:


node4:machines0 on /mnt/pve/machines0 type fuse.glusterfs 
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

node3 mount entry is this:


node4:machines0 on /mnt/pve/machines0 type fuse.glusterfs 
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)



My Load Averages are in the 8s and should be in the 1s

Thanks.








___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Feedback on DHT option "cluster.readdir-optimize"

2016-11-10 Thread Nithya Balachandran
On 8 November 2016 at 20:21, Kyle Johnson  wrote:

> Hey there,
>
> We have a number of processes which daily walk our entire directory tree
> and perform operations on the found files.
>
> Pre-gluster, this processes was able to complete within 24 hours of
> starting.  After outgrowing that single server and moving to a gluster
> setup (two bricks, two servers, distribute, 10gig uplink), the processes
> became unusable.
>
> After turning this option on, we were back to normal run times, with the
> process completing within 24 hours.
>
> Our data is heavy nested in a large number of subfolders under /media/ftp.
>

Thanks for getting back to us - this is very good information. Can you
provide a few more details?

How deep is your directory tree and roughly how many directories do you
have at each level?
Are all your files in the lowest level dirs or do they exist on several
levels?
Would you be willing to provide the gluster volume info output for this
volume?

>
> A subset of our data:
>
> 15T of files in 48163 directories under /media/ftp/dig_dis.
>
> Without readdir-optimize:
>
> [root@colossus dig_dis]# time ls|wc -l
> 48163
>
> real13m1.582s
> user0m0.294s
> sys 0m0.205s
>
>
> With readdir-optimize:
>
> [root@colossus dig_dis]# time ls | wc -l
> 48163
>
> real0m23.785s
> user0m0.296s
> sys 0m0.108s
>
>
> Long story short - this option is super important to me as it resolved an
> issue that would have otherwise made me move my data off of gluster.
>
>
> Thank you for all of your work,
>
> Kyle
>
>
>
>
>
> On 11/07/2016 10:07 PM, Raghavendra Gowdappa wrote:
>
>> Hi all,
>>
>> We have an option in called "cluster.readdir-optimize" which alters the
>> behavior of readdirp in DHT. This value affects how storage/posix treats
>> dentries corresponding to directories (not for files).
>>
>> When this value is on,
>> * DHT asks only one subvol/brick to return dentries corresponding to
>> directories.
>> * Other subvols/bricks filter dentries corresponding to directories and
>> send only dentries corresponding to files.
>>
>> When this value is off (this is the default value),
>> * All subvols return all dentries stored on them. IOW, bricks don't
>> filter any dentries.
>> * Since a directory has one dentry representing it on each subvol, dht
>> (loaded on client) picks up dentry only from hashed subvol.
>>
>> Note that irrespective of value of this option, _all_ subvols return
>> dentries corresponding to files which are stored on them.
>>
>> This option was introduced to boost performance of readdir as (when set
>> on), filtering of dentries happens on bricks and hence there is reduced:
>> 1. network traffic (with filtering all the redundant dentry information)
>> 2. number of readdir calls between client and server for the same number
>> of dentries returned to application (If filtering happens on client, lesser
>> number of dentries in result and hence more number of readdir calls. IOW,
>> result buffer is not filled to maximum capacity).
>>
>> We want to hear from you Whether you've used this option and if yes,
>> 1. Did it really boost readdir performance?
>> 2. Do you've any performance data to find out what was the percentage of
>> improvement (or deterioration)?
>> 3. Data set you had (Number of files, directories and organisation of
>> directories).
>>
>> If we find out that this option is really helping you, we can spend our
>> energies on fixing issues that will arise when this option is set to on.
>> One common issue with turning this option on is that when this option is
>> set, some directories might not show up in directory listing [1]. The
>> reason for this is that:
>> 1. If a directory can be created on a hashed subvol, mkdir (result to
>> application) will be successful, irrespective of result of mkdir on rest of
>> the subvols.
>> 2. So, any subvol we pick to give us dentries corresponding to directory
>> need not contain all the directories and we might miss out those
>> directories in listing.
>>
>> Your feedback is important for us and will help us to prioritize and
>> improve things.
>>
>> [1] https://www.gluster.org/pipermail/gluster-users/2016-October
>> /028703.html
>>
>> regards,
>> Raghavendra
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>
> ___
> Gluster-devel mailing list
> gluster-de...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] question about info and info.tmp

2016-11-10 Thread songxin
Hi,
When I start the glusterd some error happened.
And the log is following.

[2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] 
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: 
/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) 
[2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] 
0-management: Maximum allowed open file descriptors set to 65536 
[2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] 
0-management: Using /system/glusterd as working directory
[2016-11-08 07:58:35.024508] I [MSGID: 106514] 
[glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade 
detected. Setting op-version to minimum : 1 
[2016-11-08 07:58:35.025356] E [MSGID: 106206] 
[glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed to 
get next store iter 
[2016-11-08 07:58:35.025401] E [MSGID: 106207] 
[glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed to 
update volinfo for c_glusterfs volume 
[2016-11-08 07:58:35.025463] E [MSGID: 106201] 
[glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to 
restore volume: c_glusterfs 
[2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init] 
0-management: Initialization of volume 'management' failed, review your volfile 
again 
[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] 0-management: 
initializing translator failed 
[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] 0-graph: 
init failed 
[2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] 
(-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] 
-->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] 
-->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received 
signum (0), shutting down 




And then I found that the size of vols/volume_name/info is 0.It cause glusterd 
shutdown.
But I found that vols/volume_name_info.tmp is not 0.
And I found that there is a brick file vols/volume_name/bricks/.brick is 0, 
but vols/volume_name/bricks/.brick.tmp is not 0.


I read the function code glusterd_store_volinfo () in glusterd-store.c .
I know that the info.tmp will be rename to info in function 
glusterd_store_volume_atomic_update().


But my question is that why the info file is 0 but info.tmp is not 0.




Thanks,
Xin___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users