date:20160115

Re: [Gluster-devel] Core from gNFS process

2016-01-15 Thread Vijay Bellur


On 01/15/2016 08:38 AM, Soumya Koduri wrote:




On 01/15/2016 06:52 PM, Soumya Koduri wrote:



On 01/14/2016 08:41 PM, Vijay Bellur wrote:

On 01/14/2016 04:11 AM, Jiffin Tony Thottan wrote:



On 14/01/16 14:28, Jiffin Tony Thottan wrote:

Hi,

The core generated when encryption xlator is enabled

[2016-01-14 08:13:15.740835] E
[crypt.c:4298:master_set_master_vol_key] 0-test1-crypt: FATAL: missing
master key
[2016-01-14 08:13:15.740859] E [MSGID: 101019]
[xlator.c:429:xlator_init] 0-test1-crypt: Initialization of volume
'test1-crypt' failed, review your volfile again
[2016-01-14 08:13:15.740890] E [MSGID: 101066]
[graph.c:324:glusterfs_graph_init] 0-test1-crypt: initializing
translator failed
[2016-01-14 08:13:15.740904] E [MSGID: 101176]
[graph.c:670:glusterfs_graph_activate] 0-graph: init failed
[2016-01-14 08:13:15.741676] W [glusterfsd.c:1231:cleanup_and_exit]
(-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x307) [0x40d287]
-->/usr/sbin/glusterfs(glusterfs_process_volfp+0x117) [0x4086c7]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x4d) [0x407e1d] ) 0-:
received signum (0), shutting down




Forgot to mention this last mail,  for crypt xlator needs master key
before enabling the translator which cause the issue
--


Irrespective of the problem, the nfs process should not crash. Can we
check why there is a memory corruption during cleanup_and_exit()?


That's right. This issue was reported quite a few times earlier in
gluster-devel and it is not specific to gluster-nfs process. As updated
in [1], we have raised bug1293594[2] against lib-gcc team to further
investigate this.


The segmentation fault in gcc is while attempting to print a backtrace 
upon glusterfs receiving a SIGSEGV. It would be good to isolate the 
reason for the initial SIGSEGV whose signal handler causes the further 
crash.


-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

2016-01-15 Thread Oleksandr Natalenko

Another observation: if rsyncing is resumed after hang, rsync itself 
hangs a lot faster because it does stat of already copied files. So, the 
reason may be not writing itself, but massive stat on GlusterFS volume 
as well.


15.01.2016 09:40, Oleksandr Natalenko написав:

While doing rsync over millions of files from ordinary partition to
GlusterFS volume, just after approx. first 2 million rsync hang
happens, and the following info appears in dmesg:

===
[17075038.924481] INFO: task rsync:10310 blocked for more than 120 
seconds.

[17075038.931948] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[17075038.940748] rsync   D 88207fc13680 0 10310
10309 0x0080
[17075038.940752]  8809c578be18 0086 8809c578bfd8
00013680
[17075038.940756]  8809c578bfd8 00013680 880310cbe660
881159d16a30
[17075038.940759]  881e3aa25800 8809c578be48 881159d16b10
88087d553980
[17075038.940762] Call Trace:
[17075038.940770]  [] schedule+0x29/0x70
[17075038.940797]  [] __fuse_request_send+0x13d/0x2c0 
[fuse]

[17075038.940801]  [] ?
fuse_get_req_nofail_nopages+0xc0/0x1e0 [fuse]
[17075038.940805]  [] ? wake_up_bit+0x30/0x30
[17075038.940809]  [] fuse_request_send+0x12/0x20 
[fuse]

[17075038.940813]  [] fuse_flush+0xff/0x150 [fuse]
[17075038.940817]  [] filp_close+0x34/0x80
[17075038.940821]  [] __close_fd+0x78/0xa0
[17075038.940824]  [] SyS_close+0x23/0x50
[17075038.940828]  [] system_call_fastpath+0x16/0x1b
===

rsync blocks in D state, and to kill it, I have to do umount --lazy on
GlusterFS mountpoint, and then kill corresponding client glusterfs
process. Then rsync exits.

Here is GlusterFS volume info:

===
Volume Name: asterisk_records
Type: Distributed-Replicate
Volume ID: dc1fe561-fa3a-4f2e-8330-ec7e52c75ba4
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1:
server1:/bricks/10_megaraid_0_3_9_x_0_4_3_hdd_r1_nolvm_hdd_storage_01/asterisk/records
Brick2:
server2:/bricks/10_megaraid_8_5_14_x_8_6_16_hdd_r1_nolvm_hdd_storage_01/asterisk/records
Brick3:
server1:/bricks/11_megaraid_0_5_4_x_0_6_5_hdd_r1_nolvm_hdd_storage_02/asterisk/records
Brick4:
server2:/bricks/11_megaraid_8_7_15_x_8_8_20_hdd_r1_nolvm_hdd_storage_02/asterisk/records
Brick5:
server1:/bricks/12_megaraid_0_7_6_x_0_13_14_hdd_r1_nolvm_hdd_storage_03/asterisk/records
Brick6:
server2:/bricks/12_megaraid_8_9_19_x_8_13_24_hdd_r1_nolvm_hdd_storage_03/asterisk/records
Options Reconfigured:
cluster.lookup-optimize: on
cluster.readdir-optimize: on
client.event-threads: 2
network.inode-lru-limit: 4096
server.event-threads: 4
performance.client-io-threads: on
storage.linux-aio: on
performance.write-behind-window-size: 4194304
performance.stat-prefetch: on
performance.quick-read: on
performance.read-ahead: on
performance.flush-behind: on
performance.write-behind: on
performance.io-thread-count: 2
performance.cache-max-file-size: 1048576
performance.cache-size: 33554432
features.cache-invalidation: on
performance.readdir-ahead: on
===

The issue reproduces each time I rsync such an amount of files.

How could I debug this issue better?
___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Core from gNFS process

2016-01-15 Thread Soumya Koduri





On 01/15/2016 06:52 PM, Soumya Koduri wrote:



On 01/14/2016 08:41 PM, Vijay Bellur wrote:

On 01/14/2016 04:11 AM, Jiffin Tony Thottan wrote:



On 14/01/16 14:28, Jiffin Tony Thottan wrote:

Hi,

The core generated when encryption xlator is enabled

[2016-01-14 08:13:15.740835] E
[crypt.c:4298:master_set_master_vol_key] 0-test1-crypt: FATAL: missing
master key
[2016-01-14 08:13:15.740859] E [MSGID: 101019]
[xlator.c:429:xlator_init] 0-test1-crypt: Initialization of volume
'test1-crypt' failed, review your volfile again
[2016-01-14 08:13:15.740890] E [MSGID: 101066]
[graph.c:324:glusterfs_graph_init] 0-test1-crypt: initializing
translator failed
[2016-01-14 08:13:15.740904] E [MSGID: 101176]
[graph.c:670:glusterfs_graph_activate] 0-graph: init failed
[2016-01-14 08:13:15.741676] W [glusterfsd.c:1231:cleanup_and_exit]
(-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x307) [0x40d287]
-->/usr/sbin/glusterfs(glusterfs_process_volfp+0x117) [0x4086c7]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x4d) [0x407e1d] ) 0-:
received signum (0), shutting down




Forgot to mention this last mail,  for crypt xlator needs master key
before enabling the translator which cause the issue
--


Irrespective of the problem, the nfs process should not crash. Can we
check why there is a memory corruption during cleanup_and_exit()?


That's right. This issue was reported quite a few times earlier in
gluster-devel and it is not specific to gluster-nfs process. As updated
in [1], we have raised bug1293594[2] against lib-gcc team to further
investigate this.

As requested in [1], kindly upload the core in the bug along with bt
taken with gcc debuginfo packages installed. Might help to get their
attention and get a closure on this issue sooner.


Here is the bug link -
https://bugzilla.redhat.com/show_bug.cgi?id=1293594

Request Raghavendra/Ravi to update it.

Thanks,
Soumya


Thanks,
Soumya
[1] http://article.gmane.org/gmane.comp.file-systems.gluster.devel/13298


-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Core from gNFS process

2016-01-15 Thread Soumya Koduri




On 01/14/2016 08:41 PM, Vijay Bellur wrote:

On 01/14/2016 04:11 AM, Jiffin Tony Thottan wrote:



On 14/01/16 14:28, Jiffin Tony Thottan wrote:

Hi,

The core generated when encryption xlator is enabled

[2016-01-14 08:13:15.740835] E
[crypt.c:4298:master_set_master_vol_key] 0-test1-crypt: FATAL: missing
master key
[2016-01-14 08:13:15.740859] E [MSGID: 101019]
[xlator.c:429:xlator_init] 0-test1-crypt: Initialization of volume
'test1-crypt' failed, review your volfile again
[2016-01-14 08:13:15.740890] E [MSGID: 101066]
[graph.c:324:glusterfs_graph_init] 0-test1-crypt: initializing
translator failed
[2016-01-14 08:13:15.740904] E [MSGID: 101176]
[graph.c:670:glusterfs_graph_activate] 0-graph: init failed
[2016-01-14 08:13:15.741676] W [glusterfsd.c:1231:cleanup_and_exit]
(-->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x307) [0x40d287]
-->/usr/sbin/glusterfs(glusterfs_process_volfp+0x117) [0x4086c7]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x4d) [0x407e1d] ) 0-:
received signum (0), shutting down




Forgot to mention this last mail,  for crypt xlator needs master key
before enabling the translator which cause the issue
--


Irrespective of the problem, the nfs process should not crash. Can we
check why there is a memory corruption during cleanup_and_exit()?

That's right. This issue was reported quite a few times earlier in 
gluster-devel and it is not specific to gluster-nfs process. As updated 
in [1], we have raised bug1293594[2] against lib-gcc team to further 
investigate this.


As requested in [1], kindly upload the core in the bug along with bt 
taken with gcc debuginfo packages installed. Might help to get their 
attention and get a closure on this issue sooner.


Thanks,
Soumya
[1] http://article.gmane.org/gmane.comp.file-systems.gluster.devel/13298


-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Blog post on GlusterFS-Quota

2016-01-15 Thread Manikandan Selvaganesh

Hi all,
I have written an initial blog post[1] on GlusterFS-quota. I will update the 
blog or add few more posts to add the remaining contents. It is also been 
shared on planet.gluster.org. Suggestions/comments are welcome :-)

[1] https://manikandanselvaganesh.wordpress.com/category/glusterfs/


--
Thanks & Regards,
Manikandan Selvaganesh.

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Gluster AFR volume write performance has been seriously affected by GLUSTERFS_WRITE_IS_APPEND in afr_writev

2016-01-15 Thread li . ping288

GLUSTERFS_WRITE_IS_APPEND Setting in afr_writev function at glusterfs 
client end makes the posix_writev in the server end  deal IO write fops 
from parallel  to serial in consequence.

i.e.  multiple io-worker threads carrying out IO write fops are blocked in 
posix_writev to execute final write fop pwrite/pwritev in __posix_writev 
function ONE AFTER ANOTHER. 

For example:

thread1: iot_worker -> ...  -> posix_writev()   |
thread2: iot_worker -> ...  -> posix_writev()   |
thread3: iot_worker -> ...  -> posix_writev()   -> __posix_writev() 
thread4: iot_worker -> ...  -> posix_writev()   |

there are 4 iot_worker thread doing the 128KB IO write fops as above, but 
only one can execute __posix_writev function and the others have to wait.

however, if the afr volume is configured on with storage.linux-aio which 
is off in default,  the iot_worker will use posix_aio_writev instead of 
posix_writev to write data.
the posix_aio_writev function won't be affected by 
GLUSTERFS_WRITE_IS_APPEND, and the AFR volume write performance goes up.

So, my question is whether  AFR volume could work fine with 
storage.linux-aio configuration which bypass the GLUSTERFS_WRITE_IS_APPEND 
setting in afr_writev, 
and why glusterfs keeps posix_aio_writev different from posix_writev ?

Any replies to clear my confusion would be grateful, and thanks in 
advance.

ZTE Information Security Notice: The information contained in this mail (and 
any attachment transmitted herewith) is privileged and confidential and is 
intended for the exclusive use of the addressee(s).  If you are not an intended 
recipient, any disclosure, reproduction, distribution or other dissemination or 
use of the information contained is strictly prohibited.  If you have received 
this mail in error, please delete it and notify us immediately.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Core from gNFS process

Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

Re: [Gluster-devel] Core from gNFS process

Re: [Gluster-devel] Core from gNFS process

[Gluster-devel] Blog post on GlusterFS-Quota

[Gluster-devel] Gluster AFR volume write performance has been seriously affected by GLUSTERFS_WRITE_IS_APPEND in afr_writev

6 matches

Site Navigation

Mail list logo

Footer information