[Gluster-devel] Problems with graph switch in disperse

Xavier Hernandez Wed, 24 Dec 2014 04:55:43 -0800

Hi,

I'm experiencing a problem when gluster graph is changed as a result ofa replace-brick operation (probably with any other operation thatchanges the graph) while the client is also doing other tasks, likewriting a file.

When operation starts, I see that the replaced brick is disconnected,but writes continue working normally with one brick less.

At some point, another graph is created and comes online. Remainingbricks on the old graph are disconnected and the old graph is destroyed.I see how new write requests are sent to the new graph.


This seems correct. However there's a point where I see this:

[2014-12-24 11:29:58.541130] T [fuse-bridge.c:2305:fuse_write_resume]0-glusterfs-fuse: 2234: WRITE (0x16dcf3c, size=131072, offset=255721472)[2014-12-24 11:29:58.541156] T [ec-helpers.c:101:ec_trace] 2-ec:WIND(INODELK) 0x7f8921b7a9a4(0x7f8921b78e14) [refs=5, winds=3, jobs=1]frame=0x7f8932e92c38/0x7f8932e9e6b0, min/exp=3/3, err=0 state=1{111:000:000} idx=0[2014-12-24 11:29:58.541292] T [rpc-clnt.c:1384:rpc_clnt_record]2-patchy-client-0: Auth Info: pid: 0, uid: 0, gid: 0, owner:d025e932897f0000[2014-12-24 11:29:58.541296] T [io-cache.c:133:ioc_inode_flush]2-patchy-io-cache: locked inode(0x16d2810)[2014-12-24 11:29:58.541354] T[rpc-clnt.c:1241:rpc_clnt_record_build_header] 2-rpc-clnt: Requestfraglen 152, payload: 84, rpc hdr: 68[2014-12-24 11:29:58.541408] T [io-cache.c:137:ioc_inode_flush]2-patchy-io-cache: unlocked inode(0x16d2810)[2014-12-24 11:29:58.541493] T [io-cache.c:133:ioc_inode_flush]2-patchy-io-cache: locked inode(0x16d2810)[2014-12-24 11:29:58.541536] T [io-cache.c:137:ioc_inode_flush]2-patchy-io-cache: unlocked inode(0x16d2810)[2014-12-24 11:29:58.541537] T [rpc-clnt.c:1577:rpc_clnt_submit]2-rpc-clnt: submitted request (XID: 0x17 Program: GlusterFS 3.3,ProgVers: 330, Proc: 29) to rpc-transport (patchy-client-0)[2014-12-24 11:29:58.541646] W [fuse-bridge.c:2271:fuse_writev_cbk]0-glusterfs-fuse: 2234: WRITE => -1 (Input/output error)

It seems that fuse still has a write request pending for graph 0. It isresumed but it returns EIO without calling the xlator stack (operationsseen between the two log messages are from other operations and they aresent to graph 2). I'm not sure why this happens and how I should aviod this.

I tried the same scenario with replicate and it seems to work, so theremust be something wrong in disperse, but I don't see where the problemcould be.


Any ideas ?

Thanks,

Xavi
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Problems with graph switch in disperse

Reply via email to