Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-28 Thread qingwei wei
I did one random read test (~10k shard in one replicate group) but so far no error reported, will try to do few more tests over the weekend to confirm this. Just a quick question, is the full heal process heal in sequence according to sorted file name? Thanks. Cwtan On Thu, Oct 27, 2016 at

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-27 Thread Krutika Dhananjay
This should work without any issues. It is possible that the shard(s) would get created with different gfids but the ones on the lagging brick will eventually (by the time heal-info returns all zeroes) get replaced with shards having the correct gfids. Have you tried it yet? Did you face any

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-27 Thread qingwei wei
Hi, My final goal of the test is to see the impact of brick replacement while IO is till running. One scenario that i think of is as below: 1. random read IO is performed on gluster volume (3 replicas) 2. 1 brick down and IO still ongoing 3. Perform brick replacement and IO still ongoing 4.

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-27 Thread Krutika Dhananjay
Found the RC. The problem seems to be that sharding translator attempts to create non-existent shards in read/write codepaths with a newly generated gfid attached to the create request in case the shard is absent. Replicate translator, which sits below sharding on the stack takes this request and

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-27 Thread Krutika Dhananjay
Now it's reproducible, thanks. :) I think I know the RC. Let me confirm it through tests and report back. -Krutika On Thu, Oct 27, 2016 at 10:42 AM, qingwei wei wrote: > Hi, > > I did few more test runs and it seems that it happens during this sequence > > 1.populate data

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-26 Thread Krutika Dhananjay
Do you also have the brick logs? Looks like the bricks are returning EINVAL on lookup which AFR is subsequently converting into an EIO. And sharding is merely delivering the same error code upwards. -Krutika On Wed, Oct 26, 2016 at 6:38 AM, qingwei wei wrote: > Hi, > > Pls

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-25 Thread qingwei wei
Hi, Pls see the client log below. [2016-10-24 10:29:51.111603] I [fuse-bridge.c:5171:fuse_graph_setup] 0-fuse: switched to graph 0 [2016-10-24 10:29:51.111662] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-testHeal-client-2: Server lk version = 1 [2016-10-24

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-25 Thread Krutika Dhananjay
Tried it locally on my setup. Worked fine. Could you please attach the mount logs? -Krutika On Tue, Oct 25, 2016 at 6:55 PM, Pranith Kumar Karampuri < pkara...@redhat.com> wrote: > +Krutika > > On Mon, Oct 24, 2016 at 4:10 PM, qingwei wei wrote: > >> Hi, >> >> I am

Re: [Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-25 Thread Pranith Kumar Karampuri
+Krutika On Mon, Oct 24, 2016 at 4:10 PM, qingwei wei wrote: > Hi, > > I am currently running a simple gluster setup using one server node > with multiple disks. I realize that if i delete away all the .shard > files in one replica in the backend, my application (dd) will

[Gluster-devel] Input/output error when files in .shard folder are deleted

2016-10-24 Thread qingwei wei
Hi, I am currently running a simple gluster setup using one server node with multiple disks. I realize that if i delete away all the .shard files in one replica in the backend, my application (dd) will report Input/Output error even though i have 3 replicas. My gluster version is 3.7.16 gluster