Re: [Gluster-users] Self Heal fails...

2011-09-16 Thread Robert Krig

I'm using GlusterFS version 3.2.3 built from the sources of the
gluster.org website.

I think I've found a way. I've shutdown my volume, detached the peers
and basically recreated my storage volume from scratch.
This time I started the setup with probing a peer from the node that had
the up to date data in its underlying storage directory.

Then I created the Volume again from scratch, this time entering
node2:/export first and then node1:/export.
Then I mounted the Gluster Volume locally  and am currently running the
find one liner on it.
Judging from the logs, it seems to be rebuilding.

I'm just wondering if there is perhaps a more elegant way to force a resync.
It would be nice if there was a feature or a command, so that you can
say: ok Node2, you are the main source, node1 listen to what node2 has
to say.



On 09/16/2011 08:31 PM, Burnash, James wrote:
> Hi Robert.
>
> Can you tell us what version you are running? That helps nail down if this is 
> a known bug in a specific version.
>
> James Burnash
> Unix Engineer
> Knight Capital Group
>
>
> -Original Message-
> From: gluster-users-boun...@gluster.org 
> [mailto:gluster-users-boun...@gluster.org] On Behalf Of Robert Krig
> Sent: Friday, September 16, 2011 2:17 PM
> To: gluster-users@gluster.org
> Subject: Re: [Gluster-users] Self Heal fails...
>
>
> On 09/16/2011 06:36 PM, Robert Krig wrote:
>> Hi there. I'm new to GlusterFS. I'm currently evaluating it for 
>> production usage.
>>
>> I have two Storage Servers which use JFS as a filesystem for the 
>> underlying export.
>>
>> The setup is supposed to be replicated.
>>
>> I've been experimenting with various settings for benchmarking and 
>> such, as well as trying out different failure scenarios.
>>
>> Anyways, the export directory on node 1 is out of sync with node 2.
>> So I mounted the storage volume via glusterfs client on node1 in 
>> another directory.
>>
>> The fuse mounted directory is /storage
>>
>> As per Manual I tried doing the "find  -noleaf -print0 
>> | xargs --null stat >/dev/null" dance, however the logs throw a bunch 
>> of
>> errors:
>> ##
>> ###
>> [2011-09-16 18:29:33.759729] E
>> [client3_1-fops.c:1216:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0: 
>> error
>> [2011-09-16 18:29:33.759747] I
>> [client3_1-fops.c:1226:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0:
>> remote operation failed: Invalid argument
>> [2011-09-16 18:29:33.759942] E
>> [afr-self-heal-metadata.c:672:afr_sh_metadata_post_nonblocking_inodelk
>> _cbk]
>> 0-GLSTORAGE-replicate-0: Non Blocking metadata inodelks failed for /.
>> [2011-09-16 18:29:33.759961] E
>> [afr-self-heal-metadata.c:674:afr_sh_metadata_post_nonblocking_inodelk
>> _cbk]
>> 0-GLSTORAGE-replicate-0: Metadata self-heal failed for /.
>> [2011-09-16 18:29:33.760167] W [rpc-common.c:64:xdr_to_generic]
>> (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x7d) 
>> [0x7f4702a751ad]
>> (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
>> [0x7f4702a74de5]
>> (-->/usr/local/lib/glusterfs/3.2.3/xlator/protocol/client.so(client3_1
>> _entrylk_cbk+0x52)
>> [0x7f46ff88a572]))) 0-xdr: XDR decoding failed
>> [2011-09-16 18:29:33.760200] E
>> [client3_1-fops.c:1292:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0: 
>> error
>> [2011-09-16 18:29:33.760215] I
>> [client3_1-fops.c:1303:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0:
>> remote operation failed: Invalid argument
>> [2011-09-16 18:29:33.760417] E
>> [afr-self-heal-entry.c:2292:afr_sh_post_nonblocking_entry_cbk]
>> 0-GLSTORAGE-replicate-0: Non Blocking entrylks failed for /.
>> [2011-09-16 18:29:33.760447] E
>> [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk]
>> 0-GLSTORAGE-replicate-0: background  meta-data entry self-heal failed 
>> on /
>> [2011-09-16 18:29:33.760808] I
>> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
>> remote operation failed: Invalid argument 
>> ##
>> #
>>
>>
>> Is this normal? The directory in question already has 150GB of data, 
>> so the find command is still running. Will it be ok once it finishes?
>> from what I understand from the manual, the files should repair as the 
>> find process runs, or did I misinterpret that?
>>
>> If self heal should fail, is there a failsafe method to ensure that 
>> both nodes are in sync again?
>>
>>
&

Re: [Gluster-users] Self Heal fails...

2011-09-16 Thread Burnash, James
Hi Robert.

Can you tell us what version you are running? That helps nail down if this is a 
known bug in a specific version.

James Burnash
Unix Engineer
Knight Capital Group


-Original Message-
From: gluster-users-boun...@gluster.org 
[mailto:gluster-users-boun...@gluster.org] On Behalf Of Robert Krig
Sent: Friday, September 16, 2011 2:17 PM
To: gluster-users@gluster.org
Subject: Re: [Gluster-users] Self Heal fails...


On 09/16/2011 06:36 PM, Robert Krig wrote:
>
> Hi there. I'm new to GlusterFS. I'm currently evaluating it for 
> production usage.
>
> I have two Storage Servers which use JFS as a filesystem for the 
> underlying export.
>
> The setup is supposed to be replicated.
>
> I've been experimenting with various settings for benchmarking and 
> such, as well as trying out different failure scenarios.
>
> Anyways, the export directory on node 1 is out of sync with node 2.
> So I mounted the storage volume via glusterfs client on node1 in 
> another directory.
>
> The fuse mounted directory is /storage
>
> As per Manual I tried doing the "find  -noleaf -print0 
> | xargs --null stat >/dev/null" dance, however the logs throw a bunch 
> of
> errors:
> ##
> ###
> [2011-09-16 18:29:33.759729] E
> [client3_1-fops.c:1216:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0: 
> error
> [2011-09-16 18:29:33.759747] I
> [client3_1-fops.c:1226:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0:
> remote operation failed: Invalid argument
> [2011-09-16 18:29:33.759942] E
> [afr-self-heal-metadata.c:672:afr_sh_metadata_post_nonblocking_inodelk
> _cbk]
> 0-GLSTORAGE-replicate-0: Non Blocking metadata inodelks failed for /.
> [2011-09-16 18:29:33.759961] E
> [afr-self-heal-metadata.c:674:afr_sh_metadata_post_nonblocking_inodelk
> _cbk]
> 0-GLSTORAGE-replicate-0: Metadata self-heal failed for /.
> [2011-09-16 18:29:33.760167] W [rpc-common.c:64:xdr_to_generic]
> (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x7d) 
> [0x7f4702a751ad]
> (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
> [0x7f4702a74de5]
> (-->/usr/local/lib/glusterfs/3.2.3/xlator/protocol/client.so(client3_1
> _entrylk_cbk+0x52)
> [0x7f46ff88a572]))) 0-xdr: XDR decoding failed
> [2011-09-16 18:29:33.760200] E
> [client3_1-fops.c:1292:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0: 
> error
> [2011-09-16 18:29:33.760215] I
> [client3_1-fops.c:1303:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0:
> remote operation failed: Invalid argument
> [2011-09-16 18:29:33.760417] E
> [afr-self-heal-entry.c:2292:afr_sh_post_nonblocking_entry_cbk]
> 0-GLSTORAGE-replicate-0: Non Blocking entrylks failed for /.
> [2011-09-16 18:29:33.760447] E
> [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk]
> 0-GLSTORAGE-replicate-0: background  meta-data entry self-heal failed 
> on /
> [2011-09-16 18:29:33.760808] I
> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
> remote operation failed: Invalid argument 
> ##
> #
>
>
> Is this normal? The directory in question already has 150GB of data, 
> so the find command is still running. Will it be ok once it finishes?
> from what I understand from the manual, the files should repair as the 
> find process runs, or did I misinterpret that?
>
> If self heal should fail, is there a failsafe method to ensure that 
> both nodes are in sync again?
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>


Well, the find process has finished in the meantime, and as expected, it didn't 
fix anything.

here are the last few lines of the client mount log:
##
2011-09-16 18:48:45.287954] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.288394] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.288921] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.289535] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.290063] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.290649] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument

Re: [Gluster-users] Self Heal fails...

2011-09-16 Thread Robert Krig

On 09/16/2011 06:36 PM, Robert Krig wrote:
>
> Hi there. I'm new to GlusterFS. I'm currently evaluating it for
> production usage.
>
> I have two Storage Servers which use JFS as a filesystem for the
> underlying export.
>
> The setup is supposed to be replicated.
>
> I've been experimenting with various settings for benchmarking and such,
> as well as trying out different failure scenarios.
>
> Anyways, the export directory on node 1 is out of sync with node 2.
> So I mounted the storage volume via glusterfs client on node1 in another
> directory.
>
> The fuse mounted directory is /storage
>
> As per Manual I tried doing the "find  -noleaf -print0 |
> xargs --null stat >/dev/null" dance, however the logs throw a bunch of
> errors:
> #
> [2011-09-16 18:29:33.759729] E
> [client3_1-fops.c:1216:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0: error
> [2011-09-16 18:29:33.759747] I
> [client3_1-fops.c:1226:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0:
> remote operation failed: Invalid argument
> [2011-09-16 18:29:33.759942] E
> [afr-self-heal-metadata.c:672:afr_sh_metadata_post_nonblocking_inodelk_cbk]
> 0-GLSTORAGE-replicate-0: Non Blocking metadata inodelks failed for /.
> [2011-09-16 18:29:33.759961] E
> [afr-self-heal-metadata.c:674:afr_sh_metadata_post_nonblocking_inodelk_cbk]
> 0-GLSTORAGE-replicate-0: Metadata self-heal failed for /.
> [2011-09-16 18:29:33.760167] W [rpc-common.c:64:xdr_to_generic]
> (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x7d) [0x7f4702a751ad]
> (-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
> [0x7f4702a74de5]
> (-->/usr/local/lib/glusterfs/3.2.3/xlator/protocol/client.so(client3_1_entrylk_cbk+0x52)
> [0x7f46ff88a572]))) 0-xdr: XDR decoding failed
> [2011-09-16 18:29:33.760200] E
> [client3_1-fops.c:1292:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0: error
> [2011-09-16 18:29:33.760215] I
> [client3_1-fops.c:1303:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0:
> remote operation failed: Invalid argument
> [2011-09-16 18:29:33.760417] E
> [afr-self-heal-entry.c:2292:afr_sh_post_nonblocking_entry_cbk]
> 0-GLSTORAGE-replicate-0: Non Blocking entrylks failed for /.
> [2011-09-16 18:29:33.760447] E
> [afr-self-heal-common.c:1554:afr_self_heal_completion_cbk]
> 0-GLSTORAGE-replicate-0: background  meta-data entry self-heal failed on /
> [2011-09-16 18:29:33.760808] I
> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
> remote operation failed: Invalid argument
> ###
>
>
> Is this normal? The directory in question already has 150GB of data, so
> the find command is still running. Will it be ok once it finishes?
> from what I understand from the manual, the files should repair as the
> find process runs, or did I misinterpret that?
>
> If self heal should fail, is there a failsafe method to ensure that both
> nodes are in sync again?
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>


Well, the find process has finished in the meantime, and as expected, it
didn't fix anything.

here are the last few lines of the client mount log:
##
2011-09-16 18:48:45.287954] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.288394] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.288921] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.289535] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.290063] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.290649] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:48:45.291126] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 20:14:52.289901] W [rpc-common.c:64:xdr_to_generic]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x7d) [0x7f4702a751ad]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7f4702a74de5]
(-->/usr/local/lib/glusterfs/3.2.3/xlator/protocol/client.so(client3_1_statfs_cbk+0x71)
[0x7f46ff88b741]))) 0-xdr: XDR decoding failed
[2011-09-16 20:14:52.289928] E
[client3_1-fops.c:624:client3_1_statfs_cbk] 0-GLSTORAGE-client-0: error
[2011-09-16 20:14:52.289939] I
[client3_1-fops.c:637:client3_1_statfs_cbk] 0-GLSTORAGE-client-0: remote
operation failed: Invalid argument
#

[Gluster-users] Self Heal fails...

2011-09-16 Thread Robert Krig

Hi there. I'm new to GlusterFS. I'm currently evaluating it for
production usage.

I have two Storage Servers which use JFS as a filesystem for the
underlying export.

The setup is supposed to be replicated.

I've been experimenting with various settings for benchmarking and such,
as well as trying out different failure scenarios.

Anyways, the export directory on node 1 is out of sync with node 2.
So I mounted the storage volume via glusterfs client on node1 in another
directory.

The fuse mounted directory is /storage

As per Manual I tried doing the "find  -noleaf -print0 |
xargs --null stat >/dev/null" dance, however the logs throw a bunch of
errors:
#
[2011-09-16 18:29:33.759729] E
[client3_1-fops.c:1216:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0: error
[2011-09-16 18:29:33.759747] I
[client3_1-fops.c:1226:client3_1_inodelk_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:29:33.759942] E
[afr-self-heal-metadata.c:672:afr_sh_metadata_post_nonblocking_inodelk_cbk]
0-GLSTORAGE-replicate-0: Non Blocking metadata inodelks failed for /.
[2011-09-16 18:29:33.759961] E
[afr-self-heal-metadata.c:674:afr_sh_metadata_post_nonblocking_inodelk_cbk]
0-GLSTORAGE-replicate-0: Metadata self-heal failed for /.
[2011-09-16 18:29:33.760167] W [rpc-common.c:64:xdr_to_generic]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x7d) [0x7f4702a751ad]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)
[0x7f4702a74de5]
(-->/usr/local/lib/glusterfs/3.2.3/xlator/protocol/client.so(client3_1_entrylk_cbk+0x52)
[0x7f46ff88a572]))) 0-xdr: XDR decoding failed
[2011-09-16 18:29:33.760200] E
[client3_1-fops.c:1292:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0: error
[2011-09-16 18:29:33.760215] I
[client3_1-fops.c:1303:client3_1_entrylk_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
[2011-09-16 18:29:33.760417] E
[afr-self-heal-entry.c:2292:afr_sh_post_nonblocking_entry_cbk]
0-GLSTORAGE-replicate-0: Non Blocking entrylks failed for /.
[2011-09-16 18:29:33.760447] E
[afr-self-heal-common.c:1554:afr_self_heal_completion_cbk]
0-GLSTORAGE-replicate-0: background  meta-data entry self-heal failed on /
[2011-09-16 18:29:33.760808] I
[client3_1-fops.c:2228:client3_1_lookup_cbk] 0-GLSTORAGE-client-0:
remote operation failed: Invalid argument
###


Is this normal? The directory in question already has 150GB of data, so
the find command is still running. Will it be ok once it finishes?
from what I understand from the manual, the files should repair as the
find process runs, or did I misinterpret that?

If self heal should fail, is there a failsafe method to ensure that both
nodes are in sync again?



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users