Re: [Ocfs2-users] RBD with OCFS2

Joseph Qi Tue, 29 Sep 2015 02:36:07 -0700

On 2015/9/29 15:18, gjprabu wrote:
> Hi Joseph,
> 
>     We have total 7 nodes and this problem occurs in multiple nodes 
> simultaneously not in particular one node. we checked network and its fine.
> When we remount the ocfs2 partition, this problem is get fixed temporarily 
> and same problem reoccurs after some time.
> 
> Even we do have problem while unmountinng.  umount process goes to "D" stat, 
> then i need to restart server itself. Is there any solution for this issue.
> 
>    I have tried running fsck.ocfs2 in problematic machine but its throwing 
> error.
> 
> fsck.ocfs2 1.8.0
> fsck.ocfs2: I/O error on channel while opening "/zoho/build/downloads"
> 
IMO, this can happen if the mountpoint is offline.


> 
> Please refer the latest logs from one node.
> 
> [258418.054204] o2cb: o2dlm has evicted node 7 from domain 
> A895BC216BE641A8A7E20AA89D57E051
> [258418.957738] o2cb: o2dlm has evicted node 7 from domain 
> A895BC216BE641A8A7E20AA89D57E051
> [264056.408719] o2dlm: Node 7 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 
> 1 2 3 4 7 ) 5 nodes
> [264464.605542] o2dlm: Node 7 leaves domain A895BC216BE641A8A7E20AA89D57E051 
> ( 1 2 3 4 ) 4 nodes
> [275619.497198] o2dlm: Node 7 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 
> 1 2 3 4 7 ) 5 nodes
> [426628.076148] o2cb: o2dlm has evicted node 1 from domain 
> A895BC216BE641A8A7E20AA89D57E051
> [426628.885084] o2dlm: Begin recovery on domain 
> A895BC216BE641A8A7E20AA89D57E051 for node 1
> [426628.891170] o2dlm: Node 3 (me) is the Recovery Master for the dead node 1 
> in domain A895BC216BE641A8A7E20AA89D57E051
> [426634.182384] o2dlm: End recovery on domain A895BC216BE641A8A7E20AA89D57E051
> [427001.383315] o2dlm: Node 1 joins domain A895BC216BE641A8A7E20AA89D57E051 ( 
> 1 2 3 4 7 ) 5 nodes
> 
The above message shows nodes in your cluster is frequently in and out.
I suggest you check the cluster config in each node
(/etc/ocfs2/cluster.conf as well as /sys/kernel/config/<cluster_name>/node/).
I haven't used ocfs2 along with ceph rbd. So I am not sure if it has
relations with rbd.

> 
> 
> 
> Regards
> G.J
> **
> 
> 
> 
> ---- On Fri, 25 Sep 2015 06:26:57 +0530 *Joseph Qi <joseph...@huawei.com>* 
> wrote ----
> 
>     On 2015/9/24 18:30, gjprabu wrote:
>     > Hi All,
>     >
>     > Can someone tell me what kind of is this.
>     >
>     > Regards
>     > Prabu GJ
>     >
>     >
>     > ---- On Wed, 23 Sep 2015 18:26:13 +0530 *gjprabu <gjpr...@zohocorp.com 
> <mailto:gjpr...@zohocorp.com>>* wrote ----
>     >
>     > Hi All,
>     >
>     > This issue we faced in locally machine also. but it is not in all the 
> client only two ocfs2 client we facing this issue.
>     >
>     > Regards
>     > Prabu GJ
>     >
>     >
>     >
>     > ---- On Wed, 23 Sep 2015 17:49:51 +0530 *gjprabu <gjpr...@zohocorp.com 
> <mailto:gjpr...@zohocorp.com> <mailto:gjpr...@zohocorp.com>>* wrote ----
>     >
>     >
>     >
>     > Hi All,
>     >
>     > We are using ocfs2 for RBD mounting and everything works fine, but 
> while writing or moving the data via the scripts after written it shows below 
> error. Please anybody help on this issue.
>     >
>     >
>     >
>     > # ls -althr
>     > ls: cannot access MICKEYLITE_3_0_M4_1_TEST: Input/output error
>     > ls: cannot access MICKEYLITE_3_0_M4_1_OLD: Input/output error
>     > total 0
>     > d????????? ? ? ? ? ? MICKEYLITE_3_0_M4_1_TEST
>     > d????????? ? ? ? ? ? MICKEYLITE_3_0_M4_1_OLD
>     >
>     > _*Partition details.*_
>     >
>     > /dev/rbd0 ocfs2 9.6T 140G 9.5T 2% /zoho/build/downloads
>     >
>     > /etc/ocfs2/cluster.conf
>     > cluster:
>     > node_count=7
>     > heartbeat_mode = local
>     > name=ocfs2
>     >
>     > node:
>     > ip_port = 7777
>     > ip_address = 10.1.1.50
>     > number = 1
>     > name = integ-hm5
>     > cluster = ocfs2
>     >
>     > node:
>     > ip_port = 7777
>     > ip_address = 10.1.1.51
>     > number = 2
>     > name = integ-hm9
>     > cluster = ocfs2
>     >
>     > node:
>     > ip_port = 7777
>     > ip_address = 10.1.1.52
>     > number = 3
>     > name = integ-hm2
>     > cluster = ocfs2
>     >
>     > node:
>     > ip_port = 7777
>     > ip_address = 10.1.1.53
>     > number = 4
>     > name = integ-ci-1
>     > cluster = ocfs2
>     > node:
>     > ip_port = 7777
>     > ip_address = 10.1.1.54
>     > number = 5
>     > name = integ-cm2
>     > cluster = ocfs2
>     > node:
>     > ip_port = 7777
>     > ip_address = 10.1.1.55
>     > number = 6
>     > name = integ-cm1
>     > cluster = ocfs2
>     > node:
>     > ip_port = 7777
>     > ip_address = 10.1.1.56
>     > number = 7
>     > name = integ-hm8
>     > cluster = ocfs2
>     >
>     >
>     > *_Error on dmesg_*
>     >
>     >
>     > [516421.342393] (dlm_thread,51005,25):dlm_flush_asts:599 ERROR: status 
> = -112
>     > [517119.689992] (httpd,64399,31):dlm_do_master_request:1383 ERROR: link 
> to 1 went down!
>     > [517119.690003] (dlm_thread,51005,25):dlm_send_proxy_ast_msg:482 ERROR: 
> A895BC216BE641A8A7E20AA89D57E051: res S000000000000000000000200000000, error 
> -112 send AST to node 1
>     > [517119.690028] (dlm_thread,51005,25):dlm_flush_asts:599 ERROR: status 
> = -112
>     > [517119.690034] (dlm_thread,51005,25):dlm_send_proxy_ast_msg:482 ERROR: 
> A895BC216BE641A8A7E20AA89D57E051: res S000000000000000000000200000000, error 
> -107 send AST to node 1
>     > [517119.690036] (dlm_thread,51005,25):dlm_flush_asts:599 ERROR: status 
> = -107
>     > [517119.700425] (httpd,64399,31):dlm_get_lock_resource:968 ERROR: 
> status = -112
>     > [517517.894949] (dlm_thread,51005,25):dlm_send_proxy_ast_msg:482 ERROR: 
> A895BC216BE641A8A7E20AA89D57E051: res S000000000000000000000200000000, error 
> -112 send AST to node 1
>     > [517517.899640] (dlm_thread,51005,25):dlm_flush_asts:599 ERROR: status 
> = -112
>     >
>     The error messages means the connection between this node and node 1 has 
> problem.
>     You have to check the network.
> 
>     >
>     > Regards
>     > Prabu GJ
>     >
>     >
>     >
>     > _______________________________________________
>     > Ocfs2-users mailing list
>     > Ocfs2-users@oss.oracle.com <mailto:Ocfs2-users@oss.oracle.com> 
> <mailto:Ocfs2-users@oss.oracle.com>
>     > https://oss.oracle.com/mailman/listinfo/ocfs2-users
>     >
>     >
>     >
>     >
>     > _______________________________________________
>     > Ocfs2-users mailing list
>     > Ocfs2-users@oss.oracle.com <mailto:Ocfs2-users@oss.oracle.com>
>     > https://oss.oracle.com/mailman/listinfo/ocfs2-users
>     >
> 
> 
> 



_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] RBD with OCFS2

Reply via email to