Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-09 Thread Gang He
Hi Steven and David,


>>> 
> Hi,
> 
> 
> On 09/04/18 06:02, Gang He wrote:
>> Hello David,
>>
>> If the user sets "protocol=tcp" in the configuration file /etc/dlm/dlm.conf 
> under two-rings cluster environment,
>> DLM kernel module will not work with the below error message,
>> [   43.696924] DLM installed
>> [  149.552039] ocfs2: Registered cluster interface user
>> [  149.559579] dlm: TCP protocol can't handle multi-homed hosts, try SCTP  
> <<== here, failed
>> [  149.559589] dlm: cannot start dlm lowcomms -22
>> [  149.559612] (mount.ocfs2,2593,3):ocfs2_dlm_init:3120 ERROR: status = -22
>> [  149.559629] (mount.ocfs2,2593,3):ocfs2_mount_volume:1845 ERROR: status = 
> -22
>>
>> Then, could we modify the code, let this case still work via only using one 
> ring address? or the code is written by purpose.
>> in lowcomms.c
>> 1358 static int tcp_listen_for_all(void)
>> 1359 {
>> 1360 struct socket *sock = NULL;
>> 1361 struct connection *con = nodeid2con(0, GFP_NOFS);
>> 1362 int result = -EINVAL;
>> 1363
>> 1364 if (!con)
>> 1365 return -ENOMEM;
>> 1366
>> 1367 /* We don't support multi-homed hosts */
>> 1368 if (dlm_local_addr[1] != NULL) {   <<== here, could we get ride 
> of this limitation?
>> 1369 log_print("TCP protocol can't handle multi-homed hosts, 
> "
>> 1370   "try SCTP");
>> 1371 return -EINVAL;
>> 1372 }
>> 1373
>> 1374 log_print("Using TCP for communications");
>> 1375
>> 1376 sock = tcp_create_listen_sock(con, dlm_local_addr[0]);
>> 1377 if (sock) {
>> 1378 add_sock(sock, con);
>> 1379 result = 0;
>> 1380 }
>> 1381 else {
>> 1382 result = -EADDRINUSE;
>> 1383 }
>>
>>
>> Thanks
>> Gang
>>
> There is already a patch set to allow multi-homing for TCP. Mark and 
> Dave can comment on the current status and how far from merging it 
> currently is,
Thanks for your update on this problem, hopefully we can the related patches in 
the Linus git tree soon.

Thanks
Gang

> 
> Steve.




Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-09 Thread Steven Whitehouse

Hi,


On 09/04/18 06:02, Gang He wrote:

Hello David,

If the user sets "protocol=tcp" in the configuration file /etc/dlm/dlm.conf 
under two-rings cluster environment,
DLM kernel module will not work with the below error message,
[   43.696924] DLM installed
[  149.552039] ocfs2: Registered cluster interface user
[  149.559579] dlm: TCP protocol can't handle multi-homed hosts, try SCTP  <<== 
here, failed
[  149.559589] dlm: cannot start dlm lowcomms -22
[  149.559612] (mount.ocfs2,2593,3):ocfs2_dlm_init:3120 ERROR: status = -22
[  149.559629] (mount.ocfs2,2593,3):ocfs2_mount_volume:1845 ERROR: status = -22

Then, could we modify the code, let this case still work via only using one 
ring address? or the code is written by purpose.
in lowcomms.c
1358 static int tcp_listen_for_all(void)
1359 {
1360 struct socket *sock = NULL;
1361 struct connection *con = nodeid2con(0, GFP_NOFS);
1362 int result = -EINVAL;
1363
1364 if (!con)
1365 return -ENOMEM;
1366
1367 /* We don't support multi-homed hosts */
1368 if (dlm_local_addr[1] != NULL) {   <<== here, could we get ride of 
this limitation?
1369 log_print("TCP protocol can't handle multi-homed hosts, "
1370   "try SCTP");
1371 return -EINVAL;
1372 }
1373
1374 log_print("Using TCP for communications");
1375
1376 sock = tcp_create_listen_sock(con, dlm_local_addr[0]);
1377 if (sock) {
1378 add_sock(sock, con);
1379 result = 0;
1380 }
1381 else {
1382 result = -EADDRINUSE;
1383 }


Thanks
Gang

There is already a patch set to allow multi-homing for TCP. Mark and 
Dave can comment on the current status and how far from merging it 
currently is,


Steve.



Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-08 Thread Gang He
Hello David,

If the user sets "protocol=tcp" in the configuration file /etc/dlm/dlm.conf 
under two-rings cluster environment,
DLM kernel module will not work with the below error message,
[   43.696924] DLM installed
[  149.552039] ocfs2: Registered cluster interface user
[  149.559579] dlm: TCP protocol can't handle multi-homed hosts, try SCTP  <<== 
here, failed
[  149.559589] dlm: cannot start dlm lowcomms -22
[  149.559612] (mount.ocfs2,2593,3):ocfs2_dlm_init:3120 ERROR: status = -22
[  149.559629] (mount.ocfs2,2593,3):ocfs2_mount_volume:1845 ERROR: status = -22 

Then, could we modify the code, let this case still work via only using one 
ring address? or the code is written by purpose.
in lowcomms.c
1358 static int tcp_listen_for_all(void)
1359 {
1360 struct socket *sock = NULL;
1361 struct connection *con = nodeid2con(0, GFP_NOFS);
1362 int result = -EINVAL;
1363
1364 if (!con)
1365 return -ENOMEM;
1366
1367 /* We don't support multi-homed hosts */
1368 if (dlm_local_addr[1] != NULL) {   <<== here, could we get ride of 
this limitation? 
1369 log_print("TCP protocol can't handle multi-homed hosts, "
1370   "try SCTP");
1371 return -EINVAL;
1372 }
1373
1374 log_print("Using TCP for communications");
1375
1376 sock = tcp_create_listen_sock(con, dlm_local_addr[0]);
1377 if (sock) {
1378 add_sock(sock, con);
1379 result = 0;
1380 }
1381 else {
1382 result = -EADDRINUSE;
1383 }


Thanks
Gang


>>> 
> On Mon, Apr 02, 2018 at 08:01:24PM -0600, Gang He wrote:
>> OK, I got your point.
>> But, could we have a appropriate way to let the users know SCTP protocol 
> status?
> 
> I think this is a case where suse/rh/etc need to have their own
> distro-specific approaches for specifying the usage parameters that they
> have tested and found to be supportable.  Other companies have previously
> found their specific use of SCTP to be acceptable.  RH does not properly
> support dlm+SCTP for similar reasons as you've found, although I've more
> recently encouraged customers to try dlm+SCTP with a single path in order
> debug or diagnose potential networking issues.




Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-02 Thread Gang He
Hi David,



>>> 
> On Thu, Mar 22, 2018 at 10:27:56PM -0600, Gang He wrote:
>> Hello David,
>> 
>> Do you agree to add this prompt to the user? 
>> Since sometimes customers attempted to setup SCTP protocol with two rings, 
>> but they could not get the expected result, then it maybe bring some 
> concerns to the customer for DLM qualities.
> 
> I don't think the kernel message is a good way to communicate this to users.
> Dave
OK, I got your point.
But, could we have a appropriate way to let the users know SCTP protocol status?

Thanks
Gang

> 
> 
>> > As you know, DLM module can use TCP or SCTP protocols to
>> > communicate among the cluster.
>> > But, according to our testing, SCTP protocol is still considered
>> > experimental, since not all aspects are working correctly and
>> > it is not full tested.
>> > e.g. SCTP connection channel switch needs about 5mins hang in case
>> > one connection(ring) is broken.
>> > Then, I suggest to add a kernel print, which prompts the user SCTP
>> > protocol for DLM should be considered experimental, it is not
>> > recommended in production environment.
>> > 
>> > Signed-off-by: Gang He 
>> > ---
>> >  fs/dlm/lowcomms.c | 1 +
>> >  1 file changed, 1 insertion(+)
>> > 
>> > diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
>> > index cff79ea..18fd85d 100644
>> > --- a/fs/dlm/lowcomms.c
>> > +++ b/fs/dlm/lowcomms.c
>> > @@ -1307,6 +1307,7 @@ static int sctp_listen_for_all(void)
>> >return -ENOMEM;
>> >  
>> >log_print("Using SCTP for communications");
>> > +  log_print("SCTP protocol is experimental, use at your own risk");
>> >  
>> >result = sock_create_kern(&init_net, dlm_local_addr[0]->ss_family,
>> >  SOCK_STREAM, IPPROTO_SCTP, &sock);
>> > -- 
>> > 1.8.5.6




Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-04-02 Thread David Teigland
On Thu, Mar 22, 2018 at 10:27:56PM -0600, Gang He wrote:
> Hello David,
> 
> Do you agree to add this prompt to the user? 
> Since sometimes customers attempted to setup SCTP protocol with two rings, 
> but they could not get the expected result, then it maybe bring some concerns 
> to the customer for DLM qualities.

I don't think the kernel message is a good way to communicate this to users.
Dave


> > As you know, DLM module can use TCP or SCTP protocols to
> > communicate among the cluster.
> > But, according to our testing, SCTP protocol is still considered
> > experimental, since not all aspects are working correctly and
> > it is not full tested.
> > e.g. SCTP connection channel switch needs about 5mins hang in case
> > one connection(ring) is broken.
> > Then, I suggest to add a kernel print, which prompts the user SCTP
> > protocol for DLM should be considered experimental, it is not
> > recommended in production environment.
> > 
> > Signed-off-by: Gang He 
> > ---
> >  fs/dlm/lowcomms.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> > index cff79ea..18fd85d 100644
> > --- a/fs/dlm/lowcomms.c
> > +++ b/fs/dlm/lowcomms.c
> > @@ -1307,6 +1307,7 @@ static int sctp_listen_for_all(void)
> > return -ENOMEM;
> >  
> > log_print("Using SCTP for communications");
> > +   log_print("SCTP protocol is experimental, use at your own risk");
> >  
> > result = sock_create_kern(&init_net, dlm_local_addr[0]->ss_family,
> >   SOCK_STREAM, IPPROTO_SCTP, &sock);
> > -- 
> > 1.8.5.6



Re: [Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-03-22 Thread Gang He
Hello David,

Do you agree to add this prompt to the user? 
Since sometimes customers attempted to setup SCTP protocol with two rings, 
but they could not get the expected result, then it maybe bring some concerns 
to the customer for DLM qualities.


Thanks
Gang


>>> 
> As you know, DLM module can use TCP or SCTP protocols to
> communicate among the cluster.
> But, according to our testing, SCTP protocol is still considered
> experimental, since not all aspects are working correctly and
> it is not full tested.
> e.g. SCTP connection channel switch needs about 5mins hang in case
> one connection(ring) is broken.
> Then, I suggest to add a kernel print, which prompts the user SCTP
> protocol for DLM should be considered experimental, it is not
> recommended in production environment.
> 
> Signed-off-by: Gang He 
> ---
>  fs/dlm/lowcomms.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
> index cff79ea..18fd85d 100644
> --- a/fs/dlm/lowcomms.c
> +++ b/fs/dlm/lowcomms.c
> @@ -1307,6 +1307,7 @@ static int sctp_listen_for_all(void)
>   return -ENOMEM;
>  
>   log_print("Using SCTP for communications");
> + log_print("SCTP protocol is experimental, use at your own risk");
>  
>   result = sock_create_kern(&init_net, dlm_local_addr[0]->ss_family,
> SOCK_STREAM, IPPROTO_SCTP, &sock);
> -- 
> 1.8.5.6




[Cluster-devel] [PATCH] dlm: prompt the user SCTP is experimental

2018-03-19 Thread Gang He
As you know, DLM module can use TCP or SCTP protocols to
communicate among the cluster.
But, according to our testing, SCTP protocol is still considered
experimental, since not all aspects are working correctly and
it is not full tested.
e.g. SCTP connection channel switch needs about 5mins hang in case
one connection(ring) is broken.
Then, I suggest to add a kernel print, which prompts the user SCTP
protocol for DLM should be considered experimental, it is not
recommended in production environment.

Signed-off-by: Gang He 
---
 fs/dlm/lowcomms.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index cff79ea..18fd85d 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1307,6 +1307,7 @@ static int sctp_listen_for_all(void)
return -ENOMEM;
 
log_print("Using SCTP for communications");
+   log_print("SCTP protocol is experimental, use at your own risk");
 
result = sock_create_kern(&init_net, dlm_local_addr[0]->ss_family,
  SOCK_STREAM, IPPROTO_SCTP, &sock);
-- 
1.8.5.6