HI ya i found this interesting .. but dont know if its normal or not
i typed this command in 3 cluster nodes tcpdump -i eth0 ip multicast and for some reason.. i am seeing same output in 3 server which is 11:26:13.700399 IP http1.xxxxx.local.5149 > 239.192.2.185.netsupport: UDP, length 118 example.. Same output in every 3 server.. is this normal output ?? ( here http1 is having the trouble to locate or relocate services in the cluster) so basically, what ever i am seeing in http1 server i am seeing the same out put on rest .. here 239.192.2.185 is the multicast address of clsuter Thanks fosiul On 27 September 2010 18:37, fosiul alam <[email protected]> wrote: > Hi, Addition to my previous email have a look to this one > > from http1 ( where i am trying to relocate a service) > > > [r...@http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local > Member http1.xxxx.local trying to enable service:httpd1...Success > Warning: service:httpd1 is now running on mail01.xxxx.local > > so, its saying its Success.. > but it actually no.. > > Thanks again > > > > > On 27 September 2010 18:31, fosiul alam <[email protected]> wrote: > >> Hi >> Thanks for your advise, >> Currently i got this >> >> >> luci-0.12.2-12.el5.centos.1 >> ricci-0.12.2-12.el5.centos.1 >> >> is this the same rpm as >> >> luci-0.12.2-12.el5_5.4.i386.rpm ? >> ricci-0.12.2-12.el5_5.4.i386.rpm ? >> >> Thanks >> >> >> >> On 27 September 2010 17:55, Paul M. Dyer <[email protected]> wrote: >> >>> http://rhn.redhat.com/errata/RHBA-2010-0716.html >>> >>> It appears that this problem has been fixed in this errata. >>> >>> I installed the luci and ricci updates and did some lite testing. So >>> far, the timeout 11111 error has not shown up. >>> >>> Paul >>> >>> ----- Original Message ----- >>> From: "fosiul alam" <[email protected]> >>> To: "linux clustering" <[email protected]> >>> Sent: Monday, September 27, 2010 10:48:27 AM >>> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes >>> >>> Hi >>> i am trying to patch ricci . let see how it goes >>> >>> but clusvcadm is failing as well >>> >>> [r...@http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local >>> Member http1.xxxx.local trying to enable service:httpd1...Invalid >>> operation for resource >>> >>> here, http1 , where i was trying to run the service from luci >>> >>> what could be the problem ? >>> is there any way to find out if there is any problem with config ?? >>> >>> On 27 September 2010 16:26, Ben Turner < [email protected] > wrote: >>> >>> >>> RHEL 5.6 hasn't been released yet so your package probably contains the >>> problem. I'm not sure how in sync Centos is with RHEL or if they patch >>> earlier so I cannot give you a time frame when it will be in Centos or >>> if they have already patched it. The problem in that BZ is more of an >>> annoyance, you usually just have to retry a time or two and it works. If >>> you can't get Luci working properly with your service at all you should >>> try enabling the service through the command line with clusvcadm -e. If >>> it is not working from the command line either then there is a problem >>> with the service config. >>> >>> >>> >>> >>> -Ben >>> >>> >>> >>> >>> ----- "fosiul alam" < [email protected] > wrote: >>> >>> > Hi Ben >>> > Thanks >>> > >>> > I named this cluster as mysql-server but i have not installed mysql >>> > database in their yet >>> > >>> > and both luci and ricci on luci server and node1 is running this >>> > version >>> > >>> > luci-0.12.2-12.el5.centos.1 >>> > ricci-0.12.2-12.el5.centos.1 >>> > >>> > >>> > do you think this version has problem as well ?? >>> > >>> > thanks for your help >>> > >>> > >>> > >>> > >>> > On 24 September 2010 15:33, Ben Turner < [email protected] > wrote: >>> > >>> > >>> > There is an issue with ricci timeouts that was fixed recently: >>> > >>> > https://bugzilla.redhat.com/show_bug.cgi?id=564490 >>> > >>> > I'm not sure but you may be hitting that bug. Symptoms include: luci >>> > isn't able to get the status from the node, timeouts when querying >>> > ricci, etc. The fix should be released with 5.6 >>> > >>> > On the mysql service there are some options that you need to set. Here >>> > are all the options available to that agent: >>> > >>> > mysql >>> > Defines a MySQL database server >>> > >>> > Attribute Description >>> > config_file Define configuration file >>> > listen_address Define an IP address for MySQL server. If the address >>> > is not given then first IP address from the service is taken. >>> > mysqld_options Other command-line options for mysqld >>> > name Name >>> > ref Reference to existing mysql resource in the resources section. >>> > service_name Inherit the service name. >>> > shutdown_wait Wait X seconds for correct end of service shutdown >>> > startup_wait Wait X seconds for correct end of service startup >>> > __enforce_timeouts Consider a timeout for operations as fatal. >>> > __failure_expire_time Amount of time before a failure is forgotten. >>> > __independent_subtree Treat this and all children as an independent >>> > subtree. __max_failures Maximum number of failures before returning a >>> > failure to a status check. >>> > >>> > If I recall correctly you may need to tweak: >>> > >>> > shutdown_wait Wait X seconds for correct end of service shutdown >>> > startup_wait Wait X seconds for correct end of service startup >>> > >>> > There can be problems relocating the DB if it takes too long to >>> > start/shutdown. If you are having problems relocating with luci it may >>> > be a good idea to test with: >>> > >>> > # clusvcadm -r <service name> -m <cluster node> >>> > >>> > -Ben >>> > >>> > >>> > >>> > >>> > >>> > >>> > ----- "fosiul alam" < [email protected] > wrote: >>> > >>> > > Hi >>> > > I have 4 nodes cluster, >>> > > It was running fine. but today one nodes is giving trouble >>> > > >>> > > From luci Gui interface, when i try to relocate service into this >>> > node >>> > > and trying to relocate from this nodes to another nodes >>> > > >>> > > from luci gui interface, its showing : >>> > > >>> > > Unable to retrieve batch 1908047789 status from >>> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1: >>> > > Starting cluster service "httpd1" on node "http1.domain.local" -- >>> > You >>> > > will be redirected in 5 seconds. >>> > > also >>> > > >>> > > The ricci agent for this node is unresponsive. Node-specific >>> > > information is not available at this time. : >>> > > >>> > > but ricci is running on problematic node , >>> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101 >>> > > >>> > > there is not any firewall running. >>> > > >>> > > iptables -L >>> > > Chain INPUT (policy ACCEPT) >>> > > target prot opt source destination >>> > > >>> > > Chain FORWARD (policy ACCEPT) >>> > > target prot opt source destination >>> > > >>> > > Chain OUTPUT (policy ACCEPT) >>> > > target prot opt source destination >>> > > >>> > > Chain RH-Firewall-1-INPUT (0 references) >>> > > target prot opt source destination >>> > > >>> > > port 11111 is runningg >>> > > >>> > > netstat -an | grep 11111 >>> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN >>> > > >>> > > >>> > > but still ricci is very unstable , and i cant relocate any service >>> > on >>> > > this node or i cant relocate any service away from this node. >>> > > >>> > > from problematic node if i type this >>> > > >>> > > clustat >>> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010 >>> > > Member Status: Quorate >>> > > >>> > > Member Name ID Status >>> > > ------ ---- ---- ------ >>> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this >>> > > server publicdns1.xxxx.local 2 Online, rgmanager >>> > > http1.xxxx.local 3 Online, Local, rgmanager >>> > > mail01.xxxxx.local 4 Online, rgmanager >>> > > >>> > > Service Name Owner (Last) State >>> > > ------- ---- ----- ------ ----- >>> > > service:httpd1 mail01.xxxx.local started >>> > > service:mysql-server http1.xxxx.local started ------------------- >>> > this >>> > > is the problematic node >>> > > service:public-dns publicdns1.xxxxxx.local started >>> > > >>> > > I cant move that service mysql-server from this node or cant >>> > relocate >>> > > any service on this node .. >>> > > I am very confused. >>> > > >>> > > what shall i do to fix this issue ?? >>> > > >>> > > thanks for your advise. >>> > > >>> > > >>> > > >>> > > >>> > > -- Linux-cluster mailing list >>> > > [email protected] >>> > > https://www.redhat.com/mailman/listinfo/linux-cluster >>> > >>> > -- Linux-cluster mailing list >>> > [email protected] >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>> > >>> > >>> > -- Linux-cluster mailing list >>> > [email protected] >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>> >>> -- Linux-cluster mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >>> >>> -- Linux-cluster mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >>> -- >>> Linux-cluster mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> >> >
-- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
