Following are few logs  in /var/log/messages

Some Malformed packet is received and connection is closed by controller

//Controller Logs
Jul  2 13:14:19 VEM-1 osafimmd[2641]: Node 11d0f request sync sync-pid:2841 
epoch:0
Jul  2 13:14:20 VEM-1 osafimmnd[2656]: Announce sync, epoch:3
Jul  2 13:14:20 VEM-1 osafimmnd[2656]: SERVER STATE: IMM_SERVER_READY --> 
IMM_SERVER_SYNC_SERVER
Jul  2 13:14:20 VEM-1 osafimmd[2641]: Successfully announced sync. New ruling 
epoch:3
Jul  2 13:14:20 VEM-1 osafimmnd[2656]: NODE STATE-> IMM_NODE_R_AVAILABLE
Jul  2 13:14:20 VEM-1 immload: Sync starting
Jul  2 13:14:20 VEM-1 immload: Synced 1541 objects in total
Jul  2 13:14:20 VEM-1 osafimmnd[2656]: NODE STATE-> IMM_NODE_FULLY_AVAILABLE 
12197
Jul  2 13:14:20 VEM-1 osafimmnd[2656]: Epoch set to 3 in ImmModel
Jul  2 13:14:20 VEM-1 osafimmd[2641]: ACT: New Epoch for IMMND process at node 
1010f old epoch: 2  new epoch:3
Jul  2 13:14:20 VEM-1 immload: Sync ending normally
Jul  2 13:14:21 VEM-1 osafimmnd[2656]: SERVER STATE: IMM_SERVER_SYNC_SERVER --> 
IMM SERVER READY
Jul  2 13:14:21 VEM-1 osafdtmd[2593]: DTM:dtm_comm_socket_recv() failed rc : 79
Jul  2 13:14:21 VEM-1 osafclmd[2707]: Node 72975 doesn't exist
Jul  2 13:14:21 VEM-1 osafimmnd[2656]: Global discard node received for 
nodeId:11d0f pid:2841
Jul  2 13:22:18 VEM-1 osafimmnd[2656]: Global discard node received for 
nodeId:11d0f pid:0
Jul  2 13:22:34 VEM-1 osafimmd[2641]: Node 11d0f request sync sync-pid:3382 
epoch:0
Jul  2 13:22:35 VEM-1 osafimmnd[2656]: Announce sync, epoch:4
Jul  2 13:22:35 VEM-1 osafimmnd[2656]: SERVER STATE: IMM_SERVER_READY --> 
IMM_SERVER_SYNC_SERVER
Jul  2 13:22:35 VEM-1 osafimmd[2641]: Successfully announced sync. New ruling 
epoch:4

//Payload logs
Jul  2 13:14:16 SDB-1 opensafd: Starting OpenSAF Services
Jul  2 13:14:18 SDB-1 osafdtmd[2823]: Started
Jul  2 13:14:19 SDB-1 osafimmnd[2841]: Started
Jul  2 13:14:19 SDB-1 osafimmnd[2841]: Director Service is up
Jul  2 13:14:19 SDB-1 osafimmnd[2841]: SERVER STATE: IMM_SERVER_ANONYMOUS --> 
IMM_SERVER_CLUSTER_WAITING
Jul  2 13:14:19 SDB-1 osafimmnd[2841]: SERVER STATE: IMM_SERVER_CLUSTER_WAITING 
--> IMM_SERVER_LOADING_PENDING
Jul  2 13:14:19 SDB-1 osafimmnd[2841]: SERVER STATE: IMM_SERVER_LOADING_PENDING 
--> IMM_SERVER_SYNC_PENDING
Jul  2 13:14:19 SDB-1 osafimmnd[2841]: NODE STATE-> IMM_NODE_ISOLATED
Jul  2 13:14:20 SDB-1 osafimmnd[2841]: NODE STATE-> IMM_NODE_W_AVAILABLE
Jul  2 13:14:20 SDB-1 osafimmnd[2841]: SERVER STATE: IMM_SERVER_SYNC_PENDING 
--> IMM_SERVER_SYNC_CLIENT
Jul  2 13:14:21 SDB-1 osafdtmd[2823]: DTM: Malformed packet recd, Ident : 
1097688659, ver : 101
Jul  2 13:14:21 SDB-1 osafdtmd[2823]: DTM: Malformed packet recd, Ident : 
1634956110, ver : 97
Jul  2 13:14:21 SDB-1 osafdtmd[2823]: DTM: Malformed packet recd, Ident : 
1883464806, ver : 0
Jul  2 13:14:21 SDB-1 osafimmnd[2841]: Director Service in NOACTIVE state
Jul  2 13:14:21 SDB-1 osafimmnd[2841]: Director Service is down
Jul  2 13:22:18 SDB-1 opensafd[2810]: Timed-out for response from IMMND
Jul  2 13:22:18 SDB-1 opensafd[2810]:
Jul  2 13:22:18 SDB-1 opensafd[2810]: Going for recovery
Jul  2 13:22:18 SDB-1 opensafd[2810]: Trying To RESPAWN 
/usr/lib64/opensaf/clc-cli/osaf-immnd attempt #1
Jul  2 13:22:18 SDB-1 opensafd[2810]: Sending SIGKILL to IMMND, pid=2829
Jul  2 13:22:34 SDB-1 osafimmnd[3382]: Started
Jul  2 13:22:34 SDB-1 osafimmnd[3382]: Director Service is up
Jul  2 13:22:34 SDB-1 osafimmnd[3382]: SERVER STATE: IMM_SERVER_ANONYMOUS --> 
IMM_SERVER_CLUSTER_WAITING
Jul  2 13:22:34 SDB-1 osafimmnd[3382]: SERVER STATE: IMM_SERVER_CLUSTER_WAITING 
--> IMM_SERVER_LOADING_PENDING
Jul  2 13:22:34 SDB-1 osafimmnd[3382]: SERVER STATE: IMM_SERVER_LOADING_PENDING 
--> IMM_SERVER_SYNC_PENDING
Jul  2 13:22:34 SDB-1 osafimmnd[3382]: NODE STATE-> IMM_NODE_ISOLATED
Jul  2 13:22:35 SDB-1 osafimmnd[3382]: NODE STATE-> IMM_NODE_W_AVAILABLE
Jul  2 13:22:35 SDB-1 osafimmnd[3382]: SERVER STATE: IMM_SERVER_SYNC_PENDING 
--> IMM_SERVER_SYNC_CLIENT
Jul  2 13:22:36 SDB-1 osafdtmd[2823]: DTM: Malformed packet recd, Ident : 
1231908161, ver : 116
Jul  2 13:22:36 SDB-1 osafimmnd[3382]: Director Service in NOACTIVE state
Jul  2 13:22:36 SDB-1 osafimmnd[3382]: Director Service is down
Jul  2 13:25:36 SDB-1 osafimmnd[3382]: Director Service is down
Jul  2 13:30:34 SDB-1 opensafd[2810]: Timed-out for response from IMMND
Jul  2 13:30:34 SDB-1 opensafd[2810]: Could Not RESPAWN IMMND

Getting this resolved is very critical to our product.
Anyone have any idea what is happening here?

Thanks,
Nivrutti

-----Original Message-----
From: Anders Björnerstedt [mailto:[email protected]] 
Sent: Thursday, July 02, 2015 2:14 PM
To: Nivrutti Kale; [email protected]
Subject: RE: [users] Timeout for response from IMMD

Ok 1600 objects is nothing.
The sync should get done in seconds after having been started.

So you apparently have some configuration problem or communication problem.
Hard to say what it is.

/AndersBj

-----Original Message-----
From: Nivrutti Kale [mailto:[email protected]]
Sent: den 2 juli 2015 09:26
To: Anders Björnerstedt; [email protected]
Subject: RE: [users] Timeout for response from IMMD

I have around 1600 objects which synced up correctly.

When I enabled the dtmd trace between the nodes, some malformed packets is 
received on payload node and connection between controller and payload is 
closed.
After this, there is no dtmd trace for those 8 minutes. And after 8 minutes " 
Time-out for response from IMMD" log comes in /var/log/messages.
Any idea what is happening here?

Is it possible to have a communication break between 2 VMs. Every time the time 
is exactly around 8 minutes.

Thanks,
Nivrutti

-----Original Message-----
From: Anders Björnerstedt [mailto:[email protected]]
Sent: Tuesday, June 30, 2015 1:45 PM
To: Nivrutti Kale; [email protected]
Subject: RE: [users] Timeout for response from IMMD

Eight minutes  is extremely long for a sync.
Sync time of course depends on volume to be synced.

How much data roughly are you using?
That is number of IMM objects and average size per imm object roughly.

The IMM programmers reference doc points out that the IMM is not suitable for 
storing large volumes of data.
It is tested regularly to cope with 300 000 objects of 300 bytes average size. 
If you go beyond that then you are stretching the use case.

The IMM is intended only for storing config data and runtime data for 
configuring and reflecting services Running in that OpenSAF cluster.

/AndersBj



-----Original Message-----
From: Nivrutti Kale [mailto:[email protected]]
Sent: den 30 juni 2015 10:05
To: [email protected]
Subject: Re: [users] Timeout for response from IMMD

Hi All,

Sometimes, I am getting Time-out for response from IMMD issue while starting 
one of payload.
I am using openSAF 4.5, though I have seen this issue with 4.2.2 as well.

Also I want to understand how imm sync works between payload and controller.
In this case payload waits for almost 8 minutes before any error recovery.

Please find attached logs for controller and payload.

Can we control these timeout values or can we off the imm sync, so that payload 
should not wait before any error recovery?


Thanks,
Nivrutti

On Tue, Jun 30, 2015 at 1:33 PM, Nivrutti Kale <[email protected]> wrote:

> Hi All,
>
> Sometimes, I am getting Time-out for response from IMMD issue while 
> starting one of payload.
> I am using openSAF 4.5, though I have seen this issue with 4.2.2 as well.
>
> Also I want to understand how imm sync works between payload and 
> controller.
> In this case payload waits for almost 8 minutes before any error recovery.
>
> Please find attached logs for controller and payload.
>
> Can we control these timeout values or can we off the imm sync, so 
> that payload should not wait before any error recovery?
>
>
> Thanks,
> Nivrutti
>
>
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that you need 
to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Opensaf-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-users

Reply via email to