On Tue, Nov 15, 2011 at 3:29 AM, Nick Khamis <sym...@gmail.com> wrote:
> Hello Andrew,
>
> Thank you so much for your response. I wanted to clarify, I am running the
> pacemaker stack,

If you are running pacemaker on top of cman do not use ocfs2_controld.pcmk
Is that clearer?

> and experiening errors with ocf:pacemaker:o2cb and
> ocfs2_controld.pcmk. Tracking some of the o2cb processes, I waned to say that:
>
> * aisexec does contain:
>
> export 
> COROSYNC_DEFAULT_CONFIG_IFACE="openaisserviceenableexperimental:corosync_parser"
> corosync "$@"
>
> And when issuing ocfs2_controld.pcmk -D, I am recieving the following error:
>
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'corosync_quorum' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'corosync_cman' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'openais_clm' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'openais_evt' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'openais_ckpt' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'openais_msg' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'openais_lck' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'openais_tmr' for option: name
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next: No
> additional configuration supplied for: service
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: config_find_next:
> Processing additional quorum options...
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_config_opt: Found
> 'quorum_cman' for option: provider
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_cluster_type:
> Detected an active 'cman' cluster
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: get_local_node_name:
> Using CMAN node name: astdrbd1
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info:
> init_ais_connection_once: Connection to 'cman': established
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: crm_new_peer: Node
> astdrbd1 now has id: 1
> ocfs2_controld[10601]: 2011/11/14_11:26:22 info: crm_new_peer: Node 1
> is now known as astdrbd1
> ocfs2_controld[10601]: 2011/11/14_11:26:22 ERROR: crm_abort:
> send_ais_text: Triggered assert at corosync.c:352 : dest !=
> crm_msg_ais
> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[10601]: 2011/11/14_11:26:22 ERROR: send_ais_text:
> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[10601]: 2011/11/14_11:26:22 ERROR: crm_abort:
> send_ais_text: Triggered assert at corosync.c:352 : dest !=
> crm_msg_ais
> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[10601]: 2011/11/14_11:26:22 ERROR: send_ais_text:
> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
> 1321287982 setup_stack@170: Cluster connection established.  Local node id: 1
> 1321287982 setup_stack@174: Added Pacemaker as client 1 with fd -1
>
> Thanks in Advance,
>
> Nick.
>
>
>
> On Sun, Nov 13, 2011 at 7:44 PM, Andrew Beekhof <and...@beekhof.net> wrote:
>> On Mon, Nov 14, 2011 at 11:12 AM, Nick Khamis <sym...@gmail.com> wrote:
>>> Hello Andrew,
>>>
>>> Thank you so much for your response. I am using ocfs-tools 1.6.and it only
>>> includes pcmk and cman ocfs2 controld:
>>>
>>> ocfs2_controld.cman  ocfs2_controld.pcmk  ocfs2_hb_ctl
>>>
>>> Which stack provides the standard ocfs2_controld?
>>
>> If you're running cman, use the cman one
>>
>>>
>>> Thanks for Everything!
>>>
>>> Nick.
>>>
>>> If it's cman
>>>
>>> On Sun, Nov 13, 2011 at 6:49 PM, Andrew Beekhof <and...@beekhof.net> wrote:
>>>> On Sat, Nov 12, 2011 at 12:06 AM, Nick Khamis <sym...@gmail.com> wrote:
>>>>> Hello Andrew,
>>>>>
>>>>> I do appologize for this, and really appreciate how far I have got into
>>>>> this project thanks to everyone's help. Just as a quick summary:
>>>>>
>>>>> the patch that you suggested did in fact fix the following (ais.c:346):
>>>>>
>>>>> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: crm_abort:
>>>>> send_ais_text: Triggered assert at ais.c:346 : dest != crm_msg_ais
>>>>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: send_ais_text:
>>>>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: crm_abort:
>>>>> send_ais_text: Triggered assert at ais.c:346 : dest != crm_msg_ais
>>>>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: send_ais_text:
>>>>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> 1320247939 setup_stack@170: Cluster connection established.  Local node 
>>>>> id: 1
>>>>> 1320247939 setup_stack@174: Added Pacemaker as client 1 with fd -1
>>>>>
>>>>> The run-time error I am getting now is in (corosync.c:352):
>>>>>
>>>>> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node 1
>>>>> is now known as astdrbd1
>>>>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
>>>>> send_ais_text: Triggered assert at corosync.c:352 : dest !=
>>>>> crm_msg_ais
>>>>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
>>>>> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort:
>>>>> send_ais_text: Triggered assert at corosync.c:352 : dest !=
>>>>> crm_msg_ais
>>>>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text:
>>>>> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
>>>>> 1320352460 setup_stack@170: Cluster connection established.  Local node 
>>>>> id: 1
>>>>> 1320352460 setup_stack@174: Added Pacemaker as client 1 with fd -1
>>>>>
>>>>>
>>>>> * The controld RA is using the standard dlm_controld, and this is now 
>>>>> working.
>>>>> * The o2cb RA is using ocfs2_controld.pcmk, and this is where I am 
>>>>> running into
>>>>> the runtime error with corosync.c
>>>>
>>>> As I mentioned in the last email, you're not supposed to use
>>>> ocfs2_controld.pcmk with cman.
>>>> You must use the standard ocfs2_controld
>>>>
>>>>>
>>>>>>
>>>>>> IMO (and as Florian alluded to in another message), you'd probably save
>>>>>> yourself a lot of trouble taking prebuilt packages from a distro where
>>>>>> the pieces you need are known to work together.
>>>>>
>>>>>> Indeed.
>>>>>
>>>>> There is no resenting that! But I am so close. Actually, I do have things
>>>>> working without the o2cb primitive, i.e., pcmk is starting the dual 
>>>>> primary
>>>>> drbd, cloned dlm, and mounting the cloned ocfs2 filesystem:
>>>>>
>>>>> root@astdrbd1:~# /etc/init.d/cman start
>>>>> Starting cluster:
>>>>>   Checking if cluster has been disabled at boot... [  OK  ]
>>>>>   Checking Network Manager... [  OK  ]
>>>>>   Global setup... [  OK  ]
>>>>>   Loading kernel modules... [  OK  ]
>>>>>   Mounting configfs... [  OK  ]
>>>>>   Starting cman... [  OK  ]
>>>>>   Waiting for quorum... [  OK  ]
>>>>>   Starting fenced... [  OK  ]
>>>>>   Starting dlm_controld... [  OK  ]
>>>>>   Unfencing self... [  OK  ]
>>>>>   Joining fence domain... [  OK  ]
>>>>>
>>>>> root@astdrbd1:~# /etc/init.d/pacemaker start
>>>>> Starting Pacemaker Cluster Manager: touch: missing file operand
>>>>> Try `touch --help' for more information.
>>>>> [  OK  ]
>>>>>
>>>>>
>>>>> ============
>>>>> Last updated: Fri Nov 11 07:36:11 2011
>>>>> Last change: Fri Nov 11 07:33:06 2011 via crmd on astdrbd1
>>>>> Stack: cman
>>>>> Current DC: astdrbd1 - partition with quorum
>>>>> Version: 1.1.6-2d8fad5
>>>>> 2 Nodes configured, 2 expected votes
>>>>> 7 Resources configured.
>>>>> ============
>>>>>
>>>>> Online: [ astdrbd1 astdrbd2 ]
>>>>>
>>>>> astIP   (ocf::heartbeat:IPaddr2):       Started astdrbd1
>>>>>  Master/Slave Set: msASTDRBD [astDRBD]
>>>>>     Masters: [ astdrbd2 astdrbd1 ]
>>>>>  Clone Set: astDLMClone [astDLM]
>>>>>     Started: [ astdrbd2 astdrbd1 ]
>>>>>  Clone Set: astFilesystemClone [astFilesystem]
>>>>>     Started: [ astdrbd2 astdrbd1 ]
>>>>>
>>>>>
>>>>> Of course, o2cb is not pcmk cluster aware right now and needs to be
>>>>> started manually.
>>>>>
>>>>> Vladislav, if you are getting this I can test if the kernel bug that 
>>>>> slows down
>>>>> ocfs2 reported by you earlier. Is there any test you would like me to 
>>>>> perform?
>>>>>
>>>>>
>>>>> Kind Regards,
>>>>>
>>>>> Nick.
>>>>> _______________________________________________
>>>>> Linux-HA mailing list
>>>>> Linux-HA@lists.linux-ha.org
>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>>> See also: http://linux-ha.org/ReportingProblems
>>>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> Linux-HA@lists.linux-ha.org
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA@lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to