Re: [Openais] strerror_r usage

2011-06-24 Thread Dietmar Maurer
Can anybody see that problem (Or do I misunderstand the manual page)? > -Original Message- > From: openais-boun...@lists.linux-foundation.org [mailto:openais- > boun...@lists.linux-foundation.org] On Behalf Of Dietmar Maurer > Sent: Donnerstag, 23. Juni 2011 07:5

[Openais] corosync and CONFIG_RT_GROUP_SCHED

2011-06-23 Thread Dietmar Maurer
I just noticed that cororsync is unable to set the RT scheduler priority when I enable CONFIG_RT_GROUP_SCHED (return EPERM). Is that expected? kernel: http://download.openvz.org/kernel/branches/rhel6-2.6.32/042stab018.1/ - Dietmar ___ Openais mailing

[Openais] strerror_r usage

2011-06-22 Thread Dietmar Maurer
There are two versions of strerror_r ('man strerror_r') corosync configure.ac. defined -D_GNU_SOURCE, so we use the GNU specific version. The manual page claims: The GNU-specific strerror_r() returns a pointer to a string containing the error message. This may be either a pointer

[Openais] fix log message

2011-06-22 Thread Dietmar Maurer
--- corosync-1.3.1/exec/main.c.org 2011-06-23 06:58:38.0 +0200 +++ corosync-1.3.1/exec/main.c 2011-06-23 06:58:44.0 +0200 @@ -1230,10 +1230,10 @@ if (res == -1) { char error_str[100]; strerror_r (errno, error_str,

[Openais] missing git tag v1.3.1

2011-06-22 Thread Dietmar Maurer
see http://www.corosync.org/git/?p=corosync.git;a=summary I thought there should be a tag for v1.3.1? And what is the difference between head 'flatiron' and 'flatiron-1.3'? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://

Re: [Openais] corosync 1.3.0 not stable under load

2010-12-22 Thread Dietmar Maurer
> I pload is not destrustive, but it does not play nicely. It is a tool to > compare > sending messages without IPC been involved (compare with cpgbench not to be > used at the same time). Many thanks for the explanation. I just wanted to do some stress testing, and thought the pload service was

Re: [Openais] corosync 1.3.0 not stable under load

2010-12-22 Thread Dietmar Maurer
> > Any idea? > > Don´t run pload.. that´s the issue. It´s not meant to be executed on something > that needs to survive. Ok, thanks (I guess it would be a good idea to mention that in the manual page). - Dietmar ___ Openais mailing list Openais@lists

Re: [Openais] corosync 1.3.0 not stable under load

2010-12-22 Thread Dietmar Maurer
osync-pload Any idea? > > Fabio > > On 12/22/2010 9:52 AM, Dietmar Maurer wrote: > > Corosync v1.3.0 (single node) > > > > Debian Squeeze AMD64 with latest 2.6.32 kernel > > > > > > > > When I run "corosync-pload" it prints: >

[Openais] corosync 1.3.0 not stable under load

2010-12-22 Thread Dietmar Maurer
Corosync v1.3.0 (single node) Debian Squeeze AMD64 with latest 2.6.32 kernel When I run "corosync-pload" it prints: # corosync-pload Init result 1 The process never stops (but I can stop it with cntrl-c), but it seems to work anyways: Dec 22 09:32:46 maui corosync[2409]: [PLOAD ] 150 Wri

Re: [Openais] question abou qb_ipcs_msg_process_fn()

2010-12-09 Thread Dietmar Maurer
> > I thought the return value is used to indicate errors, but if I simply > > return a negative number my client hang forever. > > Currently if you return anything but -ENOBUFS or -EAGAIN the server will > remove the socket from the poll loop. I can't reproduce that behavior. My server continues

[Openais] question abou qb_ipcs_msg_process_fn()

2010-12-09 Thread Dietmar Maurer
Hi all, I am playing around with libqb writing my first test server. Normal operation work quite good so far. I just wonder how to handle errors on the server side, especially in qb_ipcs_msg_process_fn: typedef int32_t (*qb_ipcs_msg_process_fn) (qb_ipcs_connection_t *c, void *da

[Openais] newlines in qb_util_log messages

2010-12-03 Thread Dietmar Maurer
I assume those newlines "\n" should not be there? --- ipcs.c.org 2010-12-03 09:32:58.0 +0100 +++ ipcs.c 2010-12-03 09:33:08.0 +0100 @@ -168,7 +168,7 @@ assert(s->ref_count > 0); free_it = qb_atomic_int_dec_and_test(&s->ref_count); if (free_it) { -

[Openais] confdb_reload_notify

2010-12-01 Thread Dietmar Maurer
Hi all, I try to track confdb changes,i.e. I want to get notified about cman config changes. corosync-objctl -t cluster Type "q" to finish object_deleted>cluster I only get one "confdb_object_delete_change_notify " notification after running "cman_tool version -r -S". Is that expected behavio

Re: [Openais] using libcoroipcs with epoll

2010-11-20 Thread Dietmar Maurer
> If I use libpb, it creates a second thread pool, so I end up running twice as > many threads? Oh, I guess this only matters if I write a corosync service. Simply ignore that question. - Dietmar ___ Openais mailing list Openais@lists.linux-foundatio

Re: [Openais] using libcoroipcs with epoll

2010-11-20 Thread Dietmar Maurer
> I'd recommend using libqb if you want to use epoll with ipc - its already > available. Its far improved and where we are headed in corosync 2.0. Thanks for that hint. I want to use it inside an application which use CPG (and corosync 1.2), so I thought if I use use libcoroipcs/ipcc I share t

Re: [Openais] multiple CPG leave messages

2010-11-20 Thread Dietmar Maurer
> No, you should either get a LEAVE or a PROCDOWN but not both. I have > verified this is indeed the case on master. Could you describe your test case > in more detail? Well, I will try to extract a simple test which shows the behavior next week. - Dietmar _

[Openais] using libcoroipcs with epoll

2010-11-19 Thread Dietmar Maurer
Hi all, is there some example code on howto use libcoroipcs with epoll? Is that expected to work? Or do I need to use poll() instead? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/list

[Openais] multiple CPG leave messages

2010-11-18 Thread Dietmar Maurer
Hi all, i just detected that I get 2 CPG leave message for the same configuration. for example I get: members: 1 2 left: 3 (CPG_REASON_LEAVE) and then: members: 1 2 left: 3 (CPG_REASON_PROCDOWN) Is that expected behavior? - Dietmar ___ Ope

Re: [Openais] announcement of the vinzvault project

2010-04-27 Thread Dietmar Maurer
> Our project is focused around one goal: providing a small footprint > (10kloc) highly available block storage area for virtual machines > optimized for Linux data-centers. Our plans don't depend on SAN > hardware, software, hardware fencing devices, or any other hardware > then > is commonly ava

Re: [Openais] Add a graph hash table to libqb

2010-04-14 Thread Dietmar Maurer
> The long term goal of this work is to enable a replicated structured > memory-based key-value storage that maintains consistency after a merge > from a network partition. This allows IPC speed reads, and network > speed writes of key/value pairs with full availability of key/value > data on all

Re: [Openais] does self-fencing makes sense?

2010-02-25 Thread Dietmar Maurer
Do you have an idea whats the best place to implement self fencing? Can we simply use softdog inside the quorum service (to trigger a reboot when we lose quorum? or is that too simple? Or is fenced the better place? - Dietmar > > But what I've heard so far is that many users do not understand > >

Re: [Openais] stale CPG members in confchg callback

2010-02-25 Thread Dietmar Maurer
sync 1.2.0. Is somebody able to reproduce that issue? - Dietmar > Regards, > Honza > > Dietmar Maurer wrote: > > Just found the following commit: > > > > > http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commitdiff; > h=bcc5fdef8473d99399c624a7b

Re: [Openais] stale CPG members in confchg callback

2010-02-23 Thread Dietmar Maurer
d after IPC is finished. > > > > Maybe it is bug. Do you have any reproduces? > > > > Thanks, > > Honza > > > > Dietmar Maurer wrote: > > >> Inside my CPG application, The confchg callback is called with > > 'dead' > > &

Re: [Openais] stale CPG members in confchg callback

2010-02-23 Thread Dietmar Maurer
d after IPC is finished. > > Maybe it is bug. Do you have any reproduces? > > Thanks, > Honza > > Dietmar Maurer wrote: > >> Inside my CPG application, The confchg callback is called with > 'dead' > >> members: > >> > >

Re: [Openais] does self-fencing makes sense?

2010-02-23 Thread Dietmar Maurer
> > Currently there are two choices 1) power fencing 2) no fencing. > > Wll, you could configure both. > But you'd end up with the node being power fenced after committing > sepuku, which wouldn't be ideal. But we can do: 1.) use regular fencing if configured 2.) If there is no fencing conf

Re: [Openais] does self-fencing makes sense?

2010-02-22 Thread Dietmar Maurer
> > There are thousands of interactions with power fencing and every one > > of them needs to work perfectly for power fencing to work. > > Thats not the problem. > Its the false positives you need to worry about (devices that report > success when power fencing failed). > > When power fencing fa

Re: [Openais] stale CPG members in confchg callback

2010-02-22 Thread Dietmar Maurer
> Dietmar, > process *should* be removed after IPC is finished. > > Maybe it is bug. Do you have any reproduces? > It happens quite often here, but I have no easy way to reproduce it. Where is that 'IPC' code responsible for removing the process (maybe I can debug myself)? - Dietmar _

Re: [Openais] stale CPG members in confchg callback

2010-02-22 Thread Dietmar Maurer
> Inside my CPG application, The confchg callback is called with 'dead' > members: > > [debug] cpg member node 3 pid 1132 > [debug] cpg member node 3 pid 14640 > > for example process 1132 does not exists any longer on node 3. Any idea > what > can cause such 'ghost' entries? If I run corosync-c

[Openais] stale CPG members in confchg callback

2010-02-22 Thread Dietmar Maurer
Hi all, I currently test cman (3.0.7) and corosync 1.2.0 on a debian squeeze box. Inside my CPG application, The confchg callback is called with 'dead' members: [debug] cpg member node 3 pid 1132 [debug] cpg member node 3 pid 14640 for example process 1132 does not exists any longer on node 3.

[Openais] does self-fencing makes sense?

2010-02-19 Thread Dietmar Maurer
Hi all, I just found a whitepaper from XenServer - seem they implement some kind of self-fencing: -text from XenServer High Availability Whitepaper--- The worst-case scenario for HA is the situation where a host is thought to be off-line but is actually still writing to the shared storage

Re: [Openais] configuring nodes in corosync

2010-02-19 Thread Dietmar Maurer
> On 19/02/10 01:30, Steven Dake wrote: > > On Thu, 2010-02-18 at 16:04 -0800, Alan Jones wrote: > >> Friends, > >> Is there any (undocumented) configuration option for corosync to > list > >> the nodes in the cluster similar to heartbeat? > >> Alan > >> > > > > No but clearly this is a gap. This

Re: [Openais] quorum service question

2010-01-28 Thread Dietmar Maurer
> >>> But that can't work. corosync (when using cman) needs cluster.conf > to > >>> start up. So if the filesystem depends on corosync you have a > catch- > >> 22 > >>> situation: the filesystem can't be mounted because corosync isn't > >>> running > >> > >> That is no problem. You can mount the fi

Re: [Openais] quorum service question

2010-01-28 Thread Dietmar Maurer
> -Original Message- > From: Dietmar Maurer > Sent: Donnerstag, 28. Jänner 2010 11:31 > To: 'Christine Caulfield' > Cc: openais@lists.linux-foundation.org > Subject: RE: [Openais] quorum service question > > > But that can't work. corosy

Re: [Openais] quorum service question

2010-01-28 Thread Dietmar Maurer
> But that can't work. corosync (when using cman) needs cluster.conf to > start up. So if the filesystem depends on corosync you have a catch-22 > situation: the filesystem can't be mounted because corosync isn't > running That is no problem. You can mount the filesystem, but it is read-only as lo

Re: [Openais] quorum service question

2010-01-28 Thread Dietmar Maurer
> -Original Message- > From: Christine Caulfield [mailto:ccaul...@redhat.com] > Sent: Donnerstag, 28. Jänner 2010 10:49 > To: Dietmar Maurer > Cc: openais@lists.linux-foundation.org > Subject: Re: [Openais] quorum service question > > On 28/01/10 09:38, Dietmar M

Re: [Openais] quorum service question

2010-01-28 Thread Dietmar Maurer
> >>> But who calls 'cman_tool version' to actually sync the config? > >> > >> You do ;-) > > > > Really? I have to connect to each node and call 'cman_tool version'? > > No, just one one node. But what if the cluster is partinioned? ___ Openais mailin

Re: [Openais] quorum service question

2010-01-28 Thread Dietmar Maurer
> >> It's called ccs_sync and is part of the ricci package. > >> > >> cman_tool will automatically call this (if it's installed) when you > >> update the version number and call 'cman_tool version'. > > > > Ok, found that code. Seems that ricci does the following: > > > > - generate new config file

Re: [Openais] quorum service question

2010-01-28 Thread Dietmar Maurer
> > I just tried to find the code which syncs cluster.conf to all nodes > > in current Red Hat Cluster, but was not able to find it. Please can > someone > > give me a hint what source files implement that? Seems there is no > ccsd in > > cluster3? > > > > > > It's called ccs_sync and is part of t

Re: [Openais] quorum service question

2010-01-27 Thread Dietmar Maurer
> > >> The quorum service really works best if you have a common > > configuration > > >> system so that the values are the same on all nodes. > > > > > > The question is how to do that. Quorum setting are read from > > /etc/corosync/corosync.conf, but there is no way to write that file - > > the c

Re: [Openais] quorum.h or votequorum.h

2009-11-18 Thread Dietmar Maurer
> >>> these weren't available for openais whitetank, so we have a custom > >>> version that we will "soon" replace with the one from corosync. > >> > >> Great! I guess you will use votequorum? > > > > It would be configurable I imagine. > > > If all you need to know is "do we have quorum?" then y

Re: [Openais] cpg_join and CS_ERR_TRY_AGAIN

2009-11-18 Thread Dietmar Maurer
> On Wed, 2009-11-18 at 14:01 +0100, Dietmar Maurer wrote: > > I just noticed that cpg_join return CS_ERR_TRY_AGAIN, but it joins > the group anyways. Is that expected? > > > No. How do you know it joined the group? I got a configuration change event. > How do you re

Re: [Openais] quorum service question

2009-11-18 Thread Dietmar Maurer
> > So quorum setting should be the same on all nodes. What other > settings should be the same on all nodes? Are the totem settings local > or should they be the same on all nodes? > > > > - Dietmar > > > > Each node's config file should be an exact copy, including totem > network > settings. I

[Openais] cpg_join and CS_ERR_TRY_AGAIN

2009-11-18 Thread Dietmar Maurer
I just noticed that cpg_join return CS_ERR_TRY_AGAIN, but it joins the group anyways. Is that expected? Also, I get that CS_ERR_TRY_AGAIN many times, so how should I call it: while ((result = cpg_join(mdb->cpg_handle, &mdb->cpg_group_name)) == CS_ERR_TRY_AGAIN) sleep (1); Or is there

Re: [Openais] quorum service question

2009-11-18 Thread Dietmar Maurer
I am trying to answer that myself, > > > That way you use 2 different services - votequorum an CPG. Is there > > > really a defined order between those messages? > > > > There are no quorum messages, but correlating events between cpg and > > quorum is a problem, > > I think about injecting a mes

Re: [Openais] quorum service question

2009-11-18 Thread Dietmar Maurer
> >> The quorum service really works best if you have a common > configuration > >> system so that the values are the same on all nodes. > > > > The question is how to do that. Quorum setting are read from > /etc/corosync/corosync.conf, but there is no way to write that file - > the confdb interfac

Re: [Openais] quorum.h or votequorum.h

2009-11-18 Thread Dietmar Maurer
> >> We hope there will be more in the future to fulfil various needs. > > > > Which is used by pacemaker? > > neither, yet. > these weren't available for openais whitetank, so we have a custom > version that we will "soon" replace with the one from corosync. Great! I guess you will use votequoru

Re: [Openais] quorum.h or votequorum.h

2009-11-18 Thread Dietmar Maurer
> Currently there are three quorum providers: > > votequorum > YKD > testquorum I can find votequorum and testquorum, but where is YKD (what source files)? > testquorum is (as it says) just a test - it can flip the quorum state > based on an objdb value, but the other two provide quorum services

Re: [Openais] quorum service question

2009-11-17 Thread Dietmar Maurer
> > That way you use 2 different services - votequorum an CPG. Is there > > really a defined order between those messages? > > There are no quorum messages, but correlating events between cpg and > quorum is a problem, I think about injecting a message into the CPG when quorum changes. But this

Re: [Openais] quorum service question

2009-11-17 Thread Dietmar Maurer
> >> The quorum service really works best if you have a common > configuration > >> system so that the values are the same on all nodes. > > > > The question is how to do that. Quorum setting are read from > /etc/corosync/corosync.conf, but there is no way to write that file - > the confdb interfac

Re: [Openais] quorum service question

2009-11-17 Thread Dietmar Maurer
> > So what is the suggested way to block some CPG operation when quorum > is lost (how to integrate quorum into an CPG application)? > > > > Use the quorum library, and check the cluster is quorate before sending > messages. You don't need to call the quorum_getquorate() call every > time, you ca

Re: [Openais] quorum service question

2009-11-17 Thread Dietmar Maurer
> > But AFAIK this information is local (not copied to other nodes). And > the votequorum seems to add new nodes dynamically. So how can I make > sure that all nodes in the cluster use the same values? I guess it is > important that all nodes start with the same value? > > > > Or is it better to no

[Openais] quorum.h or votequorum.h

2009-11-17 Thread Dietmar Maurer
I can see the difference in the API, but what is exactly the difference - both are API into the votequorum service? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais

[Openais] cpg_leave hangs

2009-11-17 Thread Dietmar Maurer
Hi all, I just notice that cpg_leave hangs forever when I stop the corosync daemon. Is that expected behavior? If so, is it safe to just call cpg_finalize - does that free all resources? - Dietmar ___ Openais mailing list Openais@lists.linux-foundati

Re: [Openais] quorum service question

2009-11-02 Thread Dietmar Maurer
> > So CFG, CPG and VOTEQUORUM works normal when the cluster is > inquorate? But EVS does not? > > > > > Yes, that's how it currently is - I suspect that all openais services > are blocked too. > > It might be the case that some other subsystems need to operate when > there is no quorum - when q

Re: [Openais] quorum service question

2009-11-02 Thread Dietmar Maurer
> > What operations are blocked - I can't find that in the source code? > > > > See in main.c > > static int corosync_sending_allowed() # grep CS_LIB_ALLOW_INQUORATE services/*.[ch] services/cfg.c: .allow_inquorate= CS_LIB_ALLOW_INQUORATE, services/confdb.c: .allow_

Re: [Openais] quorum service question

2009-10-30 Thread Dietmar Maurer
After some, testing, I noticed that I can't reproduce the bug anymore, don't know if this is good or bad ;-) > -Original Message- > From: Fabio Massimo Di Nitto [mailto:fabbi...@fabbione.net] > Sent: Freitag, 30. Oktober 2009 06:10 > To: Dietmar Maurer > Cc: Chr

Re: [Openais] partition recovery question

2009-10-29 Thread Dietmar Maurer
And what is the rationale behind this check: "if lowest processor identifier of the old ring in the new ring" it is used in the checkpoint recovery algorithm: http://www.openais.org/doku.php?id=dev:partition_recovery_checkpoint:checkpoint - Dietmar

Re: [Openais] quorum service question

2009-10-29 Thread Dietmar Maurer
> > but with > > > > # ./testvotequorum2 5 > > votequorum_qdisk_getinfo error 12: OK > > qdisk votes 4 > > state0 > > name QDISK > > > > the cluster freezes (corosync does not respond anymore)? > > > > (using corosync-1.1.2 on debian lenny). > > > > > I don't know what would be c

Re: [Openais] partition recovery question

2009-10-29 Thread Dietmar Maurer
> I still recommend using CPG for all use since it receives the most > testing. Ah, OK - many thanks for your help. - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais

Re: [Openais] partition recovery question

2009-10-29 Thread Dietmar Maurer
> > And what is the difference between EVS and CPG? > > > > EVS is a passthrough without process id membership. cpg provides > process group membership meaning that both node id and process id > information is given for membership information. So if I plan to only run one instance per node its b

Re: [Openais] quorum service question

2009-10-29 Thread Dietmar Maurer
Can anyone else reproduce that error? Any suggestion how to debug such thing? > -Original Message- > From: Christine Caulfield [mailto:ccaul...@redhat.com] > Sent: Donnerstag, 29. Oktober 2009 15:31 > To: Dietmar Maurer > Cc: openais@lists.linux-foundation.org > Sub

Re: [Openais] quorum service question

2009-10-29 Thread Dietmar Maurer
Btw, when I start # testvotequorum2 votequorum_qdisk_getinfo error 12: OK votequorum_qdisk_getinfo error 12: OK but with # ./testvotequorum2 5 votequorum_qdisk_getinfo error 12: OK qdisk votes 4 state0 name QDISK the cluster freezes (corosync does not respond anymore)? (usin

Re: [Openais] quorum service question

2009-10-29 Thread Dietmar Maurer
> Actually it does block some API operations when there is no quorum - or > it should. What operations are blocked - I can't find that in the source code? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://lists.linux-founda

[Openais] quorum service question

2009-10-29 Thread Dietmar Maurer
Hi all, I am just testing the quorum service and have some basic questions. First, I use the following configuration in corosync.conf: quorum { provider: corosync_votequorum expected_votes: 3 votes: 1 } But AFAIK this information is local (not copied to other nodes). And

[Openais] partition recovery question

2009-10-29 Thread Dietmar Maurer
Hi all, The evs_overview man page says: >Virtual Synchrony allows the current configuration to be used to make >decisions in partitions >and merges. Since the configuration is sent in the stream of messages to the >application, the >application can alter its behavior based upon the configuratio

Re: [Openais] cherrypicking into flatiron discussion - post 1.1.0

2009-09-24 Thread Dietmar Maurer
> This is how I think it could work, based on the idea that developers > use > trunk and integration use flatiron. > > - bug is reported/fixed against trunk: > * verify right away if the bug affects Flatiron or not. > * If it does, cherry pick the fix in Flatiron for integration people > t

Re: [Openais] corosync trunk request user to generate entropy

2009-08-17 Thread Dietmar Maurer
> -Original Message- > From: openais-boun...@lists.linux-foundation.org [mailto:openais- > boun...@lists.linux-foundation.org] On Behalf Of Dietmar Maurer > Sent: Dienstag, 18. August 2009 07:54 > To: sd...@redhat.com; open...@lists.osdl.org > Subject: Re: [Opena

Re: [Openais] corosync trunk request user to generate entropy

2009-08-17 Thread Dietmar Maurer
We need to install corosync without user interaction, so that solution is even worse than the previous behavior. Can't we use /dev/urandom instead (AFAIK even ssh uses that to generate private keys)? - Dietmar > -Original Message- > From: openais-boun...@lists.linux-foundation.org [mail

Re: [Openais] OpenAIS uses a LOT of CPU

2009-07-16 Thread Dietmar Maurer
> When I start openais+pacemaker on my virtual machines (vbox) it uses a > lot of > CPU power. top shows 30 - 50% of CPU usage only for ais. > > The virtual manchines are not the slowest (~4000 bogomips). Please > could > anybody explain me the problem? When I use heartbeat as the cluster > stack

Re: [Openais] Partition Recovery and CPG

2009-04-21 Thread Dietmar Maurer
> As I mentioned in another mail, some apps may allow re-syncing without > a complete node reset or rejoining the cpg. Other apps (like dlm and gfs > that I work on) require a full node reset to re-sync, because they export > their state through the dlm and gfs and up into random But exporting s

Re: [Openais] cpg log example tarball

2009-04-20 Thread Dietmar Maurer
> To replicate state using virtual synchrony / cpg you must have agreed > ordering > of messages and configuration changes, which you don't get by using > separate > cpgs for each. It uses only one cpg to send (application) messages, so you still have the VS guarantee. The other cpg is only used t

Re: [Openais] Partition Recovery and CPG

2009-04-20 Thread Dietmar Maurer
> The key point is that two cpg members (A and B) with different event > histories > will have inconsistent state, assuming that the state is derived from > the > history. So, after they are partitioned and merged, A needs to replace > its > own state/history with B's or v.v. Right. That is the i

Re: [Openais] howto distribute data accross all nodes?

2009-04-20 Thread Dietmar Maurer
> > > And even worse, section contains data from different partitions > (old > > > data mixed with new one)? And there is no notification that such > things > > > happens? > > That ckpt behavior is nonsensical for most real applications I'd wager. > I'm going to have to go check whether my apps ar

Re: [Openais] Partition Recovery and CPG

2009-04-20 Thread Dietmar Maurer
> Merging/reconciling a partition is *fundamentally* not possible for any > application that requires virtual synchrony. You cannot rewrite > history, But I can re-sync the state. > and VS is all about event history. It has nothing to do with mechanism or > design, it has everything to do with

Re: [Openais] Partition Recovery and CPG

2009-04-20 Thread Dietmar Maurer
> If your application requires virtual synchrony, then the only way to > handle > partitions is to terminate one part (and force it to restart with no > history) > to preserve a common history among the remaining nodes. Merging > doesn't make > sense because the VS history has already diverged. A

Re: [Openais] cpg log example tarball

2009-04-20 Thread Dietmar Maurer
> Here is an example CPG application which shows how to replicate state > between multiple processes on multiple machines. It is just a start, > but have a look. If you have interest in CPG, it may help you > understand the basics of the API. Interesting, especially the use of 2 process groups.

Re: [Openais] Partition Recovery and CPG

2009-04-18 Thread Dietmar Maurer
> It took us a long time to develop some patterns around application > programming over VS. But there is a good book out there that can help: > "Building Secure and Reliable Network Applications" by Ken Birman > (whose Cornell research team invented VS) Thanks, will take a look at that book. - Di

Re: [Openais] howto distribute data accross all nodes?

2009-04-18 Thread Dietmar Maurer
> The SA Forum doesn't consider at all how to handle partitions in a > network or at least not very suitably (up to designer of SA Forum > services). They assume that applications will be using the AMF, and > rely on the AMF functionality to reboot partitioned nodes (fencing) so > this condition d

Re: [Openais] howto distribute data accross all nodes?

2009-04-18 Thread Dietmar Maurer
> > Is there any guarantee that all sections inside a recovered > checkpoint > > are from the same cluster partition (I can't see such restriction in > the > > algorithm)? > > > > no > > The algorithm will merge sections created in both partitions into the > single checkpoint. At least the SA Fo

Re: [Openais] howto distribute data accross all nodes?

2009-04-18 Thread Dietmar Maurer
> An older version of the algorithm is described here: > http://www.openais.org/doku.php?id=dev:partition_recovery_checkpoint:ch > eckpoint > > It has been updated to deal with some race conditions, but the document > is pretty close. > > As you can see, designing the recovery state machine is co

Re: [Openais] howto distribute data accross all nodes?

2009-04-18 Thread Dietmar Maurer
> An older version of the algorithm is described here: > http://www.openais.org/doku.php?id=dev:partition_recovery_checkpoint:ch > eckpoint > > It has been updated to deal with some race conditions, but the document > is pretty close. > > As you can see, designing the recovery state machine is co

Re: [Openais] Partition Recovery and CPG

2009-04-18 Thread Dietmar Maurer
> In my opinion to merge states reliably, the designer of the application > has to do the hard work of designing a synchronization protocol that > happens each time a configuration change occurs. Yes. (That is why I am confused about the checkpoint service - how can that service merge states relia

Re: [Openais] Partition Recovery and CPG

2009-04-18 Thread Dietmar Maurer
> Yes, forcing the losers to reset and start from scratch is a must, but > we end up doing that a layer above corosync. That means the losers often > reappear again through corosync/cpg prior to being forced out. Are you talking about an implementation bug, or a 'bubbling idiot' which simply joi

Re: [Openais] howto distribute data accross all nodes?

2009-04-18 Thread Dietmar Maurer
> > > > like a 'merge' function? Seems the algorithm for checkpoint > recovery > > > > always uses the state from the node with the lowest processor id? > > > > > > > Yes that is right. > > > > So if I have the following cluster: > > > > Part1: node2 node3 node4 > > Part2: node1 > > > > Let assume

Re: [Openais] Partition Recovery and CPG

2009-04-18 Thread Dietmar Maurer
> Once a partition exists, a merge back together doesn't change the fact > that > the disagreement has already occured (at partition time) and that > disagreement > can only be resolved (to maintain VS) by killing nodes that don't agree > with > one version of the history. Sure, but the whole prob

Re: [Openais] howto distribute data accross all nodes?

2009-04-17 Thread Dietmar Maurer
> > like a 'merge' function? Seems the algorithm for checkpoint recovery > > always uses the state from the node with the lowest processor id? > > > Yes that is right. So if I have the following cluster: Part1: node2 node3 node4 Part2: node1 Let assume Part1 is running for some time and has gath

[Openais] Partition Recovery and CPG

2009-04-16 Thread Dietmar Maurer
Lest assume the cluster is partitioned: Part1: node1 node2 node3 Part2: node4 node5 After recovery, what join/leave messaged do I receive with a CPG: A.) JOIN: node4 node5 or B.) JOIN: node1 node2 node3 or anything else? - Dietmar ___ Openais mail

Re: [Openais] howto distribute data accross all nodes?

2009-04-16 Thread Dietmar Maurer
> Check out: > http://www.openais.org/doku.php?id=dev:paritition_recovery > > Instead of using the token callback method, you could write your own > methodology for executing the state machine. Ah, OK - I think that is what I already do. What is miss is something like a 'merge' function? Seems th

Re: [Openais] howto distribute data accross all nodes?

2009-04-15 Thread Dietmar Maurer
> You might try taking a look at exec/sync.c > > it is a synchronization engine. Basically it takes configuration > changes into account to call sync_init, sync_process, sync_abort, or > sync_activate. These 4 states then activate the new dato a model. > This > could easily be done as an addon a

Re: [Openais] howto distribute data accross all nodes?

2009-04-14 Thread Dietmar Maurer
> Cool idea if it wasn't totally integrated with CPG but instead some > external service which people could use in addition to CPG. Thats the plan. My current problem is the state merge function. There should be a standard way to merge state (when the user does not want to provide his own merge

Re: [Openais] howto distribute data accross all nodes?

2009-04-14 Thread Dietmar Maurer
> You might try taking a look at exec/sync.c > > it is a synchronization engine. Basically it takes configuration > changes into account to call sync_init, sync_process, sync_abort, or > sync_activate. These 4 states then activate the new dato a model. > This > could easily be done as an addon a

Re: [Openais] howto distribute data accross all nodes?

2009-04-14 Thread Dietmar Maurer
> > > Yes. I'm not sure if a generic service for this would be used much > or > > > not... > > > maybe. > > > > Btw, how large is an average checkpoint in dlm? Just wonder how much > > data needs to be transferred. > > I've never measured, but it's a trivially small amount of data. Around > 32 >

Re: [Openais] howto distribute data accross all nodes?

2009-04-14 Thread Dietmar Maurer
> > When a new node joins, CPG immediately change mode to > > DFSM_MODE_SYNC. Then all members send their state. > > > > When a node received the states of all members, it computes the new > > state by merging all received states (dfsm_state_merge_fn), and > > finally switches mode to DFS

Re: [Openais] howto distribute data accross all nodes?

2009-04-14 Thread Dietmar Maurer
states (dfsm_state_merge_fn), and finally switches mode to DFSM_MODE_WORK. Does that make sense? - Dietmar > On Thu, Apr 09, 2009 at 09:00:08PM +0200, Dietmar Maurer wrote: > > > If new, normal read/write messages to the replicated state continue > while > > > the new

[Openais] message order

2009-04-14 Thread Dietmar Maurer
If I send 2 messages with CPG_TYPE_AGREED: node1: cpg_mcast_joined (A) node1: cpg_mcast_joined (B) Is there any guarantee that they arrive in the same order (A before B)? - Dietmar ___ Openais mailing list Openais@lists.linux-foundation.org https://li

Re: [Openais] detecting cpg joiners

2009-04-09 Thread Dietmar Maurer
> guarantees you seek, and if it doesn't, it is defective. The only > exception might be if the new process reuses the same PID since the > pid/nodeid/group are the uniqifiers and if pid is the same, there is no > way to detect the new process (and remove the old one). PID reuse happens more ofte

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
> > Ah, that probably works. But can lead to very high memory usage if > traffic > > is high. > > If that's a problem you could block normal activity during the sync > period. wow. that 'virtual synchrony' sound nice first, but gets incredible complex soon ;-) > > > Is somebody really using tha

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
> 1. Have an old cpg member (e.g. the one with the lowest nodeid) send > messages > containing the state to the new node after it's joined. These "sync > messages" > are separate from the messages used to read/write the replicated state > during > normal operation. This is not bullet proof. State

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
> > > need for locks. An example of why not is creation of a resource > > called > > > "datasetA". > > > > > > 3 nodes: > > > node A sends "create datasetA" > > > node B sends "create datasetA" > > > node C sends "create datasetA" > > > > > > Only one of those nodes create dataset will arrive firs

Re: [Openais] howto distribute data accross all nodes?

2009-04-09 Thread Dietmar Maurer
> What I recommend here is to place your local node id in the message > contents (retrieved via cpg_local_get) and then compare that nodeid to > incoming messages Why do you include the local node id into the message? I can compare the local node id with the sending node id without that, for examp

  1   2   >