Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Christian Balzer
On Tue, 31 Jul 2012 16:54:48 +0100 James Gibbon wrote: > On Tue, 31 Jul 2012 17:21:14 +0200 > Felix Frank wrote: > > > On 07/31/2012 05:19 PM, James Gibbon wrote: > > > Is that right? > > > > Yes. > > > > I'd still consider losing the split brain resolution option. > > > > Many thanks, Felix

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread James Gibbon
On Tue, 31 Jul 2012 17:21:14 +0200 Felix Frank wrote: > On 07/31/2012 05:19 PM, James Gibbon wrote: > > Is that right? > > Yes. > > I'd still consider losing the split brain resolution option. > Many thanks, Felix. This has all been a useful learning experience as well as a problem resolution

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Jake Smith
Felix - thank you for correcting that! Jake Original Message From: Felix Frank Sent: Tue, 31/07/2012 09:25 AM To: Jake Smith CC: James Gibbon ; drbd-user@lists.linbit.com Subject: Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating_

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
On 07/31/2012 05:19 PM, James Gibbon wrote: > Is that right? Yes. I'd still consider losing the split brain resolution option. ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread James Gibbon
Hello, On Tue, 31 Jul 2012 17:04:02 +0200 Felix Frank wrote: > I stand by this assessment. You got "lucky" insofar that both > nodes were primary when they saw each other again. There is no > autorecovery from that. If for some freak reason your "good" > node would have been in a demoted state

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
Hi, On 07/31/2012 04:23 PM, James Gibbon wrote: > > Thanks again for your reply, Felix. I'm still a bit confused though, understandably so I guess ;-) > I'm sorry to say. In your earlier reply to Jake, you said: > > >> > >>> > > after-sb-1pri discard-secondary; >> > >> > thi

Re: [DRBD-user] how can i reduce the time doing initial full sync

2012-07-31 Thread Lars Ellenberg
On Tue, Jul 31, 2012 at 01:06:30PM +0200, Sebastian Riemer wrote: > On 31.07.2012 12:14, David Coulson wrote: > > Does it matter? Your data is accessible from both nodes while it syncs. > > > > I suppose you have a risk of the primary failing until it has a full > > copy on the other node, but sinc

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread James Gibbon
On Tue, 31 Jul 2012 15:27:52 +0200 Felix Frank wrote: > Hi, > > On 07/31/2012 03:23 PM, James Gibbon wrote: > > OK - thanks again. Since the master is already up and running, > > I'm hoping that the broken secondary box isn't going to get > > promoted - is that a reasonable assumption? > > it's

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
Hi, On 07/31/2012 03:23 PM, James Gibbon wrote: > OK - thanks again. Since the master is already up and running, > I'm hoping that the broken secondary box isn't going to get > promoted - is that a reasonable assumption? it's already promoted. Hence split brain. It's not an issue though. It's ea

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
Hi, On 07/31/2012 03:07 PM, Jake Smith wrote: > ## sb-1pri says to discard data on the secondary and resync if there has only > been the one primary since the last uptodate status uhm, no. It's a policy for "what to do if there is only 1 primary at the time of connecting". If there has been only

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread James Gibbon
On Tue, 31 Jul 2012 15:03:22 +0200 Felix Frank wrote: > On 07/31/2012 02:49 PM, James Gibbon wrote: > > Can someone tell me what the "become-primary-on-both" part > > means? I'm fairly anxious to ensure that the second node > > doesn't attempt to become primary when its restarted as its > > data

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Jake Smith
- Original Message - > From: "James Gibbon" > To: "James Gibbon" > Cc: drbd-user@lists.linbit.com > Sent: Tuesday, July 31, 2012 8:49:52 AM > Subject: Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating > > > Can someone tell me what the "become-primary-on-both" part me

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread James Gibbon
Can someone tell me what the "become-primary-on-both" part means? I'm fairly anxious to ensure that the second node doesn't attempt to become primary when its restarted as its data will be out of date.. startup { wfc-timeout 15; degr-wfc-timeout 60;

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
On 07/31/2012 02:49 PM, James Gibbon wrote: > Can someone tell me what the "become-primary-on-both" part means? I'm > fairly anxious to ensure that the second node doesn't attempt to become > primary when its restarted as its data will be out of date.. It does what you assume. The initscript will

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread James Gibbon
Hi Felix, On Tue, 31 Jul 2012 13:24:38 +0200 Felix Frank wrote: > > I see. Quite unusual I'd say, to have two drbd nodes that each > use a NAS as backing device. But it looks sound, judging from > the config. Thanks for that. > Thanks a lot for taking a look! I take it the strategy we've alre

[DRBD-user] Fwd: Re: how can i reduce the time doing initial full sync

2012-07-31 Thread Felix Frank
Original Message Subject: Re: [DRBD-user] how can i reduce the time doing initial full sync Date: Tue, 31 Jul 2012 11:37:22 +0100 From: Philip Gaw To: Felix Frank I usually bond 4+ gigabit links together with balance-rr to lower sync time (assuming you dont already have 10G bet

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
Hi, On 07/31/2012 01:10 PM, James Gibbon wrote: > OK. The VM images for the cluster are stored in two Cisco NAS > units. Both the physical servers are connected to these using > iSCSI, through two Cisco gigabit switches. So for example a > particular VM disk image might be visible, from both serv

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread James Gibbon
Hi Felix, On Tue, 31 Jul 2012 12:12:30 +0200 Felix Frank wrote: > > > Not sure I even can copy the second node's data - it's not > > communicating with the storage at all. But I only need the > > first node's data. > > This too makes me edgy. The "storage"? In most DRBD setups, the > data res

Re: [DRBD-user] how can i reduce the time doing initial full sync

2012-07-31 Thread Sebastian Riemer
On 31.07.2012 12:14, David Coulson wrote: > Does it matter? Your data is accessible from both nodes while it syncs. > > I suppose you have a risk of the primary failing until it has a full > copy on the other node, but since it is a one time, on build, thing it > isn't a big deal. > > Short answer

Re: [DRBD-user] how can i reduce the time doing initial full sync

2012-07-31 Thread David Coulson
Does it matter? Your data is accessible from both nodes while it syncs. I suppose you have a risk of the primary failing until it has a full copy on the other node, but since it is a one time, on build, thing it isn't a big deal. Short answer is bigger pipe, faster drives, plenty of cpu. And

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
On 07/31/2012 12:07 PM, JAMES GIBBON wrote: > But for sure, there aren't any running now on the broken node. The > first node is running all of the VMs, quite smoothly fortunately. Good, so it's probably safe to trash the other node's data. > Not sure I even can copy the second node's data - it's

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread JAMES GIBBON
On 31 July 2012 10:33, Felix Frank wrote: > What's bugging me a bit is the relatively high number of "out > of sync" blocks as reported by your second node. Are you > absolutely certain that no VM has been started on this node > since you've lost connectivity? For that matter, how certain > are y

Re: [DRBD-user] how can i reduce the time doing initial full sync

2012-07-31 Thread Felix Frank
Hi, On 07/31/2012 12:02 PM, Mia Lueng wrote: > Hi All: >I have a two storage with 10T capacibility. When I doing initial > full sync ,it will take a whole night. How can I reduce the time doing > initial full sync. one night for 10TB doesn't sound bad at all to me, to be honest ;-) Anywhere,

[DRBD-user] how can i reduce the time doing initial full sync

2012-07-31 Thread Mia Lueng
Hi All: I have a two storage with 10T capacibility. When I doing initial full sync ,it will take a whole night. How can I reduce the time doing initial full sync. Thanks ___ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mai

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
Hi, On 07/31/2012 11:27 AM, JAMES GIBBON wrote: > Ah, thanks Felix. > > So - I think I need to do, on the broken node: > > # drbdadm disconnect all > # drbdadm secondary all > # drbadm connect --discard-my-data all without checking back with the Guide, this *looks* sound to me. Disconnect

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread JAMES GIBBON
On Tue, 31 Jul 2012 10:57:47 +0200 Dirk Bonenkamp - ProActive wrote: > > Since the problem is on the slave, I wouldn't worry to much. > Fix the IP on the slave and issue the 'drdbadm connect all' on > the slave. If it has been the slave all the time, it should > connect and synchronise without co

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
On 07/31/2012 10:57 AM, Dirk Bonenkamp - ProActive wrote: > split brain that >> needs resolving. See >> http://www.drbd.org/users-guide/s-resolve-split-brain.html and It's all laid out in the above link. You *are* in split brain. This probably won't mess with performance. Be sure your syncer rate

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Dirk Bonenkamp - ProActive
Op 31-7-2012 10:51, JAMES GIBBON schreef: > On Tue, 31 Jul 2012 09:32:48 +0200 > Felix Frank mailto:f...@mpexnet.de>> wrote: > > > > > Judging from your log excerpt, there might be a connectivity > > issue, but this could very well be a pure split brain that > > needs resolving. See > > http://www

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread JAMES GIBBON
On Tue, 31 Jul 2012 09:32:48 +0200 Felix Frank wrote: > > Judging from your log excerpt, there might be a connectivity > issue, but this could very well be a pure split brain that > needs resolving. See > http://www.drbd.org/users-guide/s-resolve-split-brain.html and > note that you will likely l

[DRBD-user] local disk flush failed with status -5

2012-07-31 Thread BSD test
Hi! 2 x Ubuntu 8.04 drbd 8.0.11 After several reboots of each node because of power failures there was a split brain and the connection state on both nodes became StandAlone. I performed the steps from here: http://www.drbd.org/users-guide-8.3/s-resolve-split-brain.html . Everything worked fi

Re: [DRBD-user] DRBD - one half of Proxmox cluster miscommunicating

2012-07-31 Thread Felix Frank
Hi, On 07/30/2012 10:06 PM, JAMES GIBBON wrote: > version: 8.3.7 (api:88/proto:86-91) > srcversion: EE47D8BF18AC166BE219757 > 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r > ns:0 nr:0 dw:27568823 dr:156762105 al:309656 bm:309639 lo:0 pe:0 > ua:0 ap:0 ep:1 wo:b oos:1018463