Re: [Ocfs2-users] servers blocked on ocfs2

2010-12-10 Thread Joel Becker
On Fri, Dec 10, 2010 at 08:42:19AM +0100, frank wrote:
> Hi Joel, thanks for your answer,
> usually when the cable is unplugged it appears messages related to link, 
> and there isn't. I thought in a OpenVZ issue also, but I don't find any 
> evidence of that.

Yeah, I'm just wondering what could make traffic not pass over
the link.

> There is a dedicated link as a mentioned, used for OCFS2 and 
> heartbeat/Pacemaker; may be last one hampered the interface.

I don't see how either one would have hampered the interface.
If no other traffic (iSCSI, http, whatever) is over that interface,
there just shouldn't be that much.

> Anyway, if there was a cut in the heartbeat or something similar, one of 
> the nodes should have fenced itself, haven't it? Why did the nodes  
> stall? Can we avoid that?

If both nodes saw the network go down, but the disk heartbeat
was still working, the higher node should have fenced.  Was there no
fencing?  Was it just both nodes hung?  How were they hung?  All
operation, or just ocfs2 operations?

Joel

-- 

"War doesn't determine who's right; war determines who's left."

Joel Becker
Senior Development Manager
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] servers blocked on ocfs2

2010-12-10 Thread Joel Becker
On Fri, Dec 10, 2010 at 11:38:04AM -0800, Joel Becker wrote:
> On Fri, Dec 10, 2010 at 08:42:19AM +0100, frank wrote:
> > Anyway, if there was a cut in the heartbeat or something similar, one of 
> > the nodes should have fenced itself, haven't it? Why did the nodes  
> > stall? Can we avoid that?
> 
>   If both nodes saw the network go down, but the disk heartbeat
> was still working, the higher node should have fenced.  Was there no
> fencing?  Was it just both nodes hung?  How were they hung?  All
> operation, or just ocfs2 operations?

Oh, I see.  While node 0 was waiting for node 1 to kill itself,
node 1 managed to reconnect.  The invalid lock stuff was weird, though.
After this, did all operation resume to normal, or were many operations
permanently frozen?

Joel

-- 

"Sometimes one pays most for the things one gets for nothing."
- Albert Einstein

Joel Becker
Senior Development Manager
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] Reservation conflicts

2010-12-10 Thread brad hancock
My setup has the SCSI controller set to Physical so the guest can be on
different hosts, but I do not have the disk setup as Independent. I am going
to change that setting in VMware and see if it makes a difference.






On Thu, Dec 9, 2010 at 5:29 PM, Ulf Zimmermann  wrote:
>
> > -Original Message-
> > From: ocfs2-users-boun...@oss.oracle.com [mailto:ocfs2-users-
> > boun...@oss.oracle.com] On Behalf Of Joel Becker
> > Sent: Thursday, December 09, 2010 2:54 PM
> > To: brad hancock
> > Cc: ocfs2-users@oss.oracle.com
> > Subject: Re: [Ocfs2-users] Reservation conflicts
> >
> > On Thu, Dec 09, 2010 at 04:45:25PM -0600, brad hancock wrote:
> > > Yeah both guest have the same Harddrive attached with the virtual
> > scsi
> > > controller configured
> > > as Physical to set a  policy to allow virtual disk to be
> > > used simultaneously by multi virtual machines.
> > >
> > > as /dev/sdb1
> >
> >   It sure seems like VMWare is caching some data somewhere.
> > That's my best guess.  These are on the same host, right?
> >
> > Joel
>
> I have configured:
>
> SCSI Controller 1 Virtual (Virtual disks can be shared between any virtual
> machines on the same server.)
> Disks are configured as Independent.
>
> That works for my test cluster using OCFS inside of Vmware.
>
>
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] servers blocked on ocfs2

2010-12-10 Thread Thomas.Zimolong
Hi!

> first of all thanks for the answer, but are you saying that is better to
use a switch that a direct cable between nodes?
> I thought that using a switch adds an extra point of failure and also
spends a couple of switch ports unnecessarily.
> What is the problem in using crossover cables?

One problem with using a crossover cable could be, that when one node goes
down, the other node has no link on the corresponding interface too.

"No link" usually makes the node "think" it's got a problem itself (while in
fact only the other node has a problem). This may lead to self-fencing.

Using a switch avoids this situation, since a switch will still deliver link
to the "good" node, even when the other node is down.

In my opinion it's not "unnecessary" to dedicate some switch ports to this.
And if you consequently think of high availability ( and scalability) you
should consider making your network environment high available (i.e.
redundant) anyway, which means that there would be no extra point of failure.

Greetz,
Zimo

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users