[Gluster-users] Problem with creating constraints

2011-09-21 Thread Uwe Weiss
Hello List,

 

I have a problem in creating a constraint. I hope that someone could help me
and give me a hint.

 

I have three resources (A,B,C) and two cluster nodes (node0,node1). Resource
A can run only on node0 and resource B can run only node1.

I managed this by resource-stickiness INFINITY for each resource.

 

Resource C can run on both nodes but  on node0 only if resource A is up and
running or on node 1 if resource B is up and running.

Additionally resource C can only start after resource A or B

 

How do I have to create resource order constraints and resource location
constraints that meets the requirement of resource C

 

Thx in advance

Uwe

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Problem with creating constraints

2011-09-21 Thread Uwe Weiss
Uuuups, 

 

sorry wrong list. 

Pls. forget it

 

Uwe

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] glusterfs, pacemaker and Filesystem RA

2011-09-12 Thread Uwe Weiss
Hello List
due to a mistake my post from yesterday has been cut.  This is why I try to
send my post again and open it as new thread. I hope it will work this time.

 Original postet mail starts here 

Hello Marcel, hello Samuel,

sorry for my late answer, but I was away for two months and for that I could
continue my tests last week.

First of all thank you for your patch of the Filesystem RA. It works like a
charm but I have some little remarks. 

What I found out is that the test of the filesystem access with
OCF_CHECK_LEVEL is not working with glusterfs.

If I use the nvpair OCF_CHECK_LEVEL with a value of 10 I get an err_message
with the content: 
' 192.168.51.1:/gl_vol0 is not a block device, monitor 10 is noop'

If I use the nvpair OCF_CHECK_LEVEL with a value of 20 I get an err_message
with the content:
 ' ERROR: dd said: dd: opening
`/virtfs0/.Filesystem_status/res_glusterfs_sp0:0_0_vmhost1': Invalid
argument'
After that the resource is trying to restart permanently

Unfortunately I am not familiar enough with scripting to fix it by myself
and to contribute it.

Another item I would like to discuss is a bit more general. 
As Samuel pointed out the Filesystem RA (with native client) needs the
gluster node it connects to (by using the device attribute of the
Filsesystem RA) up and running only on startup of the client. 

After that the native client detects by itself if a gluster node is gone or
not. This is correct so far but in my setup this could be a SPOF.

I would like to build a cluster of three machines (A,B,C) and start a
Filesystem RA clone on all three clusternodes. Each of that nodes is a
glusterfs server offering a glusterfs replicated share  and also a glusterfs
client which mounts that share (from server A initially).  If all three
servers are up there is no problem. 

Even if one of the servers is going down everything will work fine.

But If the node the clients are connected to on startup (Server A) crashes
and I have afterwards a need to reboot one of the remaining servers (B or C)
, this server is not able to reconnect as client  cause node A is still
down.

From my point of view it could be a solution to give some nvpairs (e.g.
glusterhost1=IPorName1ofNodeA:/glustervolume,
glusterhost2=IPorName2ofNodeB:/glustervolume, ...
glusterhostN=IPorNameNofNodeN:/glustervolume). 

These pairs could be pinged and/or tested before the Filesystem RA tries to
connect to them. In case that one of these nodes is not reachable or does
not respond to the connection attempt  the RA could try a connection with
the next nvpair.

Background:

I would like to build a openais/pacemaker cluster consisting of three nodes.

On each node should run a gluster server providing a replicated glusterfs
share, a glusterfs client (Filesystem RA clone) connected to this share and
one or more KVM-VMs 

Due to load reason the VMs should be distributed over the cluster. In case
of crash of one of these servers the affected VM shall fail over to the
remaining nodes.

I hope I was able to explain my concerns and you or anybody else could give
me a hint to solve my problem.

Thx in advance
Uwe



-Ursprüngliche Nachricht-
Von: gluster-users-boun...@gluster.org
[mailto:gluster-users-boun...@gluster.org] Im Auftrag von Marcel Pennewiß
Gesendet: Montag, 18. Juli 2011 14:00
An: gluster-users@gluster.org
Betreff: Re: [Gluster-users] glusterfs and pacemaker

On Monday 18 July 2011 13:26:00 samuel wrote:
 I don't know from which version on but, if you use the native client 
 for mounting the volumes, it's only required to have the IP active in 
 the mount moment. After that, the native client will transparently 
 manage node's failure.

ACK, that's why we use this shared IP (e.g. for backup issues via nfs).
AFAIR glusterFS retrieves Volfile (via shared IP) and connects to the nodes.

Marcel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs and pacemaker

2011-09-11 Thread Uwe Weiss
Hello Marcel, hello Simuel,

sorry for my late answer, but I was away for two months and for that I could
continue my tests last week.

First of all thank you for your patch of the Filesystem RA. It works like a
charm but I have some little remarks. 

What I found out is that the test of the filesystem access with
OCF_CHECK_LEVEL is not working with glusterfs.

If I use the nvpair OCF_CHECK_LEVEL with a value of 10 I get an err_message
with the content: 
' 192.168.51.1:/gl_vol0 is not a block device, monitor 10 is noop'

If I use the nvpair OCF_CHECK_LEVEL with a value of 20 I get an err_message
with the content:
 ' ERROR: dd said: dd: opening
`/virtfs0/.Filesystem_status/res_glusterfs_sp0:0_0_vmhost1': Invalid
argument'
After that the resource is trying to restart permanently

Unfortunately I am not familiar enough with scripting to fix it by myself
and to contribute it.

Another item I would like to discuss is a bit more general. 
As Samuel pointed out the Filesystem RA (with native client) needs the
gluster node it connects to (by using the device attribute of the
Filsesystem RA) up and running only on startup of the client. 

After that the native client detects by itself if a gluster node is gone or
not. This is correct so far but in my setup this could be a SPOF.

I would like to build a cluster of three machines (A,B,C) and start a
Filesystem RA clone on all three clusternodes. Each of that nodes is a
glusterfs server offering a glusterfs replicated share  and also a glusterfs
client which mounts that share (from server A initially).  If all three
servers are up there is no problem. 

Even if one of the servers is going down everything will work fine.

But If the node the clients are connected to on startup (Server A) crashes
and I have afterwards a need to reboot one of the remaining servers (B or C)
, this server is not able to reconnect as client  cause node A is still
down.

From my point of view it could be a solution to give some nvpairs (e.g.
glusterhost1=IPorName1ofNodeA:/glustervolume,
glusterhost2=IPorName2ofNodeB:/glustervolume, ...
glusterhostN=IPorNameNofNodeN:/glustervolume). 

These pairs could be pinged and/or tested before the Filesystem RA tries to
connect to them. In case that one of these nodes is not reachable or does
not respond to the connection attempt  the RA could try a connection with
the next nvpair.

Background:

I would like to build a openais/pacemaker cluster consisting of three nodes.

On each node should run a gluster server providing a replicated glusterfs
share, a glusterfs client (Filesystem RA clone) connected to this share and
one or more KVM-VMs 

Due to load reason the VMs should be distributed over the cluster. In case
of crash of one of these servers the affected VM shall fail over to the
remaining nodes.

I hope I was able to explain my concerns and you or anybody else could give
me a hint to solve my problem.

Thx in advance
Uwe



-Ursprüngliche Nachricht-
Von: gluster-users-boun...@gluster.org
[mailto:gluster-users-boun...@gluster.org] Im Auftrag von Marcel Pennewiß
Gesendet: Montag, 18. Juli 2011 14:00
An: gluster-users@gluster.org
Betreff: Re: [Gluster-users] glusterfs and pacemaker

On Monday 18 July 2011 13:26:00 samuel wrote:
 I don't know from which version on but, if you use the native client 
 for mounting the volumes, it's only required to have the IP active in 
 the mount moment. After that, the native client will transparently 
 manage node's failure.

ACK, that's why we use this shared IP (e.g. for backup issues via nfs).
AFAIR glusterFS retrieves Volfile (via shared IP) and connects to the nodes.

Marcel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfs and pacemaker

2011-07-18 Thread Uwe Weiss
Hello Marcel,

thank you very much for the patch. Great Job.

It works on the first shot. Mounting and migration of the fs works. Till now
I could not test a hart reset of a cluster node, cause a colleague is
currently using the cluster.

I applied the following parameters:

Fstype: glusterfs
Mountdir: /virtfs
Glustervolume: 192.168.50.1:/gl_vol1

Maybe you can answer me a question for better understanding?

My second node is 192.168.50.2. But in the Filesystem RA I have referenced
to 192.168.50.1 (see above). During my first test node1 was up and running,
but what happens if node1 is completely away and the address is
inaccessible?

Thx
Uwe


Uwe Weiss
weiss edv-consulting
Lattenkamp 14
22299 Hamburg
Phone:  +49 40 51323431
Fax:+49 40 51323437
eMail:  u.we...@netz-objekte.de

-Ursprüngliche Nachricht-
Von: gluster-users-boun...@gluster.org
[mailto:gluster-users-boun...@gluster.org] Im Auftrag von Marcel Pennewiß
Gesendet: Sonntag, 17. Juli 2011 17:37
An: gluster-users@gluster.org
Betreff: Re: [Gluster-users] glusterfs and pacemaker

On Friday 15 July 2011 13:12:02 Marcel Pennewiß wrote:
  My idea is, that pacemaker starts and monitors the glusterfs 
  mountpoints and migrates some resources to the remaining  node if 
  one or more
  mountpoint(s) fails.
 
 For using mountpoints, please have a look at OCF Filesystem agent.

Uwe informed me (via PM) that this didn't work - we did not use this until
now. After some investigation you'll see that ocf::Filesystem did not
detect/work with glusterfs-shares :(

A few changes are necessary to create a basic support for glusterfs.
@Uwe: Please have a look at [1] and try to patch your
Filesystem-OCF-script (which maybe located in
/usr/lib/ocf/resource.d/heartbeat).

[1] http://subversion.fem.tu-ilmenau.de/repository/fem-overlay/trunk/sys-
cluster/resource-agents/files/filesystem-glusterfs-support.patch

best regards
Marcel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] glusterfs and pacemaker

2011-07-15 Thread Uwe Weiss
Hello List,

 

I have a running cluster (two nodes) running opanais , pacemaker and
glusterfs.

 

Currently glusterfs is manually started and mounted.

 

Now I would like to integrate glusterfs into this pacemaker environment but
I can't find a glusterfs resource agent for pacemaker.

 

Has anyone an idea how to integrate glusterfs into pacemaker?

 

My idea is, that pacemaker starts and monitors the glusterfs mountpoints and
migrates some resources to the remaining  node if one or more mountpoint(s)
fails.

 

Thx in advance

ewuewu

 

 

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] glusterfs 3.2.1 nfs hangs on opensuse 11.4

2011-07-12 Thread Uwe Weiss
Hi list,

I have some trouble with glusterfs 3.2.1. compiled from tarball in two
opensue 11.4 boxes.
Compiling an Installing of glusterfs went fine.
Afterwards I have configured a distributed replicated volume named
gluster_vol1 (peers are of course connected to each other)
As long as I only use the native gluster-client everything works fine an I
am reallay happy about glusterfs.
But if I try to use nfs I am running in trouble.
In details: I have two hosts called vmhost1 (192.168.50.1) and
vmhost2(192.168.50.2). On both hosts glusterfs is started.
On both hosts I have mounted a directory named /nfs_mount with:
mount -t nfs -o mountproto=tcp 192.168.50.1:/gluster_vol1 /nfs_mount
Afterward I can ls, df, du and so on with this directory on both hosts. Even
copying of small files works mostly but not every times.

But if I try to create bigfiles with dd for example [], the connections
hangs after one or two minutes. Sometimes it helps to restart glusterfs on
the mounted node, sometimes only a reset of the master node solves the
problem.
This happens if I run the copying or the creation of bigfile on any node
that has mounted the gluster nfs share independently if I run this command
from a single node or concurrently from multiple nodes.

Here is my logfile from my masternode after starting glusterfs on both
nodes:

cat etc-glusterfs-glusterd.vol.log
[2011-07-12 15:19:08.455046] I [glusterd.c:564:init] 0-management: Using
/etc/glusterd as working directory
[2011-07-12 15:19:08.467129] E [rpc-transport.c:676:rpc_transport_load]
0-rpc-transport: /usr/local/lib/glusterfs/3.2.1/rpc-transport/rdma.so:
cannot open shared object file: No such file or directory
[2011-07-12 15:19:08.467151] E [rpc-transport.c:680:rpc_transport_load]
0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not
valid or not found on this machine
[2011-07-12 15:19:08.467164] W [rpcsvc.c:1288:rpcsvc_transport_create]
0-rpc-service: cannot create listener, initing the transport failed
[2011-07-12 15:19:08.475337] I [glusterd.c:88:glusterd_uuid_init]
0-glusterd: retrieved UUID: 358770a7-d1ee-4fbd-8552-b354a2295d79
[2011-07-12 15:19:09.852048] E
[glusterd-store.c:1567:glusterd_store_retrieve_volume] 0-: Unknown key:
brick-0
[2011-07-12 15:19:09.852093] E
[glusterd-store.c:1567:glusterd_store_retrieve_volume] 0-: Unknown key:
brick-1
[2011-07-12 15:19:10.72382] I [glusterd-handler.c:3399:glusterd_friend_add]
0-glusterd: connect returned 0
[2011-07-12 15:19:10.74324] I
[glusterd-utils.c:1092:glusterd_volume_start_glusterfs] 0-: About to start
glusterfs for brick 192.168.50.1:/sp1/glusterfs
[2011-07-12 15:19:10.98987] I
[glusterd-utils.c:2217:glusterd_nfs_pmap_deregister] 0-: De-registered
MOUNTV3 successfully
[2011-07-12 15:19:10.100043] I
[glusterd-utils.c::glusterd_nfs_pmap_deregister] 0-: De-registered
MOUNTV1 successfully
[2011-07-12 15:19:10.101123] I
[glusterd-utils.c:2227:glusterd_nfs_pmap_deregister] 0-: De-registered NFSV3
successfully
Given volfile:
+---
---+
  1: volume management
  2: type mgmt/glusterd
  3: option working-directory /etc/glusterd
  4: option transport-type socket,rdma
  5: option transport.socket.keepalive-time 10
  6: option transport.socket.keepalive-interval 2
  7: end-volume
  8:

+---
---+
[2011-07-12 15:19:10.115334] E [socket.c:1685:socket_connect_finish]
0-management: connection to  failed (Connection refused)
[2011-07-12 15:19:10.193987] I [glusterd-pmap.c:237:pmap_registry_bind]
0-pmap: adding brick /sp1/glusterfs on port 24010
[2011-07-12 15:19:10.224800] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Transport endpoint
is not connected), peer (192.168.50.1:1023)
[2011-07-12 15:21:39.983189] W [socket.c:1494:__socket_proto_state_machine]
0-socket.management: reading from socket failed. Error (Transport endpoint
is not connected), peer (192.168.50.2:1021)
[2011-07-12 15:21:40.318042] I
[glusterd-handshake.c:317:glusterd_set_clnt_mgmt_program] 0-: Using Program
glusterd clnt mgmt, Num (1238433), Version (1)

Here is my nfs.log before mounting anything:
cat nfs.log
[2011-07-12 15:19:10.213621] I [nfs.c:675:init] 0-nfs: NFS service started
[2011-07-12 15:19:10.213743] W [write-behind.c:3029:init]
0-gluster_vol1-write-behind: disabling write-behind for first 0 bytes
[2011-07-12 15:19:10.216610] I [client.c:1935:notify]
0-gluster_vol1-client-0: parent translators are ready, attempting connect on
transport
[2011-07-12 15:19:10.220675] I [client.c:1935:notify]
0-gluster_vol1-client-1: parent translators are ready, attempting connect on
transport
Given volfile:
+---
---+
  1: volume gluster_vol1-client-0
  2: type protocol/client
  3: option remote-host 192.168.50.1
  4: option remote-subvolume