[Gluster-users] Can't set volume options: operation failed

2011-11-10 Thread Daniel Manser
We've had some strange issues on our two-node replicated cluster
recently. Now I'm trying to reconfigure the network.ping-timeout
setting on a volume, which results in operation failed:

gluster volume info vol0_web1
Volume Name: vol0_web1
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: glu1.example.org:/mnt/vol0/web1
Brick2: glu2.example.org:/mnt/vol0/web1
Options Reconfigured:
network.ping-timeout: 1

gluster volume set vol0_web1 network.ping-timeout 20
operation failed

The log message in /var/log/glusterfs/etc-glusterfs-glusterd.vol.log:
[2011-11-10 14:11:19.464338] E
[glusterd-handler.c:1900:glusterd_handle_set_volume] 0-: Unable to set
cli op: 16
[2011-11-10 14:11:19.465716] W
[socket.c:1494:__socket_proto_state_machine] 0-socket.management:
reading from socket failed. Error (Transport endpoint is not
connected), peer (127.0.0.1:987)

Has anyone an idea what might wrong here?
Daniel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Expand replicated bricks to more nodes/change replica count

2011-10-21 Thread Daniel Manser
 I believe you will have to re-create your volume.
 I have a feature request in for this too.
 [...]

Thanks Jeff. That's definitely a feature I'd love to see. I was
expecting that this is possible without service downtime since - IMO -
replicated data should give you some sort of flexibility to
expand/shrink the cluster.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Expand replicated bricks to more nodes/change replica count

2011-10-20 Thread Daniel Manser
Hi list

I have set up several volumes on a two-node Gluster setup using
replica 2 configurations. I would like to add two more nodes to the
trusted pool so that all volumes are replicated on 4 nodes. I wonder
if that can be done online, but after doing some research on this I
didn't find evidence that it's possible to change the replica count
_after_ the volume has been created.

Would I have to stop the volumes, set up a new volume with a replica
of 4, and then start the volume again? Or is there a way to do it
online?

Thanks
Daniel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-14 Thread Daniel Manser

Hi Whit,

Thanks for your reply.

I do know that it's not the Gluster-standard thing to use a crossover 
link.

(Seems to me it's the obvious best way to do it, but it's not a
configuration they're committed to.) It's possible that if you were 
doing
your replication over the LAN rather than the crossover that Gluster 
would

handle a disconnected system better. Might be worth testing.


It is still the same, even if no crossover cable is used and all 
traffic goes through an ethernet switch. The client can't write to the 
gluster volume anymore. I discovered that the NFS volume seems to be 
read-only in this state:


  client01:~# rm debian-6.0.1a-i386-DVD-1.iso
  rm: cannot remove `debian-6.0.1a-i386-DVD-1.iso': Read-only file 
system


So all traffic goes through one interface (NFS to the client, glusterfs 
replication, corosync).


I can reproduce the issue with the NFS client on VMware ESXi and with 
the NFS client on my Linux desktop.


My config:

  Volume Name: vmware
  Type: Replicate
  Status: Started
  Number of Bricks: 2
  Transport-type: tcp
  Bricks:
  Brick1: gluster1:/mnt/gvolumes/vmware
  Brick2: gluster2:/mnt/gvolumes/vmware

Regards,
Daniel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-14 Thread Daniel Manser

Hi

Thanks for your reply.


 Can you confirm if you backend filesystem is proper? Can you delete
the file from the backend?


I was able to delete files on the server.


Also, try setting a lower ping-timeout and see if
it helps in case of crosscable failover test.


I set it to 5 seconds, but the result is still the same.

  Volume Name: vmware
  Type: Replicate
  Status: Started
  Number of Bricks: 2
  Transport-type: tcp
  Bricks:
  Brick1: gluster1:/mnt/gvolumes/vmware
  Brick2: gluster2:/mnt/gvolumes/vmware
  Options Reconfigured:
  network.ping-timeout: 5

Daniel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-13 Thread Daniel Manser
I disconnected the crossover (replication) link again and it happened 
again. When I re-connect it afterwards, it takes some seconds and 
Gluster NFS works again. If this behavior is normal, then the 
replication link becomes a single point of failure. Any suggestions?

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Crossover cable: single point of failure?

2011-06-10 Thread Daniel Manser

Dear community,

I have a 2-node gluster cluster with one replicated volume shared to a 
client via NFS. If the replication link (Ethernet crossover cable) 
between the Gluster nodes breaks, I discovered that my whole storage is 
not available anymore.


I am using Pacemaker/corosync with two virtual IPs (service IPs exposed 
to the clients), so each node has its corresponding virtual IP, and if 
one node fails, corosync assigns the failing IP to the other running 
node). This mechanism works pretty good so far.


So, I have:

  gluster1: IP 10.196.150.251 and virtual IP 10.196.150.250
  gluster2: IP 10.196.150.252 and virtual IP 10.196.150.254

Now I am using DNS round-robin to distribute the load on both gluster 
nodes (name: gluster.mycompany.tld).


If one node goes down, the virtual IP is handed over to the remaining 
node, and the client works without any disruption. However, if the 
replication link between the gluster nodes breaks, we examined a service 
disruption. The client was then unable to write data to the cluster. The 
replication links between gluster nodes seems to be a single point of 
failure. Is that correct?


Daniel
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Crossover cable: single point of failure?

2011-06-10 Thread Daniel Manser

Can you please share NFS and brick logs from the duration of the link
going down? Gluster should have worked in the situation you 
described.


Brick log on gluster1:
[2011-06-10 13:12:08.57634] W [socket.c:204:__socket_rwv] 
0-tcp.vmware-server: readv failed (Connection timed out)
[2011-06-10 13:12:08.57674] W 
[socket.c:1494:__socket_proto_state_machine] 0-tcp.vmware-server: 
reading from socket failed. Error (Connection timed out), peer 
(192.168.150.252:1022)
[2011-06-10 13:12:08.57712] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one.vmx
[2011-06-10 13:12:08.57778] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/vmware-3.log
[2011-06-10 13:12:08.57796] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one.vmdk
[2011-06-10 13:12:08.57820] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one-flat.vmdk
[2011-06-10 13:12:08.57848] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one.nvram
[2011-06-10 13:12:08.57866] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/.lck-98894e05b84ec3a1
[2011-06-10 13:12:08.57887] I [server.c:438:server_rpc_notify] 
0-vmware-server: disconnected connection from 192.168.150.252:1022
[2011-06-10 13:12:08.57933] I 
[server-helpers.c:783:server_connection_destroy] 0-vmware-server: 
destroyed connection of 
gluster2-4038-2011/06/10-08:31:09:180937-vmware-client-0
[2011-06-10 13:12:19.3036] I [server-handshake.c:534:server_setvolume] 
0-vmware-server: accepted client from 192.168.150.252:1021


Brick log on gluster2:
[2011-06-10 13:12:01.467191] W [socket.c:204:__socket_rwv] 
0-tcp.vmware-server: readv failed (Connection timed out)
[2011-06-10 13:12:01.467236] W 
[socket.c:1494:__socket_proto_state_machine] 0-tcp.vmware-server: 
reading from socket failed. Error (Connection timed out), peer 
(192.168.150.251:1021)
[2011-06-10 13:12:01.467279] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one.vmx
[2011-06-10 13:12:01.467345] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one.vmxf
[2011-06-10 13:12:01.467362] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one.vmdk
[2011-06-10 13:12:01.467379] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one-flat.vmdk
[2011-06-10 13:12:01.467413] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/vmware-4.log
[2011-06-10 13:12:01.467431] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/debian one.nvram
[2011-06-10 13:12:01.467447] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/vmware-5.log
[2011-06-10 13:12:01.467463] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/vmware-6.log
[2011-06-10 13:12:01.467478] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/vmware.log
[2011-06-10 13:12:01.467494] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/.lck-98894e05b84ec3a1
[2011-06-10 13:12:01.467510] I [server-helpers.c:485:do_fd_cleanup] 
0-vmware-server: fd cleanup on /debian one/.lck-b14dcb6d11640f7e
[2011-06-10 13:12:01.467526] I [server.c:438:server_rpc_notify] 
0-vmware-server: disconnected connection from 192.168.150.251:1021
[2011-06-10 13:12:01.467546] I 
[server-helpers.c:783:server_connection_destroy] 0-vmware-server: 
destroyed connection of 
gluster1-3974-2011/06/10-08:31:35:778159-vmware-client-1
[2011-06-10 13:12:18.705503] I 
[server-handshake.c:534:server_setvolume] 0-vmware-server: accepted 
client from 192.168.150.251:1021


NFS log on gluster2:
[2011-06-10 13:11:56.708490] W [socket.c:204:__socket_rwv] 
0-testvolume-client-0: readv failed (Connection timed out)
[2011-06-10 13:11:56.708530] W 
[socket.c:1494:__socket_proto_state_machine] 0-testvolume-client-0: 
reading from socket failed. Error (Connection timed out), peer 
(192.168.150.251:24009)
[2011-06-10 13:11:56.708564] I [client.c:1883:client_rpc_notify] 
0-testvolume-client-0: disconnected
[2011-06-10 13:11:56.709490] W [socket.c:204:__socket_rwv] 
0-vmware-client-0: readv failed (Connection timed out)
[2011-06-10 13:11:56.709521] W 
[socket.c:1494:__socket_proto_state_machine] 0-vmware-client-0: reading 
from socket failed. Error (Connection timed out), peer 
(192.168.150.251:24013)
[2011-06-10 13:11:56.709553] I [client.c:1883:client_rpc_notify] 
0-vmware-client-0: disconnected
[2011-06-10 13:12:07.701752] I 
[client-handshake.c:1080:select_server_supported_programs] 
0-testvolume-client-0: Using Program GlusterFS-3.1.0, Num (1298437), 
Version (310)
[2011-06-10 13:12:07.702059] I 
[client-handshake.c:913:client_setvolume_cbk] 0-testvolume-client-0: 
Connected to