[Gluster-users] volume started but not 'startable', not 'stoppable'

2012-10-06 Thread harry mangalam
...and should have added:

the rebalance log (the volume claimed to be rebalancing before I shut it down 
but was idle or wedged at that time) is active as well with about 1 warning of 
a "1 subvolumes down -- not fixing" for every 3 informational messages:

 2012-10-06 22:05:35.396650] I [dht-rebalance.c:1058:gf_defrag_migrate_data] 
0-gli-dht: migrate data called on /nlduong/nduong2-t-
illiac/workspace/m5_sim/trunk/src/arch/.svn/tmp/wcprops 


[2012-10-06 22:05:35.451925] I [dht-layout.c:593:dht_layout_normalize] 0-gli-
dht: found anomalies in /nlduong/nduong2-t-
illiac/workspace/m5_sim/trunk/src/arch/.svn/wcprops. holes=1 overlaps=0 
  

[2012-10-06 22:05:35.451957] W [dht-selfheal.c:875:dht_selfheal_directory] 0-
gli-dht: 1 subvolumes down -- not fixing  


previously...

gluster 3.3, running on ubuntu 10.04, was running OK, had to shut down for a 
power outage.

When I tried to shut it down, it insisted that it was rebalancing, but seeemed 
wedged - no activity in the logs.

Was able to shut it down tho.

After power was restored, tried to restart the volume but altho the 4 peers 
claimed to be visible and could ping each other etc:
==
Sat Oct 06 21:38:07 [0.81 0.71 0.58]  root@pbs2:/var/log/glusterfs/bricks
567 $ gluster peer status
Number of Peers: 3

Hostname: pbs3ib
Uuid: c79c4084-d6b9-4af9-b975-40dd6aa99b42
State: Peer in Cluster (Connected)

Hostname: 10.255.77.2
Uuid: 3fcd023c-9cc9-4d1c-84c4-babfb4492e38
State: Peer in Cluster (Connected)

Hostname: pbs4ib
Uuid: 2a593581-bf45-446c-8f7c-212c53297803
State: Peer in Cluster (Connected)
==

and the volume info seemed to be OK:
==
Sat Oct 06 21:36:11 [0.75 0.67 0.56]  root@pbs2:/var/log/glusterfs/bricks   
  
565 $ gluster volume info gli   
  

  
Volume Name: gli
  
Type: Distribute
  
Volume ID: 76cc5e88-0ac4-42ac-a4a3-31bf2ba611d4 
  
Status: Started 

Number of Bricks: 4 
  
Transport-type: tcp,rdma
  
Bricks: 
  
Brick1: pbs1ib:/bducgl  
  
Brick2: pbs2ib:/bducgl  
  
Brick3: pbs3ib:/bducgl  
  
Brick4: pbs4ib:/bducgl  
  
Options Reconfigured:   
  
performance.write-behind-window-size: 1024MB
  
performance.flush-behind: on
  
performance.cache-size: 268435456   
  
nfs.disable: on 
  
performance.io-thread-count: 64 
  
performance.quick-read: on  
  
performance.io-cache: on
  

==
some utilities claim that it was not started, even tho some clients /are using 
the volume/ (tho there are some file oddities)
(from a client):

-rw-r--r-- 1 hmangala hmangala   32935 Jun 23  2010 INSTALL.txt
?- ? ??  ?? R-2.15.0
drwxr-xr-x 2 hmangala hmangala  18 Sep 10 14:20 bonnie/
drwxr-xr-x 2 root   

[Gluster-users] volume started but not 'startable', not 'stoppable'

2012-10-06 Thread harry mangalam
gluster 3.3, running on ubuntu 10.04, was running OK, had to shut down for a 
power outage.

When I tried to shut it down, it insisted that it was rebalancing, but seeemed 
wedged - no activity in the logs.

Was able to shut it down tho.

After power was restored, tried to restart the volume but altho the 4 peers 
claimed to be visible and could ping each other etc:
==
Sat Oct 06 21:38:07 [0.81 0.71 0.58]  root@pbs2:/var/log/glusterfs/bricks
567 $ gluster peer status
Number of Peers: 3

Hostname: pbs3ib
Uuid: c79c4084-d6b9-4af9-b975-40dd6aa99b42
State: Peer in Cluster (Connected)

Hostname: 10.255.77.2
Uuid: 3fcd023c-9cc9-4d1c-84c4-babfb4492e38
State: Peer in Cluster (Connected)

Hostname: pbs4ib
Uuid: 2a593581-bf45-446c-8f7c-212c53297803
State: Peer in Cluster (Connected)
==

and the volume info seemed to be OK:
==
Sat Oct 06 21:36:11 [0.75 0.67 0.56]  root@pbs2:/var/log/glusterfs/bricks   
  
565 $ gluster volume info gli   
  

  
Volume Name: gli
  
Type: Distribute
  
Volume ID: 76cc5e88-0ac4-42ac-a4a3-31bf2ba611d4 
  
Status: Started 

Number of Bricks: 4 
  
Transport-type: tcp,rdma
  
Bricks: 
  
Brick1: pbs1ib:/bducgl  
  
Brick2: pbs2ib:/bducgl  
  
Brick3: pbs3ib:/bducgl  
  
Brick4: pbs4ib:/bducgl  
  
Options Reconfigured:   
  
performance.write-behind-window-size: 1024MB
  
performance.flush-behind: on
  
performance.cache-size: 268435456   
  
nfs.disable: on 
  
performance.io-thread-count: 64 
  
performance.quick-read: on  
  
performance.io-cache: on
  

==
some utilities claim that it was not started, even tho some clients /are using 
the volume/ (tho there are some file oddities)
(from a client):

-rw-r--r-- 1 hmangala hmangala   32935 Jun 23  2010 INSTALL.txt
?- ? ??  ?? R-2.15.0
drwxr-xr-x 2 hmangala hmangala  18 Sep 10 14:20 bonnie/
drwxr-xr-x 2 root root  18 Sep 10 13:41 bonnie2/

drwx-- 2 spoorkas spoorkas  8211 Jun  2 00:22 QPSK_2Tx_2Rx_BH_Method2/
?- ? ???? QPSK_2Tx_2Rx_ML_Method1
drwx-- 2 spoorkas spoorkas  8237 Jun  3 11:22 QPSK_2Tx_2Rx_ML_Method2/
drwx-- 2 spoorkas spoorkas 12288 Jun  4 01:24 QPSK_2Tx_3Rx_BH/
drwx-- 2 spoorkas spoorkas  4232 Jun  2 00:26 QPSK_2Tx_3Rx_BH_Method1/
drwx-- 2 spoorkas spoorkas  8274 Jun  2 00:34 QPSK_2Tx_3Rx_BH_Method2/
?- ? ???? QPSK_2Tx_3Rx_ML_Method1
?- ? ???? QPSK_2Tx_3Rx_ML_Method2
-rw-r--r-- 1 spoorkas spoorkas 0 Apr 17 14:16 simple.sh.e1802207

(These files appear to be intact on the individual bricks tho.)

==
Sat Oct 06 21:38:18 [0.76 0.71 0.58]  root@pbs2:/var/log/glusterfs/bricks
568 $ gluster volume status
Volume gli is not started
==