Re: [Gluster-users] Setting up 3.1

2011-01-12 Thread Marcus Bointon
Thanks for responding Amar.

On 13 Jan 2011, at 06:49, Amar Tumballi wrote:

> The default transport-type is 'tcp' in 3.1.x version, hence you don't need to 
> specify any transport type for tcp.

That's good to know - where is that documented?

> What was the earlier error in glusterd's log file ? 'glusterd.vol' should not 
> be edited by anyone, hence its not documented anywhere. Also, we changed the 
> default port from 6996 to 24007 in 3.1.x version, so obviously there is 
> nothing on 6996 after you upgraded.

This was the initial (default) config:

Given volfile:
+--+
  1: volume management
  2: type mgmt/glusterd
  3: option working-directory /etc/glusterd
  4: option transport-type socket,rdma
  5: option transport.socket.keepalive-time 10
  6: option transport.socket.keepalive-interval 2
  7: end-volume
  8: 

and it reported errors like this:

[2011-01-07 21:03:06.219877] I [glusterfsd.c:672:cleanup_and_exit] glusterfsd: 
shutting down
[2011-01-07 21:03:08.286880] I [glusterd.c:275:init] management: Using 
/etc/glusterd as working directory
[2011-01-07 21:03:08.289496] E [socket.c:322:__socket_server_bind] 
socket.management: binding to  failed: Address already in use
[2011-01-07 21:03:08.289544] E [socket.c:325:__socket_server_bind] 
socket.management: Port is already in use
[2011-01-07 21:03:08.289652] I [glusterd.c:87:glusterd_uuid_init] glusterd: 
retrieved UUID: 8563bfd6-4ce9-4aa3-bef5-a20597dda496
[2011-01-07 21:03:08.292233] W [rpc-transport.c:849:rpc_transport_load] 
rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2011-01-07 21:03:08.293277] I [glusterd-handler.c:2600:glusterd_friend_add] 
glusterd: connect returned 0
[2011-01-07 21:03:08.296098] W [rpc-transport.c:849:rpc_transport_load] 
rpc-transport: missing 'option transport-type'. defaulting to "socket"

I changed transport-type to 'tcp' and all the errors went away.

Now I look, it is indeed working on port 24007. I changed the transport-type 
back to the original, restarted it and found that it still works - but why is 
it now using a transport-type that's not listed? In searching the docs again, I 
can still only find documentation on transport-type for gluster versions older 
than 3.0.

> Is there some documentation or howto for getting gluster 3.1.1 working over 
> tcp? I don't think my situation is exactly rare!
> 
> All the docs present are updated for version 3.1.x, and are kept up to date. 
> Please refer to 
> http://www.gluster.com/community/documentation/index.php/Gluster_3.0_to_3.1_Upgrade_Guide
>  and let us know what exactly was the problem in upgrading.

The main problem I found with that page is that it didn't actually say much 
that was very useful - it's a leaf node (not a root) of a whole tree of docs 
you then need discover and read in order to do the upgrade - for example the 
need for a server pool is not mentioned at all when it's a vital new concept. 
I'm installing on Ubuntu, but I have now found the firewall info on the redhat 
install page (and nowhere else according to a search). I don't think firewall 
config belongs on OS-specific pages. You could show how to open ports on each 
specific OS, but the general requirements should documented independently and 
findably - searching for 'firewall port' doesn't find it. Overall I found the 
older (obsolete) user guide much more usable, but obviously that's out of date.

A nicer upgrade path would be one that reads a volfile (maybe just the volgen 
line) and emits a series of gluster client commands to build the same structure.

I shall persevere...

Marcus___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Share volume through rdma and tcp simultaneously

2011-01-12 Thread Beat Rubischon
Hello!

On 12.01.11 15:20, Claudio Baeza Retamal wrote:
> In gluster, it possible share a volume through rdma and tcp simultaneously?

I realized such a setup with GlusterFS 2.x by using two server volumes
in the same volfile:

volume tcp-server
  option transport-type tcp
  ...
end volume

volume ib-server
  option transport-type rdma
  ...
end volume

With GlusterFS 3.1 and it's cool management frontend I didn't found a
way to realize such a setup without hacking the volfile.
*hint-to-the-developers* :-)

Beat

-- 
 \|/   Beat Rubischon 
   ( 0-0 ) http://www.0x1b.ch/~beat/
oOO--(_)--OOo---
Meine Erlebnisse, Gedanken und Traeume: http://www.0x1b.ch/blog/
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] active directory integration and gluster config questions

2011-01-12 Thread Ellis, Albert Luther
Hey Gluster users,

I was wondering a few questions after doing a test install of Gluster Storage 
Platform 3.0.5...

1) Can I use Active Directory permissions pushed from a windows machine on a 
CIFS-exported Gluster share? If so...any how-to's or pointers? I am a tad lost 
at the moment =/

2) I notice that SSH is up on the Storage Platform install I did, but I can't 
seem to login to the shell...? The web frontend is working fine (and I've 
changed the password).

Thanks for any info!

Sincerely,

Scotty
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] strange problem

2011-01-12 Thread Koleszár Ádám

Hi

I have a strange problem. I am using glusterfs 3.0.5 on 3 machines with 
CentOS 5.4. If I am using the mounted glusterfs intensively it is using 
more and more cache. I think it is normal. But as more and more cache 
used, it's getting slower and slower. The glusterfsd process uses 
500-700% CPU (normal 1-10%, btw it's a 8 core machine). On the mounted 
glusterfs a simple directory change takes about half minute, and every 
operation very very slow.


If i execute the following command:
sync && echo 3 > /proc/sys/vm/drop_caches

Almost the entire system memory freed, and glusterfsd CPU usage fall 
back 1-10%, and everything working fine.. for a while


Now, i have to put this command into cron for every night.

Has someone encountered the same problem?
Is there any other solution?

Regards,
Adam Koleszar
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] gluster peer probe

2011-01-12 Thread Piotr Skurczak
Hello everyone,

So this is my first email here. Recently I have downloaded glusterfs-3.1 and
tried to install it on my four servers. I did not have any problems with the
installation, but the configuration is a little bit unclear to me. I would
like to ask you, if perhaps you had the same problems as I encountered and
if yes, then how did you resolve your problem ?

My configuration : 5 servers scattered around the world globe : 2 x VPS in
the UK and USA, 1 (fully) virtual server (with bridged eth adapter on
account of that my internal ifconfig shows another ip, and the public ip is
different, but the server is pingable, and ports are accessible too), and 2
servers behind routers in so called "demilitarized zone".

Since glusterfs does not need any file system (at least it was never
mentioned in installation instructions) I decided to set up a catalogue
(/data) where I am supposed to keep my data and replicate the data onto the
other /data folders on the other servers.

So we have 5 ip addresses that look like :

83.31.111.111
93.20.112.120
(...)

All ip's can ping one another. I can do ssh from one onto another. I checked
iptables rules to allow gluster on communication on whatever ports it need
Now what I do is "gluster ping probe HOSTNAME", and this is where it
either stuck, or says that it failed, or gives strange error logs.

This is an excerpt from gluster logfile :

[2011-01-09 14:59:34.273183] I
[glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received
cli list req
[2011-01-09 14:59:44.896781] I
[glusterd-handler.c:716:glusterd_handle_cli_get_volume] glusterd: Received
get vol req
[2011-01-09 15:16:16.11424] I
[glusterd-handler.c:2387:glusterd_handle_probe_query] glusterd: Received
probe from uuid: 637b41c6-3349-4202-8cab-2d1293de782c
[2011-01-09 15:16:16.11472] I [glusterd-handler.c:386:glusterd_friend_find]
glusterd: Unable to find peer by uuid
[2011-01-09 15:16:16.443504] I [glusterd-handler.c:398:glusterd_friend_find]
glusterd: Unable to find hostname: 93.20.112.120
[2011-01-09 15:16:16.443526] I
[glusterd-handler.c:2401:glusterd_handle_probe_query] glusterd: Unable to
find peerinfo for host: 93.20.112.120 (24007)
[2011-01-09 15:16:16.444769] W [rpc-transport.c:849:rpc_transport_load]
rpc-transport: missing 'option transport-type'. defaulting to "socket"
[2011-01-09 15:16:16.447238] I [glusterd-handler.c:2600:glusterd_friend_add]
glusterd: connect returned 0
[2011-01-09 15:16:16.447278] I
[glusterd-handler.c:2422:glusterd_handle_probe_query] glusterd: Responded to
83.31.111.111, op_ret: 0, op_errno: 0, ret: 0
[2011-01-09 15:16:16.491402] E [socket.c:1656:socket_connect_finish]
management: connection to  failed (No route to host)
[2011-01-09 15:16:16.491726] I
[glusterd-handler.c:2131:glusterd_handle_incoming_friend_req] glusterd:
Received probe from uuid: 637b41c6-3349-4202-8cab-2d1293de782c
[2011-01-09 15:16:16.491751] I [glusterd-handler.c:386:glusterd_friend_find]
glusterd: Unable to find peer by uuid
[2011-01-09 15:16:16.491767] I
[glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend
93.20.112.120 found.. state: 0
[2011-01-09 15:18:40.413343] I
[glusterd-handler.c:674:glusterd_handle_cli_list_friends] glusterd: Received
cli list req
[2011-01-09 15:18:40.413449] I
[glusterd-handler.c:2768:glusterd_xfer_friend_add_resp] glusterd: Responded
to 83.31.111.111 (0), ret: 0
[2011-01-09 15:18:40.413639] I
[glusterd-utils.c:2101:glusterd_friend_find_by_hostname] glusterd: Friend
93.20.112.120 found.. state: 2

On all of my servers I can do telnet HOST 24007, and it works. glusterd
process is up and running on all servers.

What I did was that I added my own names into /etc/hosts like node1 node2
node3... did not help.
There is an interesting video over here :
http://www.youtube.com/user/GlusterStorage#p/u/3/sSCCZLzNnUQ where gluster
guys just hit enter and bah, a peer is in a cluster. Reality, however, tells
another story. It is either the instruction that is incomplete or does not
consist of all tips and tricks OR I am just missing something.

Any help in this matter would be highly appreciated.
Peter
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Setting up 3.1

2011-01-12 Thread Amar Tumballi
Hi Marcus,

Replies Inline.

On Thu, Jan 13, 2011 at 2:59 AM, Marcus Bointon
wrote:

> [Repost - last time this didn't seem to work]
>
> I've been running gluster for a couple of years, so I'm quite used to 3.0.x
> and earlier. I'm looking to upgrade to 3.1.1 for more stability (I'm getting
> frequent 'file has vanished' errors when rsyncing from 3.0.6) on a
> bog-standard 2-node dist/rep config. So far it's not going well. I'm running
> on Ubuntu Lucid x64 using the current 3.1.1 package. It seems that a lot of
> the basics are not covered in the docs - for example, it doesn't seem to
> work with tcp out of the box - the vital transport-type option doesn't even
> appear in the 3.1 docs as far as I can find!
>
>
The default transport-type is 'tcp' in 3.1.x version, hence you don't need
to specify any transport type for tcp.



> So now I've got as far as tweaking that in the glusterd.vol file (also not
> documented) and glusterd starts with no errors, but netstat tells me nothing
> is running on that port and it's not responding on port 6996 if I try to
> telnet to it.
>
>
What was the earlier error in glusterd's log file ? 'glusterd.vol' should
not be edited by anyone, hence its not documented anywhere. Also, we changed
the default port from 6996 to 24007 in 3.1.x version, so obviously there is
nothing on 6996 after you upgraded.



> Is there some documentation or howto for getting gluster 3.1.1 working over
> tcp? I don't think my situation is exactly rare!
>
> All the docs present are updated for version 3.1.x, and are kept up to
date. Please refer to
http://www.gluster.com/community/documentation/index.php/Gluster_3.0_to_3.1_Upgrade_Guide
and
let us know what exactly was the problem in upgrading.

Regards,
Amar
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] GlusterSP 3.1.1 Progress?

2011-01-12 Thread MIKE SHELDON
Any update on Gluster Storage Platform 3.1.1 release date?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Setting up 3.1

2011-01-12 Thread Marcus Bointon
[Repost - last time this didn't seem to work]

I've been running gluster for a couple of years, so I'm quite used to 3.0.x and 
earlier. I'm looking to upgrade to 3.1.1 for more stability (I'm getting 
frequent 'file has vanished' errors when rsyncing from 3.0.6) on a bog-standard 
2-node dist/rep config. So far it's not going well. I'm running on Ubuntu Lucid 
x64 using the current 3.1.1 package. It seems that a lot of the basics are not 
covered in the docs - for example, it doesn't seem to work with tcp out of the 
box - the vital transport-type option doesn't even appear in the 3.1 docs as 
far as I can find!

So now I've got as far as tweaking that in the glusterd.vol file (also not 
documented) and glusterd starts with no errors, but netstat tells me nothing is 
running on that port and it's not responding on port 6996 if I try to telnet to 
it.

Is there some documentation or howto for getting gluster 3.1.1 working over 
tcp? I don't think my situation is exactly rare!

Marcus
-- 
Marcus Bointon
Synchromedia Limited: Creators of http://www.smartmessages.net/
UK resellers of i...@hand CRM solutions
mar...@synchromedia.co.uk | http://www.synchromedia.co.uk/


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Frequent "stale nfs file handle" error

2011-01-12 Thread Fabricio Cannini
Em Quarta-feira 12 Janeiro 2011, às 09:05:00, Łukasz Jagiełło escreveu:
> W dniu 12 stycznia 2011 11:19 użytkownik Amar Tumballi
> 
>  napisał:
> >> Got same problem at 3.1.1 - 3.1.2qa4
> > 
> > Can you paste the logs ?? Also, when you say problem, what is the user
> > application errors seen?
> 
> No errors/notice logs at gluster, just client side where nfs is
> mounted. When I try list directory got "stale nfs file handle".
> 'mount /dir -o remount' helps but thats not solution.

One thing that i noticed reading again the documentation about AFR, is that 
the first node of my cluster was the 'lock server' for all nodes. At the high 
loads that are common there, it surely is possible that a single machine could 
not manage the locks in a timely manner, right ?
If so, how can i set a specific node as the 'lock server' of a given subvolume 
?

http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator

Haven't found it in the docs, only to increase the number of 'lock servers' in 
a subvolume:

http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator#Locking_options

TIA.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Debian, 3.1.1, duplicate files

2011-01-12 Thread phil cryer
I'm now running gluster 3.1.1 on Debian. A directory that was running
under 3.0.4 had duplicate files, but I've remounted things now that
we're running 3.1.1 in hopes it would fix things, but so far it has
not:

# ls -l /mnt/glusterfs/www/0/0descriptionofta581unittotal 37992
-rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
0descriptionofta581unit_bw.pdf
-rwxr-xr-x 1 www-data www-data   796343 Jun 23  2010
0descriptionofta581unit_bw.pdf
-T 1 root root 1497 Jun 24  2010
0descriptionofta581unit_dc.xml
-T 1 root root 1497 Jun 24  2010
0descriptionofta581unit_dc.xml
-T 1 www-data www-data   577050 Jun 24  2010
0descriptionofta581unit.djvu
-T 1 www-data www-data   577050 Jun 24  2010
0descriptionofta581unit.djvu
-rwxr-xr-x 1 www-data www-data33272 Jun 22  2010
0descriptionofta581unit_djvu.txt
-rwxr-xr-x 1 www-data www-data33272 Jun 22  2010
0descriptionofta581unit_djvu.txt
-rwxr-xr-x 1 www-data www-data 4445 Jun 23  2010
0descriptionofta581unit_files.xml
-rwxr-xr-x 1 www-data www-data 4445 Jun 23  2010
0descriptionofta581unit_files.xml
-rwxr-xr-x 1 www-data www-data 5011 Jun 22  2010
0descriptionofta581unit_marc.xml
-rwxr-xr-x 1 www-data www-data 5011 Jun 22  2010
0descriptionofta581unit_marc.xml
-rwxr-xr-x 1 www-data www-data  360 Jun 23  2010
0descriptionofta581unit_metasource.xml
-rwxr-xr-x 1 www-data www-data  360 Jun 23  2010
0descriptionofta581unit_metasource.xml
-rwxr-xr-x 1 www-data www-data 2848 Jun 22  2010
0descriptionofta581unit_meta.xml
-rwxr-xr-x 1 www-data www-data 2848 Jun 22  2010
0descriptionofta581unit_meta.xml
-rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
0descriptionofta581unit_orig_jp2.tar
-rwxr-xr-x 1 www-data www-data 16916480 Jun 22  2010
0descriptionofta581unit_orig_jp2.tar
-rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 0descriptionofta581unit.pdf
-rwxr-xr-x 1 www-data www-data  1051810 Jun 22  2010 0descriptionofta581unit.pdf

While running the latest, 3.1.1, I noticed some log files that said:

[..]
[2011-01-12 15:24:33.325546] I
[afr-common.c:613:afr_lookup_self_heal_check] bhl-volume-replicate-69:
size differs for
/www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
[2011-01-12 15:24:33.325558] I [afr-common.c:716:afr_lookup_done]
bhl-volume-replicate-69: background  meta-data data self-heal
triggered. path:
/www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
[2011-01-12 15:24:33.364501] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
bhl-volume-replicate-66: background  meta-data data self-heal
completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu
[2011-01-12 15:24:33.364881] I
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk]
bhl-volume-replicate-69: background  meta-data data self-heal
completed on /www/0/0descriptionofta581unit/0descriptionofta581unit.djvu

I assumed it was fixing that, but it didn't. Here's the full logs that
include all the gluster.log work it did in this directory:
http://pastebin.com/8X52Em7Y

Question: how can I 'fix' this, or is the best bet to remove
everything and start over? It's going to set us back, but I'd rather
do it now that keep banging on this without any resolution.

Thanks for the help, really like the new gluster command, very nice!

P
-- 
http://philcryer.com
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] strange problem

2011-01-12 Thread Vikas Gorur

On Jan 12, 2011, at 1:25 AM, Koleszár Ádám wrote:

> Hi,
> 
> I have a strange problem. I am using glusterfs 3.0.5 on 3 machines with 
> CentOS 5.4. If I am using the mounted glusterfs intensively it is using more 
> and more cache. I think it is normal. But as more and more cache used, it's 
> getting slower and slower.
> The glusterfsd process uses 500-700% CPU (normal 1-10%, btw it's a 8 core 
> machine). On the mounted glusterfs a simple directory change takes about half 
> minute, and every operation very very slow on the mounted glusterfs.
> 
> If i execute the following command:
> sync && echo 3 > /proc/sys/vm/drop_caches
> 
> Almost the entire system memory freed, and glusterfsd CPU usage fall back 
> 1-10%, and everything working fine.. for a while
> 
> Now, i have to put this command into cron for every night.
> 
> Has someone encountered the same problem?
> Is there any other solution?


You can set the sysctl parameter vfs_cache_pressure to a high value to make the 
kernel free up the cache at a higher rate. We've seen better latencies with 
this set to something like 1. You can also experiment with setting 
vm.swappiness to 0.

/etc/sysctl.conf

vm.vfs_cache_pressure=1
vm.swappiness=0

# sysctl -p

--
Vikas Gorur
Engineer - Gluster, Inc.
--








___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] 4 node replica 2 crash

2011-01-12 Thread rickytato rickytato
This is stack trace I found syslog:

Jan 10 18:08:24 www3 kernel: [2773721.043130] INFO: task nginx:22664 blocked
for more than 120 seconds.
Jan 10 18:08:24 www3 kernel: [2773721.043152] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 10 18:08:24 www3 kernel: [2773721.043176] nginx D
0001108733fc 0 22664   3107 0x0004
Jan 10 18:08:24 www3 kernel: [2773721.043179]  880058d7db68
0082 8800 00015980
Jan 10 18:08:24 www3 kernel: [2773721.043181]  880058d7dfd8
00015980 880058d7dfd8 880018492dc0
Jan 10 18:08:24 www3 kernel: [2773721.043184]  00015980
00015980 880058d7dfd8 00015980
Jan 10 18:08:24 www3 kernel: [2773721.043186] Call Trace:
Jan 10 18:08:24 www3 kernel: [2773721.043192]  []
request_wait_answer+0x85/0x240
Jan 10 18:08:24 www3 kernel: [2773721.043196]  [] ?
autoremove_wake_function+0x0/0x40
Jan 10 18:08:24 www3 kernel: [2773721.043199]  []
fuse_request_send+0x7c/0x90
Jan 10 18:08:24 www3 kernel: [2773721.043202]  []
fuse_dentry_revalidate+0x179/0x2b0
Jan 10 18:08:24 www3 kernel: [2773721.043204]  []
do_lookup+0x84/0x280
Jan 10 18:08:24 www3 kernel: [2773721.043206]  []
link_path_walk+0x12e/0xab0
Jan 10 18:08:24 www3 kernel: [2773721.043208]  []
do_filp_open+0x143/0x660
Jan 10 18:08:24 www3 kernel: [2773721.043212]  [] ?
default_spin_lock_flags+0x9/0x10
Jan 10 18:08:24 www3 kernel: [2773721.043216]  [] ?
sys_recvfrom+0xe1/0x170
Jan 10 18:08:24 www3 kernel: [2773721.043220]  [] ?
_raw_spin_lock+0xe/0x20
Jan 10 18:08:24 www3 kernel: [2773721.043222]  [] ?
alloc_fd+0x10a/0x150
Jan 10 18:08:24 www3 kernel: [2773721.043226]  []
do_sys_open+0x69/0x170
Jan 10 18:08:24 www3 kernel: [2773721.043229]  []
sys_open+0x20/0x30
Jan 10 18:08:24 www3 kernel: [2773721.043232]  []
system_call_fastpath+0x16/0x1b


2011/1/12 rickytato rickytato 

> Some other info:
> S.O. Ubuntu 10.10 64bit
> GlusterFS compiled from source
>
> Client and server are the same machine; the machine are simple webserver
> with Nginx + PHP-FPM and only one directory for static contents are exported
> by GlusterFS; the PHP core are only local.
>
> The server are 2 NIC 1GBit in bonding.
>
> Other?
>
> The very strange problem is that only after about 4 hours to add new node
> Nginx stop to response.
>
> Any suggestions?
>
>
> rr
>
> 2011/1/11 rickytato rickytato 
>
> Hi,
>> I'm using for about 4 weeks a simple 2 node replica 2 cluster; I'm
>> using glusterfs 3.1.1 built on Dec  9 2010 15:41:32 Repository revision:
>> v3.1.1 .
>> I use it to provide images trought Nginx.
>> All works well.
>>
>> Today i've added 2 new brick, and rebalance volume. For about 4 hours
>> work, after the Nginx hang; i've rebooted all server but nothings to do.
>>
>> When I removed two brick all returns ok (I've manually copied file from
>> "old" brick to the original).
>>
>>
>> What's wrong?
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] 4 node replica 2 crash

2011-01-12 Thread rickytato rickytato
Some other info:
S.O. Ubuntu 10.10 64bit
GlusterFS compiled from source

Client and server are the same machine; the machine are simple webserver
with Nginx + PHP-FPM and only one directory for static contents are exported
by GlusterFS; the PHP core are only local.

The server are 2 NIC 1GBit in bonding.

Other?

The very strange problem is that only after about 4 hours to add new node
Nginx stop to response.

Any suggestions?


rr

2011/1/11 rickytato rickytato 

> Hi,
> I'm using for about 4 weeks a simple 2 node replica 2 cluster; I'm
> using glusterfs 3.1.1 built on Dec  9 2010 15:41:32 Repository revision:
> v3.1.1 .
> I use it to provide images trought Nginx.
> All works well.
>
> Today i've added 2 new brick, and rebalance volume. For about 4 hours work,
> after the Nginx hang; i've rebooted all server but nothings to do.
>
> When I removed two brick all returns ok (I've manually copied file from
> "old" brick to the original).
>
>
> What's wrong?
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Share volume through rdma and tcp simultaneously

2011-01-12 Thread Claudio Baeza Retamal

Dear friends,

I have been reading some documents about gluster, but I have not found 
an answer to an question.


In gluster, it possible share a volume through rdma and tcp simultaneously?

something like, would work?

volume home1-server
type protocol/server
   option transport-type rdma,tcp
   option auth.addr./lustre/int.allow *
   subvolumes /san/int
end-volume


saludos

claudio

--
Claudio Baeza Retamal
CTO
National Laboratory for High Performance Computing (NLHPC)
Center for Mathematical Modeling (CMM)
School of Engineering and Sciences
Universidad de Chile



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Frequent "stale nfs file handle" error

2011-01-12 Thread Fabricio Cannini
Em Quarta-feira 12 Janeiro 2011, às 08:19:50, Amar Tumballi escreveu:
> > > If anybody can make a sense of why is it happening, i'd be really
> > > really thankful.
> > 
> > We fixed many issues in 3.1.x releases compared to 3.0.5 (even some
> > issues
> 
> fixed in 3.0.6). Please considering testing/upgrading to higher version.

I'm thinking about upgrading, but i'd rather stay with debian stock packages 
if possible.
I'll talk with Patrick Matthäi, Debian's gluster maintainer and see if is it 
possible to backport the fixes.
Also, if there is any work-around available, please tell us.

> > Got same problem at 3.1.1 - 3.1.2qa4
> 
> Can you paste the logs ?? Also, when you say problem, what is the user
> application errors seen?

i've put a bunch of log messages here >> http://pastebin.com/gkf3CmK9 and here 
>> http://pastebin.com/wDgF74j8 .

> Regards,
> Amar
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Frequent "stale nfs file handle" error

2011-01-12 Thread Łukasz Jagiełło
W dniu 12 stycznia 2011 11:19 użytkownik Amar Tumballi
 napisał:
>> Got same problem at 3.1.1 - 3.1.2qa4
>>
>
> Can you paste the logs ?? Also, when you say problem, what is the user
> application errors seen?

No errors/notice logs at gluster, just client side where nfs is
mounted. When I try list directory got "stale nfs file handle".
'mount /dir -o remount' helps but thats not solution.

-- 
Łukasz Jagiełło
lukaszjagielloorg
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Frequent "stale nfs file handle" error

2011-01-12 Thread Amar Tumballi
>
> >
> > If anybody can make a sense of why is it happening, i'd be really really
> > thankful.
>
> We fixed many issues in 3.1.x releases compared to 3.0.5 (even some issues
fixed in 3.0.6). Please considering testing/upgrading to higher version.



> Got same problem at 3.1.1 - 3.1.2qa4
>
>
Can you paste the logs ?? Also, when you say problem, what is the user
application errors seen?


Regards,
Amar
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Frequent "stale nfs file handle" error

2011-01-12 Thread Łukasz Jagiełło
2011/1/11 Fabricio Cannini :
> Hi all.
>
> I've been having this error very frequently, at least once in a week.
> Whenever this happens, restarting all the gluster daemons makes things work
> again.
>
> This is the hardware i'm using:
>
> 22 nodes
> 2x Intel xeon 5420 2.5GHz , 16GB ddr2 ECC , 1 sata2 hd of 750GB.
> Of which ~600GB is a partition ( /glstfs ) dedicated to gluster. Each node
> have 1 Mellanox MT25204 [InfiniHost III Lx] Inifiniband DDR HCA used by
> gluster through the 'verbs' interface. The switch is a Voltaire ISR 9024S/D.
> Each node also is a client of the gluster volume, that is accessed through the
> '/scratch' mount-point.
> The machine itself is a scientific cluster, with all nodes and the head 
> running
> Debian Squeeze amd64, with stock 3.0.5 packages.
>
> These are the server and client configs:
>
> Client config
> http://pastebin.com/6d4BjQwd
>
> Server config
> http://pastebin.com/4ZmX9ir1
>
> And here are some of the messages in the head node log:
> http://pastebin.com/gkf3CmK9
>
> If anybody can make a sense of why is it happening, i'd be really really
> thankful.

Got same problem at 3.1.1 - 3.1.2qa4

-- 
Łukasz Jagiełło
lukaszjagielloorg
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] strange problem

2011-01-12 Thread Koleszár Ádám

Hi,

I have a strange problem. I am using glusterfs 3.0.5 on 3 machines with 
CentOS 5.4. If I am using the mounted glusterfs intensively it is using 
more and more cache. I think it is normal. But as more and more cache 
used, it's getting slower and slower.
The glusterfsd process uses 500-700% CPU (normal 1-10%, btw it's a 8 
core machine). On the mounted glusterfs a simple directory change takes 
about half minute, and every operation very very slow on the mounted 
glusterfs.


If i execute the following command:
sync && echo 3 > /proc/sys/vm/drop_caches

Almost the entire system memory freed, and glusterfsd CPU usage fall 
back 1-10%, and everything working fine.. for a while


Now, i have to put this command into cron for every night.

Has someone encountered the same problem?
Is there any other solution?

Regards,
Adam Koleszar
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users