Re: [Gluster-devel] Gluster 1.3.12 and Xen!

2008-11-25 Thread Raghavendra G
Hi, Can you send us glusterfs log files? regards, On Mon, Nov 24, 2008 at 7:32 PM, Enrico Valsecchi [EMAIL PROTECTED] wrote: Dear All, I have installed Gluster FS v. 1.3.12 on 2 server with Debian 4.05, in AFR with client side replication, explained in this page:

[Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Fred Hucht
Hi devels! We consider GlusterFS as parallel file server (8 server nodes) for our parallel Opteron cluster (88 nodes, ~500 cores), as well as for a unified nufa /scratch distributed over all nodes. We use the cluster within a scientific environment (theoretical physics) and use

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Basavanagowda Kanur
Fred, Can you also provide us server logs? -- gowda On Tue, Nov 25, 2008 at 4:57 PM, Fred Hucht [EMAIL PROTECTED] wrote: Hi devels! We consider GlusterFS as parallel file server (8 server nodes) for our parallel Opteron cluster (88 nodes, ~500 cores), as well as for a unified nufa

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Fred Hucht
Hi! The glusterfsd.log on all nodes are virtually empty, the only entry on 2008-11-25 reads 2008-11-25 03:13:48 E [io-threads.c:273:iot_flush] sc1-ioth: fd context is NULL, returning EBADFD on all nodes. I don't think that this is related to our problems. Regards, Fred On

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Joe Landman
Fred Hucht wrote: Hi! The glusterfsd.log on all nodes are virtually empty, the only entry on 2008-11-25 reads 2008-11-25 03:13:48 E [io-threads.c:273:iot_flush] sc1-ioth: fd context is NULL, returning EBADFD on all nodes. I don't think that this is related to our problems. Regards,

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Fred Hucht
Hello Harald! I didn't test Infiniband transport until now, as I don't want to interfere with the parallel applications which are running over Infiniband. Gigabit Ethernet throughput would be sufficient for us at the moment. Today only three nodes were affected, yesterday it were nine

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Fred Hucht
Hi, crawling through all /var/log/messages, I found on one of the failing nodes (node68) Nov 25 04:04:12 node68 kernel: INFO: task pw.x:20052 blocked for more than 120 seconds. Nov 25 04:04:12 node68 kernel: echo 0 /proc/sys/kernel/ hung_task_timeout_secs disables this message. Nov 25

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Fred Hucht
Hi, I forgot to say: A # umount -f /scratch; umount /scratch; mount /scratch resolves the problems. No need to reload fuse. Fred On 25.11.2008, at 13:42, Joe Landman wrote: Fred Hucht wrote: Hi! The glusterfsd.log on all nodes are virtually empty, the only entry on 2008-11-25 reads

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Harald Stürzebecher
Hello! 2008/11/25 Fred Hucht [EMAIL PROTECTED]: Hello Harald! I didn't test Infiniband transport until now, as I don't want to interfere with the parallel applications which are running over Infiniband. Gigabit Ethernet throughput would be sufficient for us at the moment. Today only three

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Joe Landman
Fred Hucht wrote: Hi, crawling through all /var/log/messages, I found on one of the failing nodes (node68) Does your setup use local disk? Is it possible that the backing store is failing? If you run mcelog /tmp/mce.log 21 on the failing node, do you get any output in

[Gluster-devel] Namespace cache size ratio

2008-11-25 Thread Daniel van Ham Colchete
Hey yall, I had a bunch of problems here that delayed my tests. Hardware problems, other systems and so. Yesterday I started the tests and what I'm learning is really making it worth. I will send all the results latter but for now the biggest find is that the XFS filesystem of *the* best for

Re: [Gluster-devel] GlusterFS hangs/fails: Transport endpoint is not connected

2008-11-25 Thread Fred Hucht
Hi, I installed and ran mcelog and only found DIMM problems on one node which is not related to our problems. I don't think that the problems are related to only a few nodes. The problems occur on the nodes where the test job runs, and the queue scheduler always selects different nodes.

Re: [Gluster-devel] rdiff-backup to glusterfs share doesn't work at all

2008-11-25 Thread Hannes Dorbath
Hello, I'm sorry for the noise. The problem seems to be completely unrelated to GlusterFS. An upgrade of lib pyxattr 0.2 to 0.4 fixed everything. So the problem was between rdiff-backup 1.2.2 and pyxattr 0.2. I'm sorry for blaming GlusterFS without proper debugging. Hannes Dorbath wrote:

Re: [Gluster-devel] Namespace cache size ratio

2008-11-25 Thread Amar S. Tumballi
Hi Daniel, 2008/11/25 Daniel van Ham Colchete [EMAIL PROTECTED] Yesterday I started the tests and what I'm learning is really making it worth. I will send all the results latter but for now the biggest find is that the XFS filesystem of *the* best for maildir storage in comprassion from