Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Sunil Mushran
What version? File a bugzilla (oss.oracle.com/bugzilla) with all the version details, etc. Easier to track issues that-a-way. mike wrote: > Here's another issue: > > I have a client with 9049 files/dirs in a specific dir. The first time > I did an ls /that/dir/ it froze up - and hitting control C

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
Here's another issue: I have a client with 9049 files/dirs in a specific dir. The first time I did an ls /that/dir/ it froze up - and hitting control C actually made it behave interestingly. Every two times I hit control C, I got one of these permission denied lines (the files exist, I can open t

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
Is there any OCFS2 debugging I could turn on as well? To log perhaps every file request and how long it took, or if something hits a threshhold, etc? I think I can turn on the webserver's debugging (if not, the author is pretty responsive) and hopefully find the bottleneck through there. On 4/21/

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Herbert van den Bergh
Does the web server that the proxy server talks to have any extended debugging you can turn on? In particular, would it be able to log timestamps of things it does, so you can narrow down where the hic-up occurs? A brute force method to do this would be to run strace -T on all server process

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
You're right, it -is- possible, but if you look at it (and I can log it for hours) it only seems to do that right before I get a timeout message from the proxy. The two appear to be related. I will continue to monitor this and make sure that my hypothesis is correct. Something is flaking out every

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Herbert van den Bergh
Mike, Are you sure it's not possible for sdb to be idle for just 1 second? If you look at the interval right after the one you pointed out, you'll see r/s is 2.97 and w/s is .99, so it did 3 reads and 1 write in that one second interval. The device appears to be used very little. I think i

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
Thanks. If I have the opportunity to run the (buggy) new kernel again I will try this. That is a definately problem and I think I need to set the oracle behavior to crash and not auto reboot for this to be effective, right? That is just one issue. 1) 2.6.24-16 with load completely crashes node pr

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Sunil Mushran
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/networking/netconsole.txt;h=3c2f2b3286385337ce5ec24afebd4699dd1e6e0a;hb=HEAD netconsole is a facility to capture oops traces. It is not a console per se and does not require a head/gtk/x11 etc to work. The link

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
Well these are headless production servers, CLI only. no GTK, no X11. also I am not running the newer kernels (and I can't...) it looks like I cannot run a hybrid of 2.6.24-16 and 2.6.22-19, whichever one has mounted the drive first is the winner. If I mix them, I can get the 2.6.24's to mount, th

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Sunil Mushran
Setting up netconsole does not require a reboot. The idea is to catch the oops trace when the oops happens. Without that trace, we are flying blind. mike wrote: > Since these are production I can't do much. > > But I did get an error (it's not happening as much but it still blips > here and there)

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
Since these are production I can't do much. But I did get an error (it's not happening as much but it still blips here and there) Notice that /dev/sdb (my iscsi target using ocfs2) hits 0.00% utilization, 3 seconds before my proxy says "hey, timeout" - every other second there is -always- some ut

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Sunil Mushran
Do you have the panic output... kernel stack trace. We'll need that to figure this out. Without that, we can only speculate. mike wrote: > On 4/21/08, Tao Ma <[EMAIL PROTECTED]> wrote: > >> mike wrote: >> >>> I have changed my kernel back to 2.6.22-14-server, and now I don't get >>> the ke

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
> On 4/21/08, Tao Ma <[EMAIL PROTECTED]> wrote: > Also please provide more details about it. I am using nginx for a frontend load balancer, and nginx for a webserver as well. This doesn't seem to be related to the webserver at all though, it was happening before this. lvs01 proxies traffic in to

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
On 4/21/08, Tao Ma <[EMAIL PROTECTED]> wrote: > mike wrote: > > I have changed my kernel back to 2.6.22-14-server, and now I don't get > > the kernel panics. It seems like an issue with 2.6.24-16 and some i/o > > made it crash... > > > > > OK, so it seems that it is a bug for ocfs2 kernel, not the

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Tao Ma
mike wrote: > I have changed my kernel back to 2.6.22-14-server, and now I don't get > the kernel panics. It seems like an issue with 2.6.24-16 and some i/o > made it crash... > OK, so it seems that it is a bug for ocfs2 kernel, not the ocfs2-tools. :) Then could you please describe it in more d

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Joel Becker
On Mon, Apr 21, 2008 at 05:02:33PM +0800, Tao Ma wrote: > Then there is only one thing maybe. Have you modify > /etc/sysconfig/o2cb(This is the place for RHEL, not sure the place in > ubuntu)? I have checked the rpm package for RHEL, it will update > /etc/sysconfig/o2cb and this file has some ti

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
I have changed my kernel back to 2.6.22-14-server, and now I don't get the kernel panics. It seems like an issue with 2.6.24-16 and some i/o made it crash... However I am still getting file access timeouts once in a while. I am nervous about putting more load on the setup. [EMAIL PROTECTED] .ba

Re: [Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread Tao Ma
Hi Mike, Are you sure it is caused by the update of ocfs2-tools? AFAIK, the ocfs2-tools only include tools like mkfs, fsck and tunefs etc. So if you don't make any change to the disk(by using this new tools), it shouldn't cause the problem of kernel panic since they are all user space to

[Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

2008-04-21 Thread mike
Hi, I'm running into a big issue. I believe it is OCFS2, I can get my machines to kernel panic consistently. Before I was running Ubuntu Gutsy (7.10) ocfs2-tools 1.2.4. Now I am running Ubuntu Hardy (8.04) ocfs2-tools 1.3.9. I am even running the same kernel (2.6.22-14), but the behavior has cha