Re: [Ocfs2-users] Non-blocking error when installing OCFS2 1.4.1 from source

2008-11-03 Thread Sunil Mushran
27;s safe to ignore this error? > > -Oorspronkelijk bericht- > Van: Sunil Mushran [mailto:[EMAIL PROTECTED] > Verzonden: maandag 3 november 2008 21:16 > Aan: Wessel > CC: ocfs2-users@oss.oracle.com > Onderwerp: Re: [Ocfs2-users] Non-blocking error when installing OCFS2

Re: [Ocfs2-users] Non-blocking error when installing OCFS2 1.4.1 from source

2008-11-03 Thread Sunil Mushran
ocfs2_stackglue was added in 2.6.27 as part of the userspace cluster stack support. Not relevant in earlier kernels. Wessel wrote: > Hi all, > > We've been using OCFS2 in our cluster setup for quite some time now, and > recently I've been thinking about upgrading the cluster to the new 1.4.1 > rel

Re: [Ocfs2-users] Question regarding HA of OCFS2

2008-10-31 Thread Sunil Mushran
ocfs2 is a shared disk clustered file system. As the disk is accessible by all nodes, there is no need to keep the data in sync. Instead the fs needs to ensure nodes coordinate access such that multiple nodes reading and writing concurrently do not corrupt the fs. You may want to read the ocfs2 1.

Re: [Ocfs2-users] Another node is heartbeating in our slot! errors with LUN removal/addition

2008-10-30 Thread Sunil Mushran
Device "dm-30": another node is heartbeating in our slot!" > messages keep flowing through syslog until I manually remove the > heartbeat. > > I can reproduce this over and over. > > TIA, > > Daniel > > >> -Original Message- >> From: Sun

Re: [Ocfs2-users] Differences in 1.2 vs 1.4

2008-10-30 Thread Sunil Mushran
Definitely upgrade to 1.2.9-1. No point sticking with 1.2.8. 1.2.9 has been out since june. This is a no-brainer. To 1.4 or not, with less than 2 weeks before golive, is a harder call. If nothing else, it would also require a kernel upgrade. We fix bugs in both releases. Actually, all three whe

Re: [Ocfs2-users] Soft lockup preceding directory becoming inaccessible.

2008-10-27 Thread Sunil Mushran
find and du are not the same. As in, a find may only need to read the directory entries and thus only lock the dir inode. OTOH, du needs to read all the inodes in the dir tree. The stack shows a soft lockup that could have been triggered if the process was extending, ftruncate(), the file to a ver

Re: [Ocfs2-users] frequent production node reboots

2008-10-27 Thread Sunil Mushran
Do upgrade to ocfs2 1.2.9-1. It has a fix for oss bugzilla#919 that could be causing the timeouts. The symptom for that issue is o2net spinning at 100% shortly before the timeout/fence. Saranya Sivakumar wrote: > Hi All, > > We have been having frequent node reboots in our 4 node production RAC

Re: [Ocfs2-users] Another node is heartbeating in our slot! errors with LUN removal/addition

2008-10-24 Thread Sunil Mushran
So that's the problem. The heartbeat is not stopping because of the segfault. I reviewed the code change in this tool (1.2.7 to 1.4.1) and it is quite limited. As in, I have no idea as to why it is segfaulting. Now you could stop the heartbeat manually. You have to be careful though because stoppi

Re: [Ocfs2-users] Another node is heartbeating in our slot! errors with LUN removal/addition

2008-10-22 Thread Sunil Mushran
Are you mounting the snapshotted lun on more than one node? If not, then use tunefs.ocfs2 to also make it mount local. That is, do it the time you are changing the label and uuid. This will avoid the problem as the fs will not start hb for local mounts. However, this just avoids the issue. To res

Re: [Ocfs2-users] How are people using OCFS2 - any limitations

2008-10-13 Thread Sunil Mushran
From the dev point of view, make sure you use ocfs2 1.4. It would mean upgrading the servers to (RH)EL5 U2 (or SLES10 SP2). I'll let actual users answer the qs you have asked. Patrick Kelly wrote: > > University of California, Davis, runs a campus sakai application. > Sakai is a group of module

Re: [Ocfs2-users] New node..new problems

2008-10-09 Thread Sunil Mushran
Yeah the cluster timeouts are not consistent. Update and restart the cluster on the new node (or all nodes as the case might be). Hint: cat /sys/kernel/config/cluster//idle_timeout_ms to see the active heartbeat threshold. Dante Garro wrote: > Hi all, because problems with ocfs2 release of Debian

Re: [Ocfs2-users] Who rebooted?

2008-10-06 Thread Sunil Mushran
Two issues. softlockup and oops in jbd. Safe to say the issue is between jbd and ocfs2. I'll need more info to proceed. Do: $ objdump -DSl /lib/modules/`uname -r`/kernel/fs/jbd/jbd.ko >/tmp/jbd.out $ ./stat_sysdir -d device >/tmp/sysdir.out http://oss.oracle.com/~smushran/.debug/scripts/stat_sysd

Re: [Ocfs2-users] ocfs2 kernel BUG

2008-10-03 Thread Sunil Mushran
This is the same as issue. http://oss.oracle.com/bugzilla/show_bug.cgi?id=1012 Is this happening frequently? We have failed to reproduce it in our test cluster. If you can reproduce it, I could give you a potential fix for testing. Let me know. Sunil Christian van Barneveld wrote: > Hi, > > Th

Re: [Ocfs2-users] 2 node cluster reboot

2008-10-03 Thread Sunil Mushran
No. Only available on sles10 sp2 and (rh)el5 u2. For more, please refer to: http://oss.oracle.com/projects/ocfs2/ David Coulson wrote: > Is OCFS2 1.4 available for RHEL4? > > David > > Sunil Mushran wrote: >> For starters, (RH)EL5 2.6.18 is not the same as 2.6.18 mainlin

Re: [Ocfs2-users] 2 node cluster reboot

2008-10-03 Thread Sunil Mushran
.2 (kernel 2.6.18.92.1.13) with ocfs2 1.2 or with ocfs2 > 1.4? > 2 Debian's 2.6.26 kernel with ocfs2 1.2 or with ocfs2 1.4? > - if I switch to Debian's 2.6.26, what version of ocfs2 can I use? > > Thank you, > > Dante > > > > > -----Mensaje original--

Re: [Ocfs2-users] ocfs2 filesystem seems out of sync

2008-09-25 Thread Sunil Mushran
Which kernel/distro/ocfs2 version? When you say writes, are you referring to files/dirs mismatch or filedata mismatch. Can you expand on what you are noticing. BTW, you can use debugfs.ocfs2 to help in your debugging. It allows one to read the contents of the fs directly from the volume. It has c

Re: [Ocfs2-users] server crash : Assertion failure in do_get_write_access (kernel 2.6.9-42.0.2.ELs

2008-09-24 Thread Sunil Mushran
Do you have a netconsole server setup? If not, it is recommended that you do because it captures the full oops logs. For example, if we had the full oops log, we would not only know the component (ext3 or ocfs2) that triggered this and also the potential fix. The non-auto-restart is because you ha

Re: [Ocfs2-users] help needed with ocfs2 on centos5.2

2008-09-24 Thread Sunil Mushran
The ocfs2 kernel package installed does not correspond to the running kernel. Well, you say you have centos 5.2 running but the kernel version# mentioned is 5.0. Read the "Getting Started" section in the ocfs2 docs to determine the appropriate ocfs2 kernel package. ritesh sinha wrote: > Hi, > I w

Re: [Ocfs2-users] Availability of the Open-Sharedroot Cluster Project for Novell SLES10 with OCFS2

2008-09-23 Thread Sunil Mushran
Are the comoonics packages open sourced (GPL or otherwise)? Wondering as I could not find the sources on the site. Marc Grimme wrote: > Hello, > I just wanted to inform you that we have successfully ported the > Open-Sharedroot Cluster to be used with Novell SLES10 SP2 with OCFS2 1.4.1 > (SuSE V

Re: [Ocfs2-users] 2 node cluster reboot

2008-09-23 Thread Sunil Mushran
will shutdown the > service orderly. In this case a with just one node running are there a way > to keep it running avoiding panicing itself? > > Dante > > > > -Mensaje original- > De: Sunil Mushran [mailto:[EMAIL PROTECTED] > Enviado el: Martes, 23 de Sep

Re: [Ocfs2-users] fsck in startup scripts

2008-09-23 Thread Sunil Mushran
Hmm... meaning the corrupted area was not touched by the running fs. Because if it had, the fs would have been remounted ro with a message asking the user to run fsck. Remember, ocfs2 is a journaled fs. So there is no need to run fsck on a regular basis. If a node dies, the surviving node recovers

Re: [Ocfs2-users] 2 node cluster reboot

2008-09-23 Thread Sunil Mushran
have it repaired/serviced . > Are there a solution for my case with ocfs? > > Dante > > > > -Mensaje original- > De: Sunil Mushran [mailto:[EMAIL PROTECTED] > Enviado el: Martes, 02 de Septiembre de 2008 05:09 p.m. > Para: Dante Garro > CC: 'ocfs2-users@o

Re: [Ocfs2-users] (no subject)

2008-09-22 Thread Sunil Mushran
The fencing is because the io write took more than 2 mins. Since you have provided only a snippet of the logs, all I can say is that mutipathd detecting the path failure and o2hb fencing is 90 secs apart. I don't see the barely timed out bit. Check your multipath setting/configuration. Daniel Ke

Re: [Ocfs2-users] Lost write in archive logs: has it ever happened?

2008-09-22 Thread Sunil Mushran
No, I've never heard of an end user running into it. Or, any bug we've fixed that addresses this. Are you multiplexing the archivelogs? Lost write could be due to any layer from the userspace to the disk array. By multiplexing archivelogs and mirroring redologs, you will reduce the chances of get

Re: [Ocfs2-users] o2hb_do_disk_heartbeat:982:ERROR

2008-09-19 Thread Sunil Mushran
Ensure the cluster.conf is the same across the cluster. If it is not, edit and restart the cluster. The "transport endpoint" error means that the tcpip connect failed. It could be because of incorrect ip, firewall, or a bad cluster.conf. The dmesg errors indicate that the cluster.conf could be mi

Re: [Ocfs2-users] Ocfs2 cluster and sw level...

2008-09-19 Thread Sunil Mushran
The network protocol differs between 1.2 and 1.4. It won't work. Marco Mililotti wrote: > Hi all, > > is it possible/acceptable to mount and use an Ocfs2 filesystem on two > machines running *different* software level? I.e.: > - M1 that runs ocfs2 1.2.3-0.7, kernel drv ver: 1.2.5-SLES-r2997 > - M

Re: [Ocfs2-users] OCFS2 and Xen - aio error -14

2008-09-17 Thread Sunil Mushran
One more thing. Do: $ strace -ff -o /tmp/save xm save Zip up the traces and attach to bugzilla. Lastly, is the guest hvm or pvm, 32-bit or 64-bit. From the packages the host machine appears to be 64-bit. Sunil Mushran wrote: > Enable some tracing: > > $ debugfs.ocfs2 -l ENTRY E

Re: [Ocfs2-users] OCFS2 and Xen - aio error -14

2008-09-16 Thread Sunil Mushran
Enable some tracing: $ debugfs.ocfs2 -l ENTRY EXIT INODE DISK_ALLOC SUPER FILE_IO NAMEI AIO allow $ xm save $ debugfs.ocfs2 -l ENTRY EXIT deny INODE DISK_ALLOC SUPER FILE_IO NAMEI AIO off File a bugzilla and attach the syslog. Brett Worth wrote: > Sunil Mushran wrote: > &g

Re: [Ocfs2-users] OCFS2 and Xen - aio error -14

2008-09-16 Thread Sunil Mushran
Which kernel? The error message itself needs to be silenced. The OR should be changed to an AND. 2184 if (ret != -EFAULT || ret != -ENOSPC) 2185 mlog_errno(ret); But that just means we are treating this as a user error. However, as the same

Re: [Ocfs2-users] OCFS2

2008-09-11 Thread Sunil Mushran
What version? $ modinfo ocfs2 $ rpm -qa | grep ocfs2 $ uname -a Sunil Marshall, Richard wrote: > > Hello: > > We have a 3 node cluster, on one of the nodes the LAN cable was > accidentally disconnected, and the node hard booted (i.e. power > off/on). There were no indications from ocfs2 and o2

Re: [Ocfs2-users] Version of ocfs2 in vanilla?

2008-09-08 Thread Sunil Mushran
That fix went into 2.6.26. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ffda89a3bf3b968bdc268584c6bc1da5c173cf12 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a4a4891164d4f6f383cc17e7c90828a7ca6a1146 http://git.kernel.org/?p=linu

Re: [Ocfs2-users] cant see parameters values

2008-09-08 Thread Sunil Mushran
What version of ocfs2-tools are you running? Doron Tamir wrote: > > Hi all , > > > > When I type : > > cat /etc/sysconfig/o2cb > > > > I see only > > 1. # O2CB_ENABELED: 'true' means to load the driver on boot. > 2. O2CB_ENABLED=true > 3.4.

Re: [Ocfs2-users] 2 node cluster reboot

2008-09-07 Thread Sunil Mushran
rify it's still connected with a ping node and continue about > its duties (it can even maintain read/write in a DRBD context) > > H > > Sunil Mushran wrote: >> Check out qs 80/81 in the faq. >> http://oss.oracle.com/projects/ocfs2/dist/documentation/v1.2/ocfs2_faq.h

Re: [Ocfs2-users] [Ocfs-users] Hard system restart when DRBD connection fails while in use

2008-09-07 Thread Sunil Mushran
Repeat the test. This time run the following on Node A after you have killed Node B. $ ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN If we are lucky we'll get to see where that process is waiting. Henri Cook wrote: > Hi all, > > I have two nodes (A+B) running a DRBD file system (using OCFS2) on

Re: [Ocfs2-users] VM node won't talk to host

2008-09-04 Thread Sunil Mushran
That will be so if KVM is buffering the ios. Which it must be doing for performance reasons. Bret Baptist wrote: > On Friday 29 August 2008 18:38:08 Bret Baptist wrote: > >> On Thursday 28 August 2008 18:59:07 Sunil Mushran wrote: >> >>> If the VM is not see

Re: [Ocfs2-users] 2 node cluster reboot

2008-09-02 Thread Sunil Mushran
Check out qs 80/81 in the faq. http://oss.oracle.com/projects/ocfs2/dist/documentation/v1.2/ocfs2_faq.html#QUORUM which version, distro, etc.? Dante Garro wrote: > Hi Sirs, I have a 2 nodes cluster. > > Seems (but I'm not sure)when I've rebooted one of nodes the other reboots > itself too. > > Is

Re: [Ocfs2-users] (no subject)

2008-09-02 Thread Sunil Mushran
ively, if you've rebooted the system that holds the lock > would the others reclaim locks held and carry on as normal? > >Andy > > > > -Original Message- > From: Sunil Mushran [mailto:[EMAIL PROTECTED] > Sent: Tue 02/09/2008 05:21 > To: Andrew Phillips >

Re: [Ocfs2-users] (no subject)

2008-09-01 Thread Sunil Mushran
So in 1.4, we have a much improved debugging infrastructure for such issues. Check out the write on dlm debugging in the 1.4 user's guide in the chapter titled notes. In short, you have correctly identified the lock resource. But we need to go a step further and get the info from the dlm and see a

Re: [Ocfs2-users] Using OCFS2 with More than Two Nodes

2008-08-29 Thread Sunil Mushran
Define export? Do you want the nodes to be part of a cluster? As in, want local fs semantics across nodes. If so, use a shared device that can accomodate more than 2 nodes. drbd8 is great for what it does but also limits users to 2 nodes. Zack Gilburd wrote: > Hi all, > > I have ocfs2 atop drbd8

Re: [Ocfs2-users] migration methods (ocfs <-> ocfs2)

2008-08-27 Thread Sunil Mushran
No, the fscat tools can only read certain unmounted file systems. They cannot write. You can use it to copy data from ocfs to ocfs2 on a box running the 2.6 kernel (sles9/10, el4/5). Mehmet Can ÖNAL wrote: > > Hi everyone; > > > > we have a production system with 6 nodes of RAC upon ocfs file sy

Re: [Ocfs2-users] VM node won't talk to host

2008-08-25 Thread Sunil Mushran
No, the device names have nothing to do. When you mount, mount.ocfs2 kicks off the heartbeat. When other nodes see a new node heartbeating, o2net attempts to connect to the node. That connect is necessary for the mount to succeed. My investigation would start with disk heartbeat. # watch -d -n2

Re: [Ocfs2-users] ocfs2 issue? : unexplained reboots of RHEL 4 server (kernel:2.6.9-42.0.2.ELs)

2008-08-23 Thread Sunil Mushran
ms to do submit_bio for read > Index 8: took 120303 ms to do waiting for read completion > *** ocfs2 is very sorry to be fencing this system by restarting *** > Bootdata ok (command line is ro root=/dev/VolGroup_ID_12182/LogVol1 > console=ttyS0,9600n8) > > > ##

Re: [Ocfs2-users] formatting and mounting ocfs2 on 2 rac nodes

2008-08-21 Thread Sunil Mushran
What does "mounted.ocfs2 -d" return on the two nodes? Corne Lombard wrote: > > I ran into a problem in section 16 (Install & Configure Oracle Cluster > File System (OCFS2)) of the following article – > > http://www.oracle.com/technology/pub/articles/hunter_rac10gr2_iscsi_2.html#16. > > I have 2 R

Re: [Ocfs2-users] Weird messages at kernel.log

2008-08-20 Thread Sunil Mushran
2.6.18 is very very old. We've fixed many may bugs since then. Upgrade to a more recent kernel. Earliest I would say 2.6.21. Or anything after that. Dante Garro wrote: > > Hi all! > > I've just configured a new 2 node cluster and I found messages at > kernel.log like the following: > > Aug 19 19:

Re: [Ocfs2-users] new server and version ocfs2

2008-08-19 Thread Sunil Mushran
Actually, the mount will fail. The clusterstack detects mismatches during the handshake. The 1.3.9 tools corresponds with the file system that comes with the kernel. It was the best release at that time. However, software development is a constant process. We are adding new features and fixing bug

[Ocfs2-users] OCFS2 1.4 is released

2008-08-19 Thread Sunil Mushran
All, We are pleased to announce the release of OCFS2 1.4. This release has been available with Novell's SUSE Linux Enterprise Server (SLES10 SP2) for some time now. Today we are announcing the release of the same for Red Hat's and Oracle's Enterprise Linux (EL5 U2) distributions. Before upgrading

Re: [Ocfs2-users] OCFS2 random kernel error on Fedora 8

2008-08-18 Thread Sunil Mushran
Thanks. This looks like a new issue. Please log it in the bugzilla. http://oss.oracle.com/bugzilla Wessel wrote: > Hello All, > > I’ve been running a 4-node OCFS2 cluster for about a month now, and recently > I’ve had a total of 3 kernel errors on random nodes. This causes the machine > to lock up

Re: [Ocfs2-users] ocfs2 issue? : unexplained reboots of RHEL 4 server (kernel:2.6.9-42.0.2.ELs)

2008-08-18 Thread Sunil Mushran
Configure a netdump or netconsole server. It will catch the relevant messages. Derek Hazell wrote: > > Dear OCFS2 forum > > We run ocfs2 version 1.2.9-1 as an ocfs2 cluster on four Linux servers > running RHEL 4 (kernel: 2.6.9-42.0.2.ELs) > > We are getting unexpected reboots of one of the L

Re: [Ocfs2-users] Bug in OCFS2 1.3.3

2008-08-15 Thread Sunil Mushran
Please can you file a bugzilla and attach this stack trace. Also attach the output of the following: $ objdump -DSl /lib/modules/`uname -r`/kernel/fs/ocfs2/ocfs2.ko >/tmp/ocfs2.out Paulo Rodrigues wrote: > Got the same error again today. > > BUG: unable to handle kernel NULL pointer dereference

Re: [Ocfs2-users] Linux-x86_64 Error: 4: Interrupted system call

2008-08-14 Thread Sunil Mushran
This may not be related to ocfs2. Check the oracle doc for the version of the database you are running. It will have all the appropriate kernel settings. Daniel Keisling wrote: > Greetings, > > When attempting to start up a database on an OCFS2 filesystem, Oracle > complains with the following

Re: [Ocfs2-users] Bug in OCFS2 1.3.3

2008-08-13 Thread Sunil Mushran
Does not look you used the force option. Or, you ran with the file system mounted. Umount the fs on all nodes and do: $ fsck.ocfs2 -f /dev/dm-1 Paulo Rodrigues wrote: > Hello Sunil, > > fsck says its clean: > > Checking OCFS2 filesystem in /dev/dm-1: > label: /var/lib/dovecot/spool

Re: [Ocfs2-users] Suggestion about Heartbeat

2008-08-13 Thread Sunil Mushran
I am hoping we would not have this problem with the new cluster stacks that are in development, cman and pacemaker. But always good to hear about the issues being encountered by the users. Well, not good... but you know what I mean. Michael Moody wrote: > > I have a suggestion about the heartbeat

Re: [Ocfs2-users] Bug in OCFS2 1.3.3

2008-08-13 Thread Sunil Mushran
This could suggest an on disk problem. Have you run fsck.ocfs2 recently? fsck.ocfs2 -f /dev/sdX1 Paulo Rodrigues wrote: > Hello, > > I'm on 2.6.24 with OCFS2 1.3.3 and every couple days this comes up in > dmesg. I have to reboot the cluster machines, there's nothing else I > can do. Stopping t

Re: [Ocfs2-users] Version compatibity

2008-08-12 Thread Sunil Mushran
Yes. It is fully ondisk compatible. Read the bit about file system compatibility here. http://oss.oracle.com/pipermail/ocfs2-announce/2008-March/23.html We will be releasing a more formal user's guide soon. Paulo Rodrigues wrote: > Hello, > > is it safe to unmount FS under 1.3.3 and mount un

Re: [Ocfs2-users] dlm domain problem

2008-08-11 Thread Sunil Mushran
Do: $ debugfs.ocfs2 -l DLM ENTRY EXIT allow $ mkdir /dlm/test $ debugfs.ocfs2 -l DLM off ENTRY EXIT deny File a bugzilla and attach the /var/log/messages. Charlie Sharkey wrote: > > > > I'm having a problem creating a dlm domain. The libo2dlm library > returns a 'could not create domain' err

Re: [Ocfs2-users] OCFS2 troubleshooting tools

2008-08-11 Thread Sunil Mushran
ocfs2 home page on oss.oracle.com has support guides. Also, we have a wiki. http://oss.oracle.com/osswiki/OCFS2/Debugging The dlm debugging has been improved in sles10 sp2. We will be soon releasing the details in ocfs2 1.4's user's guide. Make sure you are running the latest sles10 sp1 kernel. h

Re: [Ocfs2-users] ocfs2 node reboot method

2008-08-06 Thread Sunil Mushran
http://oss.oracle.com/bugzilla/show_bug.cgi?id=838 Check out this bugzilla. Tao Ma wrote: > Hi, > > Masanari Iida wrote: >> Hello Tao and Sunil, > ]> My case, the symptom (ocfs2 failed to mount a volume using >> /etc/fstab) happend when I reboot the system. >> Even if it failed to mount (by /etc/f

Re: [Ocfs2-users] ocfs2 node reboot method

2008-08-05 Thread Sunil Mushran
I believe 1.2.5-SLES-r2997 is the version of the fs and not the tools. Meaning, an upgrade is required to the latest kernel that is shipping 1.2.9. As far as failure to mount goes, one reason could be that the default timeout (10 secs) could be low. See if increasing to the new default of 30 secs

Re: [Ocfs2-users] ocfs2 kernel BUG

2008-08-01 Thread Sunil Mushran
No, the kernel is old. A year+ old. Refer to this announcement below. http://oss.oracle.com/pipermail/ocfs2-announce/2008-July/26.html From the stack, it looks you are encountering the rename/extend race that was fixed a long time ago. http://oss.oracle.com/projects/ocfs2/news/article_14.htm

Re: [Ocfs2-users] Filesystem usage after mkfs.ocfs2

2008-07-30 Thread Sunil Mushran
The default mkfs params make 4 slots each with a 256M journal. That's 1G. If you want them smaller, mkfs provides parameters to override the same. Secondly, we compute based on the full device. Most other filesystems deduct the blocks consumed by the fs on creation in their calculation. Arnold Ma

Re: [Ocfs2-users] Problems building ocfs2 rpm on Fedora 9

2008-07-29 Thread Sunil Mushran
Use an enterprise kernel. Tina Soles wrote: > OK, another snag. Fedora 9 does not support RAW devices, so I can't > configure the voting disk or OCR disk to be as such. Any suggestions? I > think I'm "up a creek" here... > > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PRO

Re: [Ocfs2-users] OCFS2 and VMware ESX

2008-07-28 Thread Sunil Mushran
SLES10 SP2 is shipping OCFS2 1.4. We will releasing the same for (RH)EL in the coming weeks. -Original Message- >From Haydn Cahir <[EMAIL PROTECTED]> Sent Mon 7/28/2008 8:07 PM To ocfs2-users@oss.oracle.com Subject Re: [Ocfs2-users] OCFS2 and VMware ESX Hi Mark, Thanks for your reply. Ho

Re: [Ocfs2-users] why does mkfs.ocfs2 take so long?

2008-07-28 Thread Sunil Mushran
Two inits take time. 1. Cluster group init. 2. Journal init. Considering this is a 16TB volume being formatted with 4K/4K block/cluster sizes, means it has 127074 cluster groups to initialize. So 127074 4K blocks to initialize. But this bit should be somewhat similar to ext3. Journal initializati

Re: [Ocfs2-users] ocfs2 fencing issue on 1.2.9.1

2008-07-24 Thread Sunil Mushran
Hard for me to diagnose the issue with no logs. Maybe best if you logged a bugzilla with novell and provided them with all the logs. Kuang, Howard [WHQKT] wrote: > > Hi, Sunil, > > > > I upgrade ocfs2 to 1.2.9.1 with the new kernel from Novell. The > fencing problem is still existing. When one

Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]

2008-07-24 Thread Sunil Mushran
rious stages in the test > outlined below. Also, the -n option is not used on the mount. > > Regards > > Mark Schloss > > > Mark Schloss | Oracle DBA | Information Technology | x0013 > > -Original Message- > From: Sunil Mushran [mailto:[EMAIL PROTECTED]

Re: [Ocfs2-users] Recommended block size for a mail environment

2008-07-24 Thread Sunil Mushran
To start off, you are using a very old version of the fs/tools. 2+ year old. Upgrading will take care of the -R option. However, the basic problem you are experiencing will remain. As in, ocfs2 uses blocksized inodes. You can reduce the blocksize, but that will result in a loss of thruput as the i

Re: [Ocfs2-users] ORA-19870 and ORA-19502 During RMAN restore to OCFS2 filesystem

2008-07-23 Thread Sunil Mushran
Please file a SR with Oracle support. Database issues are best resolved in that forum. __ >From Ed Gulakowski <[EMAIL PROTECTED]> Sent Wed 7/23/2008 7:48 PM To ocfs2-users@oss.oracle.com Subject [Ocfs2-users] ORA-19870 and ORA-19502 During RMAN

Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]

2008-07-22 Thread Sunil Mushran
Did you monitor /proc/mounts as I had suggested. -Original Message- >From Mark Schloss <[EMAIL PROTECTED]> Sent Mon 7/21/2008 9:22 PM To Sunil Mushran <[EMAIL PROTECTED]> Cc ocfs2-users@oss.oracle.com Subject Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNO

Re: [Ocfs2-users] OCFS processes active after a umount [SEC=UNOFFICIAL]

2008-07-21 Thread Sunil Mushran
That is strange. Next time double check the mounts with: $ cat /proc/mounts The mount command prints the entries in /etc/mtab while the /proc/mounts dumps the information from the kernel. If those threads are there, it means the volume is still mounted. Two in this case. The entries in mtab are

Re: [Ocfs2-users] ocfs2 performance and scaling

2008-07-17 Thread Sunil Mushran
Sabuj Pattanayek wrote: > Hi, > > I'm using OCFS2 from 2.6.26 with some patches I made that allow for > the creation of a volume greater than 16TB: > > http://oss.oracle.com/pipermail/ocfs2-devel/2008-July/002568.html > http://oss.oracle.com/pipermail/ocfs2-tools-devel/2008-July/000857.html > > The

Re: [Ocfs2-users] Much higher disk usage in OCFS2 then in XFS

2008-07-15 Thread Sunil Mushran
That's 175 million files. I hope they are spread out across many directories. Our inodes are blocksized. 4k blocksize means 700G of metadata. 2K means 350G. 1K means 175G. AFAIK, XFS has 256 byte inodes. Maybe try 1K blocksize and 8K clustersize. You would be an ideal candidate for the inlined

Re: [Ocfs2-users] Node fence on RHEL4 machine running 1.2.8-2

2008-07-14 Thread SUNIL . MUSHRAN
File a bugzilla with the logs of all the machines. /var/log/messages. Meanwhile do schedule an upgrade to 1.2.9-1. We have one fix relating to o2net fencing that could have been in play here. But I'll need to read the full logs to be sure. Sunil --- Begin Message --- Hello, We have a four-node R

Re: [Ocfs2-users] Fence abnormal and with not apparent reason

2008-07-11 Thread Sunil Mushran
If you are still on 1.2.8-2, then it is a known issue fixed in 1.2.9-1. Gabriele Di Giambelardini wrote: > Hi to all, watching the log by more attention and in the moment when a > node go down, I have this imformation by the kernel about o2net : > > Jul 10 16:52:02 be1 kernel: BUG: soft lockup -

Re: [Ocfs2-users] Different size with du and ls

2008-07-10 Thread Sunil Mushran
Markus Meyer wrote: > Block Size Bits: 12 Cluster Size Bits: 16 > Links: 0 Clusters: 6707596 So you have a 4TB volume. Correct? Appears mkfs chose 64K as the cluster size. This means the smallest data allocation would be 64K. >File: `/mnt/user/small/11/11wa1.jpg' >Size

Re: [Ocfs2-users] Different size with du and ls

2008-07-10 Thread Sunil Mushran
Email me the following info: $ debugfs.ocfs2 -R "stats" /dev/sdX <== replace with ocfs2 device $ stat /mnt/user/small/11/11wa1.jpg $ stat /data/user/small/11/11wa1.jpg Markus Meyer wrote: > Hi all, > > I stumbled over a curious thing. The Linux tools "df" and "du" aren't > working cor

Re: [Ocfs2-users] ocfs2 datavolume option and oracle

2008-07-08 Thread Sunil Mushran
[EMAIL PROTECTED] wrote: > By raw are you meaning raw device access without a filesystem like ocfs2 > on the volume for the voting disk? Or am I not following? > Raw means specifying the block device directly. So make two partitions, say, sdd1 and sdd2, and feed that (/dev/sdd1, etc) to the to

Re: [Ocfs2-users] ocfs2 datavolume option and oracle

2008-07-08 Thread SUNIL . MUSHRAN
RAC is only supported on (RH)EL and SLES. It may work with other distros, but support is a different beast. datavolume mount option is not in mainline kernel. Oracle 10g onwards, the database itself does not require it... as one can set filesystemio_options to directio (init.ora param). That away

Re: [Ocfs2-users] ocfs2 datavolume option and oracle

2008-07-08 Thread SUNIL . MUSHRAN
In mainline, the issue was addressed in 2.6.21. In enterprise kernels, the issue was addressed in 1.2.4-2. If you are on (RH)EL4 or (RH)EL5, install 1.2.9-1. If you are on SLES9 SP4 or SLES10 SP1, upgrade to the latest kernel. Sunil --- Begin Message --- From the User's Guide: > Oracle databas

[Ocfs2-users] OCFS2 1.2.9-1 for Novell's SLES9 SP4 and SLES10 SP1 released

2008-07-07 Thread Sunil Mushran
All, This is to inform all SLES users that the latest SLES9 SP4 and SLES10 SP1 kernel erratas includes OCFS2 1.2.9-1. Users running the older kernel with 1.2.8-1 are urged to upgrade to the current release. For more information on the changes in OCFS2, please refer to the email announcing 1.2.9-1

Re: [Ocfs2-users] Question regarding old memory leak

2008-07-07 Thread SUNIL . MUSHRAN
In mainline, that issue was resolved in 2.6.21. We have patches for 2.6.20 but not older than that. Sunil --- Begin Message --- Hi all, I just started with OCFS2 and set up a 2-node cluster where one node is writing and both read from the clustered volume. Currently I'm moving data to the volum

Re: [Ocfs2-users] Howto compile ocfs2-1.3.9-0.1 for CentOS5.2 2.6.18-92.1.6.el5xen +compile messages

2008-07-04 Thread SUNIL . MUSHRAN
./configure --with-src=/usr/src/kernels/2.6.18-92.1.6.el5-x86_64 make --- Begin Message --- > Post the exact command and the error message. This is a selection of the output, I have attached the complete output. [EMAIL PROTECTED] ocfs2-1.3.9]# ./configure checking build system type...

Re: [Ocfs2-users] Ooops in OCFS2

2008-07-04 Thread SUNIL . MUSHRAN
1.2.5 is year+ old. Suggest you upgrade to 1.2.9. The oops is bizzare to say the least. I notice you are using xenU kernel. 4 nodes are VMs? Just trying to understand the layout. Is it reproducible? Definitely upgrade to 1.2.9. If the issue reproduces, file a bugzilla with all the details. This

Re: [Ocfs2-users] Howto compile ocfs2-1.3.9-0.1 for CentOS5.2 2.6.18-92.1.6.el5xen

2008-07-04 Thread SUNIL . MUSHRAN
Post the exact command and the error message. --- Begin Message --- Hello, I would like to install OCFS2 on CentOS5.2 with the 2 2.6.18-92.1.6.el5xen kernel. When I try to compile the OCFS2 source for this kernel I see a message nothing todo for rhel5 in the vendor section. What I did was the fol

Re: [Ocfs2-users] ocfs2 limits

2008-07-03 Thread Sunil Mushran
y have hundreds of thousands files > inside the concurrent output and log directories. > > Regards, > Luis > > --- On *Thu, 7/3/08, Sunil Mushran /<[EMAIL PROTECTED]>/* wrote: > > From: Sunil Mushran <[EMAIL PROTECTED]> > Subject: Re: [Ocfs2-users] o

Re: [Ocfs2-users] ocfs2 fencing problem

2008-07-03 Thread Sunil Mushran
Gabriele Di Giambelardini wrote: > Hi to all, some time ago, I read that the ocfs have a limit for the > subfolder. Is it possible this whren this limit gone exceeded the > ocfs2 have those problem??? > > > Or some boby know the limit number? http://oss.oracle.com/projects/ocfs2/dist/documentat

Re: [Ocfs2-users] "Propagate Configuration" missing

2008-07-02 Thread Sunil Mushran
What distro, kernel, packges versions, etc? http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#DOWNLOAD Check the requirements for the console. You could be missing a package that enables propagate config. [EMAIL PROTECTED] wrote: > I've been attempting to follow the instruct

Re: [Ocfs2-users] Slow backups, slow rsync

2008-07-01 Thread Sunil Mushran
34359738367 kB > VmallocUsed:293184 kB > VmallocChunk: 34359444779 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > > 5 nodes in the cluster. > > Michael > > -Ori

Re: [Ocfs2-users] Slow backups, slow rsync

2008-07-01 Thread Sunil Mushran
Which kernel? uname -a? block/cluster sizes? debugfs.ocfs2 -R "stats" /dev/sdX How many nodes in your cluster? memory? cat /proc/meminfo Michael Moody wrote: > > I use rsync to take backups of my ocfs2 filesystems (since nothing > else really supports it out of the box). Unfortunately, it’s ve

Re: [Ocfs2-users] ocfs2 fencing problem

2008-07-01 Thread Sunil Mushran
Upgrade to OCFS2 1.2.9-1 shipping with the latest SLES9 SP4 kernel (2.6.5-7.312). http://download.novell.com/Download?buildid=27kCZ1qWwWo~ You are most likely hitting bug#6680001 as mentioned here. http://oss.oracle.com/projects/ocfs2/news/article_17.html Also, you might want to tone down the he

Re: [Ocfs2-users] Problems building ocfs2 rpm on Fedora 9

2008-07-01 Thread Sunil Mushran
My recommendation is for you to use an enterprise kernel... (rh)el or sles. For shared storage, use iscsi. sles10 ships with a good iscsi target. Firewire as a shared disk was useful when there was no inexpensive shared disk available. That is no longer the case. Tina Soles wrote: > Sunil, > > I a

Re: [Ocfs2-users] Problems building ocfs2 rpm on Fedora 9

2008-06-30 Thread Sunil Mushran
mount(2) man page lists the following reasons for it to return an EBUSY: EBUSY source is already mounted. Or, it cannot be remounted read-only, because it still holds files open for writing. Or, it cannot be mounted on target because target is still busy (it is the working directory of some

Re: [Ocfs2-users] Fence abnormal and with not apparent reason

2008-06-30 Thread Sunil Mushran
Could be due to bugzilla#919 as explained in the list of fixes in 1.2.9-1. http://oss.oracle.com/projects/ocfs2/news/article_18.html Gabriele Di Giambelardini wrote: > Hi, this is my output on all the 5 servers > > Module "configfs": Loaded > Filesystem "configfs": Mounted > Module "ocfs2_nodemana

Re: [Ocfs2-users] Problems building ocfs2 rpm on Fedora 9

2008-06-29 Thread SUNIL . MUSHRAN
ROTECTED] Sent: Sunday, June 29, 2008 10:16 PM To: Tina Soles Cc: Sunil Mushran; ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] Problems building ocfs2 rpm on Fedora 9 Hi Tina, Sorry, I am a not RAC expert. So you may have to wait for Sunil's suggestion for it. If you

Re: [Ocfs2-users] Problems building ocfs2 rpm on Fedora 9

2008-06-27 Thread Sunil Mushran
ply. Can you be more specific and give me the exact > name of the native Fedora 9 rpm(s) that I need for ocfs2 and ocfs2-tools? > Thanks. > ---- > *From:* Sunil Mushran [mailto:[EMAIL PROTECTED] > *Sent:* Fri 6/27/

Re: [Ocfs2-users] Problems building ocfs2 rpm on Fedora 9

2008-06-27 Thread Sunil Mushran
Fedora ships ocfs2 fs modules natively. You don't have to do all this. What is missing is the tools rpm. But the good news is that that should be available any day now literally speaking. Tina Soles wrote: > > Hello, > > I’m brand new to RAC and ocfs2. I need to install ocfs2, but there is >

Re: [Ocfs2-users] Invalid argument while mounting

2008-06-25 Thread Sunil Mushran
$ dd if=/dev/sdd1 of=/tmp/out count=100 bs=4K Can you file a bugzilla and attach the /tmp/out. Looks like it is unable to read the inode because it's signature if off. The above command will dump the first 400K of the device. I want to see what the extent of the corruption is. If it is localized

Re: [Ocfs2-users] mount readonly without lockmanager

2008-06-25 Thread SUNIL . MUSHRAN
Kruyt [EMAIL PROTECTED] Sent: Wednesday, June 25, 2008 8:41 AM To: Sunil Mushran Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] mount readonly without lockmanager Thanks, But the filesystem is on a SAN, and shared arcoss 3 nodes. One of that node I dont wat a lockmanager and the system

Re: [Ocfs2-users] mount readonly without lockmanager

2008-06-24 Thread Sunil Mushran
Sure. You can mark the volume as local (man tunefs.ocfs2 or mkfs.ocfs2) and mount it without the cluster stack (like any local file system). You can use the ro mount option to mount it readonly. Combine the two and you get what you want. BTW, if the fs image on a physical ro media, the fs autom

Re: [Ocfs2-users] Invalid argument while mounting

2008-06-24 Thread SUNIL . MUSHRAN
Run fsck to repair that inode. fsck.ocfs2 -f /dev/sdd1 Also, better if you upgrade the fs to 1.2.9-1. --- Begin Message --- I get the following error when trying to mount: Nothing changed (in any case not that I know of). I get the same error from both nodes. Please assist. # mount /u02 moun

Re: [Ocfs2-users] crash during big file transfers

2008-06-23 Thread Sunil Mushran
g for help on compiling the kernel module, so as we > can have a updated one for the kernel 2.6.21.5 and 2.6.24.5 distributed with > Slackware 12.0 and 12.1. > > Tanks, > Carlos Xavier. > > - Original Message - > From: "Sunil Mushran" <[EMAIL PROT

<    3   4   5   6   7   8   9   10   11   12   >