Re: [Ocfs2-users] ocfs2_delete_inode kernel bug
On Thu, Oct 28, 2010 at 02:16:03AM -0200, Andre Nathan wrote: Hello Sunil The errors happened again, but now I think it may be completely fixed. I only got the -17 error for a single inode this time: Are you having more than one machine access the same disk without being in the same cluster? I would hope not, but something is weird here. Joel -- None of our men are experts. We have most unfortunately found it necessary to get rid of a man as soon as he thinks himself an expert -- because no one ever considers himself expert if he really knows his job. A man who knows a job sees so much more to be done than he has done, that he is always pressing forward and never gives up an instant of thought to how good and how efficient he is. Thinking always ahead, thinking always of trying to do more, brings a state of mind in which nothing is impossible. The moment one gets into the expert state of mind a great number of things become impossible. - From Henry Ford Sr., My Life and Work Joel Becker Senior Development Manager Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] ocfs2_delete_inode kernel bug
On Tue, Oct 26, 2010 at 02:59:17PM -0200, Andre Nathan wrote: My setup is the following: I have two servers sharing OCFS2 filesystems through one dedicated 10Gbps interface. Where are the disks? I'm guessing they're on an iSCSI server at the other end of the 10Gbps interface, but what is the technology there? There is also a backup server which exports OCFS2-formatted devices via ATA-over-ethernet. This machine is connected to each server through standard gigabit ethernet interfaces. The two servers mount their respective volumes and run the backup script. Let me see if I understand this. The backup server has disks. Those disks are formatted for ocfs2. The backup server does NOT mount these disks, it merely exports them via AoE. Each server mounts its own AoE disk as ocfs2 and writes backup data to the ocfs2 filesystems there. Is this correct? Are those filesystems on the AoE disks clustered or in local mode? Does the backup server ever mount those filesystems? Are the errors you see on the live disks (the 10Gbps iSCSI ones) or on the backup disks (AoE)? What machine sees the errors, the two servers or the backup server? -- It is not the function of our government to keep the citizen from falling into error; it is the function of the citizen to keep the government from falling into error. - Robert H. Jackson Joel Becker Senior Development Manager Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] No space left on device error, older kernels?
On Thu, Oct 28, 2010 at 12:09:59AM +0300, Antonis Kopsaftis wrote: Even if 2.6.18 is a too old kernel, its then kernel thats its been used by the current production running versions (5.x) of redhat enterprise distros (and all his branches: centos, SL , ...). You can easily get the Unbreakable kernel on those distros. We understand your concern, as there are a lot of people running into this issue, but we feel that running the Unbreakable kernel is far less risky than backporting features of this size. If it was a simple fix, we wouldn't be having this conversation. ;-) Joel -- Where are my angels? Where's my golden one? And where is my hope Now that my heroes are gone? Joel Becker Senior Development Manager Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] No space left on device error, older kernels?
Hello, I search for info about the unbreakable kernel, and by the info that i found i came up to this conclusions: 1. To use unbreakable kernel you have to upgrade your distro to oracle linux first. This upgrade is only available for Redhat linux and not the free branches(px centos). 2. To upgrade to oracle linux you have to BUY a support contract. So for me, who i use Scientific Linux , to keep on using OCFS , the unbreakable kernel is not a solution, as i cannot upgrade easily and i have to reinstall everything. The only solutions that i can came up, (without reinstalling my production servers) are 1. Switch to another clustered filesystem...:-( 2. Wait for SL6 and see if there is an easy way to upgrade from SL5 to 6. Finally i would like to say that i dont judge for your decision on dropping support for redhat redhat-likes 5.x distros, as its true that the running kernel of this distros is old. But to be fair, there's not any info on the official site http://oss.oracle.com/projects/ocfs2/ about this decision. Not even in the top reported session or the FAQ of the site Regards, Kopsaftis Antonis On 28/10/2010 11:45 πμ, Joel Becker wrote: On Thu, Oct 28, 2010 at 12:09:59AM +0300, Antonis Kopsaftis wrote: Even if 2.6.18 is a too old kernel, its then kernel thats its been used by the current production running versions (5.x) of redhat enterprise distros (and all his branches: centos, SL , ...). You can easily get the Unbreakable kernel on those distros. We understand your concern, as there are a lot of people running into this issue, but we feel that running the Unbreakable kernel is far less risky than backporting features of this size. If it was a simple fix, we wouldn't be having this conversation. ;-) Joel ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] ocfs2_delete_inode kernel bug
Em 28/10/2010, Joel Becker lt;joel.bec...@oracle.comgt; escreveu: gt;Where are the disks?nbsp; I'm guessing they're on an iSCSI server at gt; the other end of the 10Gbps interface, but what is the technology there? There are two servers, each with 16 SATA local disks. The servers areconnected to each other through the 10Gbps interface. The disks arearrayed in RAID1 pairs done via hardware, and therefore the OS sees 8disks. These 8 disks are configured in an active-active DRBD setupbetween the two machines. The DRBD devices are formatted as OCFS2.The OCFS2 cluster configuration is done using the same 10Gbpsinterface used by DRBD. gt;Let me see if I understand this.nbsp; The backup server has disks. gt; Those disks are formatted for ocfs2.nbsp; The backup server does NOT mount gt; these disks, it merely exports them via AoE.nbsp; Each server mounts its own gt; AoE disk as ocfs2 and writes backup data to the ocfs2 filesystems there. gt; Is this correct? Yes. gt;Are those filesystems on the AoE disks clustered or in local gt; mode?nbsp; Does the backup server ever mount those filesystems?nbsp; Are the gt; errors you see on the live disks (the 10Gbps iSCSI ones) or on the gt; backup disks (AoE)?nbsp; What machine sees the errors, the two servers or gt; the backup server? The backup server never mounts its filesystems. The errors I reportedalways happen on the two servers, never on the backup server. I havealso run fsck on all backup filesystems, but never found any errors tobe corrected. Thanks,Andre___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] No space left on device error, older kernels?
You don't need to buy a support license to use Oracle Linux. You can download and use it for free. You can configure your system to get the Unbreakable Enterprise Kernel from public-yum.oracle.com. You don't need to upgrade to Oracle Linux to install the Unbreakable Kernel, as long as the distro you are using is binary compatible with Red Hat / Oracle Linux. If you added third party modules to your system, you will have to get new ones for the Unbreakable Kernel. Otherwise you should be able to just drop it in place. Oracle has not dropped support for OCFS2 on EL5. Yet. But some problems are too difficult to fix in the older version. The decision was made that fixing this particular problem in the old OCFS2 version was more risky than living with the problem. Thanks, Herbert. On 10/28/2010 06:02 AM, Antonis Kopsaftis wrote: Hello, I search for info about the unbreakable kernel, and by the info that i found i came up to this conclusions: 1. To use unbreakable kernel you have to upgrade your distro to oracle linux first. This upgrade is only available for Redhat linux and not the free branches(px centos). 2. To upgrade to oracle linux you have to BUY a support contract. So for me, who i use Scientific Linux , to keep on using OCFS , the unbreakable kernel is not a solution, as i cannot upgrade easily and i have to reinstall everything. The only solutions that i can came up, (without reinstalling my production servers) are 1. Switch to another clustered filesystem...:-( 2. Wait for SL6 and see if there is an easy way to upgrade from SL5 to 6. Finally i would like to say that i dont judge for your decision on dropping support for redhat redhat-likes 5.x distros, as its true that the running kernel of this distros is old. But to be fair, there's not any info on the official site http://oss.oracle.com/projects/ocfs2/ about this decision. Not even in the top reported session or the FAQ of the site Regards, Kopsaftis Antonis On 28/10/2010 11:45 πμ, Joel Becker wrote: On Thu, Oct 28, 2010 at 12:09:59AM +0300, Antonis Kopsaftis wrote: Even if 2.6.18 is a too old kernel, its then kernel thats its been used by the current production running versions (5.x) of redhat enterprise distros (and all his branches: centos, SL , ...). You can easily get the Unbreakable kernel on those distros. We understand your concern, as there are a lot of people running into this issue, but we feel that running the Unbreakable kernel is far less risky than backporting features of this size. If it was a simple fix, we wouldn't be having this conversation. ;-) Joel ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] ocfs2_delete_inode kernel bug
On Thu, Oct 28, 2010 at 12:19:14PM -0200, an...@digirati.com.br wrote: Em 28/10/2010, Joel Becker lt;joel.bec...@oracle.comgt; escreveu: gt; Where are the disks?nbsp; I'm guessing they're on an iSCSI server at gt; the other end of the 10Gbps interface, but what is the technology there? There are two servers, each with 16 SATA local disks. The servers areconnected to each other through the 10Gbps interface. The disks arearrayed in RAID1 pairs done via hardware, and therefore the OS sees 8disks. These 8 disks are configured in an active-active DRBD setupbetween the two machines. The DRBD devices are formatted as OCFS2.The OCFS2 cluster configuration is done using the same 10Gbpsinterface used by DRBD. snip The backup server never mounts its filesystems. The errors I reportedalways happen on the two servers, never on the backup server. I havealso run fsck on all backup filesystems, but never found any errors tobe corrected. I'm starting to think that DRBD isn't keeping a consistent view of the devices between your servers. Joel -- Life's Little Instruction Book #157 Take time to smell the roses. Joel Becker Senior Development Manager Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] No space left on device error, older kernels?
On Thu, Oct 28, 2010 at 04:02:47PM +0300, Antonis Kopsaftis wrote: I search for info about the unbreakable kernel, and by the info that i found i came up to this conclusions: 1. To use unbreakable kernel you have to upgrade your distro to oracle linux first. This upgrade is only available for Redhat linux and not the free branches(px centos). 2. To upgrade to oracle linux you have to BUY a support contract. As Herbert pointed out, you can get the packages from public-yum.oracle.com right now. The Unbreakable kernel should install on your EL5-based system with perhaps a couple dependent package upgrades. If you want us to answer phone calls about it, you need a support contract, but given that you are running SL and not RHEL/OL, I don't think you care about that. Joel -- Life's Little Instruction Book #267 Lie on your back and look at the stars. Joel Becker Senior Development Manager Oracle E-mail: joel.bec...@oracle.com Phone: (650) 506-8127 ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users