Re: [Ocfs2-users] ocfs2_delete_inode kernel bug

2010-10-28 Thread Joel Becker
On Thu, Oct 28, 2010 at 02:16:03AM -0200, Andre Nathan wrote:
 Hello Sunil
 
 The errors happened again, but now I think it may be completely fixed. I
 only got the -17 error for a single inode this time:

Are you having more than one machine access the same disk
without being in the same cluster?  I would hope not, but something is
weird here.

Joel

-- 

None of our men are experts.  We have most unfortunately found
it necessary to get rid of a man as soon as he thinks himself an
expert -- because no one ever considers himself expert if he really
knows his job.  A man who knows a job sees so much more to be done
than he has done, that he is always pressing forward and never
gives up an instant of thought to how good and how efficient he is.
Thinking always ahead, thinking always of trying to do more, brings
a state of mind in which nothing is impossible. The moment one gets
into the expert state of mind a great number of things become
impossible.
- From Henry Ford Sr., My Life and Work

Joel Becker
Senior Development Manager
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] ocfs2_delete_inode kernel bug

2010-10-28 Thread Joel Becker
On Tue, Oct 26, 2010 at 02:59:17PM -0200, Andre Nathan wrote:
 My setup is the following: I have two servers sharing OCFS2 filesystems
 through one dedicated 10Gbps interface.

Where are the disks?  I'm guessing they're on an iSCSI server at
the other end of the 10Gbps interface, but what is the technology there?

 There is also a backup server which exports OCFS2-formatted devices via
 ATA-over-ethernet. This machine is connected to each server through
 standard gigabit ethernet interfaces. The two servers mount their
 respective volumes and run the backup script.

Let me see if I understand this.  The backup server has disks.
Those disks are formatted for ocfs2.  The backup server does NOT mount
these disks, it merely exports them via AoE.  Each server mounts its own
AoE disk as ocfs2 and writes backup data to the ocfs2 filesystems there.
Is this correct?
Are those filesystems on the AoE disks clustered or in local
mode?  Does the backup server ever mount those filesystems?  Are the
errors you see on the live disks (the 10Gbps iSCSI ones) or on the
backup disks (AoE)?  What machine sees the errors, the two servers or
the backup server?


-- 

It is not the function of our government to keep the citizen from
 falling into error; it is the function of the citizen to keep the
 government from falling into error.
- Robert H. Jackson

Joel Becker
Senior Development Manager
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] No space left on device error, older kernels?

2010-10-28 Thread Joel Becker
On Thu, Oct 28, 2010 at 12:09:59AM +0300, Antonis Kopsaftis wrote:
 Even if 2.6.18 is a too old kernel, its then kernel thats its been used
 by the current production running
 versions (5.x) of redhat enterprise distros (and all his branches:
 centos, SL , ...).

You can easily get the Unbreakable kernel on those distros.  We
understand your concern, as there are a lot of people running into this
issue, but we feel that running the Unbreakable kernel is far less risky
than backporting features of this size.
If it was a simple fix, we wouldn't be having this conversation.
;-)

Joel

-- 

Where are my angels?
 Where's my golden one?
 And where is my hope
 Now that my heroes are gone?

Joel Becker
Senior Development Manager
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] No space left on device error, older kernels?

2010-10-28 Thread Antonis Kopsaftis
Hello,

I search for info about the unbreakable kernel, and by the info that i
found i came up to
this conclusions:
1. To use unbreakable kernel you have to upgrade your distro to oracle
linux first. This
upgrade is only available for Redhat linux and not the free branches(px
centos).
2. To upgrade to oracle linux you have to BUY a support contract.

So for me, who i use Scientific Linux , to keep on using OCFS , the
unbreakable kernel is not a solution, as i cannot
upgrade easily and i have to reinstall everything.

The only solutions that i can came up, (without reinstalling my
production servers) are
1. Switch to another clustered filesystem...:-(
2. Wait for SL6 and see if there is an easy way to upgrade from SL5 to 6.

Finally i would like to say that i dont judge for your decision on
dropping support for redhat  redhat-likes 5.x distros, as its true
that the running kernel of this distros is old.
But to be fair, there's not any info on the official site
http://oss.oracle.com/projects/ocfs2/ about this decision. Not even in
the top reported session or the FAQ of the site

Regards,
Kopsaftis Antonis

On 28/10/2010 11:45 πμ, Joel Becker wrote:
 On Thu, Oct 28, 2010 at 12:09:59AM +0300, Antonis Kopsaftis wrote:
 Even if 2.6.18 is a too old kernel, its then kernel thats its been used
 by the current production running
 versions (5.x) of redhat enterprise distros (and all his branches:
 centos, SL , ...).
   You can easily get the Unbreakable kernel on those distros.  We
 understand your concern, as there are a lot of people running into this
 issue, but we feel that running the Unbreakable kernel is far less risky
 than backporting features of this size.
   If it was a simple fix, we wouldn't be having this conversation.
 ;-)

 Joel


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ocfs2_delete_inode kernel bug

2010-10-28 Thread andre

Em 28/10/2010, Joel Becker lt;joel.bec...@oracle.comgt; escreveu:
gt;Where are the disks?nbsp; I'm guessing they're on an iSCSI server 
at
gt; the other end of the 10Gbps interface, but what is the technology there?
There are two servers, each with 16 SATA local disks. The servers areconnected 
to each other through the 10Gbps interface. The disks arearrayed in RAID1 pairs 
done via hardware, and therefore the OS sees 8disks. These 8 disks are 
configured in an active-active DRBD setupbetween the two machines. The DRBD 
devices are formatted as OCFS2.The OCFS2 cluster configuration is done using 
the same 10Gbpsinterface used by DRBD.
gt;Let me see if I understand this.nbsp; The backup server has disks.
gt; Those disks are formatted for ocfs2.nbsp; The backup server does NOT mount
gt; these disks, it merely exports them via AoE.nbsp; Each server mounts its 
own
gt; AoE disk as ocfs2 and writes backup data to the ocfs2 filesystems there.
gt; Is this correct?
Yes.
gt;Are those filesystems on the AoE disks clustered or in local
gt; mode?nbsp; Does the backup server ever mount those filesystems?nbsp; Are 
the
gt; errors you see on the live disks (the 10Gbps iSCSI ones) or on the
gt; backup disks (AoE)?nbsp; What machine sees the errors, the two servers or
gt; the backup server?
The backup server never mounts its filesystems. The errors I reportedalways 
happen on the two servers, never on the backup server. I havealso run fsck on 
all backup filesystems, but never found any errors tobe corrected.
Thanks,Andre___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] No space left on device error, older kernels?

2010-10-28 Thread Herbert van den Bergh

You don't need to buy a support license to use Oracle Linux.  You can 
download and use it for free.  You can configure your system to get the 
Unbreakable Enterprise Kernel from public-yum.oracle.com.

You don't need to upgrade to Oracle Linux to install the Unbreakable 
Kernel, as long as the distro you are using is binary compatible with 
Red Hat / Oracle Linux.  If you added third party modules to your 
system, you will have to get new ones for the Unbreakable Kernel.  
Otherwise you should be able to just drop it in place.

Oracle has not dropped support for OCFS2 on EL5.  Yet.  But some 
problems are too difficult to fix in the older version.  The decision 
was made that fixing this particular problem in the old OCFS2 version 
was more risky than living with the problem.

Thanks,
Herbert.


On 10/28/2010 06:02 AM, Antonis Kopsaftis wrote:
 Hello,

 I search for info about the unbreakable kernel, and by the info that i
 found i came up to
 this conclusions:
 1. To use unbreakable kernel you have to upgrade your distro to oracle
 linux first. This
 upgrade is only available for Redhat linux and not the free branches(px
 centos).
 2. To upgrade to oracle linux you have to BUY a support contract.

 So for me, who i use Scientific Linux , to keep on using OCFS , the
 unbreakable kernel is not a solution, as i cannot
 upgrade easily and i have to reinstall everything.

 The only solutions that i can came up, (without reinstalling my
 production servers) are
 1. Switch to another clustered filesystem...:-(
 2. Wait for SL6 and see if there is an easy way to upgrade from SL5 to 6.

 Finally i would like to say that i dont judge for your decision on
 dropping support for redhat  redhat-likes 5.x distros, as its true
 that the running kernel of this distros is old.
 But to be fair, there's not any info on the official site
 http://oss.oracle.com/projects/ocfs2/ about this decision. Not even in
 the top reported session or the FAQ of the site

 Regards,
 Kopsaftis Antonis

 On 28/10/2010 11:45 πμ, Joel Becker wrote:
 On Thu, Oct 28, 2010 at 12:09:59AM +0300, Antonis Kopsaftis wrote:
 Even if 2.6.18 is a too old kernel, its then kernel thats its been used
 by the current production running
 versions (5.x) of redhat enterprise distros (and all his branches:
 centos, SL , ...).
  You can easily get the Unbreakable kernel on those distros.  We
 understand your concern, as there are a lot of people running into this
 issue, but we feel that running the Unbreakable kernel is far less risky
 than backporting features of this size.
  If it was a simple fix, we wouldn't be having this conversation.
 ;-)

 Joel

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ocfs2_delete_inode kernel bug

2010-10-28 Thread Joel Becker
On Thu, Oct 28, 2010 at 12:19:14PM -0200, an...@digirati.com.br wrote:
 Em 28/10/2010, Joel Becker lt;joel.bec...@oracle.comgt; escreveu:
 gt;  Where are the disks?nbsp; I'm guessing they're on an iSCSI server at
 gt; the other end of the 10Gbps interface, but what is the technology there?
 There are two servers, each with 16 SATA local disks. The servers 
 areconnected to each other through the 10Gbps interface. The disks arearrayed 
 in RAID1 pairs done via hardware, and therefore the OS sees 8disks. These 8 
 disks are configured in an active-active DRBD setupbetween the two machines. 
 The DRBD devices are formatted as OCFS2.The OCFS2 cluster configuration is 
 done using the same 10Gbpsinterface used by DRBD.
snip
 The backup server never mounts its filesystems. The errors I reportedalways 
 happen on the two servers, never on the backup server. I havealso run fsck on 
 all backup filesystems, but never found any errors tobe corrected.

I'm starting to think that DRBD isn't keeping a consistent view
of the devices between your servers.

Joel

-- 

Life's Little Instruction Book #157 

Take time to smell the roses.

Joel Becker
Senior Development Manager
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


Re: [Ocfs2-users] No space left on device error, older kernels?

2010-10-28 Thread Joel Becker
On Thu, Oct 28, 2010 at 04:02:47PM +0300, Antonis Kopsaftis wrote:
 I search for info about the unbreakable kernel, and by the info that i
 found i came up to
 this conclusions:
 1. To use unbreakable kernel you have to upgrade your distro to oracle
 linux first. This
 upgrade is only available for Redhat linux and not the free branches(px
 centos).
 2. To upgrade to oracle linux you have to BUY a support contract.

As Herbert pointed out, you can get the packages from
public-yum.oracle.com right now.  The Unbreakable kernel should install
on your EL5-based system with perhaps a couple dependent package
upgrades.
If you want us to answer phone calls about it, you need a
support contract, but given that you are running SL and not RHEL/OL, I
don't think you care about that.

Joel

-- 

Life's Little Instruction Book #267

Lie on your back and look at the stars.

Joel Becker
Senior Development Manager
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users