Re: "Trying to free nonexistent swap-page" error message.

2001-06-29 Thread Andreas Dilger

Johan Seland
> One one of our Linux Oracle servers the following messages has started
> to appear : 
> 
> Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page
> Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page
> 
> I also find some of these:
> 
> Jun 29 06:25:01 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
>entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
>rec_len=11833, name_len=115
> Jun 29 06:25:32 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
>entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
>rec_len=11833, name_len=115
> 
> Machine is a 2x933MhZ P3 with 2GB of memory. Kernel version is now
> 2.2.19, but the same problem appeared with 2.2.18 as well.

My first guess would be some sort of hardware/software problem with your
SCSI controller, cables, disk, etc.  I'm not sure about the swap problem,
but the ext2 problems are caused by corruption of the disk or memory.

It is not just a single-bit error either, because rec_len % 4 != 0 AND it
is larger than a page size, so the value is totally bogus, as is the inode
number.  Interestingly, converting the above ext2 numbers into ascii gives:

0x69 0x73 0x6f 0x30 0x39 0x2e 0x73 => iso09.s

(in the order they are layed out in ext2_dir_entry_2).  Coincidence or bug?
I would suggest a full fsck for the filesystem, as it is likely that there
are other problems.

Now when you say "servers" do you mean you have the same problem on
multiple machines?  Are they identical, or different?

Cheers, Andreas
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/   -- Dogbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



"Trying to free nonexistent swap-page" error message.

2001-06-29 Thread Johan Simon Seland

Hello,

I have searched the archives for this error message before, but no one
seems to have given a good answer. (Though the question has been
posted before.) I am not sure if this is a kernel problem, a hardware
problem or a Oracle problem. (Or a combination of them.)

One one of our Linux Oracle servers the following messages has started
to appear : 

Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page
Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page

They seem to always come in pairs, and usually with about three hours
between them. 

The database had to be restored from backup because of massive table
corruption recently, but these messages also appeared before we had to
restore it. (But we believe they might have caused the corruption.)

I also find some of these:

Jun 29 06:25:01 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
rec_len=11833, name_len=115
Jun 29 06:25:32 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
rec_len=11833, name_len=115

Machine is a 2x933MhZ P3 with 2GB of memory. Kernel version is now
2.2.19, but the same problem appeared with 2.2.18 as well. The
database is in moderate to heavy use 24/7 and with a lot (~ 500 - 3000)
processes during business hours.

The machine has only 128MB of swap, is this to little since it has a
full 2GB of memory?


 dmesg output from machine


The kernel is stock 2.2.19 with the following patch applied:

--- include/linux/tasks.h~  Wed Jan 17 14:45:54 2001
+++ include/linux/tasks.h   Wed Jan 17 14:46:39 2001
@@ -11,7 +11,7 @@
 #define NR_CPUS 1
 #endif
 
-#define NR_TASKS   512 /* On x86 Max about 4000 */
+#define NR_TASKS   4000/* On x86 Max about 4000 */
 
 #define MAX_TASKS_PER_USER (NR_TASKS/2)
 #define MIN_TASKS_LEFT_FOR_ROOT 4

--
Regards
Johan Seland
Programmer
Net Fonds ASA



Trying to free nonexistent swap-page error message.

2001-06-29 Thread Johan Simon Seland

Hello,

I have searched the archives for this error message before, but no one
seems to have given a good answer. (Though the question has been
posted before.) I am not sure if this is a kernel problem, a hardware
problem or a Oracle problem. (Or a combination of them.)

One one of our Linux Oracle servers the following messages has started
to appear : 

Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page
Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page

They seem to always come in pairs, and usually with about three hours
between them. 

The database had to be restored from backup because of massive table
corruption recently, but these messages also appeared before we had to
restore it. (But we believe they might have caused the corruption.)

I also find some of these:

Jun 29 06:25:01 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
rec_len=11833, name_len=115
Jun 29 06:25:32 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
rec_len=11833, name_len=115

Machine is a 2x933MhZ P3 with 2GB of memory. Kernel version is now
2.2.19, but the same problem appeared with 2.2.18 as well. The
database is in moderate to heavy use 24/7 and with a lot (~ 500 - 3000)
processes during business hours.

The machine has only 128MB of swap, is this to little since it has a
full 2GB of memory?


 dmesg output from machine


The kernel is stock 2.2.19 with the following patch applied:

--- include/linux/tasks.h~  Wed Jan 17 14:45:54 2001
+++ include/linux/tasks.h   Wed Jan 17 14:46:39 2001
@@ -11,7 +11,7 @@
 #define NR_CPUS 1
 #endif
 
-#define NR_TASKS   512 /* On x86 Max about 4000 */
+#define NR_TASKS   4000/* On x86 Max about 4000 */
 
 #define MAX_TASKS_PER_USER (NR_TASKS/2)
 #define MIN_TASKS_LEFT_FOR_ROOT 4

--
Regards
Johan Seland
Programmer
Net Fonds ASA



Re: Trying to free nonexistent swap-page error message.

2001-06-29 Thread Andreas Dilger

Johan Seland
 One one of our Linux Oracle servers the following messages has started
 to appear : 
 
 Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page
 Jun 29 07:16:32 blanco kernel: swap_free: Trying to free nonexistent swap-page
 
 I also find some of these:
 
 Jun 29 06:25:01 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
rec_len=11833, name_len=115
 Jun 29 06:25:32 blanco kernel: EXT2-fs error (device sd(8,10)): ext2_readdir: bad 
entry in directory #172258: rec_len %% 4 != 0 - offset=192, inode=812610409, 
rec_len=11833, name_len=115
 
 Machine is a 2x933MhZ P3 with 2GB of memory. Kernel version is now
 2.2.19, but the same problem appeared with 2.2.18 as well.

My first guess would be some sort of hardware/software problem with your
SCSI controller, cables, disk, etc.  I'm not sure about the swap problem,
but the ext2 problems are caused by corruption of the disk or memory.

It is not just a single-bit error either, because rec_len % 4 != 0 AND it
is larger than a page size, so the value is totally bogus, as is the inode
number.  Interestingly, converting the above ext2 numbers into ascii gives:

0x69 0x73 0x6f 0x30 0x39 0x2e 0x73 = iso09.s

(in the order they are layed out in ext2_dir_entry_2).  Coincidence or bug?
I would suggest a full fsck for the filesystem, as it is likely that there
are other problems.

Now when you say servers do you mean you have the same problem on
multiple machines?  Are they identical, or different?

Cheers, Andreas
-- 
Andreas Dilger  \ If a man ate a pound of pasta and a pound of antipasto,
 \  would they cancel out, leaving him still hungry?
http://www-mddsp.enel.ucalgary.ca/People/adilger/   -- Dogbert
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/