Re: Stuck: What to do with solid locks?

2001-04-04 Thread Colonel


In kernel.list, you wrote:
>
>Oh, I realize this.  I don't mind and even expect the occational crash
>right now in the 2.4.x series, but the frequency of these crashes fall

Well, you say this, but
...more whinny post deleted...

>to begin to help fix this problem (or these problems).

Twice your tone plays different from your words.  A scan of lkml
should have shown you that your problem is not a major problem, it's
far more likely to be unique than general.

- ---

1) try a different distribution, RH is bleeding edge at times and the
problems may not be entirely within the kernel.  For example,
Slackware-current is a base (without any additions it runs a 2.4
kernel) I've used with 4 machines now, and the only problem was the
loopback fs hang of a month ago.

2) remove drivers & hardware goodies to see if stability improves.
change your typical application load to see what happens.  I actually
do this the other way, run a simple kernel and then add to it.

3) there are a lot of 2.4 kernels, over 40 variants, look thru the
ChangeLogs, maybe your hardware is mentioned someplace.  try them to
see if stability improves.


In short, try the reduce the possible areas for a bug, ideally getting
to a point where you can state : 2.4.X-Y with AAA locks while 2.4.X-Y
without AAA does not lock.  That will bring more attention.

oh, and

4) boot into your last working kernel when you want to accomplish
something.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[CHECKER] __init functions called by non-__init

2001-04-04 Thread Dawson Engler

as per a suggestion from Jonathan ([EMAIL PROTECTED]) I wrote a simple
checker to warn when non-__init functions call __init functions or use
__initdata.  Before sending the entire list of "bugs" I want to make
sure they actually are errors rather than bugs in how I understand the
kernel.   So, as I understand it, there seem to be two error cases:

1. The best case: an init function calls a non-init, which in
turn calls an init:

void __init probe() { a(); }
void a() { b(); }
void __init b() { ... }

in this case, is the missing __init on 'a' only a performance
bug in that a's code won't be freed up?

2. The worst case: some random post-initialization routine
calls an __init routine which can cause the kernel to go into
hyperspace if the __init routine's code has been deleted.

Are these error cases correct?  Are there other interesting, related
__init problems to look for?  The doc's I read were a little coy about
specifics.

There were about 43 errors flagged.  Most seem to be case 1.  An example:

/u2/engler/mc/oses/linux/2.4.1/drivers/scsi/g_NCR5380.c:240:generic_NCR53C400A_setup: 
ERROR:INIT: non-init fn 'generic_NCR53C400A_setup' calling init fn 'internal_setup'
/u2/engler/mc/oses/linux/2.4.1/drivers/scsi/g_NCR5380.c:253:generic_DTC3181E_setup: 
ERROR:INIT: non-init fn 'generic_DTC3181E_setup' calling init fn 'internal_setup'


where if you look in the code, the flagged routine generic_NCR53C400A_setup 
does indeed not have __init:
void generic_NCR53C400A_setup (char *str, int *ints) {
internal_setup (BOARD_NCR53C400A, str, ints);
}

but a nearby, almost identiical routine does:

void __init generic_NCR53C400_setup (char *str, int *ints){
internal_setup (BOARD_NCR53C400, str, ints);
}

This just seems to be a case where someone forgot to type the __init
in, thereby causing a (small) storage leak.

-

On the other hand, if I understood the rules right, this next one looks like
a more exciting error, since an __exit routine is calling an __init routine:

/u2/engler/mc/oses/linux/2.4.1/drivers/sound/aedsp16.c:1356:cleanup_aedsp16: 
ERROR:INIT: non-init fn 'cleanup_aedsp16' calling init fn 'uninit_aedsp16'

void __init uninit_aedsp16(void)
{
if (ae_config.mss_base != -1)
uninit_aedsp16_mss();
else
uninit_aedsp16_sb();
if (ae_config.mpu_base != -1)
uninit_aedsp16_mpu();
}


static void __exit cleanup_aedsp16(void) {
uninit_aedsp16();
}
-

Any clarifications, etc would be appreciated. 

Thanks,
Dawson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.3-ac2 and D state process

2001-04-04 Thread CaT

I have mozilla stuck in D state:

25 [16:44:06] bowman@europa:/home/bowman>> ps -eo pid,tt,user,fname,tmout,f,wchan | 
grep mozilla
  435 ?bowman   mozilla- - 040 down_write_failed
 2646 ?bowman   mozilla- - 040 down_write_failed

Would this be a mozilla issue or a kernel issue? I can't kill this sucker
nor can I attache an strace to it and have it return something.

System is a Debian 2.2r2 system, kernel 2.4.3-ac2, glibc 2.1.3
(dunno what else you folks may need - if you do need more info, 
holler)

-- 
CaT ([EMAIL PROTECTED])*** Jenna has joined the channel.
 speaking of mental giants..
 me, a giant, bullshit
 And i'm not mental
- An IRC session, 20/12/2000

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[QUESTION] 2.4.3: hotplug_path unresolved in usbcore?

2001-04-04 Thread Ryan Mack

Sorry for such a stupid question, but I'm stumped (it doesn't take much).
modprobe reports that hotplug_path is unresolved when it processes
usbcore. CONFIG_HOTPLUG is defined, so it seems that hotplug_path is
defined and EXPORTed in kernel/kmod.c, so I'm unsure what the problem is.

Thanks, Ryan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: More about 2.4.3 timer problems

2001-04-04 Thread Eric Gillespie

On Wed, 4 Apr 2001, James Simmons wrote:

:
:>Err, tried the patch you recommended me to apply to the 2.4.3 source code
:>(not the 2.4.3-pre6), but everything else started complaining it couldn't
:>see printk() any more.  Any advice?  Thanks...
:
:Can you tell us your hardware configuration and your kernel configuration.

This has been mentioned in an earlier message of mine, but basically, I have a
Cyrix M-II machine, 32 meg of ram, 3.4 gig HD, PCI and ISA buses, SiS chipset,
etc etc.

:What do you mean it couldn't see printk any more?

What I meant was that while kernel/printk.c got compiled to printk.o, none of
the modules compiled could then see the function printk(); when I did a depmod
-ae, the unresolved function (and the ONLY function) they all complained about
was printk, even though it appeared in the System.Map file, (c0 T printk)
and, presumably, in the kernel image as well.

I noted that the patch touched printk.c, so wondered what the patch had
done... and I really didn't want to have to compile everything into my kernel
just to get it to work... it's almost like it was a question of scope...

The patch did work to eliminate the clock skew, aside from this printk
problem, but fortunately I was bright enough to have kept an older kernel
tarball around, so I could go back to unpatched version. 

Thanks for keeping in touch... me not being much of a programmer, I can't just
get my hands dirty and fix the patch problem, otherwise, I'd do it.  But
knowing my luck, I'd end up with a kernel that worked six times faster than
anything previously, but wind up destroying the hard disk and CMOS in two
minutes fifteen point three seconds.  And set my monitor on fire too...

-- 
 /|   _,.:*^*:.,   |\   Cheers from the Viking family, 
| |_/'  viking@ `\_| |including Pippin, our cat
|flying-brick| $FunnyMail  Bilbo   : Now far ahead the Road has gone,
 \_.caverock.net.nz_/ 5.39in LOTR  : Let others follow it who can!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: which gcc version?

2001-04-04 Thread Alexander Viro



On Thu, 5 Apr 2001, Manoj Sontakke wrote:

> Hi
>   I am getting linker error "undefined reference to __divdi3".
> This is because c = a/b; where a,b,c are of type "long long"
> I understand this is gcc problem.
>   I am doing this on a pentium with gcc -v = egcs-2.91.66

Don't do it in the kernel. It has nothing to gcc version.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] SysV IPC, kernel 2.4.3

2001-04-04 Thread Scott Maxwell

This patch contains several small bug fixes and micro-optimizations
for SysV IPC code in the 2.4.3 kernel.  Summary:

* testmsg(): "return expr;" beats "if (expr) return 1; ... return 0;"
* pipelined_send() was setting q_lspid instead of q_lrpid.
* sys_msgrcv() had redundant found_msg assignments.
* freeundos() had unused parameter (sma).
* ipc_alloc(): Kernel oops if initial allocation failed.

I'm having unrelated problems getting 2.4.3 to work with my test
hardware, so I have tested these changes under 2.4.1 only.  They work
fine for me under 2.4.1, though.  Please email me if you see any
problems or if Netscape mail mangles this post (I haven't tried this
before).  Thanks.

-- 
-+
R H L U  Scott Maxwell:  | ``Life results from the non-random survival
E A I X maxwell@ |   of randomly varying replicators.''
D T N 6 ScottMaxwell.org | -- Richard Dawkins

diff -urN linux-2.4.3/ipc/msg.c linux/ipc/msg.c
--- linux-2.4.3/ipc/msg.c   Mon Feb 19 10:18:18 2001
+++ linux/ipc/msg.c Wed Apr  4 00:38:21 2001
@@ -582,17 +582,11 @@
case SEARCH_ANY:
return 1;
case SEARCH_LESSEQUAL:
-   if(msg->m_type <=type)
-   return 1;
-   break;
+   return msg->m_type <=type;
case SEARCH_EQUAL:
-   if(msg->m_type == type)
-   return 1;
-   break;
+   return msg->m_type == type;
case SEARCH_NOTEQUAL:
-   if(msg->m_type != type)
-   return 1;
-   break;
+   return msg->m_type != type;
}
return 0;
 }
@@ -613,7 +607,8 @@
wake_up_process(msr->r_tsk);
} else {
msr->r_msg = msg;
-   msq->q_lspid = msr->r_tsk->pid;
+   /* Was q_lspid; surely, this was the intent. */
+   msq->q_lrpid = msr->r_tsk->pid;
msq->q_rtime = CURRENT_TIME;
wake_up_process(msr->r_tsk);
return 1;
@@ -753,10 +748,8 @@
if(testmsg(msg,msgtyp,mode)) {
found_msg = msg;
if(mode == SEARCH_LESSEQUAL && msg->m_type != 1) {
-   found_msg=msg;
msgtyp=msg->m_type-1;
} else {
-   found_msg=msg;
break;
}
}
diff -urN linux-2.4.3/ipc/sem.c linux/ipc/sem.c
--- linux-2.4.3/ipc/sem.c   Mon Feb 19 10:18:18 2001
+++ linux/ipc/sem.c Mon Apr  2 20:51:37 2001
@@ -775,7 +775,7 @@
}
 }
 
-static struct sem_undo* freeundos(struct sem_array *sma, struct sem_undo* un)
+static struct sem_undo* freeundos(struct sem_undo* un)
 {
struct sem_undo* u;
struct sem_undo** up;
@@ -878,7 +878,7 @@
if(un->semid==semid)
break;
if(un->semid==-1)
-   un=freeundos(sma,un);
+   un=freeundos(un);
 else
un=un->proc_next;
}
diff -urN linux-2.4.3/ipc/util.c linux/ipc/util.c
--- linux-2.4.3/ipc/util.c  Mon Feb 19 10:18:18 2001
+++ linux/ipc/util.cMon Apr  2 20:39:55 2001
@@ -75,7 +75,8 @@
ids->size = 0;
}
ids->ary = SPIN_LOCK_UNLOCKED;
-   for(i=0;isize;i++)
ids->entries[i].p = NULL;
 }
 



which gcc version?

2001-04-04 Thread Manoj Sontakke

Hi
I am getting linker error "undefined reference to __divdi3".
This is because c = a/b; where a,b,c are of type "long long"
I understand this is gcc problem.
I am doing this on a pentium with gcc -v = egcs-2.91.66

Thanks for all the help.

Manoj
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Race in fs/proc/generic.c:make_inode_number()

2001-04-04 Thread Tom Leete

Hello,

The proc_alloc_map bitfield is unprotected by any lock, and
find_first_zero_bit() is not atomic. Concurrent module loading can race
here.

static unsigned char proc_alloc_map[PROC_NDYNAMIC / 8];

static int make_inode_number(void)
{
int i = find_first_zero_bit((void *) proc_alloc_map, PROC_NDYNAMIC);
if (i<0 || i>=PROC_NDYNAMIC) 
return -1;
set_bit(i, (void *) proc_alloc_map);
return PROC_DYNAMIC_FIRST + i;
}

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: /dev/loop0 over lvm... leading to d-state :-(

2001-04-04 Thread Jens Axboe

On Wed, Apr 04 2001, Herbert Valerio Riedel wrote:
> 
> fyi, loop devices over lvm LV's dont work for me...
> 
> I've tested with 2.4.3final (and some other 2.4.3 derivates) and two
> lvm'ized partitions with a size of about 1gig each; mke2fs
> just goes into D-state and stays there when applying it to /dev/loop0,
> running it directly on the LV-device works...

this would appear to be an lvm bug, could you try this patch? it's
untested, let me know if it doesn't work and I'll try and reproduce
here.

-- 
Jens Axboe



--- /opt/kernel/linux-2.4.3/drivers/md/lvm.cMon Jan 29 01:11:20 2001
+++ drivers/md/lvm.cThu Apr  5 07:12:14 2001
@@ -1480,14 +1480,14 @@
  */
 static int lvm_map(struct buffer_head *bh, int rw)
 {
-   int minor = MINOR(bh->b_dev);
+   int minor = MINOR(bh->b_rdev);
int ret = 0;
ulong index;
ulong pe_start;
ulong size = bh->b_size >> 9;
ulong rsector_tmp = bh->b_blocknr * size;
ulong rsector_sav;
-   kdev_t rdev_tmp = bh->b_dev;
+   kdev_t rdev_tmp = bh->b_rdev;
kdev_t rdev_sav;
vg_t *vg_this = vg[VG_BLK(minor)];
lv_t *lv = vg_this->lv[LV_BLK(minor)];



Re: Underscore in rivafb

2001-04-04 Thread James Simmons


linux/Documentation/VGA-softcursor.txt

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: "linux" terminal type

2001-04-04 Thread James Simmons


>> Is there any documentation on ths linux console terminal type?  If
>> so, where?
>
>Maybe cryptic but the most complete documentation of the linux terminal
>and it's relatives are probably /etc/termcap and the ncurses terminfo
>database.  Aside of the code itself.

Also take a look at http://www.vt100.net . Since linux tries to emulate
the Dec vt100 at this site you will find the vt100 manuals. They are quite
good and the esc codes are well described in them.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: More about 2.4.3 timer problems

2001-04-04 Thread James Simmons


>Err, tried the patch you recommended me to apply to the 2.4.3 source code
>(not the 2.4.3-pre6), but everything else started complaining it couldn't
>see printk() any more.  Any advice?  Thanks...

Can you tell us your hardware configuration and your kernel configuration.
What do you mean it couldn't see printk any more?

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Linux-fbdev-devel] Re: fbcon slowness [was NTP on 2.4.2?]

2001-04-04 Thread James Simmons


>>> As long as you are copying in real memory. So the PCI bus or the host
bridge
>>> implementation may be the actual limit.
>>
>> The CyrixIII sits on the same host bridges as the intel processors
>
>I don't know if it applies to this case but one thing I have seen make
>a noticeable difference is whether or not write-combining is enabled.
>If we have only be enabling MTRR's for intel this could do account
>for it.

I think what Geert was trying to point out is does MTRR perform was well
with normal memory over bus to video memory transfers as compared to
normal memory to normal memory transfers. MTTRs might not be optimzed for
these kinds of transfers. I honestly can't say since I haven't tried it. I
brought the MMX book home from works so I'm going to be experimenting
with it this weekend to find out. I really like to compare the MMX
performance to the word aligned transfers over the bus I have going. I had
a bug in my soft accel code that prevented word alignment. Once I fixed
that bug I seen a 10 fold improvement in rendering on the framebuffer.
I'm not kidding about that improvement either :-)

MTTRs enabled always makes a difference. I liek to try it with and
without. I will do some benchmarkings.

MS: (n) 1. A debilitating and surprisingly widespread affliction that
renders the sufferer barely able to perform the simplest task. 2. A disease.

James Simmons  [[EMAIL PROTECTED]]   /|
fbdev/console/gfx developer \ o.O|
http://www.linux-fbdev.org   =(_)=
http://linuxgfx.sourceforge.netU
http://linuxconsole.sourceforge.net

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] NeoMagic Framebuffer Driver

2001-04-04 Thread Denis Oliver Kropp

Hi,

I wrote a framebuffer driver for all NeoMagic PCI chips
ported from current XFree driver, not based on the AFAIK
unfinished driver of Ani Joshi or Helge Deller.
This is version 0.2 with some little TODOs like
module parameters or kernel parameter parsing;
it's tested with NM2200 and NM2380.
The patch is against 2.4.3, please test the driver,
merge into kernel tree would be appreciated.


I'm not on the list, so do a Cc to me, please.


Best regards,

-- 
Denis Oliver Kropp
( convergence   )
( integrated media gmbh )

 neofb-0.2-patch-2.4.3.bz2


2.4.3 video module build failure

2001-04-04 Thread Stephen L Moshier


This is for linux-2.4.3 configured with multimedia/video selected as a module.

gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes
-O2 -fomit-f
rame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2
-march=i586
 -DMODULE   -c -o buz.o buz.c
buz.c: In function `v4l_fbuffer_alloc':
buz.c:188: `KMALLOC_MAXSIZE' undeclared (first use in this function)
buz.c:188: (Each undeclared identifier is reported only once
buz.c:188: for each function it appears in.)
buz.c: In function `jpg_fbuffer_alloc':
buz.c:262: `KMALLOC_MAXSIZE' undeclared (first use in this function)
buz.c:256: warning: `alloc_contig' might be used uninitialized in this
function
buz.c: In function `jpg_fbuffer_free':
buz.c:322: `KMALLOC_MAXSIZE' undeclared (first use in this function)
buz.c:316: warning: `alloc_contig' might be used uninitialized in this
function
buz.c: In function `zoran_ioctl':
buz.c:2837: `KMALLOC_MAXSIZE' undeclared (first use in this function)
make[3]: *** [buz.o] Error 1
make[3]: Leaving directory `/usr/src/linux/drivers/media/video'

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Signal Handling Performance?

2001-04-04 Thread Christopher Smith

--On Wednesday, April 04, 2001 21:30:51 -0400 "Carey B. Stortz" 
<[EMAIL PROTECTED]> wrote:
> either stayed the same or had a performance increase. A general decrease
> started around kernel 2.1.32, then performance drastically fell at kernel
> 2.3.20. There is an Excel graph which shows the trend at:
>
> http://euclid.nmu.edu/~benchmark/Carey/signalhandling.gif
>
> I was wondering if anybody had any ideas why this is happening, and what
> happened in kernel 2.3.20 to cause such a decrease in performance?

Lies, damn lies, and benchmarks. ;-) Seriously though, I'm not clear on 
what you are measuring or how you are measuring it. It looks like this is 
measuring signal latency, which is important, but what about thoroughput?

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [QUESTION] MOD_INC/MOD_DEC: useful to check for correct usage?

2001-04-04 Thread Alexander Viro



On Wed, 4 Apr 2001, Dawson Engler wrote:

> Hi,
> 
> in the old days you couldn't call a sleeping function in a module
> before doing a MOD_INC or after doing a MOD_DEC.  Then some safety nets
> were added that made these obsolete (in some number of places).  I was
> told that people had decided to potentially get rid of all safety
> nets.  Is this true?  Is it worthwhile to have a checker for these two
> rules?

It's worth removing the MOD_{INC,DEC}_USE_COUNT. Which had been done
in quite a few places. Let the caller handle the refcount on callee -
_that_ is definitely safe.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Signal Handling Performance?

2001-04-04 Thread Carey B. Stortz

I am doing a research project on Linux kernel performance starting with the
2.0.1 kernel through the 2.4.0 kernel. I ran across something very
interesting when running LMBench and reviewing the results. The performance
of Signal Handling has decreased while every other area has either stayed
the same or had a performance increase. A general decrease started around
kernel 2.1.32, then performance drastically fell at kernel 2.3.20. There is
an Excel graph which shows the trend at:

http://euclid.nmu.edu/~benchmark/Carey/signalhandling.gif

I was wondering if anybody had any ideas why this is happening, and what
happened in kernel 2.3.20 to cause such a decrease in performance?

Thanks for your time
Carey Stortz

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [BUG] smbfs: caching problems

2001-04-04 Thread Xuan Baldauf



Urban Widmark wrote:

> On Sun, 1 Apr 2001, Xuan Baldauf wrote:
>
> > there is something wrong with smbfs caching which makes my
> > applications fail. The behaviour happens with
> > linux-2.4.3-pre4 and linux-2.4.3-final.
> >
> > Consider following shell script: (where /mnt/n is a
> > smbmounted smb share from a Win98SE box)
>
> Try the attached patch, as a workaround.
>

Works for me. :-)

It survived codified test case at the end of this message.

Xuân.


#!/bin/bash
#

if test -z "$1"; then
 LOCAL=0
fi

if test -n "$LOCAL"; then
 umount /mnt/n

 rmmod smbfs

 # mount
 ~/bin/lwc

 cd /mnt/n/temp
fi


rm -f /tmp/test.abc /tmp/test.xyz testfile

I=0
while test $I -lt 127; do
 echo "abc" >>/tmp/test.abc
 I=$((I+1))
done

I=0
while test $I -lt 129; do
 echo "xyz" >>/tmp/test.xyz
 I=$((I+1))
done

I=0

while test $I -lt 8; do
 cp /tmp/test.abc testfile
 tail -1 testfile
 cp /tmp/test.xyz testfile
 tail -1 testfile

 I=$((I+1))
done

rm -f /tmp/test.abc /tmp/test.xyz testfile


if test -n "$LOCAL"; then
 umount /mnt/n
fi


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: linux 2.4.3 crashed my hard disk

2001-04-04 Thread Alan Cox

> ide_dmaproc: chipset supported ide_dma_timeout func only: 14
> is about the most ominous message one can receive from the IDE driver:
> 
> 1. it's not in English, so it doesn't tell you jack

It tells you the chipset doesnt support an IDE dma timeout handling function
(ie all it can do is reset and retry)

> 2. it's usually a sign of "mkfs + reinstall needed"

Not in my experience. Its just a drive throwing a fit.

> 3. I've had it happen on Intel and VIA chipsets alike, 100% guaranteed
>non-overclocked

I've only seen it on broken boards that also needed DMA off in 2.2 and
on the VIA stuff before the VIA fixups went in and the new via driver.

> 5. I have yet to see a coherent explanation from Andre as to what the
>message means, or what causes it.


We issued a DMA, the drive sat their and did nothing. The default handler 
asks the controller handling the request to retry it in PIO mode. Which is
readonable. On the 440BX this uses disable_irq which may also trigger a bug
in the APIC on SMP machines and hang solid unless you have -ac. I dont think
thats statistically likely here.

The code looks correct, its a bit convoluted but it does seem to correctly
reissue the request, although not as PIO. Perhaps Andre can explain why its
ignoring the 'please use pio' hint on the return

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[QUESTION] MOD_INC/MOD_DEC: useful to check for correct usage?

2001-04-04 Thread Dawson Engler

Hi,

in the old days you couldn't call a sleeping function in a module
before doing a MOD_INC or after doing a MOD_DEC.  Then some safety nets
were added that made these obsolete (in some number of places).  I was
told that people had decided to potentially get rid of all safety
nets.  Is this true?  Is it worthwhile to have a checker for these two
rules?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ReiserFS? How reliable is it? Is this the future?

2001-04-04 Thread Alan Cox

> This is a reiserfs security issue, but only of theoretical nature (Even i=
> f
> triggered, it won't harm you). But the reason for this bug is in NFS (v2,=

If the blocks contained my old /etc/shadow I'd be a bit upset.

> displacement instead of vertical displacement) is planned.
> 
> I can tell you more if you want.

I trust Chris to keep it in order. I've not yet had a broken patch from them
for -ac

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ReiserFS? How reliable is it? Is this the future?

2001-04-04 Thread Xuan Baldauf



Alan Cox wrote:

> > This is a reiserfs security issue, but only of theoretical nature (Even i=
> > f
> > triggered, it won't harm you). But the reason for this bug is in NFS (v2,=
>
> If the blocks contained my old /etc/shadow I'd be a bit upset.

The only bad consequence possible is that you possibly cannot create a file with
a given filename if someone else (remote user) could create at least 127 files
with a very special filename within the same directory. Usually, /etc/shadow and
all other important files either are created before any other user has access or
(if they are created later) belong to directories where only root may create
files in it.

>
>
> > displacement instead of vertical displacement) is planned.
> >
> > I can tell you more if you want.
>
> I trust Chris to keep it in order. I've not yet had a broken patch from them
> for -ac

:-)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ReiserFS? How reliable is it? Is this the future?

2001-04-04 Thread Xuan Baldauf



Alan Cox wrote:

> > The bad (2.2 kernels)
> >
> > * Nothing I can think of
>
> Security exploit according to bugtraq, but Im pretty sure it wont take Chris
> Mason and friends long to fix that.
>

This is a reiserfs security issue, but only of theoretical nature (Even if
triggered, it won't harm you). But the reason for this bug is in NFS (v2, v3,
hopefully not also v4) readdir braindamage.

I think, in Reiser(FS)4, a more sophisticated (NFS-)work-around (horizontal
displacement instead of vertical displacement) is planned.

I can tell you more if you want.

Xuân.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: linux 2.4.3 crashed my hard disk

2001-04-04 Thread Ion Badulescu

On Wed, 4 Apr 2001 20:00:29 +0100 (BST), Alan Cox <[EMAIL PROTECTED]> wrote:

>> Been running this configuration over more than 2 years now without such
>> major problems.
>> Could this be the cause?
> 
> Quite possibly. There are reasons we ignore bug reports from overclockers

Perhaps. But,

ide_dmaproc: chipset supported ide_dma_timeout func only: 14

is about the most ominous message one can receive from the IDE driver:

1. it's not in English, so it doesn't tell you jack
2. it's usually a sign of "mkfs + reinstall needed"
3. I've had it happen on Intel and VIA chipsets alike, 100% guaranteed
   non-overclocked
4. Andre has repeatedly claimed "he's fixed it", but experience in the
   field shows quite the contrary to be true
5. I have yet to see a coherent explanation from Andre as to what the
   message means, or what causes it.

So right now 2.4 + IDE (or 2.2 + IDE + Andre's patches) is not a combination 
I can trust my data to, unless everything is running in PIO mode. The latter
is usually way too slow for anything useful, other than maybe a pure router.

Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[Problem] 3c90x on 2.4.3-ac3

2001-04-04 Thread Prasanna P Subash

hi lkml,
I just built 2.4.3-ac3 with my old 2.4.2 .config and somehow networking does 
not work. 
dhclient eventually froze the machine.

here is what dhclient complains.

[root@psubash linux]# cat /tmp/error.txt
skb: pf=2 (unowned) dev=lo len=328
PROTO=17 0.0.0.0:68 255.255.255.255:67 L=328 S=0x10 I=0 F=0x T=16
DHCPDISCOVER on lo to 255.255.255.255 port 67 interval 14
ip_local_deliver: bad loopback skb: PRE_ROUTING LOCAL_IN
skb: pf=2 (unowned) dev=lo len=328
PROTO=17 0.0.0.0:68 255.255.255.255:67 L=328 S=0x10 I=0 F=0x T=16
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 9
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7
DHCPDISCOVER on lo to 255.255.255.255 port 67 interval 12
ip_local_deliver: bad loopback skb: PRE_ROUTING LOCAL_IN
skb: pf=2 (unowned) dev=lo len=328

Here is my ver_linux info

[root@psubash linux]# cat /tmp/ver_linux
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux psubash 2.4.3-ac3 #3 Wed Apr 4 16:49:40 PDT 2001 i686 unknown

Gnu C  2.95.2.1
Gnu make   3.79.1
binutils   2.10.1
util-linux 2.11a
mount  2.11a
modutils   2.3.24
e2fsprogs  1.20-WIP
pcmcia-cs  3.1.24
PPP2.3.11
Linux C Library2.1.3
Dynamic linker (ldd)   2.1.3
Linux C++ Library  ..
Procps 2.0.6
Net-tools  1.55
Console-tools  0.3.3
Sh-utils   2.0
Modules Loaded 3c59x


Here is the output of lspci.

[root@psubash linux]# lspci  -v
00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 03)
Flags: bus master, medium devsel, latency 64
Memory at f800 (32-bit, prefetchable)
Capabilities: [a0] AGP version 1.0

00:01.0 PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 03) 
(prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, medium devsel, latency 128
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
Memory behind bridge: f410-f4ff
Prefetchable memory behind bridge: fc00-fdff

00:07.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
Flags: bus master, medium devsel, latency 0

00:07.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) (prog-if 80 
[Master])
Flags: bus master, medium devsel, latency 64
I/O ports at 10c0

00:07.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01) (prog-if 00 
[UHCI])
Flags: bus master, medium devsel, latency 64, IRQ 9
I/O ports at 1080

00:07.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
Flags: medium devsel

00:0e.0 Multimedia audio controller: Creative Labs SB Live! EMU1 (rev 07)
Subsystem: Creative Labs CT4832 SBLive! Value
Flags: bus master, medium devsel, latency 64, IRQ 10
I/O ports at 10a0
Capabilities: [dc] Power Management version 1

00:0e.1 Input device controller: Creative Labs SB Live! (rev 07)
Subsystem: Creative Labs Gameport Joystick
Flags: bus master, medium devsel, latency 64
I/O ports at 10d0
Capabilities: [dc] Power Management version 1

00:0f.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100
Flags: bus master, medium devsel, latency 80, IRQ 5
I/O ports at 1000
Memory at f400 (32-bit, non-prefetchable)
Capabilities: [dc] Power Management version 1

01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04) 
(prog-if 00 [VGA])
Subsystem: Matrox Graphics, Inc. Millennium G400 16Mb SDRAM
Flags: bus master, medium devsel, latency 128, IRQ 11
Memory at fc00 (32-bit, prefetchable)
Memory at f410 (32-bit, non-prefetchable)
Memory at f480 (32-bit, non-prefetchable)
Capabilities: [dc] Power Management version 2
Capabilities: [f0] AGP version 2.0




I have also attached my .config.


thanks,
Prasanna Subash
[EMAIL PROTECTED]


#
# Automatically generated by make menuconfig: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
# CONFIG_SBUS is not set
CONFIG_UID16=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y

#
# Processor type and features
#
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
CONFIG_MPENTIUMIII=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCH

RE: vmalloc on 2.4.x on ia64

2001-04-04 Thread hiren_mehta

dropping the io_request_lock around vmalloc worked great.
Thanks for all the help. I really appreciate it.

Thanks 
-hiren

> -Original Message-
> From: Alan Cox [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, April 04, 2001 5:29 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
> [EMAIL PROTECTED]
> Subject: Re: vmalloc on 2.4.x on ia64
> 
> 
> > I am calling during initialization only from detect() entry point.
> > But I guess, before the detect() is called, scsi layer acquires
> > the io_request_lock. So, you mean to say that I need to release it
> 
> That depends if your driver is doing old or new style initialization
> 
> > before calling vmalloc() ? I was doing the same thing on 2.2.x
> > and even on 2.4.0 and it was working fine and now suddenly
> > it stopped working on 2.4.2. So what are the guidelines for using
> > vmalloc() if we want to use it in scsi low-level (HBA) driver ?
> 
> You can use vmalloc in any situation where you are in task context
> and can sleep.
> 
> > I am currently using the new error handling code. 
> (use_new_eh_code = TRUE).
> 
> Then yes you would need to drop the lock if my memory serves 
> me rightly.
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Is jumbo ethernet MTU possible with Hamachi Gigabit ethernet driver?

2001-04-04 Thread John Weidman

I'm trying to get some Gigabit ethernet cards that use the Packet
Engines Hamachi GNIC-II chip to use a large mtu to attempt to get a
throughput of close to the 1Gb rating of the card.  This is on a Compact
PCI Alpha system.  I'm trying to use an MTU in the 8000 to 9000 range
and so far have not been able to get these MTUs to work.

I have changed the PKT_BUF_SZ and MAX_FRAME_SIZE constants in hamachi.c
and ETH_DATA_LEN and ETH_FRAME_LEN in if_ether.h.  I can use ifconfig to
change the MTU above 1500 on one side of a connection but as soon as I
raise the MTU on both sides to greater than 1500 the link dies.  I can
change the MTU with ifconfig back to 1500 and the link will resume
operation.  We are currently somewhat married to the 2.2.14 kernel.

I read that some ethernet drivers will support jumbo MTUs.  There
appears to be something in the hamachi driver or the kernel that I've
missed.  Perhaps this only works with a later version kernel or the
hamachi driver needs more changes?  Any help would be appreciated.

John Weidman
[EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.2-ac6 hangs on boot w/AMD Elan SC520 dev board

2001-04-04 Thread Brian Moyle

Changes are isolated to sanitize_e820_map().

(patch is against linux-2.4.3-ac3)

This change informs the user when overlaping memory regions have been
found in their e820 map.  If overlaps were found, it displays the 
original mapping and then creates an adjusted map (w/o overlaps).  If 
no overlaps were found, it leaves the original bios-provided map alone.  
Formatting changes were also made so source fits w/in 80 columns.

Hope it helps,

Brian

Pavel Machek wrote:
> > I suspect you need to add some code to take the E820 map and remove any
> > overlaps from it, favouring ROM over RAM if the types disagree (for safety),
> > and filter them before you register them with the bootmem in
> > arch/i386/kernel/setup.c
> 
> ...plus prining ?@#@&#&$ BIOS reports invalid mem map
> seems like good idea, so that bios bugs are fixed.

--- linux-2.4.3-ac3/arch/i386/kernel/setup.cWed Apr  4 16:30:35 2001
+++ linux/arch/i386/kernel/setup.c  Wed Apr  4 16:34:03 2001
@@ -447,8 +447,8 @@
 /*
  * Sanitize the BIOS e820 map.
  *
- * Some e820 responses include overlapping entries.  The following 
- * replaces the original e820 map with a new one, removing overlaps.
+ * Some e820 responses include overlapping memory regions.  If overlaps are
+ * found, we'll replace the original e820 map with a new one (w/o overlaps).
  *
  */
 static int __init sanitize_e820_map(struct e820entry * biosmap, char * pnr_map)
@@ -464,14 +464,14 @@
struct change_member *change_tmp;
unsigned long current_type, last_type;
unsigned long long last_addr;
+   int overlap_entries, overlaps_found;
int chgidx, still_changing;
-   int overlap_entries;
int new_bios_entry;
int old_nr, new_nr;
int i;
 
/*
-   Visually we're performing the following (1,2,3,4 = memory types)...
+   Visually, we're performing the following (1,2,3,4 = mem types):
 
Sample memory map (w/overlaps):
   22__
@@ -517,7 +517,7 @@
if (biosmap[i].addr + biosmap[i].size < biosmap[i].addr)
return -1;
 
-   /* create pointers for initial change-point information (for sorting) */
+   /* create pointers for initial change-point info (used for sorting) */
for (i=0; i < 2*old_nr; i++)
change_point[i] = &change_point_list[i];
 
@@ -535,13 +535,15 @@
while (still_changing)  {
still_changing = 0;
for (i=1; i < 2*old_nr; i++)  {
-   /* if  > , swap */
-   /* or, if current= & last=, swap */
-   if ((change_point[i]->addr < change_point[i-1]->addr) ||
-   ((change_point[i]->addr == change_point[i-1]->addr) &&
-(change_point[i]->addr == 
change_point[i]->pbios->addr) &&
-(change_point[i-1]->addr != 
change_point[i-1]->pbios->addr))
-  )
+   /* if  < , swap */
+   /* or, if curr= & last=, swap */
+   if ((change_point[i]->addraddr) ||
+   ((change_point[i]->addr ==
+   change_point[i-1]->addr) &&
+(change_point[i]->addr ==
+   change_point[i]->pbios->addr) &&
+(change_point[i-1]->addr !=
+   change_point[i-1]->pbios->addr)))
{
change_tmp = change_point[i];
change_point[i] = change_point[i-1];
@@ -552,47 +554,72 @@
}
 
/* create a new bios memory map, removing overlaps */
-   overlap_entries=0;   /* number of entries in the overlap table */
-   new_bios_entry=0;/* index for creating new bios map entries */
-   last_type = 0;   /* start with undefined memory type */
-   last_addr = 0;   /* start with 0 as last starting address */
-   /* loop through change-points, determining affect on the new bios map */
+   overlaps_found=0;   /* indicates whether or not an overlap was found */
+   overlap_entries=0;  /* number of entries in the overlap table */
+   new_bios_entry=0;   /* index for creating new bios map entries */
+   last_type = 0;  /* start with undefined memory type */
+   last_addr = 0;  /* start with 0 as last starting address */
+   /* loop through change-points, determining affect on new bios map */
for (chgidx=0; chgidx < 2*old_nr; chgidx++)
{
/* keep track of all overlapping bios entries */
-   if (change_point[chgidx]->addr == change_point[chgidx]->pbios->addr)
+   if (change_point[chgidx]->addr ==
+   change_point[chgidx]->pbios->addr)
  

Re: 2.2.19 borks am-utils building :-(

2001-04-04 Thread Ion Badulescu

On Wed, 4 Apr 2001 16:43:14 -0700 (PDT), Andre Hedrick <[EMAIL PROTECTED]> wrote:

> The subject says it all

Use the latest snapshot of am-utils (6.0.6s1), which fixes the problem.

Ion
am-utils co-maintainer

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: MPEG-2 decoding driver for Trident Cyberblade i7

2001-04-04 Thread Alan Cox

> Searching on the web, I see a few question but no answers
> to the question of whether a driver exists that can utilize
> the MPEG-2 hardware assist feature of the Trident
> Cyberblade i7.
> 
> Any pointers?

That would be part of the XFree 4 server if supported on that card yet. The
hardware scalers/YUV for several cards are supported by XFree 4.0 and the
xv extension. Players such as xine wil use that if present. 

So its an XFree question..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: vmalloc on 2.4.x on ia64

2001-04-04 Thread Alan Cox

> I am calling during initialization only from detect() entry point.
> But I guess, before the detect() is called, scsi layer acquires
> the io_request_lock. So, you mean to say that I need to release it

That depends if your driver is doing old or new style initialization

> before calling vmalloc() ? I was doing the same thing on 2.2.x
> and even on 2.4.0 and it was working fine and now suddenly
> it stopped working on 2.4.2. So what are the guidelines for using
> vmalloc() if we want to use it in scsi low-level (HBA) driver ?

You can use vmalloc in any situation where you are in task context
and can sleep.

> I am currently using the new error handling code. (use_new_eh_code = TRUE).

Then yes you would need to drop the lock if my memory serves me rightly.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Contacts within AMD? AMD-756 USB host-controller blacklisted dueto

2001-04-04 Thread Alan Cox

> Comprimise?
> 
> This patch make it a config option to enable the AMD-756.
> It's marked DANGEROUS and EXPERIMENTAL, and is only
> available if CONFIG_EXPERIMENTAL is set.

Since we expect to get errata docs very soon Im not that worried. As an 
implementation I'd rather a module option of 'ignore_blacklist' or similar
so that it is runtime

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Config printk buffer size

2001-04-04 Thread Thomas Dodd


This should allow the printk buffer to be sized during config.
The default size is still 16K. It could use some checking to
insure power-of-2, but CML1 doesn't have a way to
do it that I see. All the architectures should be
covered, and all the default files also have it.

It gets added in the kernel hacking section.

Having 8 software raid volumes most of the kernel hardware
detection messages get lost. Having to edit kernel/printk.c
before/after every kernel change is a mess and easy to forget.

I'm gouing to work on a resizing buffer, that drops to 4K or
8K after dmesg is called with the right switch.

Patch against 2.4.3-ac2 since it has more arch supported.

The embedded systems like arm, might want to change
the default size to something smaller for their arch,
whiuch is easier with this patch.

-Thomas

diff -u --new-file --recursive linux-2.4.3-ac2.orig/Documentation/Configure.help 
linux-2.4.3-ac2/Documentation/Configure.help
--- linux-2.4.3-ac2.orig/Documentation/Configure.help   Wed Apr  4 15:22:43 2001
+++ linux-2.4.3-ac2/Documentation/Configure.helpWed Apr  4 16:06:09 2001
@@ -15191,7 +15191,13 @@
   send a BREAK and then within 5 seconds a command keypress. The
   keys are documented in Documentation/sysrq.txt. Don't say Y unless
   you really know what this hack does.
-
+Printk buffer size
+CONFIG_PRINTK_BUF_LEN
+  Printk buffer size in K bytes. This should be a power of 2.
+  The 2.2.x kernels used a default of 8. The 2.4.x kernels
+  use a default of 16. Systems with many Software-RAID volumes
+  should increase since the md.o drivers have a lot of printk
+  output during boot.
 ISDN subsystem
 CONFIG_ISDN
   ISDN ("Integrated Services Digital Networks", called RNIS in France)
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/alpha/config.in 
linux-2.4.3-ac2/arch/alpha/config.in
--- linux-2.4.3-ac2.orig/arch/alpha/config.in   Wed Apr  4 15:22:44 2001
+++ linux-2.4.3-ac2/arch/alpha/config.inWed Apr  4 16:06:39 2001
@@ -361,6 +361,7 @@
 fi
 
 bool 'Magic SysRq key' CONFIG_MAGIC_SYSRQ
+int 'Printk buffer size (in K bytes)' CONFIG_PRINTK_BUF_LEN 16
 
 bool 'Legacy kernel start address' CONFIG_ALPHA_LEGACY_START_ADDRESS
 
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/alpha/defconfig 
linux-2.4.3-ac2/arch/alpha/defconfig
--- linux-2.4.3-ac2.orig/arch/alpha/defconfig   Wed Apr  4 15:12:44 2001
+++ linux-2.4.3-ac2/arch/alpha/defconfigWed Apr  4 15:36:53 2001
@@ -634,4 +634,5 @@
 #
 CONFIG_MATHEMU=y
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_PRINTK_BUF_LEN=16
 CONFIG_ALPHA_LEGACY_START_ADDRESS=y
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/arm/config.in 
linux-2.4.3-ac2/arch/arm/config.in
--- linux-2.4.3-ac2.orig/arch/arm/config.in Wed Apr  4 15:22:44 2001
+++ linux-2.4.3-ac2/arch/arm/config.in  Wed Apr  4 16:06:57 2001
@@ -414,6 +414,7 @@
 bool 'Verbose user fault messages' CONFIG_DEBUG_USER
 bool 'Include debugging information in kernel binary' CONFIG_DEBUG_INFO
 bool 'Magic SysRq key' CONFIG_MAGIC_SYSRQ
+int 'Printk buffer size (in K bytes)' CONFIG_PRINTK_BUF_LEN 16
 if [ "$CONFIG_CPU_26" = "y" ]; then
bool 'Disable pgtable cache' CONFIG_NO_PGT_CACHE
 fi
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/arm/def-configs/a5k 
linux-2.4.3-ac2/arch/arm/def-configs/a5k
--- linux-2.4.3-ac2.orig/arch/arm/def-configs/a5k   Mon Nov 27 19:07:59 2000
+++ linux-2.4.3-ac2/arch/arm/def-configs/a5kWed Apr  4 15:40:00 2001
@@ -532,5 +532,6 @@
 CONFIG_DEBUG_USER=y
 # CONFIG_DEBUG_INFO is not set
 CONFIG_MAGIC_SYSRQ=y
+CONFIG_PRINTK_BUF_LEN=16
 CONFIG_NO_PGT_CACHE=y
 CONFIG_DEBUG_LL=y
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/arm/def-configs/assabet 
linux-2.4.3-ac2/arch/arm/def-configs/assabet
--- linux-2.4.3-ac2.orig/arch/arm/def-configs/assabet   Mon Nov 27 19:07:59 2000
+++ linux-2.4.3-ac2/arch/arm/def-configs/assabetWed Apr  4 15:40:15 2001
@@ -566,4 +566,5 @@
 CONFIG_DEBUG_USER=y
 # CONFIG_DEBUG_INFO is not set
 # CONFIG_MAGIC_SYSRQ is not set
+CONFIG_PRINTK_BUF_LEN=16
 # CONFIG_DEBUG_LL is not set
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/arm/def-configs/brutus 
linux-2.4.3-ac2/arch/arm/def-configs/brutus
--- linux-2.4.3-ac2.orig/arch/arm/def-configs/brutusMon Nov 27 19:07:59 2000
+++ linux-2.4.3-ac2/arch/arm/def-configs/brutus Wed Apr  4 15:40:25 2001
@@ -293,4 +293,5 @@
 CONFIG_DEBUG_USER=y
 CONFIG_DEBUG_INFO=y
 # CONFIG_MAGIC_SYSRQ is not set
+CONFIG_PRINTK_BUF_LEN=16
 # CONFIG_DEBUG_LL is not set
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/arm/def-configs/cerf 
linux-2.4.3-ac2/arch/arm/def-configs/cerf
--- linux-2.4.3-ac2.orig/arch/arm/def-configs/cerf  Mon Nov 27 19:07:59 2000
+++ linux-2.4.3-ac2/arch/arm/def-configs/cerf   Wed Apr  4 15:40:35 2001
@@ -431,4 +431,5 @@
 CONFIG_DEBUG_USER=y
 # CONFIG_DEBUG_INFO is not set
 # CONFIG_MAGIC_SYSRQ is not set
+CONFIG_PRINTK_BUF_LEN=16
 # CONFIG_DEBUG_LL is not set
diff -u --new-file --recursive linux-2.4.3-ac2.orig/arch/arm/def

Re: vmalloc on 2.4.x on ia64

2001-04-04 Thread Andi Kleen

On Wed, Apr 04, 2001 at 06:11:32PM -0600, [EMAIL PROTECTED] wrote:
> I am calling during initialization only from detect() entry point.
> But I guess, before the detect() is called, scsi layer acquires
> the io_request_lock. So, you mean to say that I need to release it
> before calling vmalloc() ? I was doing the same thing on 2.2.x
> and even on 2.4.0 and it was working fine and now suddenly
> it stopped working on 2.4.2. So what are the guidelines for using
> vmalloc() if we want to use it in scsi low-level (HBA) driver ?
> I am currently using the new error handling code. (use_new_eh_code = TRUE).

It probably never worked correctly in all cases.

If you don't rely on the synchronization given by the io_request_lock you can
drop it around the vmalloc() call. 


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: vmalloc on 2.4.x on ia64

2001-04-04 Thread hiren_mehta

I am calling during initialization only from detect() entry point.
But I guess, before the detect() is called, scsi layer acquires
the io_request_lock. So, you mean to say that I need to release it
before calling vmalloc() ? I was doing the same thing on 2.2.x
and even on 2.4.0 and it was working fine and now suddenly
it stopped working on 2.4.2. So what are the guidelines for using
vmalloc() if we want to use it in scsi low-level (HBA) driver ?
I am currently using the new error handling code. (use_new_eh_code = TRUE).

Regards,
-hiren

> -Original Message-
> From: Alan Cox [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, April 04, 2001 5:03 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: Re: vmalloc on 2.4.x on ia64
> 
> 
> > Can we call vmalloc() or get_free_pages() from scsi 
> low-level driver 
> > (HBA driver) ? The reason why I am asking is because, I am calling
> 
> It depends where. You can call it during initialisation if 
> you arent holding
> the io_request_lock for example.
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Contacts within AMD? AMD-756 USB host-controller blacklisted dueto

2001-04-04 Thread Thomas Dodd

David Brownell wrote:
> > please correct me if i'm wrong i only don't want to blacklist complete
> > chipset-series
> 
> Then feel free to develop and submit a better fix.  That'd
> be more practical if AMD's workaround were public.  As I
> understand it, the bulk of the production chips have this
> erratum.  More power to RedHat getting info from AMD.
> Meanwhile, this patch improves robustness.

Comprimise?

This patch make it a config option to enable the AMD-756.
It's marked DANGEROUS and EXPERIMENTAL, and is only
available if CONFIG_EXPERIMENTAL is set.

This makes the default to blacklist the AMD-756
but it can be used if one wants to try.

-Thomas

diff -u --new-file --recursive linux-2.4.3-ac2.orig/drivers/usb/Config.in 
linux-2.4.3-ac2/drivers/usb/Config.in
--- linux-2.4.3-ac2.orig/drivers/usb/Config.in  Wed Apr  4 15:23:13 2001
+++ linux-2.4.3-ac2/drivers/usb/Config.in   Wed Apr  4 16:13:52 2001
@@ -24,6 +24,9 @@
   dep_tristate '  UHCI Alternate Driver (JE) support' CONFIG_USB_UHCI_ALT 
$CONFIG_USB
fi
dep_tristate '  OHCI (Compaq, iMacs, OPTi, SiS, ALi, ...) support' CONFIG_USB_OHCI 
$CONFIG_USB
+   if [ "$CONFIG_EXPERIMENTAL" = "y" ]; then
+  bool '  AMD-756 OHCI support (DANGEROUS)(EXPERIMENTAL)' CONFIG_AMD_OHCI_OK
+   fi

comment 'USB Device Class drivers'
dep_tristate '  USB Audio support' CONFIG_USB_AUDIO $CONFIG_USB $CONFIG_SOUND
diff -u --new-file --recursive linux-2.4.3-ac2.orig/drivers/usb/usb-ohci.c 
linux-2.4.3-ac2/drivers/usb/usb-ohci.c
--- linux-2.4.3-ac2.orig/drivers/usb/usb-ohci.c Wed Apr  4 15:23:15 2001
+++ linux-2.4.3-ac2/drivers/usb/usb-ohci.c  Wed Apr  4 16:18:01 2001
@@ -2332,13 +2332,14 @@
unsigned long mem_resource, mem_len;
void *mem_base;

+#ifndef CONFIG_AMD_OHCI_OK
/* blacklisted hardware? */
if (id->driver_data) {
info ("%s (%s): %s", dev->slot_name,
dev->name, (char *) id->driver_data);
return -ENODEV;
}
-
+#endif
if (pci_enable_device(dev) < 0)
return -ENODEV;

@@ -2508,6 +2509,7 @@
 * AMD-756 [Viper] USB has a serious erratum when used with
 * lowspeed devices like mice; oopses have been seen.  The
 * vendor workaround needs an NDA ... for now, blacklist it.
+* Use CONFIG_AMD_OHCI_OK to try anyway.
 */
vendor: 0x1022,
device: 0x740c,




MPEG-2 decoding driver for Trident Cyberblade i7

2001-04-04 Thread Michael Shiloh

Hello,

Searching on the web, I see a few question but no answers
to the question of whether a driver exists that can utilize
the MPEG-2 hardware assist feature of the Trident
Cyberblade i7.

Any pointers?

Thanks
Michael

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: vmalloc on 2.4.x on ia64

2001-04-04 Thread Alan Cox

> Can we call vmalloc() or get_free_pages() from scsi low-level driver 
> (HBA driver) ? The reason why I am asking is because, I am calling

It depends where. You can call it during initialisation if you arent holding
the io_request_lock for example.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Contacts within AMD? AMD-756 USB host-controller blacklisted due to

2001-04-04 Thread Ryan Butler

Miles Lane wrote:

> 
> 
> Personally, I agree with you, but I can also understand David's
> desire to avoid wasting time chasing phantom bugs that only
> show up due to this broken hardware.  If it turns out that
> there is actually a well-defined workaround that AMD will
> tell us about, it shouldn't take too long before we have a
> real fix and the AMD-756 can be taken off of the blacklist.
> 
> My guess is that there are specific drivers for which this
> hardware bug causes problems.  You probably just aren't
> using the *right* drivers.  :-)
> 
00:07.4 USB Controller: Advanced Micro Devices [AMD] AMD-756 [Viper] USB 
(rev 06) (prog-if 10 [OHCI])
   Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
   Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



RE: vmalloc on 2.4.x on ia64

2001-04-04 Thread hiren_mehta

Can we call vmalloc() or get_free_pages() from scsi low-level driver 
(HBA driver) ? The reason why I am asking is because, I am calling
vmalloc from scsi low-level driver and I tried this on 2.4.2 on
ia32 as well as ia64 and on both the systems, it is hanging.
on ia64 it happens everytime whereas on ia32 it happens intermittently. 
In case of ia32, I had watchdog enabled. So, on ia32, it detects LOCKUP
and generates call trace. I am yet to try get_free_pages().

TIA,
-hiren

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> Sent: Monday, April 02, 2001 6:30 PM
> To: [EMAIL PROTECTED]
> Subject: RE: vmalloc on 2.4.x on ia64
> 
> 
> > That is what I said. I am using vmalloc only. But the call to
> > vmalloc is hanging.
> 
> Oops, my mistake.  a) that shouldn't happen.  b) if it does, try
> get_free_pages().
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.2.19 borks am-utils building :-(

2001-04-04 Thread Andre Hedrick


The subject says it all

Andre Hedrick
Linux ATA Development

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: pcnet32 (maybe more) hosed in 2.4.3

2001-04-04 Thread Petr Vandrovec

Thomas Bogendoerfer wrote:
> 
> On Wed, Apr 04, 2001 at 01:14:16PM -0700, Petr Vandrovec wrote:
> > VMware is working on implementation PCnet 32bit mode in emulation (there
> > is no such thing now because of no OS except FreeBSD needs it). But
> > my question is - is there some real benefit in running chip in
> > 32bit mode?
> 
> probably not.
> 
> > so is 32bit mode needed for bigendian ports, or what's reasoning
> > behind it?
> 
> I've added 32bit mode for some IBM PowerPC machines. The firmware
> on this machines setup the chip to DWIO and I haven't found a way
> to switch it back to WIO.

Current Linux driver switches them to 16bit mode in pcnet_probe1:

pcnet_dwio_reset(); // reset to 16bit mode when in 32bit, ignore in
16bit mode
pcnet_wio_reset();  // device is for sure in 16bit mode, but reset it
again to 
// get it into known state if we were in 16bit mode
already

So you should find hardware always in 16bit mode at this point. If it
does not work, maybe you need to xor PCNET32_WIO_* values with 2 on
PowerPC...
Best regards,
Petr Vandrovec
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Underscore in rivafb

2001-04-04 Thread Petr Vandrovec

Stuart McFadden wrote:
> 
> Hi,
>The flashing block in rivafb was annoying me, so here is a diff (against
> vanilla 2.4.3 ) of a quick hack in case anyone else was having the same problem.

Get a look at
drivers/video/matrox/matroxfb_misc:matroxfb_createcursorshape, and
its callers, matroxfb_*_createcursor. If you'll use
conp->vc_cursor_type, standard
escape sequences for disabling cursor and for shape selection will work
on
riva then...
Petr Vandrovec
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] - USB/Vendor ID update, nothing earthshaking..

2001-04-04 Thread Nikhil Goel


but shall be helpful for a few thousand users of the device.

Thanks,
Nikhil

--  Nikhil Goel 
<[EMAIL PROTECTED]>  10, Anson 
Road Ph +65 3240210 Ext 20 International Plaza,#22-14 Fx +65 3240607 
Singapore 079903 

A good reputation is more valuable than money.
-- Publilius Syrus






--- drivers/usb/pegasus.h.orig  Mon Jan  1 12:54:29 2001
+++ drivers/usb/pegasus.h   Mon Jan  1 13:06:38 2001
@@ -140,7 +140,7 @@
 #define VENDOR_MELCO0x0411
 #define VENDOR_SMC  0x0707
 #define VENDOR_SOHOWARE 0x15e8
-
+#define VENDOR_SMARTBRIDGES0x08d1
 
 #else  /* PEGASUS_DEV */
 
@@ -193,6 +193,7 @@
DEFAULT_GPIO_RESET )
 PEGASUS_DEV( "SOHOware NUB100 Ethernet", VENDOR_SOHOWARE, 0x9100,
DEFAULT_GPIO_RESET )
-
+PEGASUS_DEV( "smartNIC 2 PnP Adapter", VENDOR_SMARTBRIDGES, 0x0003,
+   DEFAULT_GPIO_RESET | PEGASUS_II )
 
 #endif /* PEGASUS_DEV */



Re: Contacts within AMD? AMD-756 USB host-controller blacklisted dueto

2001-04-04 Thread David Brownell

> Apr  4 14:47:15 campari kernel: usb-ohci.c: bogus NDP=204 for OHCI 
> usb-00:07.4
> Apr  4 14:47:15 campari kernel: usb-ohci.c: rereads as NDP=4

Means that your system would have oopsed if it hadn't
tested for the bogus register read (NDP).  That's only one
path; other bogus reads (which could also oops) on other
paths are undetected.  Slightly less-bogus reads on that
particular path may not be detected, and can still oops.

> please correct me if i'm wrong i only don't want to blacklist complete
> chipset-series

Then feel free to develop and submit a better fix.  That'd
be more practical if AMD's workaround were public.  As I
understand it, the bulk of the production chips have this
erratum.  More power to RedHat getting info from AMD.
Meanwhile, this patch improves robustness.

- Dave


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [BUG] smbfs: caching problems

2001-04-04 Thread Urban Widmark

On Sun, 1 Apr 2001, Xuan Baldauf wrote:

> there is something wrong with smbfs caching which makes my
> applications fail. The behaviour happens with
> linux-2.4.3-pre4 and linux-2.4.3-final.
> 
> Consider following shell script: (where /mnt/n is a
> smbmounted smb share from a Win98SE box)

Try the attached patch, as a workaround.

Not really sure what is happening, but it seems like win98se isn't
updating the filesize immediately (?).

After truncating the file to 0 bytes the server still returns the old size
(516) when asked (smb_proc_getattr). Somewhere that causes something to
keep the pages for the file (smb_revalidate?) or simply be confused on the
length of the file (508).

I don't understand how as vmtruncate should have thrown out the old stuff
already ... maybe the same page is reused and the last bytes (that
shouldn't be in the file) remain from the last write.

It works with NT4 and Samba, they both return the expected 0 bytes after
truncating to 0. refresh = 0 will not ask and instead run with the 0 byte
length that vmtruncate has set.

/Urban


diff -urN -X exclude linux-2.4.3-orig/fs/smbfs/inode.c 
linux-2.4.3-smbfs/fs/smbfs/inode.c
--- linux-2.4.3-orig/fs/smbfs/inode.c   Sat Mar 31 19:11:53 2001
+++ linux-2.4.3-smbfs/fs/smbfs/inode.c  Thu Apr  5 00:32:07 2001
@@ -234,9 +234,10 @@
last_sz   = inode->i_size;
error = smb_refresh_inode(dentry);
if (error || inode->i_mtime != last_time || inode->i_size != last_sz) {
-   VERBOSE("%s/%s changed, old=%ld, new=%ld\n",
+   VERBOSE("%s/%s changed, old=%ld, new=%ld, osz=%ld, sz=%ld\n",
DENTRY_PATH(dentry),
-   (long) last_time, (long) inode->i_mtime);
+   (long) last_time, (long) inode->i_mtime,
+   (long) last_sz, (long) inode->i_size);
 
if (!S_ISDIR(inode->i_mode))
invalidate_inode_pages(inode);
@@ -550,7 +551,7 @@
if (error)
goto out;
vmtruncate(inode, attr->ia_size);
-   refresh = 1;
+   refresh = 0;
}
 
/*



Re: a quest for a better scheduler

2001-04-04 Thread Christopher Smith

--On Wednesday, April 04, 2001 15:16:32 -0700 Tim Wright <[EMAIL PROTECTED]> 
wrote:
> On Wed, Apr 04, 2001 at 03:23:34PM +0200, Ingo Molnar wrote:
>> nope. The goal is to satisfy runnable processes in the range of NR_CPUS.
>> You are playing word games by suggesting that the current behavior
>> prefers 'low end'. 'thousands of runnable processes' is not 'high end'
>> at all, it's 'broken end'. Thousands of runnable processes are the sign
>> of a broken application design, and 'fixing' the scheduler to perform
>> better in that case is just fixing the symptom. [changing the scheduler
>> to perform better in such situations is possible too, but all solutions
>> proposed so far had strings attached.]
>
> Ingo, you continue to assert this without giving much evidence to back it
> up. All the world is not a web server. If I'm running a large OLTP
> database with thousands of clients, it's not at all unreasonable to
> expect periods where several hundred (forget the thousands) want to be
> serviced by the database engine. That sounds like hundreds of schedulable
> entities be they processes or threads or whatever. This sort of load is
> regularly run on machine with 16-64 CPUs.

Actually, it's not just OLTP, anytime you are doing time sharing between 
hundreds of users (something POSIX systems are supposed to be good at) this 
will happen.

> Now I will admit that it is conceivable that you can design an
> application that finds out how many CPUs are available, creates threads
> to match that number and tries to divvy up the work between them using
> some combination of polling and asynchronous I/O etc. There are, however
> a number of problems with this approach:

Actually, one way to semi-support this approach is to implement 
many-to-many threads as per the Solaris approach. This also requires 
significant hacking of both the kernel and the runtime, and certainly is 
significantly more error prone than trying to write a flexible scheduler.

One problem you didn't highlight that even the above case does not happily 
identify is that for security reasons you may very well need each user's 
requests to take place in a different process. If you don't, then you have 
to implement a very well tested and secure user-level security mechanism to 
ensure things like privacy (above and beyond the time-sharing).

The world is filled with a wide variety of types of applications, and 
unless you know two programming approaches are functionaly equivalent (and 
event driven/polling I/O vs. tons of running processes are NOT), you 
shouldn't say one approach is "broken". You could say it's a "broken" 
approach to building web servers. Unfortunately, things like kernels and 
standard libraries should work well in the general case.

--Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] Revised memory-management stuff (was: OOM killer)

2001-04-04 Thread Andreas Dilger

Jonathon Morton writes:
> MAJOR: OOM killer now only activates when truly out of memory, ie. when
> buffer and cache memory has already been eaten down to the bone.

Good.

> MEDIUM: IOW, if the allocating process is already 4x the size of the
> remaining free memory, reservation of more memory (by fork(), malloc()
> or related calls) will fail.

I'm not sure I follow this one.  Granted it punishes larger programs,
but is this really good?  If I read it correctly, it essentially means
that it is impossible for a single process to use > 80% of the VM.  For
some types of applications (e.g. Oracle and such) which are implemented
as a number of separate processes this is _probably_ OK (it would fail
if the first process does all of the allocation before forking), but what
about monolithic apps which want to use the whole VM space (e.g. the
numerical methods program that was mentioned at the start of the OOM thread)?

It also totally breaks applications which malloc huge amounts of memory
but only sparsely use this memory (again mostly scientific apps do this).
I guess it also depends on whether 4x is for memory "allocated" or for
memory "used".  It should probably be directly tied to vm-overcommit flag.

In most cases this will probably work OK (reserve part of VM for other
processes), but why introduce a new VM parameter that needs to be tuned?
I agree it is better to return NULL from malloc() rather than invoking
OOM, but I think the "4x" heuristic needs to be looked at more closely
or changed.  It may also cause problems if your box is at the edge of
OOM and malloc fails for bash (or other "small" program) because the
amount of remaining VM is very small (so 4x "small" is still not enough
for bash to run, and root to fix the system).

> MEDIUM: The OOM killer algorithm has been reworked to be a little more
> intelligent by default, and also now allows the sysadmin to specify PIDs
> and/or process names which should be left untouched.  Simply echo a
> space-delimited list of PIDs and/or process names into

If you allow process names into the picture, it opens an easy DOS attack.
A memory hog simply runs under one of the "protected" names and is
immune from being killed, but causes every other process on the box to
die.  I'm pretty sure this idea was suggested and previously shot down
at least once.

It should be easy enough to write a user tool (script even) which outputs
all PIDs of process X, and limits this list to the current (or specified)
UID.  The OOM-unkillable trait would be stored as a per-process flag, rather
than a list to be checked against.  Not only does this make for faster
checking (O(1) vs. O(n)), but it also means that we don't have stale
OOM-unkillable entries in the list.  The non-OOM trait would be inherited
across fork() (but cleared on set*uid(), or maybe it should be a capability)
so that processes (e.g. httpd) which spawn/kill helper tasks do not have
to keep updating a list.  This also prevents the situation where PID X is
protected from OOM, but is stopped and later another process takes its PID.


All in all, I think having such a huge patch basically guarantees it will
not make it into the kernel.  IMHO, it is better to split this into at
least 3 or 4 patches so that it is manageable to see what is being changed
for each part.

Cheers, Andreas

PS - I don't think the dentry/inode slab caches are currently shrunk even
 under VM pressure - there was a thread on this recently about these
 caches filling up the memory on a 1GB machine.  However, there was
 also a patch to fix for this posted to this list recently.

PPS - can you try and keep comments within 80 columns?
-- 
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/   -- Dogbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: uninteruptable sleep (D state => load_avrg++)

2001-04-04 Thread Tim Wright

On Wed, Apr 04, 2001 at 02:13:49PM +0200, christophe barbe wrote:
> The sleep should certainly be interruptible and I that's what I said to the GFS guy.
> But what the reason to increment the load average for each D process ?
> 

OK, the Unix history goes something like this. Synchronization was achieved
using two primitives, sleep() and wakeup(). These guys rendezvous'd on a
wait channel, which was simply an 'int', and by convention was actually the
address of a data structure (yes I know int and pointers aren't the same, this
is a long time ago, OK ? :-).
Anyway, when you called sleep, you also had an associated priority. Priority
values less than PZERO were "high" priority, and >= PZERO were "low" priority.
sleeping above PZERO was interruptible, and processes sleeping at this priority
did not count towards the load. The idea was to use this for events that
potentially might never happen. Sleeping at a priority < PZERO was intended
to be used for things that are absolutely 100% guaranteed to happen, preferably
sometime very soon. Disk I/O (real disks, not NFS) fell into this category,
and hence it counts towards the load since this could be deemed a "fast wait"
state, and the process is nominally runnable. All a bit hand-wavy I know, but
it worked well enough.

The really important part of all this is that you should never sleep
uninterruptibly for anything that you cannot absolutely guarantee will happen,
otherwise you wind up with a stuck process.

Regards,

Tim


-- 
Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED]
IBM Linux Technology Center, Beaverton, Oregon
Interested in Linux scalability ? Look at http://lse.sourceforge.net/
"Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Contacts within AMD? AMD-756 USB host-controller blacklisted due to

2001-04-04 Thread Miles Lane

Joachim 'roh' Steiger wrote:

> i would like to help to track down this problem
> i'm using a gigabyte 7IXE revision 1.1
> kernel is 2.4.1
> 
> lspci output for usb:
> 00:07.4 USB Controller: Advanced Micro Devices [AMD] AMD-756 [Viper] USB
> (rev 06) (prog-i
> f 10 [OHCI])
> Flags: bus master, medium devsel, latency 16, IRQ 11
> Memory at efffc000 (32-bit, non-prefetchable) [size=4K]
> 
> 
> On Wed, 4 Apr 2001, Miles Lane wrote:
> 
>> Thomas Dodd wrote:
>> 
>> 
>>> Alan Cox wrote:
>>> 
>>> 
 because we dont know the full scope of the problem yet.
>>> 
>>> Exactly how many bug reports has this caused?
>>> What kind of problems?
>> 
> 
> here i only have this kernelmessage floating around in my logfiles about 1
> time the day: 
> 
> Apr  4 14:47:15 campari kernel: usb-ohci.c: bogus NDP=204 for OHCI 
> usb-00:07.4
> Apr  4 14:47:15 campari kernel: usb-ohci.c: rereads as NDP=4
> 
> 
>> error."  Most of the time, when the error occurs, it seems
>> pretty benign.  That is, I haven't noticed it crashing USB
>> device connections, causing data corruption or OOPSen.
>> Some folks _have_ reported OOPSen, though, that seemed to
>> be triggered by the erratum #4 hardware bug.  I think I
>> may have had one of these a long time ago.
> 
> 
> as you see it's revision 6
> i've had no other problems with usb for now and use this
>  idVendor   0x046d Logitech Inc.
>  idProduct  0xc00c 
> usb-wheelmouse all the time
> 
> i've never had this kernel or previous kernel (2.4.0test8) oopsen 
> and it runs perfectly stable here
> 
> 
>> I believe David has found that there definitely are code
>> paths where this hardware bug can cause failures of various
>> sorts and that's why the AMD-756 has been blacklisted.
> 
> 
> since i did'nt cause any troubles here i would not like to have the
> complete AMD-756 blacklisted in the ohci-driver
> eventually only some revisions are that bad
> 
> please correct me if i'm wrong i only don't want to blacklist complete
> chipset-series

Hi Joachim,

Personally, I agree with you, but I can also understand David's
desire to avoid wasting time chasing phantom bugs that only
show up due to this broken hardware.  If it turns out that
there is actually a well-defined workaround that AMD will
tell us about, it shouldn't take too long before we have a
real fix and the AMD-756 can be taken off of the blacklist.

My guess is that there are specific drivers for which this
hardware bug causes problems.  You probably just aren't
using the *right* drivers.  :-)

Luckily, USB add-on cards are pretty cheap, so I suppose you
could just put a new host-controller in your test machine
for a month or two until David and Alan get this sorted out
with AMD.  Think of it this way, you'll have more hardware
configurations to test with, so get a UHCI or EHCI card.
Woohoo!  (Only half kidding)

Miles

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: pcnet32 (maybe more) hosed in 2.4.3

2001-04-04 Thread Thomas Bogendoerfer

On Wed, Apr 04, 2001 at 01:14:16PM -0700, Petr Vandrovec wrote:
> VMware is working on implementation PCnet 32bit mode in emulation (there
> is no such thing now because of no OS except FreeBSD needs it). But
> my question is - is there some real benefit in running chip in
> 32bit mode?

probably not.

> so is 32bit mode needed for bigendian ports, or what's reasoning
> behind it?

I've added 32bit mode for some IBM PowerPC machines. The firmware
on this machines setup the chip to DWIO and I haven't found a way
to switch it back to WIO.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea. [ Alexander Viro on linux-kernel ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Underscore in rivafb

2001-04-04 Thread Stuart McFadden

Hi,
   The flashing block in rivafb was annoying me, so here is a diff (against 
vanilla 2.4.3 ) of a quick hack in case anyone else was having the same problem.

Stuarty,

diff -urN linux.pure/drivers/video/riva/fbdev.c linux/drivers/video/riva/fbdev.c
--- linux.pure/drivers/video/riva/fbdev.c   Wed Apr  4 22:34:19 2001
+++ linux/drivers/video/riva/fbdev.cWed Apr  4 22:26:43 2001
@@ -534,7 +534,7 @@
struct riva_cursor *c = rinfo->cursor;
int i, j, idx;
 
-   if (c) {
+   if (c) {
if (width <= 0 || height <= 0) {
width = 8;
height = 16;
@@ -547,13 +547,16 @@

idx = 0;
 
-   for (i = 0; i < height; i++) {
-   for (j = 0; j < width; j++,idx++)
-   c->image[idx] = CURSOR_COLOR;
-   for (j = width; j < MAX_CURS; j++,idx++)
+   for (i = MAX_CURS; i > height + 2; i--) 
+   for (j = 0; j < MAX_CURS; j++,idx++)
c->image[idx] = TRANSPARENT_COLOR;
+   for (i = height + 2; i > height; i--) {
+   for (j = 0; j < width; j++,idx++)
+   c->image[idx] = CURSOR_COLOR;
+   for (j = width; j < MAX_CURS;j++,idx++)
+   c->image[idx] = TRANSPARENT_COLOR;
}
-   for (i = height; i < MAX_CURS; i++)
+   for (i = height; i > 0; i--) 
for (j = 0; j < MAX_CURS; j++,idx++)
c->image[idx] = TRANSPARENT_COLOR;
}



-- 
Start the day with a smile.  After that you can be your nasty old self again.
 - - - - - - - - - - - - - - - - - - - - -  
Stuarty McFadden [EMAIL PROTECTED] 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Contacts within AMD? AMD-756 USB host-controller blacklisteddue to

2001-04-04 Thread Joachim 'roh' Steiger

i would like to help to track down this problem
i'm using a gigabyte 7IXE revision 1.1
kernel is 2.4.1

lspci output for usb:
00:07.4 USB Controller: Advanced Micro Devices [AMD] AMD-756 [Viper] USB
(rev 06) (prog-i
f 10 [OHCI])
Flags: bus master, medium devsel, latency 16, IRQ 11
Memory at efffc000 (32-bit, non-prefetchable) [size=4K]


On Wed, 4 Apr 2001, Miles Lane wrote:
> Thomas Dodd wrote:
> 
> > Alan Cox wrote:
> >
> >> because we dont know the full scope of the problem yet.
> > Exactly how many bug reports has this caused?
> > What kind of problems?

here i only have this kernelmessage floating around in my logfiles about 1
time the day: 

Apr  4 14:47:15 campari kernel: usb-ohci.c: bogus NDP=204 for OHCI 
usb-00:07.4
Apr  4 14:47:15 campari kernel: usb-ohci.c: rereads as NDP=4

> error."  Most of the time, when the error occurs, it seems
> pretty benign.  That is, I haven't noticed it crashing USB
> device connections, causing data corruption or OOPSen.
> Some folks _have_ reported OOPSen, though, that seemed to
> be triggered by the erratum #4 hardware bug.  I think I
> may have had one of these a long time ago.

as you see it's revision 6
i've had no other problems with usb for now and use this
 idVendor   0x046d Logitech Inc.
 idProduct  0xc00c 
usb-wheelmouse all the time

i've never had this kernel or previous kernel (2.4.0test8) oopsen 
and it runs perfectly stable here

> I believe David has found that there definitely are code
> paths where this hardware bug can cause failures of various
> sorts and that's why the AMD-756 has been blacklisted.

since i did'nt cause any troubles here i would not like to have the
complete AMD-756 blacklisted in the ohci-driver
eventually only some revisions are that bad

please correct me if i'm wrong i only don't want to blacklist complete
chipset-series

roh
-- 
Joachim 'roh' Steiger  mailto:[EMAIL PROTECTED]
Convergence Integrated Media GmbH  http://www.convergence.de
Rosenthaler Str. 51fon: +49(0)30-72 62 06 77
10178 Berlin, Germany  fax: +49(0)30-72 62 06 55

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: a quest for a better scheduler

2001-04-04 Thread Tim Wright

On Wed, Apr 04, 2001 at 03:23:34PM +0200, Ingo Molnar wrote:
> 
> On Wed, 4 Apr 2001, Hubertus Franke wrote:
> 
> > I understand the dilemma that the Linux scheduler is in, namely
> > satisfy the low end at all cost. [...]
> 
> nope. The goal is to satisfy runnable processes in the range of NR_CPUS.
> You are playing word games by suggesting that the current behavior prefers
> 'low end'. 'thousands of runnable processes' is not 'high end' at all,
> it's 'broken end'. Thousands of runnable processes are the sign of a
> broken application design, and 'fixing' the scheduler to perform better in
> that case is just fixing the symptom. [changing the scheduler to perform
> better in such situations is possible too, but all solutions proposed so
> far had strings attached.]
> 


Ingo, you continue to assert this without giving much evidence to back it up.
All the world is not a web server. If I'm running a large OLTP database with
thousands of clients, it's not at all unreasonable to expect periods where
several hundred (forget the thousands) want to be serviced by the database
engine. That sounds like hundreds of schedulable entities be they processes
or threads or whatever. This sort of load is regularly run on machine with
16-64 CPUs.

Now I will admit that it is conceivable that you can design an application that
finds out how many CPUs are available, creates threads to match that number
and tries to divvy up the work between them using some combination of polling
and asynchronous I/O etc. There are, however a number of problems with this
approach:
1) It assumes that this is the only workload on the machine. If not it quickly
becomes sub-optimal due to interactions between the workloads. This is a
problem that the kernel scheduler does not suffer from.
2) It requires *every* application designer to design an effective scheduler
into their application. I would submit that an effective scheduler is better
situated in the operating system.
3) It is not a familiar programming paradigm to many Unix/Linux/POSIX
programmers, so you have a sort of impedance mismatch going on.

Since the proposed scheduler changes being talked about here have been shown
to not hurt the "low end" and to dramatically improve the "high end", I fail
to understand the hostility to the changes. I can understand that you do not
feel that this is the correct way to architect an application, but if the
changes don't hurt you, why sabotage changes that also allow a different
method to work. There isn't one true way to do anything in computing.

Tim

-- 
Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED]
IBM Linux Technology Center, Beaverton, Oregon
Interested in Linux scalability ? Look at http://lse.sourceforge.net/
"Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.2.19 hangs just before "Uncompressing Linux"

2001-04-04 Thread José Luis Domingo López

Hi kernel gurus & testers:

When booting from my recently compiled Linux kernel version 2.2.19, about
90-95% of times the system hangs just after printing:
Loading nameofimage.

And doesn't get to the point where the kernel uncompresses (that is, no
Uncompressing message). However, booting the same kernel from floppy is OK
and works 100% of times (the same applies for other kernel versions).

With 2.2.18 I've been experiencing the same behavior, but with this
version instead of hanging, the machine reboots itself, and "only" about
50% of times. Also I discovered that pressing some keybpard keys while the
kernel is being loaded makes it more probable to get a succesful boot.

With kernel version 2.2.17-idepci as shipped on Debian Potato, I've never
experienced any kind of problems.

Searching through the mailing list archives I've seen comments about a
possible defective RAM causing this problem. However, the system never
hung up once it is up un runnig, and this machine is nearly always using
all of my RAM and working at high workloads.

Information about related important hardware and software follows:
Debian Woody with gcc 2.95.3-5

cat /proc/cpuinfo (166 MHz Pentium @ 166 MHz)
-
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 5
model   : 2
model name  : Pentium 75 - 200
stepping: 12
cpu MHz : 165.793
fdiv_bug: no
hlt_bug : no
sep_bug : no
f00f_bug: yes
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr mce cx8
bogomips: 330.95

cat /proc/pci
-
00:00.0 Host bridge: Intel Corporation 430HX - 82439HX TXC [Triton II] (rev 03)
Flags: bus master, medium devsel, latency 32

00:07.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] (rev 01)
Flags: bus master, medium devsel, latency 0

00:07.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] (prog-if 
80 [Master])
Flags: bus master, medium devsel, latency 32
I/O ports at f000

dmesg
-
Linux version 2.2.19 (root@dardhal) (gcc version 2.95.3 20010219 (prerelease)) #1 sáb 
mar 31 23:19:47 UTC 2001
BIOS-provided physical RAM map:
 BIOS-e820: 0009fc00 @  (usable)
 BIOS-e820: 0400 @ 0009fc00 (usable)
 BIOS-e820: 03f0 @ 0010 (usable)
Detected 165793 kHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 330.95 BogoMIPS
Memory: 63380k/65536k available (832k kernel code, 416k reserved, 864k data, 44k init)
Dentry hash table entries: 8192 (order 4, 64k)
Buffer cache hash table entries: 65536 (order 6, 256k)
Page cache hash table entries: 16384 (order 4, 64k)
VFS: Diskquotas version dquot_6.4.0 initialized
CPU: Intel Pentium 75 - 200 stepping 0c
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
Checking 'hlt' instruction... OK.
Intel Pentium with F0 0F bug - workaround enabled.
POSIX conformance testing by UNIFIX
PCI: PCI BIOS revision 2.10 entry at 0xfb230
PCI: Using configuration type 1
PCI: Probing PCI hardware
Linux NET4.0 for Linux 2.2
Based upon Swansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0 for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
TCP: Hash tables configured (ehash 65536 bhash 65536)
Initializing RT netlink socket
Starting kswapd v 1.5 
Serial driver version 4.27 with no serial options enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
pty: 256 Unix98 ptys configured
PIIX3: IDE controller on PCI bus 00 dev 39
PIIX3: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:pio, hdb:pio
hda: QUANTUM BIGFOOT2550A, ATA DISK drive
hdb: CDA46803I, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: QUANTUM BIGFOOT2550A, 2457MB w/87kB Cache, CHS=624/128/63, DMA
hdb: ATAPI 4X CD-ROM drive, 240kB Cache
Uniform CD-ROM driver Revision: 3.11
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
Partition check:
 hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 hda10 >

-- 
José Luis Domingo López
Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM)
 
jdomingo EN internautas PUNTO org  => ¿ Spam ? Atente a las consecuencias
jdomingo AT internautas DOT   org  => Spam at your own risk

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: memory size detection problem on 2.3.16+ and 2.4.x

2001-04-04 Thread Michael Miller

Hi,

>> the bios will set the carry flag on the return from the call should 
>> there be an error.  However, the BIOS on my PC doesnt do this- infact 
>> it seems to simply return from the call without changing any registers. 
>
> Your BIOS is faulty. No new suprises. 

Hmm- not as wrong as I had at first thought...

>
>>  meme801: 
>> + xorl%edx, %edx  # Clear regs to work around 
>> + xorl%ecx, %ecx  # flakey BIOSes which don't 
>> + # use carry bit correctly 
>> + # This way we get 0MB ram on 
>> + # call failure 
>
>Wouldn't setting the carry flag be clearer ? 

Yup- so I gave it a try and I was surprised that my OOPs happened again. Turns
out (after doing loads of tests using ms debug!!!) that my BIOS does clear
the carry flag (correctly!) and sets the memory size in AX and BX as opposed
to CX and DX (correct? possibly) which is what the Linux kernel uses.  The
crunch which got my machine, was that CX and DX are returned unchanged.  Thus
my xorl above was making the problem go away.

I looked on various sites (including Grub and Ralf Brown(?) interrupt lists)
and it seems that AX and BX are for 'extended memory' and CX/DX are for
'configured' memory...  No one seemed clear on the difference.  Since CX/DX
are currently being used byu the kernel, I thought I would make my patch
have minimal impact on users who are currently working fine.

As a result I have a second patch, which I would like to propose for
kernel addition, below.  This patch basically sets cx/dx to 0x0 before
the e801 call and then tests to see if they are still both 0, if so
ax/bx are used instead.  Obviously the carry test for sucess is still
in place.

Many thanks for suggesting an alternate solution, which although did not
solve the problem (due to my original wrong info) it did result in, 
what I consider to be, a better solution...

Mike



--- linux-2.4.2-orig/arch/i386/boot/setup.S Sat Jan 27 18:51:35 2001
+++ linux/arch/i386/boot/setup.SWed Apr  4 22:30:31 2001
@@ -32,6 +32,16 @@
  *
  * Transcribed from Intel (as86) -> AT&T (gas) by Chris Noe, May 1999.
  * <[EMAIL PROTECTED]>
+ *
+ * Fix to work around buggy BIOSes which dont use carry bit correctly
+ * and/or report extended memory in CX/DX for e801h memory size detection 
+ * call.  As a result the kernel got wrong figures.  The int15/e801h docs
+ * from Ralf Brown interrupt list seem to indicate AX/BX should be used
+ * anyway.  So to avoid breaking many machines (presumably there was a reason
+ * to orginally use CX/DX instead of AX/BX), we do a kludge to see
+ * if CX/DX have been changed in the e801 call and if so use AX/BX .
+ * Michael Miller, April 2001 <[EMAIL PROTECTED]>
+ *
  */
 
 #define __ASSEMBLY__
@@ -341,10 +351,24 @@
 # to write everything into the same place.)
 
 meme801:
+   stc # fix to work around buggy
+   xorw%cx,%cx # BIOSes which dont clear/set
+   xorw%dx,%dx # carry on pass/error of
+   # e801h memory size call
+   # or merely pass cx,dx though
+   # without changing them.
movw$0xe801, %ax
int $0x15
jc  mem88
 
+   cmpw$0x0, %cx   # Kludge to handle BIOSes
+   jne e801usecxdx # which report their extended
+   cmpw$0x0, %dx   # memory in AX/BX rather than
+   jne e801usecxdx # CX/DX.  The spec I have read
+   movw%ax, %cx# seems to indicate AX/BX 
+   movw%bx, %dx# are more reasonable anyway...
+
+e801usecxdx:
andl$0x, %edx   # clear sign extend
shll$6, %edx# and go from 64k to 1k chunks
movl%edx, (0x1e0)   # store extended memory size
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH] Revised memory-management stuff (was: OOM killer)

2001-04-04 Thread Jonathan Morton

The attached patch applies to 2.4.3 and should address the most serious
concerns surrounding OOM and low-memory situations for most people.  A
summary of the patch contents follows:

MAJOR: OOM killer now only activates when truly out of memory, ie. when
buffer and cache memory has already been eaten down to the bone.

MEDIUM: The allocation mechanism will now only allow processes to reserve
memory if there is sufficient memory remaining *and* the process is not
already hogging RAM.  IOW, if the allocating process is already 4x the
size of the remaining free memory, reservation of more memory (by fork(),
malloc() or related calls) will fail.

MEDIUM: The OOM killer algorithm has been reworked to be a little more
intelligent by default, and also now allows the sysadmin to specify PIDs
and/or process names which should be left untouched.  Simply echo a
space-delimited list of PIDs and/or process names into
/proc/sys/vm/oom-no-kill, and the OOM killer will ignore all processes
matching any entry in the list until only they and init remain.  Init (as
PID 1 or as a root process named "init") is now always
ignored.  TODO: make certain parameters of the OOM killer configurable.

W-I-P: The memory-accounting code from an old 2.3.99 patch has been
re-introduced, but is in sore need of debugging.  It can be activated by
echoing a negative number into /proc/sys/vm/overcommit_memory - but do
this at your own risk.  Interested kernel hackers should alter the
"#define VM_DEBUG 0" to 1 in include/linux/mm.h to view lots of debugging
and warning messages.  I have seen the memory-accounting code attempt to
"free" blocks of memory exceeding 2GB which had never been allocated,
while running gcc.  The sanity-check code detects these anomalies and
attempts to correct for them, but this isn't good...

SIDE EFFECT: All parts of the kernel which can change the total amount of
VM (eg. by adding/removing swap) should now call
vm_invalidate_totalmem() to notify the VM about this.  A new function
vm_total() now reports the total amount of VM available.  The total VM and
the amount of reserved memory are now available from /proc/meminfo.



diff -rBU 5 linux-2.4.3/fs/exec.c linux-oom/fs/exec.c
--- linux-2.4.3/fs/exec.c   Thu Mar 22 09:26:18 2001
+++ linux-oom/fs/exec.c Tue Apr  3 09:32:07 2001
@@ -386,23 +386,31 @@
 }
 
 static int exec_mmap(void)
 {
struct mm_struct * mm, * old_mm;
+   struct task_struct * tsk = current;
+   unsigned long reserved = 0;
 
-   old_mm = current->mm;
+   old_mm = tsk->mm;
if (old_mm && atomic_read(&old_mm->mm_users) == 1) {
+   /* Keep old stack reservation */
mm_release();
exit_mmap(old_mm);
return 0;
}
 
+   reserved = vm_enough_memory(tsk->rlim[RLIMIT_STACK].rlim_cur >> 
+   PAGE_SHIFT);
+   if(!reserved)
+   return -ENOMEM;
+
mm = mm_alloc();
if (mm) {
-   struct mm_struct *active_mm;
+   struct mm_struct *active_mm = tsk->active_mm;
 
-   if (init_new_context(current, mm)) {
+   if (init_new_context(tsk, mm)) {
mmdrop(mm);
return -ENOMEM;
}
 
/* Add it to the list of mm's */
@@ -424,10 +432,12 @@
return 0;
}
mmdrop(active_mm);
return 0;
}
+
+   vm_release_memory(reserved);
return -ENOMEM;
 }
 
 /*
  * This function makes sure the current process has its own signal table,
diff -rBU 5 linux-2.4.3/fs/proc/proc_misc.c linux-oom/fs/proc/proc_misc.c
--- linux-2.4.3/fs/proc/proc_misc.c Fri Mar 23 11:45:28 2001
+++ linux-oom/fs/proc/proc_misc.c   Tue Apr  3 09:32:27 2001
@@ -173,11 +173,13 @@
 "HighTotal:%8lu kB\n"
 "HighFree: %8lu kB\n"
 "LowTotal: %8lu kB\n"
 "LowFree:  %8lu kB\n"
 "SwapTotal:%8lu kB\n"
-"SwapFree: %8lu kB\n",
+"SwapFree:  %8lu kB\n"
+"VMTotal:   %8lu kB\n"
+"VMReserved:%8lu kB\n",
 K(i.totalram),
 K(i.freeram),
 K(i.sharedram),
 K(i.bufferram),
 K(atomic_read(&page_cache_size)),
@@ -188,11 +190,13 @@
 K(i.totalhigh),
 K(i.freehigh),
 K(i.totalram-i.totalhigh),
 K(i.freeram-i.freehigh),
 K(i.totalswap),
-K(i.freeswap));
+K(i.freeswap),
+K(vm_total()), 
+K(vm_reserved));
 
return proc_calc_metrics(page, start, off, count, eof, len);
 #undef B
 #undef K
 }
diff -rBU 5 linux-2.4.3/include/linux/mm.h linux-oom/include/linux/

Re: kernel/sched.c questions

2001-04-04 Thread Bjorn Wesen

On 4 Apr 2001, Andi Kleen wrote:
> > >>  Hello, I would like to know why you put this two functions:
> > >>  void scheduling_functions_start_here(void) { }
> > >>  ...
> > >>  void scheduling_functions_end_here(void) { }

> This is needed for a very bad hack to get the EIP information in ps -lax:
> most programs would be shown as hanging in schedule(), which would not be 
> very useful to show the user. To avoid this sched.c is always compiled with 
> frame pointers and if the EIP is inside these two functions the proc code 
> goes back one level in the stack frame.

That sure is a very bad hack :) (For the original poster: search for
get_wchan in the various ports)

There is no comment anywhere near it that says what it is MEANT to do. You
can guess from the code and the usage that it has to do with stack-frames
and special-casing the scheduler functions..  Thanks for the 
clarification.. now I can go and fix it in arch/cris :) (I had never seen
the WCHAN field in ps before actually)

Just as a reference (everyone should get their daily dose of headache)
here is the i386 version:

unsigned long get_wchan(struct task_struct *p)
{
unsigned long ebp, esp, eip;
unsigned long stack_page;
int count = 0;
if (!p || p == current || p->state == TASK_RUNNING)
return 0;
stack_page = (unsigned long)p;
esp = p->thread.esp;
if (!stack_page || esp < stack_page || esp > 8188+stack_page)
return 0;
/* include/asm-i386/system.h:switch_to() pushes ebp last. */
ebp = *(unsigned long *) esp;
do {
if (ebp < stack_page || ebp > 8184+stack_page)
return 0;
eip = *(unsigned long *) (ebp+4);
if (eip < first_sched || eip >= last_sched)
return eip;
ebp = *(unsigned long *) ebp;
} while (count++ < 16);
return 0;
}

-BW


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel/sched.c questions

2001-04-04 Thread Richard B. Johnson

On Wed, 4 Apr 2001, Tim Walberg wrote:

> On 04/04/2001 16:52 -0300, Sardañons, Eliel wrote:
> >>Hello, I would like to know why you put this two functions:
> >>void scheduling_functions_start_here(void) { }
> >>...
> >>void scheduling_functions_end_here(void) { }
> >>
> 

This is so 'ps' knows if somebody is sleeping in the scheduler,
which is most often the case unless you have 2 or more CPUs.
When these addresses are found, the observed stack is unwound
to find the return address, hense where the sleeping task
was really sleeping.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: softirq buggy [Re: Serial port latency]

2001-04-04 Thread Manfred Spraul

From: "Pavel Machek" <[EMAIL PROTECTED]>
>
> > Ok, there are 2 bugs that are (afaics) impossible to fix without
> > checking for pending softirq's in cpu_idle():
> >
> > a)
> > queue_task(my_task1, tq_immediate);
> > mark_bh();
> > schedule();
> > ;within schedule: do_softirq()
> > ;within my_task1:
> > mark_bh();
> > ; bh returns, but do_softirq won't loop
> > ; do_softirq returns.
> > ; schedule() clears current->need_resched
> > ; idle thread scheduled.
> > --> idle can run although softirq's are pending
>
> Or anything else can run altrough softirqs are pending. If it is
> computation job, softinterrupts are delayed quiet a bit, right?
>
> So right fix seems to be "loop in do_softirq".
>
No, it's the wrong fix.
A network server under high load would loop forever within the softirq,
never returning to process level.

do_softirq cannot loop, the right fix is "check often for pending
softirq's".
It's checked before a process returns to user space, it's checked when a
process schedules. What's missing is that the idle functions must check
for pending softirqs, too.

--
Manfred

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: how to let all others run

2001-04-04 Thread Richard B. Johnson

On 4 Apr 2001, John Fremlin wrote:

> 
> Hi Oliver!
> 
>  Oliver Neukum <[EMAIL PROTECTED]> writes:
> 
> > is there a way to let all other runable tasks run until they block
> > or return to user space, before the task wishing to do so is run
> > again ?
> 
> Are you trying to do this in kernel or something? From userspace you
> can use nice(2) then sched_yield(2), though I don't know if the linux
> implementations will guarrantee anything.
> 

I recommend using usleep(0) instead of sched_yield(). Last time I
checked, sched_yield() seemed to spin and eat CPU cycles, usleep(0)
always gives up the CPU.

Try:
for(;;) usleep();

and 
for(;;) sched_yield();

.. you'll see a quiet behavior under `top` for usleep(0), and over 80%
with sched_yield().


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel/sched.c questions

2001-04-04 Thread Stuart MacDonald

I had similar questions recently when I was doing some
hacking; these are my guesses:

From: ; "Eliel" <[EMAIL PROTECTED]>
> Hello, I would like to know why you put this two functions:
> void scheduling_functions_start_here(void) { }
> ...
> void scheduling_functions_end_here(void) { }

Just as markers for easy location in System.map.
The compiler should optimise those away.

> why you put 'case TASK_RUNNING'
>
> switch (prev->state) {
> case TASK_INTERRUPTIBLE:
> if (signal_pending(prev)) {
> prev->state = TASK_RUNNING;
> break;
> }
> default:
> del_from_runqueue(prev);
> case TASK_RUNNING:
> }

Prevent compiler warnings about unhandled conditions?
Not sure about that one.

> in the function schedule() you always use this syntax:
>
> -
> if (a_condition)
> goto bebe;
> bebe_back
>
>
> bebe:
> do_bebe();
> goto bebe_back;
> --
> why not just doing:
>
>if (a_condition)
>  do_bebe();

Probably because the compiler puts out

setup function parameter one
setup function parameter two
setup function parameter three
check condition
call function
setup function parameter one
setup function parameter two
setup function parameter three
check condition
call function

for your case and the above convolutions
puts out

check condition
jump to call if needed
check condition
jump to call if needed

instead.

Or even if the compiler puts out

check condition
If condition
  setup function parameter one
  setup function parameter two
  setup function parameter three
  call function
check condition
if condition
  setup function parameter one
  setup function parameter two
  setup function parameter three
  call function

I'm betting the smaller code above is better
for cache hits, right?

But these are my guesses. Anyone want to
clarify?

..Stu


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel/sched.c questions

2001-04-04 Thread Andi Kleen

Tim Walberg <[EMAIL PROTECTED]> writes:

> On 04/04/2001 16:52 -0300, Sardañons, Eliel wrote:
> >>Hello, I would like to know why you put this two functions:
> >>void scheduling_functions_start_here(void) { }
> >>...
> >>void scheduling_functions_end_here(void) { }
> >>
> 
> That one I have no idea about - maybe some perverse sort
> of comment? Or maybe something somewhere needs to know the
> address range that some functions lie within, and these functions
> would delimit that range. Of course, that presumes that the
> compiler in use doesn't reorder functions in the object code
> that emits, but I think that's a fairly safe assumption for
> now...

This is needed for a very bad hack to get the EIP information in ps -lax:
most programs would be shown as hanging in schedule(), which would not be 
very useful to show the user. To avoid this sched.c is always compiled with 
frame pointers and if the EIP is inside these two functions the proc code 
goes back one level in the stack frame.


-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [WISHLIST] Addition of suspend patch into 2.5?

2001-04-04 Thread Pavel Machek

Hi!

> Not only for laptops :)
> 
> It's nice for PCs too also.

And it is called sw_susp. So what about trying it, *NOW*?
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [WISHLIST] Addition of suspend patch into 2.5?

2001-04-04 Thread Pavel Machek

Hi!

> Just a small adition to the 2.5 whislist:
> Is "hibernation" on linux possible? Ideally it should write out on the /

I just said it is. However you definitely do not want to write it onto 
filesystem -- that's much too hard. But you can write it to the swap space,
and that's exactly what the patch does.

> running on ext2fs and the new journaling fs's like reiserfs, xfs, etx3 etc
> and not some special filesystem or unpartiotioned space etc. I mean that
> this should be working without the need to repartiotion/reinstall.
> This is something **very** useful for laptop owners, not having to shut 
> down (all) applications when need to grab the laptop and travel.
> Id' like to see this working nice in 2.6.
> 
> Best regards,
> r
> 
> On Thu, 22 Feb 2001, Pavel Machek wrote:
> 
> > Date: Thu, 22 Feb 2001 21:43:08 +0100
> > From: Pavel Machek <[EMAIL PROTECTED]>
> > To: Shawn Starr <[EMAIL PROTECTED]>, lkm <[EMAIL PROTECTED]>
> > Subject: Re: [WISHLIST] Addition of suspend patch into 2.5?
> > 
> > Hi!
> > 
> > > Any idea if suspend/hybernation will be in future kernels?
> > 
> > I'd like it included, too. Some toshiba laptops support sleep but not
> > suspend, and battery runs out within few hours if it was low before
> > suspend. That's bad.
> > 
> > And the patch was pretty clean last time I checked.
> > Pavel
> > -- 
> > I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care."
> > Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED]
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Ethernet driver tweak for error correction codes

2001-04-04 Thread Pavel Machek

Hi!

> Is it possible to use up the src, dest MAC addresses (12B) and the CRC field (4B?)
> on a point-to-point full duplex Ethernet link for my own data?

That does not seem like too clean solution.

> I would like to implement an error correction on this, because I'm gonna build
> a freespace laser link which would run just this way. And i want to use it on
> foggy days too when there will be a lot of bits fallen out.
> 
> Is it possible to do it in the kernel somehow cleanly? How should I try to do it?

Do it userspace: hook slip on tty/pty pair. See scarabd for details.

-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



softirq buggy [Re: Serial port latency]

2001-04-04 Thread Pavel Machek

Hi!

> > Seems floppy and console is buggy, then.
> >
> 
> No. The softirq implementation is buggy.
> I can trigger the problem with the TASKLET_HI (floppy), and both net rx
> and tx (ping -l)
> 
> > > What about creating a special cpu_is_idle() function that the idle
> > > functions must call before sleeping?
> > 
> > I'd say just fix all the bugs.
> >
> 
> Ok, there are 2 bugs that are (afaics) impossible to fix without
> checking for pending softirq's in cpu_idle():
> 
> a)
> queue_task(my_task1, tq_immediate);
> mark_bh();
> schedule();
> ;within schedule: do_softirq()
> ;within my_task1:
> mark_bh();
> ; bh returns, but do_softirq won't loop
> ; do_softirq returns.
> ; schedule() clears current->need_resched
> ; idle thread scheduled.
> --> idle can run although softirq's are pending

Or anything else can run altrough softirqs are pending. If it is
computation job, softinterrupts are delayed quiet a bit, right?

So right fix seems to be "loop in do_softirq".

Pavel

> I assume I trigger this race with the floppy driver.
> 
> b)
> hw interrupt
> do_softirq
> within the net_rx handler: another hw interrupt, additional packets are
> queued
> do_softirq won't loop.
> returns to idle thread. --> packets delayed unnecessary.
> 
> What about the attached patch? Obviously the other idle cpu must be
> converted to use the function as well.

-- 
I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: pcnet32 (maybe more) hosed in 2.4.3

2001-04-04 Thread Petr Vandrovec

Wade Hampton wrote:
> 
> Carsten Langgaard wrote:
> >
> > I'm not sure what the problem is, but the whole deal about checking whether the
> > controller runs in 16 bit or 32 bit mode, is a little bit tricky.
> >[snip]
> Without the changes listed in this thread, 2.4.3 crashed vmware 2.0.3
> Linux. It did not OOPS the kernel, it caused vmware to go down in
> flames  Changing the code per the previous mail fixed it and
> my VM now works fine.  THANKS!
> 
> Is this list non-causal?  The answer was posted to the list as I
> was getting ready to build 2.4.3 on my VM, before I found the
> problem or even had to ask the question!

VMware is working on implementation PCnet 32bit mode in emulation (there
is no such thing now because of no OS except FreeBSD needs it). But
my question is - is there some real benefit in running chip in
32bit mode? All registers except CSR88 use only low 16 bits anyway,
so is 32bit mode needed for bigendian ports, or what's reasoning
behind it? AFAIK all chips support 16bit mode, and having only 16bit
mode in code could save at least one indirect jump on each chip access.
Thanks,
Petr Vandrovec
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: how to let all others run

2001-04-04 Thread John Fremlin


Hi Oliver!

 Oliver Neukum <[EMAIL PROTECTED]> writes:

> is there a way to let all other runable tasks run until they block
> or return to user space, before the task wishing to do so is run
> again ?

Are you trying to do this in kernel or something? From userspace you
can use nice(2) then sched_yield(2), though I don't know if the linux
implementations will guarrantee anything.

-- 

http://www.penguinpowered.com/~vii
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: kernel/sched.c questions

2001-04-04 Thread Tim Walberg

On 04/04/2001 16:52 -0300, Sardañons, Eliel wrote:
>>  Hello, I would like to know why you put this two functions:
>>  void scheduling_functions_start_here(void) { }
>>  ...
>>  void scheduling_functions_end_here(void) { }
>>  

That one I have no idea about - maybe some perverse sort
of comment? Or maybe something somewhere needs to know the
address range that some functions lie within, and these functions
would delimit that range. Of course, that presumes that the
compiler in use doesn't reorder functions in the object code
that emits, but I think that's a fairly safe assumption for
now...

>>  why you put 'case TASK_RUNNING'
>>  
>>  switch (prev->state) {
>>  case TASK_INTERRUPTIBLE:
>>  if (signal_pending(prev)) {
>>  prev->state = TASK_RUNNING;
>>  break;
>>  }
>>  default:
>>  del_from_runqueue(prev);
>>  case TASK_RUNNING:
>>  }
>>  


This just indicates that there is nothing to be done for the
TASK_RUNNING case - if it were left out, the default case would
be taken. Of course, a 'case TASK_RUNNING: break;' placed earlier
in the switch construct would be semantically the same, but there
may be reasons related to code optimization that this was done the
way it was.

>>  and the last one:
>>  
>>  in the function schedule() you always use this syntax:
>>  
>>  -
>>  if (a_condition)
>>  goto bebe;
>>  bebe_back
>>  
>>  
>>  bebe:
>>  do_bebe();
>>  goto bebe_back;
>>  --
>>  why not just doing:
>> 
>> if (a_condition)
>>   do_bebe();
>>  
>>  
>>  I know that goto's are better but finaly you are jumping to a function and
>>  then calling the function. I think you can improve performance doing this.


This looks like a hand-optimization to avoid a branch in the most common
case. Chances are a_condition is supposed to be pretty rare, and the code
you suggest would usually include a branch for the usual code path, then.


tw


-- 
+--+--+
| Tim Walberg  | [EMAIL PROTECTED]   |
| 828 Marshall Ct. | www.concentric.net/~twalberg |
| Palatine, IL 60074   |  |
+--+--+

 PGP signature


/dev/loop0 over lvm... leading to d-state :-(

2001-04-04 Thread Herbert Valerio Riedel


fyi, loop devices over lvm LV's dont work for me...

I've tested with 2.4.3final (and some other 2.4.3 derivates) and two
lvm'ized partitions with a size of about 1gig each; mke2fs
just goes into D-state and stays there when applying it to /dev/loop0,
running it directly on the LV-device works...

greetings,
-- 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: mysqld [3.2.23] hangs when key_buffer ~256MB on [2.4.2-ac28+]

2001-04-04 Thread Andrew Morton

Alan Cox wrote:
> 
> > I initially upgraded my kernel from 2.4.2-ac5 to 2.4.3 and the first thing I
> > noticed was that mysqld was stuck.  Killing it left it hanging in a D state.
> > Then I tried 2.4.2-ac28 (which I am using now), and the got the same result.
> 
> I'd expect that bit. 2.4.2-ac28 basically has the same new rwlock VM and
> behaviour as 2.4.3pre8. What would be really useful to know is if anyone can
> duplicate the problem non x86
> 
> > Can anyone reproduce this problem?
> 
> Yes

Untested patch:


--- semaphore.c.origWed Apr  4 12:54:30 2001
+++ semaphore.c Wed Apr  4 12:54:58 2001
@@ -363,26 +363,26 @@
  */
 struct rw_semaphore *down_write_failed(struct rw_semaphore *sem)
 {
struct task_struct *tsk = current;
DECLARE_WAITQUEUE(wait, tsk);
 
__up_write(sem);/* this takes care of granting the lock
*/
 
-   add_wait_queue_exclusive(&sem->wait, &wait);
+   add_wait_queue_exclusive(&sem->write_bias_wait, &wait);
 
while (atomic_read(&sem->count) < 0) {
set_task_state(tsk, TASK_UNINTERRUPTIBLE);
if (atomic_read(&sem->count) >= 0)
break;  /* we must attempt to acquire or bias
the lock */
schedule();
}
 
-   remove_wait_queue(&sem->wait, &wait);
+   remove_wait_queue(&sem->write_bias_wait, &wait);
tsk->state = TASK_RUNNING;
 
return sem;
 }
 
 asm(
 "
 .align 4
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: nfs performance at high loads

2001-04-04 Thread Mark Hemment


  I believe David Miller's latest zero-copy patches might help here.
  In his patch, the pull-up buffer is now allocated near the top of stack
(in the sunrpc code), so it can be a blocking allocation.
  This doesn't fix the core VM problems, but does relieve the pressure
_slightly_ on the VM (I assume, haven't tried David's patch yet).

  One of the core problems is that the VM keeps no measure of
page fragmentation in the free page pool.  The system reaches the state of
having plenty of free single pages (so kswapd and friends aren't kicked
- or if they are, they do no or little word), and very few buddied pages
(which you need for some of the NFS requests).

  Unfortunately, even with keeping a mesaure of fragmentation, and
insuring work is done when it is reached, doesn't solve the next problem.

  When a large order request comes in, the inactive_clean page list is
reaped.  As reclaim_page() simply selects the "oldest" page it can, with
no regard as to whether it will buddy (now, or 'possibily in the near
future), this list is quickly shrunk by a large order request - far too
quickly for a well behaved system.

  An NFS write request, with an 8K block size, needs an order-2 (16K) pull
up buffer (we shouldn't really be pulling the header into the same buffer
as the data - perhaps we aren't any more?).  On a well used system, an
order-2 _blocking_ allocation ends up populating the order-0 and order-1
with quite a few pages from the inactive_clean.

  This then triggers another problem. :(

  As large (non-zero) order requests are always from the NORMAL or DMA
zones, these zones tend to have a lot of free-pages (put there by the
blind reclaim_page() - well, once you can do a blocking allocation they
are, or when the fragmentation kicking is working).
  New allocations for pages for the page-cache often ignore the HIGHMEM
zone (it reaches a steady state), and so is passed over by the loop at the
head of __alloc_pages()).
  However, NORMAL and DMA zones tend to be above pages_low (due to the
reason above), and so new page-cache pages came from these zones.  On a
HIGHMEM system this leads to thrashing of the NORMAL zone, while the
HIGHMEM zone stays (relatively) quiet.
  Note: To make matters even worse under this condition, pulling pages out
of the NORMAL zone is exactly what you don't want to happen!  It would be
much better if they could be left alone for a (short) while to give them
chance to buddy - Linux (at present) doesn't care about the budding of
pages in the HIGHMEM zone (no non-zero allocations come from there).

  I was working on these problems (and some others) a few months back, and
will to return to them shortly.  Unfortunately, the changes started to
look too large for 2.4
  Also, for NFS, the best solution now might be to give the nfsd threads a
receive buffer.  With David's patches, the pull-up occurs in the context
of a thread, making this possible.
  This doesn't solve the problem for other subsystems which do non-zero
order page allocations, but (perhaps) they have a low enough frequency not
to be of real issue.


Kapish,

  Note: Ensure you put a "sync" in your /etc/exports - the default
behaviour was "async" (not legal for a valid SpecFS run).

Mark


On Wed, 4 Apr 2001, Alan Cox wrote:

> > We have been seeing some problems with running nfs benchmarks
> > at very high loads and were wondering if somebody could show
> > some pointers to where the problem lies.
> > The system is a 2.4.0 kernel on a 6.2 Red at distribution ( so
> 
> Use 2.2.19. The 2.4 VM is currently too broken to survive high I/O benchmark
> tests without going silly
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: IDE RAID Hardware Advice

2001-04-04 Thread Erik Mouw

On Wed, Apr 04, 2001 at 10:03:59AM -0400, John Kodis wrote:
> I'll be assembling a terabyte of IDE RAID network attached storage,
> and was looking for some advice on:
> 
>   - best supported and most reliable multi-channel IDE controller;

See http://www.linux-ide.org/chipsets.html . (3Ware controllers are the
only supported IDE RAID controller)

>   - best supported and most reliable NFS implementation;
> 
>   - any other random advise about things to do or not do in setting up
> this type of system.

Don't use an NFS exported reiserfs filesystem. (see this mailing list
archive).


Erik

-- 
J.A.K. (Erik) Mouw, Information and Communication Theory Group, Department
of Electrical Engineering, Faculty of Information Technology and Systems,
Delft University of Technology, PO BOX 5031,  2600 GA Delft, The Netherlands
Phone: +31-15-2783635  Fax: +31-15-2781843  Email: [EMAIL PROTECTED]
WWW: http://www-ict.its.tudelft.nl/~erik/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Contacts within AMD? AMD-756 USB host-controller blacklisted due to

2001-04-04 Thread Miles Lane

Thomas Dodd wrote:

> Alan Cox wrote:
> 
>>> David Brownell recently added this check to the usb-ohci driver
>>> since noone has gotten information from AMD for the workaround,
>>> which is rumored to exist, for this bug.
>>> 
>>> Do any of you have contacts within AMD who might be able to
>>> get an explanation of the workaround to David Brownell?
>> 
>> We are working on that currently via the Red Hat contact.
>> 
>> 
>>> value given varies.  Rereading NDP seems to give a valid value.
>>> I am not really clear why we don't simply read the value twice
>>> whenever the host-controller is detected to be an AMD-756.
>> 
>> because we dont know the full scope of the problem yet.
> 
> 
> Exactly how many bug reports has this caused?
> What kind of problems?
> 
> I know I had trouble onece, but it was a CONFIG problem
> with the 2.4.2ac series and the extra DEBUG options.

I think probably everyone who has an AMD-756 has reported
this error.  At least, I've not seen any messages from
people saying, "I have an AMD-756 and have never seen this
error."  Most of the time, when the error occurs, it seems
pretty benign.  That is, I haven't noticed it crashing USB
device connections, causing data corruption or OOPSen.
Some folks _have_ reported OOPSen, though, that seemed to
be triggered by the erratum #4 hardware bug.  I think I
may have had one of these a long time ago.

I believe David has found that there definitely are code
paths where this hardware bug can cause failures of various
sorts and that's why the AMD-756 has been blacklisted.
I don't believe these failure code paths have anything to
do with specific debugging configurations.

David/Alan, please correct me if I've got this all wrong.

Thanks,
Miles

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



kernel/sched.c questions

2001-04-04 Thread Sardañons, Eliel

Hello, I would like to know why you put this two functions:
void scheduling_functions_start_here(void) { }
...
void scheduling_functions_end_here(void) { }

why you put 'case TASK_RUNNING'

switch (prev->state) {
case TASK_INTERRUPTIBLE:
if (signal_pending(prev)) {
prev->state = TASK_RUNNING;
break;
}
default:
del_from_runqueue(prev);
case TASK_RUNNING:
}

and the last one:

in the function schedule() you always use this syntax:

-
if (a_condition)
goto bebe;
bebe_back


bebe:
do_bebe();
goto bebe_back;
--
why not just doing:
   
   if (a_condition)
 do_bebe();


I know that goto's are better but finaly you are jumping to a function and
then calling the function. I think you can improve performance doing this.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 kernel hangs on 486 machine at boot

2001-04-04 Thread Vik Heyndrickx

- Original Message -
From: "Alan Cox" <[EMAIL PROTECTED]>
To: "Vik Heyndrickx" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Wednesday, April 04, 2001 7:54 PM
Subject: Re: 2.4 kernel hangs on 486 machine at boot


> > Problem: Linux kernel 2.4 consistently hangs at boot on 486 machine
> >
> > Shortly after lilo starts the kernel it hangs at the following message:
> > Checking if this processor honours the WP bit even in supervisor mode...
> > 
>
> Does this happen on 2.4.3-ac kernel trees ? I thought i had it zapped

It doesn't even happen with with 2.4.3. Now I feel silly.

--
Vik


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2048 byte/sector problems with kernel 2.4

2001-04-04 Thread Giuliano Pochini


> >There WERE direct overwrite media for a while that would, in theory, be
> >able to write the data directly, but a combination of high cost, >limited
> >sources, and strong questions about the permanence of the recorded data
> >severely limited the demand for these and I think that they have been
> >withdrawn.

I have 2 OW disks and they work just fine. According to specs their
reliability is the same as nornal MO disks.

> No, direct overwrite disks are expensive, but they are still available. I do
> not know of any, and have not heard of any problems related to direct
> overwrite technology. For some reason M/O never really caught on in the US,
> and the high price of direct overwrite disks is what seems to be killing
> them off. I have a bunch I use for backup and have never had any problems.

RW CDs killed almost all removables.

And about 2KB sectors related problems, I confirm what I wrote in my previous
message: I have no problems. 640MB and 1.3GB both work fine here.

(kernel 2.4.3, old aic7xxx driver, adaptec 2930CU, Fujitsu GigaMO, PowerPC
750)

Bye.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: linux 2.4.3 crashed my hard disk

2001-04-04 Thread Richard B. Johnson

On Wed, 4 Apr 2001, Frank Cornelis wrote:

> Hey,
> 
> After I did put in /etc/sysconfig/harddisks 
>   USE_DMA=1
> my system did crash very badly, I guess after my hard disks did wake up
[SNIPPED...]


> 
> BTW: my motherboard runs at 112 Mhz, overclocked, was 100 Mhz.
> Been running this configuration over more than 2 years now without such
> major problems.
> Could this be the cause?
> 
> Frank.

Please don't ever report any errors to linux-kernel if you are running
your machine over-clocked. All you need is to fetch ONE bad instruction
and you can evaporate ALL the data on ALL your hard disks. Think what
happens if a pointer to a structure containing the not-yet-written
to disk blocks gets adjusted to point so some spent email buffer.


Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips).

"Memory is like gasoline. You use it up when you are running. Of
course you get it all back when you reboot..."; Actual explanation
obtained from the Micro$oft help desk.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: uninteruptable sleep

2001-04-04 Thread andersg

On Wed, Apr 04, 2001 at 08:39:19AM +0930, Trevor Nichols wrote:

> > ps -eo pid,stat,pcpu,nwchan,wchan=WIDE-WCHAN-COLUMN -o args
> 
> 1230 D 0.0 105cc1 down_write_failed /home/data/mozilla/obj/dist/bin/mozilla-bin

My mysql-server got stuck in down_write_failed today too.
SMP dual PentiumIII system with no swap. I can provide more info at request
and is willing to do more bug-hunting if that is needed.

-- 

//anders/g

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Minor 2.4.3 Adaptec Driver Problems

2001-04-04 Thread Earle Nietzel

>
> That's what ext2 volume labels are for.
>

That is really not a great solution although it's half OK.

Here's why?

You can put labels on all your ext2 partitons but what happens on:
cdroms
zip drives
and probably most important:
swap
but lets just say anything other than ext2 partition.

Also you can't put a LABEL in LILO either.

This is probably great for those people who use ide only, but then again
people who had ide only machines were not affected by v2.4.3!

Good try though!!!

Earle


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: linux 2.4.3 crashed my hard disk

2001-04-04 Thread Brian Gerst

Frank Cornelis wrote:
> 
> Hey,
> 
> After I did put in /etc/sysconfig/harddisks
> USE_DMA=1
> my system did crash very badly, I guess after my hard disks did wake up
> again. For I while I though I'd lose some sectors because of this, I had
> to re-install my RedHat 7.0, had a not so productive day :) But, hard
> disks are OK now.
> I thought I should report this.
> Below there is a copy of my dmesg log.
> 
> BTW: my motherboard runs at 112 Mhz, overclocked, was 100 Mhz.
> Been running this configuration over more than 2 years now without such
> major problems.
> Could this be the cause?
> 
> Frank.

http://www.tux.org/lkml/#s13-3

--

Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



new aic7xxx driver problems

2001-04-04 Thread Giuliano Pochini


I have two Adaptec 2930CU (ultra narrow) cards. I modified the driver to
make them work in ultra mode. The card connected to my CDROM and MO drive,
operating at different bus clocks, does not behave well. Transfers stop
often for 10-20 seconds and it spits out warnings like these:

Apr  3 23:05:10 Jay kernel: scsi1:0:4:0: Attempting to queue an ABORT message 
Apr  3 23:05:10 Jay kernel: scsi1:0:4:0: Command found on device queue 
Apr  3 23:05:10 Jay kernel: aic7xxx_abort returns 8194 
Apr  3 23:05:10 Jay kernel: scsi1:0:5:0: Attempting to queue an ABORT message 
Apr  3 23:05:10 Jay kernel: scsi1:0:5:0: Command found on device queue 
Apr  3 23:05:10 Jay kernel: aic7xxx_abort returns 8194 
Apr  3 23:06:23 Jay kernel: scsi1:0:4:0: Attempting to queue an ABORT message 
Apr  3 23:06:23 Jay kernel: scsi1:0:4:0: Command found on device queue 
Apr  3 23:06:23 Jay kernel: aic7xxx_abort returns 8194

No probs with the old driver.

[Linux 2.4.3, aic7xxx v6.1.8, gcc 2.95.3, PowecPC 750]

Bye.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Another report of mozilla in D state, related to the 'uninterruptible sleep' thread

2001-04-04 Thread David Ford

Second time around, I didn't evoke any interest the first time.

I reported it back on Mar/27.  It is still an almost daily problem
requiring a reboot.  Mozilla gets stuck in down_write_failed.  This time
I'm sure it's not reiser's fault.

# uname -r
2.4.3-pre8

mozilla-bin  D C781849C 0 21055  1(NOTLB) 20611
 Call Trace: [] [] []
[leaf_copy_items+121/252]
   [leaf_paste_in_buffer+239/672] [leaf_cut_from_buffer+486/984]
   [leaf_cut_from_buffer+863/984] [balance_leaf+8645/9544]
   [balance_leaf+9225/9544] [leaf_item_bottle+916/1260]
   [balance_leaf+9505/9544] [balance_leaf+9225/9544]
   [leaf_item_bottle+916/1260] [balance_leaf+9505/9544] []
   [bin_search_in_dir_item+58/196] [leaf_copy_items+121/252]
   [leaf_paste_in_buffer+239/672] []
   [flush_commit_list+66/908] [flush_journal_list+531/944]
   [free_list_bitmaps+30/60] [reiserfs_unlink+167/460]
   [posix_lock_file+526/1232] [empty_bad_page+3213/4096]
   [empty_bad_page+2717/4096] [fib_flag_trans+35/60]
   [empty_bad_pte_table+3363/4096]

If someone is actually interested, it'd be neat to get this fixed.

-d

--
  There is a natural aristocracy among men. The grounds of this are virtue and 
talents. Thomas Jefferson
  The good thing about standards is that there are so many to choose from. Andrew S. 
Tanenbaum



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Contacts within AMD? AMD-756 USB host-controller blacklisted due to

2001-04-04 Thread Thomas Dodd

Alan Cox wrote:
> 
> > David Brownell recently added this check to the usb-ohci driver
> > since noone has gotten information from AMD for the workaround,
> > which is rumored to exist, for this bug.
> >
> > Do any of you have contacts within AMD who might be able to
> > get an explanation of the workaround to David Brownell?
> 
> We are working on that currently via the Red Hat contact.
> 
> > value given varies.  Rereading NDP seems to give a valid value.
> > I am not really clear why we don't simply read the value twice
> > whenever the host-controller is detected to be an AMD-756.
> 
> because we dont know the full scope of the problem yet.

Exactly how many bug reports has this caused?
What kind of problems?

I know I had trouble onece, but it was a CONFIG problem
with the 2.4.2ac series and the extra DEBUG options.

-Thomas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Compiling problem kernel 2.4.2

2001-04-04 Thread Boris Pisarcik

 
> gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -02
> -fomit-frame-pointer -fno-strict-aliasing -pipe -march=i486  -c -o init/main.o
> init/main.c
> gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -02
> fomit-frame-pointer -fno-strict-aliasing -pipe  -march=i486
> -DUTS_MACHINE='"i386"' -c -o init/version.o init/version.c
> cpp: /usr/src/linux/include/linux/compile.h: Input/output error
> init/version.c:20: `UTS_VERSION' undeclared here (not in a function)
> init/version.c:20: initializer element for `system_utsname.version' is not
> constant
> init/version.c:25: parse error before `LINUX_COMPILE_BY'
> make: *** [init/version.o] Error 1

Hi.

-02 mean -O2 ?

Do you comile over NFS ? Did you try it to local-compile , or compile
on another system version ? It really
may be nfs or some system bug, if you compile on some old system (what kernel
version did you compile on ? They me differ on slack and redhat machines.)

I recently had a bug with gnu assembler, which could safely compile all
files a tried, but not the ones that consisted of any combination 
3 chars name+1 char suffix.
Really interesting bug too.

Bye   B.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Non keyboard trigger of Alt-SysRQ-S-U-B

2001-04-04 Thread Boris Pisarcik

> If you have a serial console on the server, you can get sysrq by
> sending a serial break followed by the character. 

Hi,

i've tried it with minicom and functioned : ctrl+a+F and key for function
as in normal sysrq.

This approach will probably not help you a lot thought, since you wouldn't
have access to your serial console too.

If you don't get your problems solved till now, i could make a simple
single-purpos module which will export a syscall to priviledged process
to call sysrq functions.

Gonna to have a look...

Bye B.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Non keyboard trigger of Alt-SysRQ-S-U-B

2001-04-04 Thread Boris Pisarcik

Hi Nathan,

I've just made an experimental module which offers syscall to privileged
process, which internally translates itself into real sysrq handler
(handle_sysrq) defined in drivers/char/sysrq.c. It occupates itself 
one of unussed linux system calls (concretely stty - no. 31). 
Makefile and patch for that sysrq.c are included in attached archive. 
(I stronly believe i didn't made it reversed :). 
The patch itself only exports 1 variable and 1 function from sysrq.c, 
that normally aren't.

You can make a daemon, which listens on socket and triggers commands
send by clients. Dont call sysrq+boot until a while needed to sync and
unmount. This check, if sync and/or umount were finished before boot,
should be really done, but it would require more changes in kernel
source. And of course, the security is to be taken in client/server
into account.

Bye   B.

 srq.tar.gz


Re: a quest for a better scheduler

2001-04-04 Thread Hubertus Franke


I give you a concrete example:

Running DB2 on an SMP system.

In DB2 there is a processes/thread pool that is sized based
on memory and numcpus. People tell me that the size of this pool
is in the order of 100s for an 8-way system with reasonable
sized database. These  determine the number of agents
that can simultaneously execute an SQL statement.

Requests are flying in for transactions (e.g. driven by TPC-W like
applications). The agents are grepped from the pool and concurrently
fire the SQL transactions.
Assuming that there is enough concurrency in the database, there is
no reason to believe that the majority of those active agents is
not effectively running. TPC-W loads have observed 100 of active
transactions at a time.

Ofcourse limiting the number of agents would reduce concurrently
running tasks, but would limit the responsiveness of the system.
Implementing a database in the kernel ala TUX doesn't seem to be
the right approach either (complexity, fault containment, ...)

Hope that is one example people accept.

I can dig up some information on WebSphere Applications.

I'd love to hear from some other applications that fall into
a similar category as the above and substantiate a bit the need
for 100s of running processes, without claiming that the
application is broke.

Hubertus Franke
Enterprise Linux Group (Mgr),  Linux Technology Center (Member Scalability)
, OS-PIC (Chair)
email: [EMAIL PROTECTED]
(w) 914-945-2003(fax) 914-945-4425   TL: 862-2003



Mark Hahn <[EMAIL PROTECTED]> on 04/04/2001 02:28:42 PM

To:   Hubertus Franke/Watson/IBM@IBMUS
cc:
Subject:  Re: a quest for a better scheduler



> ok if the runqueue length is limited to a very small multiple of the
#cpus.
> But that is not what high end server systems encounter.

do you have some example of this in mind?  so far, noone has
actually produced an example of a "high end" server that has
long runqueues.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: mysqld [3.2.23] hangs when key_buffer ~256MB on [2.4.2-ac28+]

2001-04-04 Thread Alan Cox

> I initially upgraded my kernel from 2.4.2-ac5 to 2.4.3 and the first thing I
> noticed was that mysqld was stuck.  Killing it left it hanging in a D state.
> Then I tried 2.4.2-ac28 (which I am using now), and the got the same result.

I'd expect that bit. 2.4.2-ac28 basically has the same new rwlock VM and
behaviour as 2.4.3pre8. What would be really useful to know is if anyone can
duplicate the problem non x86

> Can anyone reproduce this problem?

Yes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: linux 2.4.3 crashed my hard disk

2001-04-04 Thread Alan Cox

> After I did put in /etc/sysconfig/harddisks 
>   USE_DMA=1
> my system did crash very badly, I guess after my hard disks did wake up

So you forced DMA on

> BTW: my motherboard runs at 112 Mhz, overclocked, was 100 Mhz.

and ran overclocked

> Been running this configuration over more than 2 years now without such
> major problems.
> Could this be the cause?

Quite possibly. There are reasons we ignore bug reports from overclockers

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



mysqld [3.2.23] hangs when key_buffer ~256MB on [2.4.2-ac28+]

2001-04-04 Thread Vibol Hou

I initially upgraded my kernel from 2.4.2-ac5 to 2.4.3 and the first thing I
noticed was that mysqld was stuck.  Killing it left it hanging in a D state.
Then I tried 2.4.2-ac28 (which I am using now), and the got the same result.

My key_buffer was set to 256MB, so I figured maybe it was something to do
with memory usage so I lowered that figured to 128MB and restarted the
system to clear the D state procs.  Everything works fine now.  2.4.2-ac5
did not have issues with the larger key_buffer.

Can anyone reproduce this problem?

--
Vibol Hou
KhmerConnection, http://khmer.cc
"Connecting Cambodian Minds, Art, and Culture"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



linux 2.4.3 crashed my hard disk

2001-04-04 Thread Frank Cornelis

Hey,

After I did put in /etc/sysconfig/harddisks 
USE_DMA=1
my system did crash very badly, I guess after my hard disks did wake up
again. For I while I though I'd lose some sectors because of this, I had
to re-install my RedHat 7.0, had a not so productive day :) But, hard
disks are OK now.
I thought I should report this.
Below there is a copy of my dmesg log.

BTW: my motherboard runs at 112 Mhz, overclocked, was 100 Mhz.
Been running this configuration over more than 2 years now without such
major problems.
Could this be the cause?

Frank.

Linux version 2.4.3 (root@bluewall) (gcc version 2.96 2731 (Red Hat Linux 7.0)) 
#16 Sun Apr 1 18:24:33 CEST 2001
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820:  - 0001 (reserved)
 BIOS-e820: 0010 - 07ff (usable)
 BIOS-e820: 07ff3000 - 0800 (ACPI data)
 BIOS-e820: 07ff - 07ff3000 (ACPI NVS)
Scan SMP from c000 for 1024 bytes.
Scan SMP from c009fc00 for 1024 bytes.
Scan SMP from c00f for 65536 bytes.
Scan SMP from c009fc00 for 4096 bytes.
On node 0 totalpages: 32752
zone(0): 4096 pages.
zone(1): 28656 pages.
zone(2): 0 pages.
mapped APIC to e000 (01222000)
Kernel command line: auto BOOT_IMAGE=Linux ro root=303 BOOT_FILE=/boot/bzImage 
idebus=66
ide_setup: idebus=66
Initializing CPU#0
Detected 392.565 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 783.15 BogoMIPS
Memory: 126136k/131008k available (1327k kernel code, 4484k reserved, 479k data, 240k 
init, 0k highmem)
Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes)
Buffer-cache hash table entries: 4096 (order: 2, 16384 bytes)
Page-cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
VFS: Diskquotas version dquot_6.4.0 initialized
CPU: Before vendor init, caps: 0183f9ff  , vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 0183f9ff   
CPU: After generic, caps: 0183f9ff   
CPU: Common caps: 0183f9ff   
CPU: Intel Pentium II (Deschutes) stepping 01
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.37 (20001109) Richard Gooch ([EMAIL PROTECTED])
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfb3b0, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
Unknown bridge resource 0: assuming transparent
PCI: Using IRQ router PIIX [8086/7110] at 00:07.0
PCI: Found IRQ 10 for device 00:07.2
PCI: The same IRQ used for device 00:09.0
Limiting direct PCI/PCI transfers.
isapnp: Scanning for Pnp cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
IA-32 Microcode Update Driver: v1.08 <[EMAIL PROTECTED]>
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.14)
Starting kswapd v1.8
parport0: PC-style at 0x378 [PCSPP(,...)]
parport0: irq 7 detected
rivafb: RIVA MTRR set to ON
Console: switching to colour frame buffer device 80x30
rivafb: PCI nVidia NV4 framebuffer ver 0.9.2a (RIVA-VTNT2, 32MB @ 0xD400)
pty: 256 Unix98 ptys configured
lp0: using parport0 (polling).
block: queued sectors max/low 83690kB/27896kB, 256 slots per queue
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 66MHz system bus speed for PIO modes
PIIX4: IDE controller on PCI bus 00 dev 39
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio
hda: QUANTUM FIREBALL EX6.4A, ATA DISK drive
hdb: ST38421A, ATA DISK drive
hdc: QUANTUM FIREBALL ST2.1A, ATA DISK drive
hdd: IDE/ATAPI CD-ROM 36X, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 12594960 sectors (6449 MB) w/418KiB Cache, CHS=784/255/63, UDMA(33)
hdb: 16498944 sectors (8447 MB) w/256KiB Cache, CHS=16368/16/63, UDMA(33)
hdc: 4124736 sectors (2112 MB) w/81KiB Cache, CHS=4092/16/63, UDMA(33)
hdd: ATAPI 16X CD-ROM drive, 128kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12
Partition check:
 hda: hda1 hda2 hda3
 hdb: hdb1 hdb2 hdb3
 hdc: [PTBL] [1023/64/63] hdc1 hdc4
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
NTFS version 000607
Serial driver version 5.05 (2000-12-13) with MANY_PORTS SHARE_IRQ SERIAL_PCI ISAPNP 
enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
ne2k-pci.c:v1.02 10/19/2000 D. Becker/P. Gortmaker
  http://www.s

Re: nfs performance at high loads

2001-04-04 Thread Alan Cox

>   We have been seeing some problems with running nfs benchmarks
> at very high loads and were wondering if somebody could show
> some pointers to where the problem lies.
>   The system is a 2.4.0 kernel on a 6.2 Red at distribution ( so

Use 2.2.19. The 2.4 VM is currently too broken to survive high I/O benchmark
tests without going silly

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



nfs performance at high loads

2001-04-04 Thread Kapish K

Hello,
We have been seeing some problems with running nfs benchmarks
at very high loads and were wondering if somebody could show
some pointers to where the problem lies.
The system is a 2.4.0 kernel on a 6.2 Red at distribution ( so
nfs utils from 6.2 and the nfsd of 2.4.0 ) - the benchmark run
is the SPECsfs97 benchmarks that runs through a series of the
nfs operations. We have about 4 nfs clients, each invoking the
operations via 8 processes. Everything goes fine till the
500-1000 IOPs mark - no errors, response time is good (0.8
sec/op )and throughput is as expected. But at the 1500 IOPs
mark, errors show up ( nfs operations failure ) and response
time drops to 1.4 Msec/Op. Continue to 2000 IOPs, there is a
drop in the error count and the response time improves  to 1.0
Msec/Op. But from there on, it gets worse, at 2500 IOPs and 3000
IOPs with huge number of nfs errors and finally the nfs server
console scrolls on with an endless number of 'alloc-pages:
0-order allocation failed' and the clients shutdown due to too
many rpc call failures and all that can be done on the server is
to reboot the system as it becomes practically locked for all
purposes.
Any hints or directions to follow or as to whether such a
benchmark testing has been carried out by somebody else for nfs
performance would be very much appreciated.
Thanks,
KK


Get your own "800" number
Voicemail, fax, email, and a lot more
http://www.ureach.com/reg/tag
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[Lse-tech] HP Plug In Policies vs Multiqueue Scheduler (fwd)

2001-04-04 Thread Scott Rhine

There has been a little cross talk lately about the "HP" schedulers that
may be sowing some confusion.

1) Pluggable policies provides a minimally intrusive way to develop and
   test new scheduler policies such as Processor Sets, or the Fair Share
   Scheduler.  It provides a good way to test a theory without rebooting.
   (Linus said that this approach was useful for experiments but was too 
dangerous to allow in the main line kernel. sigh. We're still working on a
way to get *some* flexibility via goodness, etc.)

2) The Multi-queue approach I put on our web site is not pluggable, because
   the prototype didn't generate enough interest or performance improvement.
   It was an academic exercise showing what I considered the minimum change 
   necessary.  It took about two days to code and measure the revised scaling.
   Consider it the opening volley of a group discussion, not a finished product.

3) the cpu stealing rules try to mimic those for a earlier kernel for an i386
   architecture.  Due to the ~20 penalty, stealing was not mathematically 
   possible between cpus until everything with preference for that CPU has 
   reached 0 count.  When one queue is empty, I try to emulate the single 
   queue behavior and pick the best job from all queues.  This was for 
   simplicity and compatibility, not fairness or speed.

What they say is true, there is no such thing as bad publicity.  I've had more
MQ downloads this week than I did when they were new.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 kernel hangs on 486 machine at boot

2001-04-04 Thread Brian Gerst

Alan Cox wrote:
> 
> > Problem: Linux kernel 2.4 consistently hangs at boot on 486 machine
> >
> > Shortly after lilo starts the kernel it hangs at the following message:
> > Checking if this processor honours the WP bit even in supervisor mode...
> > 
> 
> Does this happen on 2.4.3-ac kernel trees ? I thought i had it zapped
> 

Yes, that fix in -ac should take care of it.  As to why only the 486
showed the problem, most 386's will not fault on the write protected
page (the whole reason for this test) and pentiums and later don't run
the test at all (assumed good).

--

Brian Gerst
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



stick processes in 2.4.3 trace (alt-sysrq t)

2001-04-04 Thread Pau


Here's is the trace of a nautilus process in D state.
I'm rebooting now in 2.4.3-ac2 to see if it still happens.

Pau

 trace.bz2


  1   2   >