date:20001020

Re: Patch to remove undefined C code

2000-10-20 Thread Albert D. Cahalan


> Yes.  In practice the usual question is whether the compiler will
> evaluate the operands from left to right or from right to left,
> but the compiler is within its rights to evaluate the operands in
> any order it wants.

For ia64, it would be good to evaluate the operands in parallel.
One could do the same with naked Transmeta hardware and other VLIW
processors.

Tera's machine has cheap hardware threads that could be used.
Compaq's 21364 or 21464 may allow this too.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: bind() allowed to non-local addresses

2000-10-20 Thread David Schwartz


> [EMAIL PROTECTED] said:
> >   There is NOT a bug in the JVM code that handles java.net.DatagramSock
> > et.  Don't you find it a little compelling that the nearly identical
> > JVM code passes the Java Compatibility test suite on Linux 2.2,
> > Solaris, HPUX, SCO, and even Windows?
>
> If the JVM spec says that it 'MUST' fail when used on a non-local
> address,
> and the POSIX spec for bind does not say that it 'MUST' fail, then yes,
> there is a bug in the JVM if it assumes that the two are compatible.

"The bind() function will fail if:
...
[EADDRNOTAVAIL]
The specified address is not available from the local machine"
-- SuS 2

Now I suppose we can argue about what it means for an address to be
available, but I'd say the words "will fail" make that pretty clear.

DS



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: process header declaration?

2000-10-20 Thread John Kacur


Timur Tabi wrote:
> 
> ** Reply to message from "Andrew C. Dingman"
> <[EMAIL PROTECTED]> on Thu, 19 Oct 2000 12:30:51 -0500 (EST)
> 
> > I'm working on a project for my senior seminar for which I (and my
> > profs) think I need to modify the process descriptor
> > struct. Unfortunately, I don't seem to be good enough with 'grep' to
> > figure out where the type is declared.
> 
> Spend the money to get a real editor, like Visual SlickEdit for Linux.  It
> makes analyzing the Linux kernel ten times easier.  In this case, a single
> keystroke would have told you where that structure (or any field in any
> structure) is defined.
> 
> --
> Timur Tabi - [EMAIL PROTECTED]
> Interactive Silicon - http://www.interactivesi.com

Linux distributions come with an amazing amount of quality tools, you
don't need to invest any extra dollars. Learn to use ctags or etags and
you can quickly find a type with vi or emacs.

John Kacur
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Any dual AGP slot motherboards?

2000-10-20 Thread Timothy A. Seufert


>So they started with the PCI spec, but they changed the logical meaning
>of a lot of the bus signals, they added a lot of bus signals, they run it
>about 8 times faster, they changed the voltage and a bunch of other stuff.
>I say it is a different animal now.

Sorry to post yet another correction, but you've misinterpreted the AGP
spec a bit.  To really understand what's going on, you must read farther
into the spec.

AGP adds "sideband" signals to 100% standard 66 MHz 3.3V PCI.  The
sideband signals are for setup, control, and clocking of AGP mode
transactions.  AGP transactions re-use the PCI data pins for data
transfer, but at no time is the PCI logical or electrical protocol
violated.  During AGP transfers, the PCI state machines on both sides
are simply told the bus is busy.

At least as of AGP 2X, AGP was such a compatible superset of PCI that it
was possible to connect an AGP 2X device to a PCI bus and have it work. 
You can't do that if you want to use AGP transfers, since they require a
point-to-point bus, but if the AGP pins are left unconnected the device
act as a normal PCI device.

As a matter of fact, most if not all current PCI video cards have the
same chips as the respective AGP versions.  I personally own an ATI PCI
video card which has "AGP 2X" silkscreened on top of its Rage Pro chip. 
This was likely a key element of Intel's plan for marketing AGP to the
industry; AGP would have been a much harder sell if it had forced
companies to develop different chips for the AGP and PCI versions of
their cards.

It does sound like they've finally changed signalling levels for AGP
4X.  If PCI-mode transfers have to use the new electrical spec, it's the
first time the PCI part of AGP has departed in any way from standard
PCI.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

New feature:SYN cookies firewall.

2000-10-20 Thread Bordi Zhou


 A new feature for kernel 2.2.17.
SYN cookies firewalling. You can use it
to protect a whole net to avoid the SYN flooding
attack. The attachment is kernel patch and administration
tool.
Bordi

 ip_scfw-0.9.1.tar.gz

trouble with eepro100+catalyst

2000-10-20 Thread umbertogs


We're having lots of trouble with eepro100 and Cisco Catalyst switch,
and my net are a vlan. I am using RedHat 6.2/7.0 and not ping to gateway, but with o 
Slackware 7.0 ok. What's the magic?

Regards,

Umberto 
Systems Analyst
.comDominio
Brazil  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: question wrt context switching during disk i/o

2000-10-20 Thread Mark Hahn


> This is something that has been bugging me for a while.  I notice
> on my system that during disk write we do much context switching,
> but not during disk read.  Why is that?

bdflush is broken in current kernels.  I posted to linux-mm about this,
but Rik et al haven't shown any interest.  I normally see bursts of 
up to around 40K cs/second when doing writes; I hacked a little 
premption counter into the kernel and verified that they're practially
all bdflush...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Patch to remove undefined C code

2000-10-20 Thread Rick Hohensee


([EMAIL PROTECTED]) nobly encouraged empiricism thusly...
>   Just to encourage empiricism, I usually check stuff like that with
>
>   % cat > foo.c
>   main(){
>   int i;
>   i = 1 , 2;
>   printf("%d\n",i);
>   }
>   % gcc foo.c
>   % ./a.out
>   1
>
>   or similar if I'm confused.

Indeed. Further curiosity is sometimes promptly slaked in a manner
such as this...


:; cLIeNUX0 /dev/tty4  17:43:31   /
:;cc1
main(){
int i;
i = 1 , 2;
printf("%d\n",i);
}
:; cLIeNUX0 /dev/tty4  17:43:31   /
:;

The output of cc1 is not shown in the above. You'll have to
play that little ode to the joy of empiricism yourself.

Rick Hohensee
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: problem with filesystem (fat ?)

2000-10-20 Thread Arkadiusz Miskiewicz


On Wed, Oct 18, 2000 at 07:59:23AM +0200, Arkadiusz Miskiewicz wrote / Dnia Wed, Oct 
18, 2000 at 07:59:23AM +0200, Arkadiusz Miskiewicz napisa(a):
> Oct 18 09:51:09 dark kernel: invalid operand: 
> Oct 18 09:51:09 dark kernel: CPU:0
> Oct 18 09:51:09 dark kernel: EIP:
>0010:[ne2k-pci:__insmod_ne2k-pci_O/lib/modules/2.4.0-test9/kernel/drivers/+-789199/96]
...

that was IMHO caused by some corruption caused on fat32 partition. Now for example

I got errors such as:
Oct 21 01:02:48 arm kernel: attempt to access beyond end of device
Oct 21 01:02:48 arm kernel: 03:01: rw=0, want=11047331, limit=9912136
Oct 21 01:02:48 arm kernel: Directory sread (sector 0x1512345) failed
Oct 21 01:02:48 arm kernel: attempt to access beyond end of device
Oct 21 01:02:48 arm kernel: 03:01: rw=0, want=11047331, limit=9912136
Oct 21 01:02:48 arm kernel: Directory sread (sector 0x1512345) failed
Oct 21 01:02:48 arm kernel: attempt to access beyond end of device
Oct 21 01:02:48 arm kernel: 03:01: rw=0, want=11047331, limit=9912136

I think that is problem with linux fat32 support because usually
when I write something on fat32 (under Linux) using long file names
then in Norton Disc Doctor (v2001) almost always complains to long
names (truncate these long names) while under windoze everything seems
be ok.

Any ideas ? (kernel 2.4.0-test9)

btw. who is linux fat32 fs maintainer ?

-- 
Arkadiusz Mikiewicz http://www.misiek.eu.org/ipv6/
PLD GNU/Linux [IPv6 enabled]http://www.pld.org.pl/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

2.2.18pre17: usb-uhci verbosity

2000-10-20 Thread Martin Hicks



I moved from 2.2.18pre15 to 2.2.18pre17 and now I get the following
usb-uhci stuff being spewed to my terminal:

Oct 20 20:54:19 plato kernel: usb-uhci.c: interrupt, status 3, frame# 704
Oct 20 20:54:20 plato kernel: usb-uhci.c: interrupt, status 3, frame# 1088
Oct 20 20:54:20 plato kernel: usb-uhci.c: interrupt, status 3, frame# 1472
Oct 20 20:54:20 plato kernel: usb-uhci.c: interrupt, status 3, frame# 1856

Everything seems to be functioning correctly though.  Reason?

thanks
mh

-- 
Martin Hicks   || [EMAIL PROTECTED]
Use PGP/GnuPG  || DSS PGP Key: 0x4C7F2BEE  
Beer: So much more than just a breakfast drink.

 PGP signature

Re: TRACED] Re: "Tux" is the wrong logo for Linux

2000-10-20 Thread James Lewis Nance


On Fri, Oct 20, 2000 at 03:45:29PM -0400, Ricky Beam wrote:
> On Thu, 19 Oct 2000, Richard B. Johnson wrote:
> >Cary, NC. can't be very large. There are, probably, three persons in

> If that were really true, then the world is in trouble... one of Cisco's
> largest offices is here.  Nortel has a large footprint as well.

Its also home to the corporate headquarters of the worlds largest privatly
held software company (SAS).

Jim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

about /proc/meminfo and mmap

2000-10-20 Thread Zhixu Liu


Hi, all:

My PC have 128M RAM, but in /proc/meminfo, it display 122424K, not
128*1024K = 131072K, what does this mean? 

My program need to a 32M buffer, so I add "append="mem=96M"" to lilo.conf,
then the PC only know 96M mem, I can use the rest 32M. Following is a
simple example:

//
 int fd = open("/dev/mem", O_RDWR);
 if (fd < 0) {
printf("failed to open /dev/mem\n");
return -1;
}
 start = (DATA *) mmap(0, length*sizeof(DATA),PROT_READ|PROT_WRITE, 
   MAP_SHARED, fd, BASE_ADDRESS);
 if (start == (DATA *) (-1) ) {
printf("failed to map /dev/mem\n");
return -1;
}

 // do ...

 munmap(start, length);
 
//

Is there some problem? Or does the DATA are all in real RAM? Any
suggestions are welcome.

Thanks a lot.

Regards.

Zhixu

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bind() allowed to non-local addresses

2000-10-20 Thread Alexander Viro




On Fri, 20 Oct 2000, Matt Peterson wrote:

> cvs-1.10.8/vms/rcmd.c:64:rs = bind(s, (struct sockaddr *)_isa,
 ^^^
> sizeof(local_isa)); 
> cvs-1.10.8/vms/rcmd.c:79:rs = bind(s, (struct sockaddr *)_isa,
 ^^^
> sizeof(local_isa));
> 
> The cvs code does call bind, but you are right, it does not check rs for

Unless something really terrible happened quite recently, Linux is _not_
VMS.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Alexander Viro

On Thu, 19 Oct 2000, Linus Torvalds wrote:

> You'd have to do something like
> 
>   LockPage(page); /* Nobody gets to write to this page (except 
>through mmaps, ugh) */
>   gather_all_mmap_users(page);/* THIS is the nasty one */

Wait a second. invalidate_inode_pages() has no idea of range, right? Finding
all VMAs of shared mappings for given inode is not a big deal. Sure,
repeating it for each bloody page would be painful at extreme, but
"make sure that every access to address within _that_ VMA will result in
pagefault" looks like a reasonable operation. Basically, we have
two sides - pagecache and many VMAs. And loop over VMAs with per-VMA
operations sounds more reasonable than loop over pages. We _can_ block
pageins without messing with pages themselves (it starts with finding VMA,
after all) and we can block new shared mappings (not a big deal).

Comments? Basically, I propose per-VMA rw-semaphore taken on page-in for
read and on that "flush and make sure we'll reread" operation for write +
rw-sem on i_mmap_shared. We could even make invalidation work for less
than whole file, all we need for that is skipping the VMAs out of the
range. Ingo, Linus?

>   nfs_wb_page(page);  /* force write-back on this page */
>   ClearPageUptodate(page);/* mark it not up-to-date to force a read-in 
>next time */
>   UnlockPage(page);   /* Ok, now the client can go wild */
> 
> where everything but the "gather_all_mmap_users()" part is fairly
> straightforward.  The "gather" phase is nasty - it would need to figure
> out every place the page is mapped, make sure those are synchronized (ie
> something like marking the page table entry write-protected and causing a
> TLB invalidate SMP cross-call - at which point the resulting page fault
> and the page lock will catch anybody who tries to write to the page)..

> In no case could you do something like what the current
> invalidate_inode_pages() does, which is to just try to drop the page from
> the cache - that really only works if we're the only user of that page,
> which the "page_count != 1" test now enforces.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bind() allowed to non-local addresses

2000-10-20 Thread Matt Peterson

Eric Lammerts wrote:
> 
> On Fri, 20 Oct 2000, Matt Peterson wrote:
> > Are you also suggesting that every other program that expects bind() to
> > fail with EADDRNOTAVAIL are broken too?  Just for fun, I greped all
> > sources of software shipped in Caldera's distributions for instances of
> > where a check is made for EADDRNOTAVAIL after a call to bind().  Guess
> > what else besides Java is probably "broken" ...
> >
> > - lpng
> > - bind 8.2
> > - automount
> > - cvs
> > - dhcpd
> > - KDE
> > - UCL mbone
> > - ncftp
> > - netatalk
> > - nfsd
> > - rexec
> > - pppd
> > - sendmail
> > - xchat
> 
> Just for fun I looked at the sources of cvs, ncftp, netatalk, rexec
> and pppd. Guess what? None of them check for EADDRNOTAVAIL after a
> call to bind(). 

I stand corrected.  I double checked and not all of the above check
EADDRNOTAVAIL after a bind().  My grep script was only smart enough to
check for calls to bind() and EADDRNOTAVAIL .  It turns out that
EADDRNOTAVAIL is also a commonly checked return code to the
ioctl(SIOCDIFADDR) which is not an issue because it probably does not
follow the bind() code path through the kernel.  

> Cvs and pppd don't even call bind()!
> 
> Get your facts straight, please.

cvs-1.10.8/vms/rcmd.c:64:rs = bind(s, (struct sockaddr *)_isa,
sizeof(local_isa)); 
cvs-1.10.8/vms/rcmd.c:79:rs = bind(s, (struct sockaddr *)_isa,
sizeof(local_isa));

The cvs code does call bind, but you are right, it does not check rs for
EADDRNOTAVAIL.  pppd uses the ioctl() mentioned above.  My apologies.

I do not have time to go through an analize code to see if the success
of bind when the interface is not known would cause any problems.  My
guess is that it would not because before binding the interface is
looked up via ioctl() or gethostbyname().  Also as mentioned earlier in
this thread, INADDR_ANY is also commonly used.  

The point I probably failed in making is that (right or wrong) many
developers (because of tradition, documentation and various specs)
expect bind() on a non-local address to fail.  This is certainly the
case with Sun and many authors of Sockets interface documentation.  

Anyway, I am through discussing the issue.  We will probably use the
sysctl solution posted by David Miller earlier in the thread with
default bind() behavior reverted.

-- 
Matthew Peterson
Sr. Software Engineer
Caldera Systems, Inc
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Any dual AGP slot motherboards?

2000-10-20 Thread Gary E. Miller

Yo All!

On Fri, 20 Oct 2000, J . A . Magallon wrote:

> AFAIK, AGP is just a preferent PCI slot in the PCI bus; that is for you can
> only have ONE AGP port on a PCI bus.
No.  The AGP bus is never ON a PCI bus.

To quote the AGP 2.0 spec section 1.2:

"A.G.P. neither replaces nor diminishes the necessity of PCI in the system. 
This high speed port (A.G.P.) is physically, logically, and electrically 
independent of the PCI bus."

"The A.G.P. interface specification uses the 66 MHz PCI (PCI Local Bus 
Specification) specification as an operational baseline, and provides four 
significant performance extensions or enhancements to the PCI specification
which are intended to optimize the A.G.P. for high performance 3D graphics 
applications. These A.G.P. extensions are not described in, or required by, 
the PCI Local Bus Specification. These extensions are:
· Deeply pipelined memory read and write operations, fully hiding memory 
  access latency.
· Demultiplexing of address and data on the bus, allowing almost 100% bus 
  efficiency.
· New AC timing in the 3.3 V electrical specification that provides for 
  one or two data transfers per 66-MHz clock cycle, allowing for real data 
  throughput in excess of 500 MB/s.
· A new low voltage electrical specification that allows four data transfers 
  per 66-MHz clock cycle, providing real data throughput of up to 1 GB/s."

So they started with the PCI spec, but they changed the logical meaning
of a lot of the bus signals, they added a lot of bus signals, they run it 
about 8 times faster, they changed the voltage and a bunch of other stuff.  
I say it is a different animal now.

> Please, could an expert point to the AGP standard defs ?
http://developer.intel.com/technology/agp/

The spec itself is at:
ftp://download.intel.com/technology/agp/downloads/agp20.pdf

In any case, I do not see how this topic belongs on the l-k list.
Contact me off-list if you need more info.

RGDS
GARY
---
Gary E. Miller Rellim 20340 Empire Ave, Suite E-3, Bend, OR 97701
[EMAIL PROTECTED]  Tel:+1(541)382-8588 Fax: +1(541)382-8676

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: DMA and my Maxtor drive

2000-10-20 Thread Jeff V. Merkey



Linux 2.X trees are tested with memory sent from the buffer cache to the
disk I/O subsystem aligned on 512-byte boundries.   This error is
typically seen is someone is passing memory via calls to ll_rw_blk()
that are not at a minimum 512-byte aligned, which can result in a wrap
case that will hang the IDE DMA controller.  Linux I do not believe has
been tested with any other alignment.  When I first saw these errors
when I was writing my own LRU for the NWFS file system, alignment was
the cause.  

:-)

Jeff

[EMAIL PROTECTED] wrote:
> 
> I get this when DMA is enabled:
> 
> Oct 20 15:39:07 cr753963-a kernel: hdb: timeout waiting for DMA
> Oct 20 15:39:07 cr753963-a kernel: hdb: irq timeout: status=0x6e {
> DriveReady DeviceFault DataRequest CorrectedError Index }
> ide0: reset: success
> Oct 20 15:39:07 cr753963-a kernel: hdb: DMA disabled
> Oct 20 15:39:07 cr753963-a kernel: ide0: reset: success
> 
> It only happens when there lots of data is being transferred, or compiled
> on the drive.. The drive status is this:
> 
> /dev/hdb:
> 
>  Model=Maxtor 82560A4, FwRev=AA8Z2726, SerialNo=C40LTQGA
>  Config={ Fixed }
>  RawCHS=4962/16/63, TrkSize=0, SectSize=0, ECCbytes=20
>  BuffType=DualPortCache, BuffSize=256kB, MaxMultSect=16, MultSect=off
>  CurCHS=4962/16/63, CurSects=5001696, LBA=yes, LBAsects=5001728
>  IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
>  PIO modes: pio0 pio1 pio2 pio3 pio4
>  DMA modes: mdma0 mdma1 *mdma2
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

DMA and my Maxtor drive

2000-10-20 Thread linux



I get this when DMA is enabled:

Oct 20 15:39:07 cr753963-a kernel: hdb: timeout waiting for DMA
Oct 20 15:39:07 cr753963-a kernel: hdb: irq timeout: status=0x6e {
DriveReady DeviceFault DataRequest CorrectedError Index }
ide0: reset: success
Oct 20 15:39:07 cr753963-a kernel: hdb: DMA disabled
Oct 20 15:39:07 cr753963-a kernel: ide0: reset: success

It only happens when there lots of data is being transferred, or compiled
on the drive.. The drive status is this:

/dev/hdb:

 Model=Maxtor 82560A4, FwRev=AA8Z2726, SerialNo=C40LTQGA
 Config={ Fixed }
 RawCHS=4962/16/63, TrkSize=0, SectSize=0, ECCbytes=20
 BuffType=DualPortCache, BuffSize=256kB, MaxMultSect=16, MultSect=off
 CurCHS=4962/16/63, CurSects=5001696, LBA=yes, LBAsects=5001728
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: mdma0 mdma1 *mdma2


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Any dual AGP slot motherboards?

2000-10-20 Thread J . A . Magallon

On Fri, 20 Oct 2000 19:56:26 Gary E. Miller wrote:
> Yo James!
> 
> On Fri, 20 Oct 2000, James Simmons wrote:
> 
> > After much searching I couldn't find one. It was one of those mac rumors
> > people spread around. I still like to get more than one AGP going. If I
> > have multiple PCI bus in theory I should be able to have one AGP port on
> > each PCI bus. Right? 
> 
> AGP is much faster than PCI bus and has nothing to do with the 
> PCI bus.  So the number of multiple PCI buses has nothing to
> do with the number of AGP buses.
> 

AFAIK, AGP is just a preferent PCI slot in the PCI bus; that is for you can
only have ONE AGP port on a PCI bus. If all were AGP ports, you will have
a new-reinvented-ultra-fast-pci-bus. It is fast because it is special, just
for that. In linux, lspci -v lists also your AGP card, doesn't it ?

Please, could an expert point to the AGP standard defs ?

-- 
Juan Antonio Magallon Lacarta  mailto:[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

NFS oops with 2.3.99-pre3

2000-10-20 Thread Albert D. Cahalan



While copying a file to an NFS filesystem, cp got stuck and I later
found an oops on the console.

These are mounted:

pc-sw8:(pid282) on /net type auto 
(intr,rw,port=1023,timeo=8,retrans=110,indirect,map=/etc/amd/amd.net)
inxs4:/export/home on /amd/inxs4/root/export/home type nfs (rw)

This seems to be the hung process. There is no PID 6569, and there
were no messages about PID 6571. Anyway, cp is stuck in __down().

  F STAT   PID  PPID %CPU PRI WCHAN   WCHAN COMMAND
000 D 6571 1  0.0  39 down   107ab0 cp

Running "uname -a" on the server called inxs4 reports:

SunOS inxs4 5.6 Generic_105181-11 sun4u sparc

I'm using an "amd" that comes from Debian, and it reports:

Unofficial patch level 102.
amd 5.2.2.2 of 1992/05/31 16:53:21 bsd44-beta #0: Wed Aug  9 14:16:02 PDT 2000

Here is everything I could get off the console. Since gpm is still
working, this ought to be accurate.

nfs warning: mount version newer than kernel
nfs_read_super: get root fattr failed
INIT: version 2.78 reloading
Unable to handle kernel NULL pointer dereference at virtual address 
 printing eip:
c015cc28
*pde = 
Oops: 0002
CPU:0
EIP:0010:[]
EFLAGS: 00010286
eax:    ebx: c0a8e064   ecx: c0b0e818   edx: c1f03dd0
esi: c1b36780   edi: c015cc18   ebp: c0a8e064   esp: c1f03c9c
ds: 0018   es: 0018   ss: 0018
Process cp (pid: 6569, stackpage=c1f03000)
Stack: c1f03d1c c01f5be6 c0a8e064 c0b0e818 c1f03dd0 0246 c1f03cec c1f03d1c
   0286 c01f86c3 c1f03d1c c1f03d14  c1f03d1c c1b36780 c1f02000
   c01f5232 c1f02000 c1f03d84 c1f03d84 c1f03cec c01f84a4 c01f8a02 c1f03d1c
Call Trace: [] [] [] [] [] 
[] []
   [] [] [] [] [] [] 
[] []
   [] [] [] []
Code: c7 00 02 00 00 00 8b 02 50 51 53 e8 fc fc ff ff 83 c4 0c 5b

First column is the address from above, second is the function:

c01094bc system_call
c0122aca generic_file_write
c012a2fc sys_write
c013bb28 update_atime
c0157775 nfs_commit_write
c0157829 nfs_file_write
c0157f79 nfs_writepage_sync
c0159270 nfs_updatepage
c015a8c1 nfs_instantiate
c015a9a5 nfs_create
c015bb1a nfs_proc_write
c015cc18 nfs_xdr_writeres
c015cc28 nfs_xdr_writeres
c01f5232 rpc_call_sync
c01f5242 rpc_call_sync
c01f5be6 call_decode
c01f6e78 xprt_timer
c01f84a4 __rpc_wake_up
c01f86c3 __rpc_execute
c01f8a02 rpc_execute
c01f9a3d rpc_init_task

These didn't map to anything. (many are stack addresses)

c0a8e064 ?
c0b0e818 ?
c1b36780 ?
c1f02000 ?
c1f03000 ?
c1f03c9c ?
c1f03cec ?
c1f03d14 ?
c1f03d1c ?
c1f03d84 ?
c1f03dd0 ?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Equal cost multipath support

2000-10-20 Thread Balasubramanian Ramachandran


Hi,

Is equal cost multipath forwarding supported in linux 2.2
or 2.3 versions?

Thanks in advance,
Bala



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] cpqarray: several fixes/cleanups

2000-10-20 Thread Jeff Garzik


Rasmus Andersen wrote:
> Now we are looking at this driver, could we include the following patch?
> It makes gcc stop complaining about unused functions and variables when
> compiling cpqarray.c.
> 
> --- linux-240-test10-pre4-clean/drivers/block/cpqarray.cThu Oct 19 21:20:31 
>2000
> +++ linux/drivers/block/cpqarray.c  Fri Oct 20 21:37:03 2000
> @@ -103,7 +103,9 @@
>  static int * ida_hardsizes;
>  static struct gendisk ida_gendisk[MAX_CTLR];
> 
> +#ifdef CONFIG_PROC_FS
>  static struct proc_dir_entry *proc_array;
> +#endif
> 
>  /* Debug... */
>  #define DBG(s) do { s } while(0)
> @@ -173,10 +175,6 @@
>  #ifdef CONFIG_PROC_FS
>  static void ida_procinit(int i);
>  static int ida_proc_get_info(char *buffer, char **start, off_t offset, int length, 
>int *eof, void *data);
> -#else
> -static void ida_procinit(int i) {}
> -static int ida_proc_get_info(char *buffer, char **start, off_t offset,
> -int length, int *eof, void *data) { return 0;}
>  #endif
> 
>  static void ida_geninit(int ctlr)
> @@ -495,8 +493,9 @@
> 
> hba[i]->access.set_intr_mask(hba[i], FIFO_NOT_EMPTY);
> 
> -
> +#ifdef CONFIG_PROC_FS
> ida_procinit(i);
> +#endif
> 
> blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR + i),
> request_fns[i]);

Look at include/linux/proc_fs.h...  Like pci.h, it is designed to
eliminate the need for ifdef's in the code.  Is there another way you
could work up this patch, with that in mind?

Jeff



-- 
Jeff Garzik| The difference between laziness and
Building 1024  | prioritization is the end result.
MandrakeSoft   |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread Andre Hedrick


On Fri, 20 Oct 2000 [EMAIL PROTECTED] wrote:

> [EMAIL PROTECTED] wrote..
> 
> > I reliably get 30MB/s with my IBM 30G 7200rpm ATA66 drive, using a
> > Via VT82C586 controller.  2.4.0-test9.  Modern drives are really fast.
> 
> Hmm, I'm confused here.
> VIA 586 can only do up to UDMA 2, which should return speeds less than
> that. My system has an identical configuration, and I get ~12MB/s

No the are the pci device ide but different guts.  This is the ugliness
that most never see.

> Something doesn't add up here.
> What mode do you have the drive in?
> 
> Regards,
> 
> Dave.
> 
> -- 
> | Dave Jones <[EMAIL PROTECTED]>  http://www.suse.de/~davej
> | SuSE Labs
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

Andre Hedrick
The Linux ATA/IDE guy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] cpqarray: several fixes/cleanups

2000-10-20 Thread Rasmus Andersen


On Tue, Oct 17, 2000 at 05:59:22PM -0200, Arnaldo Carvalho de Melo wrote:
> Jeff,
> 
>   Here it is, resubmitting after rediffing wrt 2.4.0-test10-pre3.
> 
> - Arnaldo 
> 
> --- linux-2.4.0-test10-3/drivers/block/cpqarray.c Fri Oct 13 18:40:39 2000

(snipped Linus from the to-list since this is a not-so-severe request
for comments.)

Now we are looking at this driver, could we include the following patch?
It makes gcc stop complaining about unused functions and variables when
compiling cpqarray.c.


--- linux-240-test10-pre4-clean/drivers/block/cpqarray.cThu Oct 19 21:20:31 
2000
+++ linux/drivers/block/cpqarray.c  Fri Oct 20 21:37:03 2000
@@ -103,7 +103,9 @@
 static int * ida_hardsizes;
 static struct gendisk ida_gendisk[MAX_CTLR];
 
+#ifdef CONFIG_PROC_FS
 static struct proc_dir_entry *proc_array;
+#endif
 
 /* Debug... */
 #define DBG(s) do { s } while(0)
@@ -173,10 +175,6 @@
 #ifdef CONFIG_PROC_FS
 static void ida_procinit(int i);
 static int ida_proc_get_info(char *buffer, char **start, off_t offset, int length, 
int *eof, void *data);
-#else
-static void ida_procinit(int i) {}
-static int ida_proc_get_info(char *buffer, char **start, off_t offset,
-int length, int *eof, void *data) { return 0;}
 #endif
 
 static void ida_geninit(int ctlr)
@@ -495,8 +493,9 @@
 
hba[i]->access.set_intr_mask(hba[i], FIFO_NOT_EMPTY);
 
-
+#ifdef CONFIG_PROC_FS
ida_procinit(i);
+#endif
 
blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR + i), 
request_fns[i]);


-- 
Regards,
Rasmus([EMAIL PROTECTED])

You don't become a failure until you're satisfied with being one. 
  -- Anonymous
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread davej


[EMAIL PROTECTED] wrote..

> I reliably get 30MB/s with my IBM 30G 7200rpm ATA66 drive, using a
> Via VT82C586 controller.  2.4.0-test9.  Modern drives are really fast.

Hmm, I'm confused here.
VIA 586 can only do up to UDMA 2, which should return speeds less than
that. My system has an identical configuration, and I get ~12MB/s

Something doesn't add up here.
What mode do you have the drive in?

Regards,

Dave.

-- 
| Dave Jones <[EMAIL PROTECTED]>  http://www.suse.de/~davej
| SuSE Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread Andre Hedrick


On Fri, 20 Oct 2000, safemode wrote:

> 
> 
> That's what i was thinking, but 30MB/s seems to be quite an exaggeration.
> On my
> Intel Corporation 82371AB PIIX4 IDE (rev 01), ide chipset my master (10.2GB

Yes because that chipset is limited to Ultra33 rates

> maxtor 7200rpm UDMA66) drive i get ~15-16MB/s and on my slave (same
> interface, 20.1GB maxtor 7200rpm UDMA66), i get ~13MB/s.  This goes against
> logic as the bigger the drive the faster the transferrate should be, and
> it's about half of your estimate of Michael's 40GB.  Is this due to the
> slow disk access of 2.4.0-test10-preX ? Or am i experiencing a bug here?

There is something goofy in the block layer.
 
> Both drives are operating at UDMA33 mode (according to hdparm)  and both
> drives are set to using 32bit, dma, 16 sector read ahead and 16 sector
> multi-access mode.  I've posted results i've gotten from bonnie and
> bonnie++ before, in all cases, the performance seems to be lacking for the
> kind of hardware i have.   

You go through a buffered OS.

> 
> On Fri, 20 Oct 2000 14:58:41 Andre Hedrick wrote:
> > 
> > Michael,
> > 
> > Whatever card you are using, in you are getting that low I need to know
> > more info.  That drive should cook at 30MB/sec.
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

Andre Hedrick
The Linux ATA/IDE guy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bind() allowed to non-local addresses

2000-10-20 Thread Eric Lammerts



On Fri, 20 Oct 2000, Matt Peterson wrote:
> Are you also suggesting that every other program that expects bind() to
> fail with EADDRNOTAVAIL are broken too?  Just for fun, I greped all
> sources of software shipped in Caldera's distributions for instances of
> where a check is made for EADDRNOTAVAIL after a call to bind().  Guess
> what else besides Java is probably "broken" ...
> 
> - lpng
> - bind 8.2
> - automount
> - cvs 
> - dhcpd
> - KDE
> - UCL mbone
> - ncftp
> - netatalk
> - nfsd
> - rexec
> - pppd
> - sendmail
> - xchat

Just for fun I looked at the sources of cvs, ncftp, netatalk, rexec
and pppd. Guess what? None of them check for EADDRNOTAVAIL after a
call to bind(). Cvs and pppd don't even call bind()!

Get your facts straight, please.

Eric


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: TRACED] Re: "Tux" is the wrong logo for Linux

2000-10-20 Thread Ricky Beam

On Thu, 19 Oct 2000, Richard B. Johnson wrote:
>Cary, NC. can't be very large. There are, probably, three persons in

Why "can't" it?  Just because it's in NC and not CA?  Even CA has it's
sparse areas (ok, maybe that's "a sparse area" now-a-days.)

FYI, most of Cary is a townhouse/strip mall meca.  It has always been my
opinion the Cary city planners have never seen much less played SimCity.

>the whole county than have computers. Two haven't been booted since

If that were really true, then the world is in trouble... one of Cisco's
largest offices is here.  Nortel has a large footprint as well.

(You should know better anyway as RedHat's offices are near Cary.)

"We ain't all stewpid."

--Ricky

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread safemode





On Fri, 20 Oct 2000 15:34:04 Dmitry Pogosyan wrote:
> safemode wrote:
> 
> > That's what i was thinking, but 30MB/s seems to be quite an
> exaggeration.
> > On my
> > Intel Corporation 82371AB PIIX4 IDE (rev 01), ide chipset my master
> (10.2GB
> > maxtor 7200rpm UDMA66) drive i get ~15-16MB/s and on my slave (same
> > interface, 20.1GB maxtor 7200rpm UDMA66), i get ~13MB/s.  This goes
> against
> > logic as the bigger the drive the faster the transferrate should be,
> and
> > it's about half of your estimate of Michael's 40GB.  Is this due to the
> > slow disk access of 2.4.0-test10-preX ? Or am i experiencing a bug
> here?
> > Both drives are operating at UDMA33 mode
> 
> Isn't it this a reason ? You are not using UDMA66

Actually, the difference between UDMA33 and UDMA66 mode occurs mostly in
the cache, I should be getting in the upwards of 20MB/s with UDMA33, the
first drive is gettings speeds i would expect, but the second is
drastically slower even though logic of ide drives dictate it should be
going faster since it is bigger.  At least this is the general pattern you
see with ide drives.

 
> 
> > (according to hdparm)  and both
> > drives are set to using 32bit, dma, 16 sector read ahead and 16 sector
> > multi-access mode.  I've posted results i've gotten from bonnie and
> > bonnie++ before, in all cases, the performance seems to be lacking for
> the
> > kind of hardware i have.
> 
>  I have 17-18 MB/sec  on my Quantum Fireball 6.4GB (5400 rpm)
> drive attached to UDMA33 and 14-15 MB/sec on another, also 5400 rpm
> (guess Samsung) drive. Both in  -c1 -d1 -m16   mode.
> 
> Ah, sorry, this is with ancient 2.2.5 kernel
> 
> 
> 
> 


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread Jeremy Fitzhardinge


On Fri, Oct 20, 2000 at 03:16:14PM -0400, safemode wrote:
> That's what i was thinking, but 30MB/s seems to be quite an exaggeration.

I reliably get 30MB/s with my IBM 30G 7200rpm ATA66 drive, using a
Via VT82C586 controller.  2.4.0-test9.  Modern drives are really fast.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread Dmitry Pogosyan


safemode wrote:

> That's what i was thinking, but 30MB/s seems to be quite an exaggeration.
> On my
> Intel Corporation 82371AB PIIX4 IDE (rev 01), ide chipset my master (10.2GB
> maxtor 7200rpm UDMA66) drive i get ~15-16MB/s and on my slave (same
> interface, 20.1GB maxtor 7200rpm UDMA66), i get ~13MB/s.  This goes against
> logic as the bigger the drive the faster the transferrate should be, and
> it's about half of your estimate of Michael's 40GB.  Is this due to the
> slow disk access of 2.4.0-test10-preX ? Or am i experiencing a bug here?
> Both drives are operating at UDMA33 mode

Isn't it this a reason ? You are not using UDMA66


> (according to hdparm)  and both
> drives are set to using 32bit, dma, 16 sector read ahead and 16 sector
> multi-access mode.  I've posted results i've gotten from bonnie and
> bonnie++ before, in all cases, the performance seems to be lacking for the
> kind of hardware i have.

 I have 17-18 MB/sec  on my Quantum Fireball 6.4GB (5400 rpm)
drive attached to UDMA33 and 14-15 MB/sec on another, also 5400 rpm
(guess Samsung) drive. Both in  -c1 -d1 -m16   mode.

Ah, sorry, this is with ancient 2.2.5 kernel


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Trond Myklebust

> " " == Alexander Viro <[EMAIL PROTECTED]> writes:

 > Again, consider the case when two processes share the
 > mapping. Process A has page faulted in. Page is
 > invalidated. Process B tries to access the same page. If you
 > leave it in page tables of A you _MUST_ leave it in
 > cache. Period. Otherwise A and B will have different instances
 > of the page.

Even so, you want to reread it. When I said automatically msync(), I
meant 'schedule a write of what has changed and then do whatever you
need to do to get the page reread'. (You explicitly asked for
semantics, not an implementation)

 > And rereading the thing might be tolerable _only_ if there is
 > another client that had changed the file.  Even if you msync()
 > everything, you have to deal with plain and boring memory
 > modifications done by a process that did that bloody mmap(). If
 > they happen while you are reading the data from server - too
 > fscking bad, you'ld better have a good excuse for destroying
 > the data. write() from another client _is_ a good excuse. But
 > from my reading of fs/nfs/* it looks like we do that (cache
 > invalidation) left, right and center in cases that have nothing
 > to another clients.

That's because under NFS you don't have a cache consistency
protocol. Nothing tells you that the file/directory has changed and
that you have to resync your cache. Instead, you have to infer it from
the fact that some operation has returned file attributes that are
screwy.
In addition you may want to force a reread, because some operation
just changed a directory on the server, and you don't know what else
changed beforehand.

 > IOW, I think that invalidate_inode_pages() is bogus. There is
 > only one situation when we have a right to remove page from
 > pagecache - when it is not mapped anywhere.

The issue is not about removing pages. It's about forcing a reread of
the cached data from the server. Removing the actual pages from the
cache has so far been the only race-free method for doing this (since
pre-2.2.x at least) while ensuring that at least generic 'read',
'readdir' and 'write' work as expected.
Yes it screws up mmap() and should be fixed but without breaking what
little that works please.

As for simply settling for a self-consistent mmap() rather than
tackling the problem of rereading; the main crime is that you're
rendering file locking unusable.
Locking is the case in which you have to issue a guarantee that the
cache is consistent between client and server within the area covered
by the lock. In all other cases you *could* get away with the partial
cache invalidation implementation by arguing that there no consistency
guarantees inherent in the protocol.

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[PATCH] x86 PCI detection and documentation

2000-10-20 Thread Rasmus Andersen

(This is a repost of a mail I made just before vger went
offline. My apoligies to Linus and Martin Mares for mailing
this twice to them, but any comments should be received
by all.)

Hi.

(This has mostly been discussed in an earlier thread. This is a
follow-up with some extra added.)

There currently is a mismatch between the documentation for the
config option CONFIG_PCI_GOBIOS and the code it describes. The
help text states that if 'any' (CONFIG_PCI_GOANY) is picked,
linux will first probe for PCI devices directly, falling back to
asking the BIOS if direct access fails. In reality, the code
tries both (first BIOS then directly) and keeps the direct access
results (if valid) or else uses the BIOS results (if valid).

I discussed this mismatch with Martin Mares at a time when I
thought that the code should be changed to do as the help text
stated (because that made my machine boot :) ), and he wrote:

> Some years ago, the PCI routines have really used this strategy
> (and the obsolete help text reflects this situation), but unfortunately,
> there exist machines where the direct access detection gives bogus
> results, so it's much better to ask the BIOS first. Also, it's conceptually
> cleaner to use a well-defined BIOS interface than to probe random
> ports (well, they are random on all non-PCI machines).
> 
> These are the reasons why I'd prefer keeping the current code and
> just fixing the documentation.

AFAICS the current code does not follow this line of thought
completely, as it still probes the hw directly after asking the
BIOS, even though the BIOS might have returned valid data.

So I propose the following patch (patch 1) that changes the code
to check the BIOS results before probing directly and changes the
documentation to reflect this.

If this is rejected for some reason, I include a patch 2 that
merely changes the Documentation/Configure.help to reflect how
the code works currently.

Please comment.

Patch 1:

--- linux-240-test10-pre4-clean/arch/i386/kernel/pci-pc.c   Thu Oct 19 21:16:44 
2000
+++ linux/arch/i386/kernel/pci-pc.c Thu Oct 19 22:44:55 2000
@@ -969,7 +969,7 @@
}
 #endif
 #ifdef CONFIG_PCI_DIRECT
-   if (pci_probe & (PCI_PROBE_CONF1 | PCI_PROBE_CONF2))
+   if (!bios && pci_probe & (PCI_PROBE_CONF1 | PCI_PROBE_CONF2))
dir = pci_check_direct();
 #endif
if (dir)

--- linux-240-test10-pre4-clean/Documentation/Configure.helpThu Oct 19 21:20:29 
2000
+++ linux/Documentation/Configure.help  Thu Oct 19 21:53:27 2000
@@ -2437,8 +2437,8 @@
   With this option, you can specify how Linux should detect the PCI
   devices. If you choose "BIOS", the BIOS will be used, if you choose
   "Direct", the BIOS won't be used, and if you choose "Any", the
-  kernel will try the direct access method and falls back to the BIOS
-  if that doesn't work. If unsure, go with the default, which is
+  kernel will try through the BIOS and fall back to the direct access
+  method if that doesn't work. If unsure, go with the default, which is
   "Any".

 PCI device name database

patch 2:

--- linux-240-test10-pre4-clean/Documentation/Configure.helpThu Oct 19 21:20:29 
2000
+++ linux/Documentation/Configure.help  Thu Oct 19 22:49:44 2000
@@ -2437,9 +2437,9 @@
   With this option, you can specify how Linux should detect the PCI
   devices. If you choose "BIOS", the BIOS will be used, if you choose
   "Direct", the BIOS won't be used, and if you choose "Any", the
-  kernel will try the direct access method and falls back to the BIOS
-  if that doesn't work. If unsure, go with the default, which is
-  "Any".
+  kernel will try asking the BIOS first and then use the direct access 
+  method regardless of the BIOS scan. If unsure, go with the default,
+  which is "Any".

 PCI device name database
 CONFIG_PCI_NAMES

-- 
Regards,
Rasmus([EMAIL PROTECTED])

Outside of the killings, Washington has one of the lowest crime rates in
the country. -Mayor Marion Barry, Washington, DC

- End forwarded message -

-- 
Rasmus([EMAIL PROTECTED])

When C++ is your hammer, everything looks like a thumb.  Steven M. Haflich
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: [ADMIN] some list related topics ..

2000-10-20 Thread Alexander Viro

On Thu, 19 Oct 2000, Marty Fouts wrote:

> > -Original Message-
> > From: Matti Aarnio [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, October 19, 2000 1:26 PM
> > To: [EMAIL PROTECTED]
> > Subject: [ADMIN] some list related topics ..
> 
> [snip]
> 
> > 
> >   3) some ISP systems yield 500 series errors with text:
> > "system is temporarily busy"
> >  or something of that effect.  Now THAT is really offensive
> >  stupidity by the ISP software folks...
> > 
> 
>   There is nothing in the SMTP RFCs that require any to be able to
> accept all email at all times.  SMTP is *not* designed to be a reliable
> delivery mechanism, let alone a first-time reliable delivery mechanism.
> Refusal to accept email because the receiving system is under high load is
> well understood, commonly accepted, and even codified in implementation
> practice.

RTFRFC. 821, Appendix E, that is. Quote:

 There are five values for the first digit of the reply code:
...
4yz   Transient Negative Completion reply

   The command was not accepted and the requested action did
   not occur.  However, the error condition is temporary and
   the action may be requested again.  ...
...
5yz   Permanent Negative Completion reply

   The command was not accepted and the requested action did
   not occur.  The sender-SMTP is discouraged from repeating
   the exact request (in the same sequence).  ...

> In my opinion, you are doing a GoodThing(tm) by trying to weed broken
> addresses from the mailing list. But please don't demand from the internet
> behavior it wasn't designed to provide.

Like following the RFCs with Status: STANDARD? Last time I've checked
RFC821 was STD0010 and other parts of said beast (1869, 1870) follow the
rules set in 821,App.E when they describe the possible error codes and
their interpretation.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread safemode

That's what i was thinking, but 30MB/s seems to be quite an exaggeration.
On my
Intel Corporation 82371AB PIIX4 IDE (rev 01), ide chipset my master (10.2GB
maxtor 7200rpm UDMA66) drive i get ~15-16MB/s and on my slave (same
interface, 20.1GB maxtor 7200rpm UDMA66), i get ~13MB/s.  This goes against
logic as the bigger the drive the faster the transferrate should be, and
it's about half of your estimate of Michael's 40GB.  Is this due to the
slow disk access of 2.4.0-test10-preX ? Or am i experiencing a bug here? 
Both drives are operating at UDMA33 mode (according to hdparm)  and both
drives are set to using 32bit, dma, 16 sector read ahead and 16 sector
multi-access mode.  I've posted results i've gotten from bonnie and
bonnie++ before, in all cases, the performance seems to be lacking for the
kind of hardware i have.   

On Fri, 20 Oct 2000 14:58:41 Andre Hedrick wrote:
> 
> Michael,
> 
> Whatever card you are using, in you are getting that low I need to know
> more info.  That drive should cook at 30MB/sec.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: IDE disk slow? There's help...

2000-10-20 Thread Andre Hedrick



Michael,

Whatever card you are using, in you are getting that low I need to know
more info.  That drive should cook at 30MB/sec.

On Fri, 20 Oct 2000, Michael Kwasigroch wrote:

> Hi,
> 
> I recently bought a 40Gig IBM ATA100 disk as a replacement for a dying 4G
> SCSI disk. I knew I was risking some trouble because I have an about 4 year
> old triton 2 board (Intel 430HX) and I didn't want to risk more trouble
> (and spend more money) by using a proprietary PCI IDE controller board. But
> the disk was dead cheap and really big so I bought it and connected it as
> the primary master on the onboard controller and...
> 
> Linux (stock 2.2.17) could ony push about 2.6 MB/s "through" it (hdparm -Tt
> /dev/hda)... :-(
> 
> The scsi disks can do about 5.5 - 6.1 MB/s (8Bit fast SCSI, no ultra,
> adaptec 2940 PCI).
> 
> So I tried to enable IDE DMA, 16 bit data transfers, no use. That was quite
> disappointing but I gave up until yesterday when I (again) searched
> 
>http://www.linux-ide.org
> 
> I got the latest 2.2.17 ide-patch, made a new kernel and voila:
> 
> My new IDE disk now "flies" at about 9.2 MB/s and really outperforms the
> scsi disks!!!
> 
> ABOUT 3.5 PERFORMANCE GAIN! FOR FREE!!! Unbelievable, but the truth with
> free software...
> 
> 
> One thing I don't understand: Why is this patch not in the stock kernel? It
> should (positively) affect lots of people, or am I missing something?
> 
> 
> P.S.: Please email me directly, I'm not subscribed to any Linux list.
> 
> PPS: Beware 33+ Gig IDE disks if you have an Award 4.51 BIOS and want to
> boot from it.
>  You will **NOT** be able to boot from disks >33G due to a BIOS bug.
>  See
>  http://www.storage.ibm.com/techsup/hddtech/bios338gb.htm
>  and
>  http://www.storage.ibm.com/techsup/hddtech/hddfaqs.htm
>  for details.
> 
> 
> Enjoy.
> 
> 
> Mit freundlichen Gruessen / best regards
> 
> Michael Kwasigroch
> FaxPlus/Open Development
> 
> 
> eMail: [EMAIL PROTECTED]
> 
> INTERCOPE GmbH
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

Andre Hedrick
The Linux ATA/IDE guy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel address space]

2000-10-20 Thread Jeff Garzik

Andrea Arcangeli wrote:
> 
> On Fri, Oct 20, 2000 at 02:19:59PM -0400, Jeff Garzik wrote:
> > Why? [..]
> 
> vma information isn't passed from v4l layer to lowlevel layer.

so I see :(

The Matrox Meteor II driver I'm developing uses DMA memory, PCI shared
memory, -or- reserve_bootmem memory in mmap(2), depending on card and
system capabilities.  It sure would be nice to have the vma there,
especially now that I know about using nopage() for mmap, but it can be
worked around...

-- 
Jeff Garzik| The difference between laziness and
Building 1024  | prioritization is the end result.
MandrakeSoft   |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: netlink_dev not compile in test9 due to bug in net/Makefile

2000-10-20 Thread David S. Miller



Your patch cannot be correct, it is impossible for CONFIG_NETLINK_DEV
to be set without CONFIG_NETLINK also being set.  Therefore your patch
should make no different when using a correct kernel configuration.

I think you have a bogus kernel configuration or a mis-patched tree.

Later,
David S. Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel address space]

2000-10-20 Thread Andrea Arcangeli


On Fri, Oct 20, 2000 at 02:19:59PM -0400, Jeff Garzik wrote:
> Why? [..]

vma information isn't passed from v4l layer to lowlevel layer.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Any dual AGP slot motherboards?

2000-10-20 Thread H. Peter Anvin


Followup to:  <[EMAIL PROTECTED]>
By author:"Gary E. Miller" <[EMAIL PROTECTED]>
In newsgroup: linux.dev.kernel
>
> Yo James!
> 
> On Fri, 20 Oct 2000, James Simmons wrote:
> 
> > After much searching I couldn't find one. It was one of those mac rumors
> > people spread around. I still like to get more than one AGP going. If I
> > have multiple PCI bus in theory I should be able to have one AGP port on
> > each PCI bus. Right? 
> 
> AGP is much faster than PCI bus and has nothing to do with the 
> PCI bus.

Well, it borrows a *lot* from the PCI bus in its design.

> So the number of multiple PCI buses has nothing to
> do with the number of AGP buses.
> 
> The way to get multiple PCI buses is to bridge one PCI bus on
> to another.  There are no changes required to the core chipset.
> There is no way (yet) to bridge one AGP bus on another.
> 

This isn't necessarily true.  It's quite common to have multiple PCI
busses connected to the *HOST* bus.

AGP isn't a bus, it's a port.  You won't be able to bridge them, but
it's perfectly feasible for the chipset to provide more than one AGP
port.
-- 
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Alexander Viro

On Fri, 20 Oct 2000, Trond Myklebust wrote:

> For the general case of the page cache I think we can keep them quite
> simple:
> 
> + We do in any case want to drop all pages that are unreferenced. (The
> reason for flushing may be that the file size has changed.)
> 
> + For pages that are referenced (and unlocked) we would like to force
> them to get read in anew ASAP. How this is done in practice is
> irrelevant as far as NFS is concerned provided that we don't sleep on
> any I/O while in nfs_zap_caches()/invalidate_inode_pages().
> 
> The lower level stuff can and will sort out the business of flushing
> out pending writebacks that conflict with the read, so that isn't a
> problem for the VFS/VM.
> 
> The problem lies with writes that haven't yet been msync()ed (and
> hence do not have writebacks). For shared mappings, one should perhaps
> schedule an automatic msync() of the dirty pages (???). For private
> mappings, perhaps the best thing would be to defer the read?

Again, consider the case when two processes share the mapping. Process A
has page faulted in. Page is invalidated. Process B tries to access the
same page. If you leave it in page tables of A you _MUST_ leave it in
cache. Period. Otherwise A and B will have different instances of the
page.

It's not about writebacks. If you map something with MAP_SHARED and
fork() afterwards you _MUST_ have the same data at the address returned by
mmap() until one of the processes unmaps the thing.

And rereading the thing might be tolerable _only_ if there is another
client that had changed the file.  Even if you msync() everything, you
have to deal with plain and boring memory modifications done by a process
that did that bloody mmap(). If they happen while you are reading the data
from server - too fscking bad, you'ld better have a good excuse for
destroying the data. write() from another client _is_ a good
excuse. But from my reading of fs/nfs/* it looks like we do that (cache
invalidation) left, right and center in cases that have nothing to another
clients.

IOW, I think that invalidate_inode_pages() is bogus. There is only one
situation when we have a right to remove page from pagecache - when it is
not mapped anywhere.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [ADMIN] some list related topics ..

2000-10-20 Thread David S. Miller

   From: Marty Fouts <[EMAIL PROTECTED]>
   Date:Thu, 19 Oct 2000 19:30:33 -0700

   > -Original Message-
   > From: Matti Aarnio [mailto:[EMAIL PROTECTED]]
   > Sent: Thursday, October 19, 2000 1:26 PM
   > To: [EMAIL PROTECTED]
   > Subject: [ADMIN] some list related topics ..

   [snip]

   >   3) some ISP systems yield 500 series errors with text:
   >"system is temporarily busy"
   >  or something of that effect.  Now THAT is really offensive
   >  stupidity by the ISP software folks...

 There is nothing in the SMTP RFCs that require any to be able to
   accept all email at all times.

The lack of the systems ability to handle the email at this moment is
not what Matti is complaining about.

You will find in RFC0821, section 4.2.1, what return code "500" is
designated to mean.

Later,
David S. Miller
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [ADMIN] some list related topics ..

2000-10-20 Thread Matti Aarnio


On Thu, Oct 19, 2000 at 07:30:33PM -0700, Marty Fouts wrote:
> [snip]
> > 
> >   3) some ISP systems yield 500 series errors with text:
> > "system is temporarily busy"
> >  or something of that effect.  Now THAT is really offensive
> >  stupidity by the ISP software folks...
> 
>   There is nothing in the SMTP RFCs that require any to be able to
> accept all email at all times.  SMTP is *not* designed to be a reliable
> delivery mechanism, let alone a first-time reliable delivery mechanism.
> Refusal to accept email because the receiving system is under high load is
> well understood, commonly accepted, and even codified in implementation
> practice.

Yes, but the Specified Way to handle temporary problems is to
yield 400-series codes, or defer answering to the SMTP connections
at all.

Yielding 500 series replies means "this address is now and forever
invalid, don't try this again."

> In my opinion, you are doing a GoodThing(tm) by trying to weed broken
> addresses from the mailing list. But please don't demand from the internet
> behavior it wasn't designed to provide.

Seeing bad behaviour at 2 ISPs doesn't count to me as something
which majority of implementations are supposed to do.

> Marty

/Matti Aarnio
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Alexander Viro

On 20 Oct 2000, Trond Myklebust wrote:

> > " " == Russell King <[EMAIL PROTECTED]> writes:
> 
>  > invalidate_inode_pages nfs_zap_caches nfs_lock fcntl_setlk
>  > do_fcntl sys_fcntl
> 
>  > So I guess that NFS locking is really bad if the region is
>  > mmapped!
> 
> Yep, but that's a symptom, not a cause. We want to be able to run
> invalidate_inode_pages() safely at any moment, since the need can be
> triggered externally (because the server and client page caches
> disagree).

So what exactly do you want it to do when page is mapped by user process?
Should it remain visible or not? What should happen if process writes to
that page?

Trond, I'm not asking about implementation - the question being what
semantics do you want for nfs_zap_caches() wrt user-mapped pages.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel address space]

2000-10-20 Thread Linus Torvalds

On Fri, 20 Oct 2000, Jeff Garzik wrote:
> 
> If I understand your patch, I should call vma_reserve(), and then
> completely remove my no-op swapout().  Correct?

Note that I dislike "wrapper.h", and I just removed that part.

I don't think it's any clearer to write "vma_reserve(vma)" than it is to
just say "vma->vm_flags |= VM_RESERVE". 

But yes, add that line and remove the swapout, and you should be golden -
no unnecessary faults (well, it won't pre-fault, of course) and no trouble
with calculating locked pages.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Alexander Viro

On 20 Oct 2000, Trond Myklebust wrote:

> Under NFS the problem is that pages can (and *should*) be invalidated
> despite there being pending write backs. The server can trigger the
> need for a cache invalidation at any time.

OK, so what should happen if user does mmap() on NFS file, dirties the
page and server tells that page should be invalidated? Should we still see
that page in process' address space? Silently making it anonymous is the
worst thing possible - it's still accessible through process page tables
and you (a) still see the (allegedly) invalidated data and (b) have no way
in hell to get the cache coherency between processes. I.e. you are in for
situations when we do mmap(), fork(), write data in child and attempt to
read from that address in parent and child yield different
results. Moreover, future attempts to modify the data by either of them
will be invisible in another.

IOW, what do you really want from the invalidation? It looks like the
right thing _might_ be "remove from all page tables", but that can become
really interesting with mlock() and friends.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Alexander Viro

On Fri, 20 Oct 2000, Roger Larsson wrote:

> Is it legal/good practice to unmap the file after closing it?
Yes.

> (Since the sharing needs the fd to mmap it)

It doesn't. Mapping needs struct file * and it doesn't care about
fd. mmap() takes a reference to struct file by fd you've passed and after
that we can forget about descriptors - vma_struct holds a reference to
file and that's it.

> Successful unlinking a file should probably free pages directly to
> free list - might be worth optimizing for.

_Definitely_ no. Hell, unlink() doesn't mean that anything happens with
data - unlink an opened file and if that affects the read/write/lseek
you've found a bug. And mapping is equivalent to having opened descriptors
in that respect.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Alexander Viro




On Fri, 20 Oct 2000, Trond Myklebust wrote:

> > " " == David S Miller <[EMAIL PROTECTED]> writes:
> 
>  > Actually, judging by the trace you provided Russell, I'd say
>  > this is some peculiarity with NFS silly rename handling, and
>  > it'd be best to look for the bug in that code (early inode
>  > reference loss, for example?)
> 
> Russel's trace indicates that the unlink actually has completed and
> has become a negative dentry since the file is labelled '(deleted)'.

No, it doesn't. Proof:

exec 3>/tmp/foo
rm /tmp/foo
ls -l /proc/$$/fd/3

and dentry is _not_ negative.

> That means that the dentry count must have been zero, so that
> dentry_iput() was called.

It doesn't and it wasn't.

> I don't see how dentry_iput() can be called on an open file. In
> principle the dentry count should always be >= 1, so unless there is
> some place where we're calling d_delete() without get()ing the dentry
> first, there should be no path for early inode loss.

d_delete() will not make dentry negative if there are other
users. However, it will make it unhashed in such case, ergo
(deleted) thing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: oopses in test10-pre4 (was Re: [RFC] atomic pte updates and paechanges, take 3)

2000-10-20 Thread Ben LaHaise


On Thu, 19 Oct 2000, Linus Torvalds wrote:


> I think you overlooked the fact that SHM mappings use the page cache, and
> it's ok if such pages are dirty and writable - they will get written out
> by the shm_swap() logic once there are no mappings active any more.
> 
> I like the test per se, because I think it's correct for the "normal"
> case of a private page, but I really think those two BUG()'s are not bugs
> at all in general, and we should just remove the two tests.
> 
> Comments? Anything I've overlooked?

The primary reason I added the BUG was that if this is valid, it means
that the pte has to be removed from the page tables first with
pte_get_and_clear since it can be modified by the other CPU.  Although
this may be safe for shm, I think it's very ugly and inconsistent.  I'd
rather make the code transfer the dirty bit to the page struct so that we
*know* there is no information loss.

If the above is correct, then the following patch should do (untested).  
Oh, I think I missed adding pte_same in the generic pgtable.h macros, too.
  I'm willing to take a closer look if you think it's needed.

-ben

diff -urN v2.4.0-test10-pre4/include/asm-generic/pgtable.h 
work-foo/include/asm-generic/pgtable.h
--- v2.4.0-test10-pre4/include/asm-generic/pgtable.hFri Oct 20 00:58:03 2000
+++ work-foo/include/asm-generic/pgtable.h  Fri Oct 20 01:42:24 2000
@@ -38,4 +38,6 @@
set_pte(ptep, pte_mkdirty(old_pte));
 }
 
+#define pte_same(left,right)   (pte_val(left) == pte_val(right))
+
 #endif /* _ASM_GENERIC_PGTABLE_H */
diff -urN v2.4.0-test10-pre4/mm/vmscan.c work-foo/mm/vmscan.c
--- v2.4.0-test10-pre4/mm/vmscan.c  Fri Oct 20 00:58:04 2000
+++ work-foo/mm/vmscan.cFri Oct 20 01:43:54 2000
@@ -87,6 +87,13 @@
if (TryLockPage(page))
goto out_failed;
 
+   /* From this point on, the odds are that we're going to
+* nuke this pte, so read and clear the pte.  This hook
+* is needed on CPUs which update the accessed and dirty
+* bits in hardware.
+*/
+   pte = ptep_get_and_clear(page_table);
+
/*
 * Is the page already in the swap cache? If so, then
 * we can just drop our reference to it without doing
@@ -98,10 +105,6 @@
if (PageSwapCache(page)) {
entry.val = page->index;
swap_duplicate(entry);
-   if (pte_dirty(pte))
-   BUG();
-   if (pte_write(pte))
-   BUG();
set_pte(page_table, swp_entry_to_pte(entry));
 drop_pte:
UnlockPage(page);
@@ -111,13 +114,6 @@
page_cache_release(page);
goto out_failed;
}
-
-   /* From this point on, the odds are that we're going to
-* nuke this pte, so read and clear the pte.  This hook
-* is needed on CPUs which update the accessed and dirty
-* bits in hardware.
-*/
-   pte = ptep_get_and_clear(page_table);
 
/*
 * Is it a clean page? Then it must be recoverable

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

2.4.0-test10-pre4 oops

2000-10-20 Thread Henrik Størner


My 2.4.0-test10-pre4 box died overnight, apparently around 4 am when the
nightly cron-jobs run. The last thing logged looks interesting:

kernel BUG at vmscan.c:102!
invalid operand: 
CPU:0
EIP:0010:[try_to_swap_out+252/796]
EFLAGS: 00010286
eax: 001c   ebx: 0100   ecx:    edx: 
esi: c11c20e4   edi: 0001   ebp: 069e5045   esp: c12edebc
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 2, stackpage=c12ed000)
Stack: c024abca c024ad89 0066 080ac000 c645b140 080ec000 080a9000 0200
   c012719e c645b140 c6a16820 080ab000 c4ce02ac 0004 080a9000 c6a16820
   c645b140 0004 c4ce02ac 084a9000 c4ce1080 080ec000 080ec000 c4ce1080
Call Trace: [tvecs+6622/95392] [tvecs+7069/95392] [swap_out_vma+322/440] 
[swap_out_mm+56/100] [swap_out+283/368] [refill_inactive+213/376] 
[do_try_to_free_pages+98/128]
   [tvecs+7429/95392] [kswapd+139/348] [empty_bad_page+0/4096] 
[kernel_thread+35/48]
Code: 0f 0b 83 c4 0c f7 c5 02 00 00 00 74 17 6a 68 68 89 ad 24 c0

System is using reiserfs, but otherwise it's a pretty stock Red Hat
6.2.  Hardware is a PII/350, ncr53c875 scsi controller, 128 MB RAM.
Two network cards: RealTek 8139 and PCI NE2000.

-- 
Henrik Storner  | "Crackers thrive on code secrecy. Cockcroaches breed 
<[EMAIL PROTECTED]> |  in the dark. It's time to let the sunlight in."
|  
|  Eric S. Raymond, re. the Frontpage backdoor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

question wrt context switching during disk i/o

2000-10-20 Thread Mike Galbraith


Hi,

This is something that has been bugging me for a while.  I notice
on my system that during disk write we do much context switching,
but not during disk read.  Why is that?

   procs  memoryswap  io system cpu
 r  b  w   swpd   free   buff  cache  si  sobibo   incs  us  sy  id
 0  1  1  0   1940588 113672   0   0 0  3526  216  3707   0  14  86
 0  1  1  0   1940588 113672   0   0 0  3567  218  3722   0  15  85
 0  1  1  0   1940588 113672   0   0 0  3567  216  3711   0  18  82
 0  1  1  0   1940588 113672   0   0 0  3412  213  3712   0  15  85
 1  1  1  0   1940588 113672   0   0 0  3628  218  3669   0  13  87
 0  1  1  0   1460588 114152   0   0 0  3505  215  3744   0  17  83
 1  0  1  0   1460588 114148   0   0  2417   310  262   680   0   6  94
 1  0  1  0   1460588 114148   0   0  352031  324   620   0  16  84
 2  0  1  0   1460588 114148   0   0  348831  321   623   0  10  90
 1  0  1  0   1460588 114148   0   0  3168   310  307   590   0  18  82
 1  0  1  0   1460588 114148   0   0  3584 0  327   626   0  12  88
 1  0  1  0   1460588 114148   0   0  352062  322   612   0  12  88
 1  0  1  0   1460588 114148   0   0  348831  320   675   0  15  85

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: oops with dd with test10-pre4

2000-10-20 Thread davej


[EMAIL PROTECTED] wrote...

> kernel BUG at vmscan.c:102!

I hit the same bug() an hour ago.
It was preceeded by a complete system lock up (not even sysrq worked)
and then fsck oopsed after triggering the above bug.

The lockup happened whilst I was diffing two kernel trees.
Seems large amounts of disk IO trigger this bug() easily.

regards,

d.

-- 
| Dave Jones <[EMAIL PROTECTED]>  http://www.suse.de/~davej
| SuSE Labs

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Linus Torvalds

[ Final comment, and then I'll shut up ]

On Thu, 19 Oct 2000, Linus Torvalds wrote:
> 
> You'd have to do something like
> 
>   LockPage(page); /* Nobody gets to write to this page (except 
>through mmaps, ugh) */
>   gather_all_mmap_users(page);/* THIS is the nasty one */
>   nfs_wb_page(page);  /* force write-back on this page */
>   ClearPageUptodate(page);/* mark it not up-to-date to force a read-in 
>next time */
>   UnlockPage(page);   /* Ok, now the client can go wild */

Note that one approach would be to just make invalidate_inode_pages() do
exactly the above.

Now, the gather_all_mmap_users() part is definitely a post-2.4.x thing,
and we can't do nfs_wb_page() in invalidate_inode_page() either, but what
we CAN do is to clear the Uptodate flag (and we do need the page lock to
do so).

The advantage of clearing the uptodate flag (as opposed to doing what we
do now - dropping the page altogether) is that there would be no cache
aliasing issues, and there would be no issues with a page and its
associated data just "disappearing" from under somebody. It would cause
the page to be read in again the next time it is faulted in or somebody
does a read() on it, and that's exactly what we want.

However, we _do_ need to WB the page data some way - but the decision on
whether to invalidate write-backs or finish them would have to be up to
the low-level filesystem.

(We should _not_ allow people to read in the page, and leave some stale
write-back information there, so that we'd write back stuff that we just
read because somebody else noticed that the page was not up-to-date. That
would be just an endless source of (a) confusion and (b) unnecessary
network traffic.)

We could, of course, have some combination of the two: if there is only
one user (us), we can just drop the page, otherwise we can mark it
non-up-to-date to force future readers to re-validate the dang thing.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Linus Torvalds

On Thu, 19 Oct 2000, Linus Torvalds wrote:
> 
> I'm saying that we're much much better off guaranteeing local consistency
> over knowingly breaking local consistency over a uncertain global
> consistency issue. Especially as NFS has never guaranteed global
> consistency in the first place, and was not designed to do that.

Btw, if you really worry about NFS v3 and want to add consistency
guarantees to NFS v3, you can probably do that. It's just that you cannot
do it with a simple cache invalidation - you'd need to be much much more
careful (the same way a simple CPU cache invalidate does not guarantee SMP
cache consistency - it only guarantees data corruption due to missed
writes).

I don't know what the best way to do a true cache consistency protocol on
a page leve would be, especially as you will have a _really_ hard time
getting hold of an exclusive pointer to a page with multiple mmap's, but
you could probably get something that guarantees cache consistency as long
as people do _not_ expect mmap to always be 100% consistent.

(The reason you cannot get mmap consistency is just that mmap doesn't have
any kernel synchronization point unless you're willing to shoot down every
single mapping - which is expensive as hell, but doable).

You'd have to do something like

LockPage(page); /* Nobody gets to write to this page (except 
through mmaps, ugh) */
gather_all_mmap_users(page);/* THIS is the nasty one */
nfs_wb_page(page);  /* force write-back on this page */
ClearPageUptodate(page);/* mark it not up-to-date to force a read-in 
next time */
UnlockPage(page);   /* Ok, now the client can go wild */

where everything but the "gather_all_mmap_users()" part is fairly
straightforward.  The "gather" phase is nasty - it would need to figure
out every place the page is mapped, make sure those are synchronized (ie
something like marking the page table entry write-protected and causing a
TLB invalidate SMP cross-call - at which point the resulting page fault
and the page lock will catch anybody who tries to write to the page)..

In no case could you do something like what the current
invalidate_inode_pages() does, which is to just try to drop the page from
the cache - that really only works if we're the only user of that page,
which the "page_count != 1" test now enforces.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Trond Myklebust

> " " == Linus Torvalds <[EMAIL PROTECTED]> writes:

 > which is really really bad, because now you have the case that
 > you have 'n' copies of the same page in memory, with 'n' users,
 > out of which 'n-1' users have the wrong page. And those 'n-1'
 > users don't even have any way of _knowing_ that they have the
 > wrong page.

 > Which is why we MUST NOT drop a page that has users. Really.

 > I'm telling you that cases #4 and #5 are _much_ worse than your
 > "solution" to case #2. And you argue that your solution is good
 > only because you're completely ignoring cases #4 and #5.

No. I'm arguing (at 4:40am and while trying to keep one eye on our
detector's data acquisition) on the basis that whoever holds the file
lock has to have a guarantee of obtaining 100% accuracy on the locked
region.

I agree that dropping pages is ugly and that it will always give
problems with shared mmap(), so if it can be shown that clearing
PG_uptodate and rereading the same page will give the required
guarantee on locking, then I'm not going to complain.

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Linus Torvalds

On 20 Oct 2000, Trond Myklebust wrote:
> 
> The problem here is that NFS pages have 3 rather than 2 states:
>   1) mmapped & correct.
>   2) mmapped & incorrect. (but possibly dirty)
>   3) Unmapped
> 
> For case 1), we clearly want to have the page in inode->i_mapping.
> For cases 2) & 3) we don't.

I think you're WRONG, WRONG, WRONG.

Your case #2 is not at all a state.

It's "mapped, possibly dirty, and locally correct. But _maybe_ globally
incorrect".

If you remove the page in case #2 from the hashing, you will have case #4,
which you're ignoring:

   4) Totally, and utterly incorrect. Without any way for the application
  to even _know_ that it's incorrect.

Note that case 4 is accompanied by case #5, later on:

   5) two separate pages both mapped, one hashed and up-to-date, the other
  mapped in other processes but incorrect.

which is really really bad, because now you have the case that you have
'n' copies of the same page in memory, with 'n' users, out of which 'n-1'
users have the wrong page. And those 'n-1' users don't even have any way
of _knowing_ that they have the wrong page.

Which is why we MUST NOT drop a page that has users. Really.

I'm telling you that cases #4 and #5 are _much_ worse than your "solution"
to case #2. And you argue that your solution is good only because you're
completely ignoring cases #4 and #5.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Linus Torvalds

On 20 Oct 2000, Trond Myklebust wrote:
> 
> Under NFS the problem is that pages can (and *should*) be invalidated
> despite there being pending write backs. The server can trigger the
> need for a cache invalidation at any time.
> The existence of file locks that aren't page aligned, as well as
> partial page writeback ensure that we cannot make the equivalence
>page has pending write == page is correct.

Note that this is not what my patch really says at all.

My patch says "ok, we should invalidate this page, because somebody thinks
it may be bad. HOWEVER, we cannot reasonably do that, because we have
users of the page (which may be completely unrelated to pending writes),
and invalidating this page is _certain_ to cause corruption".

Basically think of the case of somebody having a shared mmap over NFS. He
has dirty data in his pages, and he expects those to be written out to the
server. There is NO QUESTION about this fact.

Imagine somebody (possibly even the same person) then releasing or getting
a file lock, possibly in some other part of the file. 

Without that added test, all those dirty local pages just got completely
thrown away. 

Now, you have a choice: you can KNOW that you're doing something horribly
and utterly wrong (throwing away data), or you can SUSPECT that you might
cause non-coherency between the client and the server. You cannot get
both.

I'm saying that we're much much better off guaranteeing local consistency
over knowingly breaking local consistency over a uncertain global
consistency issue. Especially as NFS has never guaranteed global
consistency in the first place, and was not designed to do that.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: about time-slice

2000-10-20 Thread George Anzinger

Dan Maas wrote:
> 
> > I have a question about the time-slice of linux, how do I know it, or how
> > can I test it?
> 
> First look for the (platform-specific) definition of HZ in
> include/asm/param.h. This is how many timer interrups you get per second (eg
> on i386 it's 100). Then look at include/linux/sched.h for the definition of
> DEF_COUNTER. This is the number of timer interrupts between mandatory
> schedules. By default it's HZ/10, meaning that the time-slice is 100ms (10
> schedules/sec). (of course the interval could be longer if kernel code is
> hogging the CPU; the scheduler won't run until the process leaves the kernel
> or sleeps explicitly...)
> 
> Experts, please correct me if I'm wrong.

Not really an expert, but...  

In the 2.4.0... version the slice time is derived from the tasks NICE
value.  Also, in the new system the call sys_sched_rr_get_interval()
returns the value.  (In older systems NICE was also involved in a "not
so clear" way and the call returned nonsense.)  

On the other hand, the system manages these slices in an interesting and
non-intuitive way.  First, the task that has the longest remaining slice
gets (usually) the processor.
Second, when the slice is consumed a new one is not given to the task
until all tasks in the run queue have consumed all of their slices. 
When this happends all tasks on the system are given new slices.  Tasks
that have some value left will get 1/2 of that value plus the new
slice.  (So a task that is waiting for something .. a key strok.. will
accumulate more slice time as it waits.).

Hope this helps.

George
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Trond Myklebust

> " " == Linus Torvalds <[EMAIL PROTECTED]> writes:

 > The advantage of clearing the uptodate flag (as opposed to
 > doing what we do now - dropping the page altogether) is that
 > there would be no cache aliasing issues, and there would be no
 > issues with a page and its associated data just "disappearing"
 > from under somebody. It would cause the page to be read in
 > again the next time it is faulted in or somebody does a read()
 > on it, and that's exactly what we want.

I've proposed this in the past, but there you refused on the grounds
that it breaks assumptions in the VFS. I'm otherwise happy with this
sort of thing. It ensures both shared mmap() and cache consistency.

The only problem I can think of would be for dirty pages that haven't
yet been msync()ed, however I'm far from being competent to evaluate
any side-effects on the VM.

 > However, we _do_ need to WB the page data some way - but the
 > decision on whether to invalidate write-backs or finish them
 > would have to be up to the low-level filesystem.

This is already in place.

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: [ADMIN] some list related topics ..

2000-10-20 Thread Marty Fouts




> -Original Message-
> From: Matti Aarnio [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, October 19, 2000 1:26 PM
> To: [EMAIL PROTECTED]
> Subject: [ADMIN] some list related topics ..

[snip]

> 
>   3) some ISP systems yield 500 series errors with text:
>   "system is temporarily busy"
>  or something of that effect.  Now THAT is really offensive
>  stupidity by the ISP software folks...
> 

  There is nothing in the SMTP RFCs that require any to be able to
accept all email at all times.  SMTP is *not* designed to be a reliable
delivery mechanism, let alone a first-time reliable delivery mechanism.
Refusal to accept email because the receiving system is under high load is
well understood, commonly accepted, and even codified in implementation
practice.

In my opinion, you are doing a GoodThing(tm) by trying to weed broken
addresses from the mailing list. But please don't demand from the internet
behavior it wasn't designed to provide.

Marty
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Trond Myklebust

> " " == Linus Torvalds <[EMAIL PROTECTED]> writes:

 > Btw, that "invalidate_inode_pages()" thing is just wrong - we
 > can't just remove pages that are mapped etc, because that would
 > result in no end of fun aliasing problems etc.

 > How about adding a test in invalidate_inode_pages() like

 >  /* We cannot invalidate a locked page */ if
 >  (TryLockPage(page))
 >  continue;

 > + /* We cannot invalidate a page that is in use */
 > + if (page_count(page) != 1) {
 > + UnlockPage(page);
 > + continue;
 > + }
 > +
 >  __lru_cache_del(page); __remove_inode_page(page);

The problem here is that NFS pages have 3 rather than 2 states:
  1) mmapped & correct.
  2) mmapped & incorrect. (but possibly dirty)
  3) Unmapped

For case 1), we clearly want to have the page in inode->i_mapping.
For cases 2) & 3) we don't.

However for case 2) we still have a weak association to the inode
itself, and we want to be able to reference inode metadata etc.  Would
it make sense then to remove these pages from i_mapping, but to hang
them onto a new struct address_space (call it i_unmapped for want of a
better name)?

That would allow you to keep a consistent state for the page, while
still allowing you to 'invalidate' it (by removing it from the
i_mapping) and hence maintain a consistent cache.

invalidate_inode_pages() would then reduce to

   remove_page_from_inode_queue(page);
   remove_page_from_hash_queue(page);
   if (page_count(page))
 add_page_to_inode_unmapped(page);

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: MAP_NR

2000-10-20 Thread Mike Galbraith


On Thu, 19 Oct 2000 [EMAIL PROTECTED] wrote:

> can anyone tell the subsitute for MAP_NR in version 2.4?
> or is MAP_NR still there?

Hi,

MAP_NR() became virt_to_page() as of test6-pre8.

-Mike
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: process header declaration?

2000-10-20 Thread Andrew C. Dingman


Thanks, both for the regexp and for the ctag/etags reccomendation. I'd
just been looking for 'task_struct' with less, find and egrep, and
apparently not noticing the correct needle in the haystack of matches.

-Andrew

On Thu, 19 Oct 2000, Erik Mouw wrote:

>On Thu, Oct 19, 2000 at 12:30:51PM -0500, Andrew C. Dingman wrote:
>> I'm working on a project for my senior seminar for which I (and my
>> profs) think I need to modify the process descriptor
>> struct. Unfortunately, I don't seem to be good enough with 'grep' to
>> figure out where the type is declared. Could someone give me a pointer
>> to the right file in the 2.4.0-testX source code, or a good expression
>> to grep for, please? I am subscribed to the list, but a cc wouldn't
>> hurt, either. Thanks in advance for any help you feel inclined to
>> offer.
>
>find include/ -name "*.h" -exec grep '^struct task_struct' {} /dev/null \;
>
>vi+ctags or emacs+etags are also good combinations to find identifiers
>in the kernel source.
>
>Or use the cross referencing tool at lxr.linux.no:
>
>  http://lxr.linux.no/ident
>
>
>Erik
>
>-- 
>J.A.K. (Erik) Mouw, Information and Communication Theory Group, Department
>of Electrical Engineering, Faculty of Information Technology and Systems,
>Delft University of Technology, PO BOX 5031,  2600 GA Delft, The Netherlands
>Phone: +31-15-2783635  Fax: +31-15-2781843  Email: [EMAIL PROTECTED]
>WWW: http://www-ict.its.tudelft.nl/~erik/
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: TRACED] Re: "Tux" is the wrong logo for Linux

2000-10-20 Thread Gregory Maxwell

On Thu, Oct 19, 2000 at 05:06:33PM +0100, Alex Buell wrote:
> With regards to this thread, looking at the headers of this post, he
> appears to be posting from 216.27.3.45. Running a traceroute produces
> the following:
[snip]
> Feel free to send complaints to [EMAIL PROTECTED] and get his account
> yanked for abuse of mailing lists. 
[snip]

While the poster was obviously being an irrational jerk, his opnion wasn't
totally inapporiately placed on this list. By getting people kicked off of
systems because they have strong (and potentially stupid) ideas is not a
good thing. Fear can be a more effective restraint on open discussion then
laws.

Any developer thick-skinned enough to with the strong personalities on this
list should have now problem ignoring the poster's inarticulate and childish
complaints.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[PATCH] Boot Logo configuration and generation

2000-10-20 Thread Rick Miller


I've replaced my former attempt at a patch with The Real Thing this
time... including the patches to include/linux/console.h and
kernel/printk.c that should have been in there last time.

BRIEFLY:
-
This patch comes with a "make_bootlogo.pl" script for installing your
own custom boot graphics and adds two options to the frame buffer kernel
configuration:

CONFIG_QUIET_PRINTK: suppresses printk()'s console output
and
CONFIG_FBLOGO_{TOPLEFT,CENTER}: chooses screen position for the boot
graphic

The patch is at:

http://execpc.com/~rdmiller/Linux/kernel-bootlogo.patch.gz

and is roughly 120k bytes due to the movement of the raw logo data.

DETAILS:

Documentation for the kernel configuration options has been added to the
Documentation/Configure.help file.  There is also a
Documentation/fb/make_bootlogo.txt to explain the use of the new script
for installing your own graphic.

Previous graphics have been retained, but moved to linux_logo_data.h
files where applicable.  The non-architecture-specific default logo is
in include/linux/linux_logo_default.h which gets linked to
include/linux/linux_logo.h if there isn't an architecture-specific logo.

ONE FINAL CAUTION:
--
Choose your boot logo with some measure of prudence.  A certain non-ANSI
logo suggested recently on this mailing list would probably not be
appropriate.

Rick Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[PATCH] Boot Logo configuration and generation

2000-10-20 Thread Rick Miller


I've replaced my former attempt at a patch with The Real Thing this
time... including the patches to include/linux/console.h and
kernel/printk.c that should have been in there last time.

BRIEFLY:
-
This patch comes with a "make_bootlogo.pl" script for installing your
own custom boot graphics and adds two options to the frame buffer kernel
configuration:

CONFIG_QUIET_PRINTK: suppresses printk()'s console output
and
CONFIG_FBLOGO_{TOPLEFT,CENTER}: chooses screen position for the boot
graphic

The patch is at:

http://execpc.com/~rdmiller/Linux/kernel-bootlogo.patch.gz

and is roughly 120k bytes due to the movement of the raw logo data.

DETAILS:

Documentation for the kernel configuration options has been added to the
Documentation/Configure.help file.  There is also a
Documentation/fb/make_bootlogo.txt to explain the use of the new script
for installing your own graphic.

Previous graphics have been retained, but moved to linux_logo_data.h
files where applicable.  The non-architecture-specific default logo is
in include/linux/linux_logo_default.h which gets linked to
include/linux/linux_logo.h if there isn't an architecture-specific logo.

ONE FINAL CAUTION:
--
Choose your boot logo with some measure of prudence.  A certain non-ANSI
logo suggested recently on this mailing list would probably not be
appropriate.

Rick Miller
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Russell King

Alexander Viro writes:
> Trond, I'm not asking about implementation - the question being what
> semantics do you want for nfs_zap_caches() wrt user-mapped pages.

Ok, looking through sendmail, and then db2, the situation is
created by the db2 library.  If the process does the following:

1. creates NFS file
2. sets file size by lseek + write
3. maps it
4. fcntl locks the file (writes above data back to file)
5. write to the mapping (makes pages dirty)
6. fcntl unlocks it (dirty page data is NOT written back)
7. closes it
8. unlinks it
9. some time later unmaps it(causes dirty data to be written back)

Note that (6) doesn't act as a barrier to synchronise writes in this case,
but it does for any normal write()s.  Surely NFS should cause any dirty
data associated with the file to be written back to the server no matter
what?

Although Linus' fix seems to prevent the problem, I get the feeling that it
is a sticky plaster over a much bigger problem.
   _
  |_| - ---+---+-
  |   | Russell King[EMAIL PROTECTED]  --- ---
  | | | | http://www.arm.linux.org.uk/personal/aboutme.html   /  /  |
  | +-+-+ --- -+-
  /   |   THE developer of ARM Linux  |+| /|\
 /  | | | ---  |
+-+-+ -  /\\\  |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

IDE disk slow? There's help...

2000-10-20 Thread Michael Kwasigroch


Hi,

I recently bought a 40Gig IBM ATA100 disk as a replacement for a dying 4G
SCSI disk. I knew I was risking some trouble because I have an about 4 year
old triton 2 board (Intel 430HX) and I didn't want to risk more trouble
(and spend more money) by using a proprietary PCI IDE controller board. But
the disk was dead cheap and really big so I bought it and connected it as
the primary master on the onboard controller and...

Linux (stock 2.2.17) could ony push about 2.6 MB/s "through" it (hdparm -Tt
/dev/hda)... :-(

The scsi disks can do about 5.5 - 6.1 MB/s (8Bit fast SCSI, no ultra,
adaptec 2940 PCI).

So I tried to enable IDE DMA, 16 bit data transfers, no use. That was quite
disappointing but I gave up until yesterday when I (again) searched

   http://www.linux-ide.org

I got the latest 2.2.17 ide-patch, made a new kernel and voila:

My new IDE disk now "flies" at about 9.2 MB/s and really outperforms the
scsi disks!!!

ABOUT 3.5 PERFORMANCE GAIN! FOR FREE!!! Unbelievable, but the truth with
free software...


One thing I don't understand: Why is this patch not in the stock kernel? It
should (positively) affect lots of people, or am I missing something?


P.S.: Please email me directly, I'm not subscribed to any Linux list.

PPS: Beware 33+ Gig IDE disks if you have an Award 4.51 BIOS and want to
boot from it.
 You will **NOT** be able to boot from disks >33G due to a BIOS bug.
 See
 http://www.storage.ibm.com/techsup/hddtech/bios338gb.htm
 and
 http://www.storage.ibm.com/techsup/hddtech/hddfaqs.htm
 for details.


Enjoy.


Mit freundlichen Gruessen / best regards

Michael Kwasigroch
FaxPlus/Open Development


eMail: [EMAIL PROTECTED]

INTERCOPE GmbH

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: problems with Tulip driver in 2.2 and 2.4 (true 21143 in 2.2.x, too)

2000-10-20 Thread Clayton Weaver


It is not only the "almost standard" tulip clones that have problems in
2.2.1x. Stock Debian potato (2.2.17-pre6, IIRC) i386 kernel, Kingston
KNE100TX w/i21143: works fine in 2.0.38 (.90 driver), hung the kernel
solid during an ftp running the potato kernel (100/half-duplex).

It was not quite a standard ftp exchange (win98 client sending a
typo, i.e. invalid ftp command, rather than a file transfer, when it
hung), but regardless, deadlocking the kernel without so much as a squeak
of an error message probably is not the intended result even in the
presence of invalid input to a userspace server.

(Assuming that the deadlock was not the product of some untirely unrelated
problem that happened just at that moment. ext2fs is a little squirrelly
with that 2.2.16+ kernel, too, fixable reference count anomalies turning
up more-or-less randomly about every 10th e2fsck, different partitions and
drives that never show such symptoms running 2.0.38. It's not as if the
ext2fs metadata write code in 2.2.17 has big revisions from 2.0.38. pci
bus in question is 2.0, no apic.)

Regards,

Clayton Weaver

(Seattle)

"Everybody's ignorant, just in different subjects."  Will Rogers



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bind() - Old/Current behaviour - Change?

2000-10-20 Thread Andrey Savochkin


[cc list trimmed]

On Thu, Oct 19, 2000 at 09:52:30PM +1000, Cefiar wrote:
[snip]
> ... what is really necessary, 
> which is to simply not allow the programs to bind to the addresses in the 
> first place. Unfortunately to implement this sort of thing in god knows how 
> many user space programs looked like too much re-inventing of the wheel.
> 
> What I'm sort of envisioning is a small API (and user space interface 
> program) that can maintain lists like this for 2 sorts of instances:
>   - Global conditions
>   - Per-process conditions
[snip]

I think that it's a good idea.
The only question is whether such lists and conditions, and such a big degree
of flexibility belongs to the kernel space.
Isn't it better just to pass almost all bind() calls through a special daemon
for systems which want non-trivial bind policies?

Best regards
Andrey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Petr Vandrovec


On 19 Oct 00 at 16:32, Linus Torvalds wrote:

> How about adding a test in invalidate_inode_pages() like
> 
> /* We cannot invalidate a locked page */
> if (TryLockPage(page))
> continue;
> 
> +   /* We cannot invalidate a page that is in use */
> +   if (page_count(page) != 1) {
> +   UnlockPage(page);
> +   continue;
> +   }
> +
> __lru_cache_del(page);
> __remove_inode_page(page);
 
Hi Linus,
  this did not fix problem with my testcase - 
mmap(shared),fork,dirty_it,ftruncate,finish - does NOT go through
this code (invalidate_inode_pages) at all (should it?).

  I instrumented both swapout in page_io, and __remove_inode_page,
and pages are going through __remove_inode_page with strange use
count. I do not know whether it is good or no... During normal life
of system, pages with count==2 && index==0 are passed to __remove_inode_page.
As soon as I start my test, non-zero index pages with count > 2 are
passed here. Later this pages appear in filemap_sync...
Thanks,
Petr Vandrovec
[EMAIL PROTECTED]

vana:~# free
 total   used   free sharedbuffers cached
Mem:255768 203232  52536  0   4084  35496
-/+ buffers/cache: 163652  92116
Swap:   530136  0 530136
vana:~# free
 total   used   free sharedbuffers cached
Mem:255768 203376  52392  0   4084  35636
-/+ buffers/cache: 163656  92112
Swap:   530136  0 530136

vana:~# free
 total   used   free sharedbuffers cached
Mem:255768  72172 183596  0   4084  32888
-/+ buffers/cache:  35200 220568
Swap:   530136  0 530136
vana:~# free
 total   used   free sharedbuffers cached
Mem:255768  72180 183588  0   4084  32888
-/+ buffers/cache:  35208 220560
Swap:   530136  0 530136


11:57:44 Bad page count in __remove_inode_page!
11:57:44   page: c1349ef8
11:57:44 mapping:  cc65c6dc
11:57:44 index:0
11:57:44 nexthash: 
11:57:44 count:2
11:57:44 flags:0x0009
11:57:44 age:  0
11:57:44 pprevhash: 
11:57:44 buffers:  
11:57:44 virtual:  cc61a000
11:57:44 zone: c021f578
11:57:44 pagedump done
11:57:44 Bad page count in __remove_inode_page!
11:57:44   page: c1120394
11:57:44 mapping:  cc6b2edc
11:57:44 index:32767
11:57:44 nexthash: 
11:57:44 count:3
11:57:44 flags:0x0009
11:57:44 age:  5
11:57:44 pprevhash: 
11:57:44 buffers:  
11:57:44 virtual:  c43d1000
11:57:44 zone: c021f578
11:57:44 pagedump done

11:57:46 Bad page count in __remove_inode_page!
11:57:46   page: c1266298
11:57:46 mapping:  cc6b2edc
11:57:46 index:13635
11:57:46 nexthash: 
11:57:46 count:3
11:57:46 flags:0x0009
11:57:46 age:  5
11:57:46 pprevhash: 
11:57:46 buffers:  
11:57:46 virtual:  c9082000
11:57:46 zone: c021f578
11:57:46 pagedump done

11:57:46 page->mapping == NULL
11:57:46 error:  -22
11:57:46 ptep:   c92f6a2c
11:57:46 pteval: 09062027
11:57:46 vma:ce926420
11:57:46   vm_mm:ccd67d60
11:57:46   vm_start: 40128000
11:57:46   vm_end:   48128000
11:57:46   vm_next:  c1490b20
11:57:46   vm_avl_height:  1
11:57:46   vm_avl_left:
11:57:46   vm_avl_right:   
11:57:46   vm_next_share:  
11:57:46   vm_pprev_share: ccdee684
11:57:46   vm_operations_struct: c021f1a0
11:57:46   vm_pgoff:  
11:57:46   vm_file:   cd4270e0
11:57:46   vm_raend:  
11:57:46   vm_private_data:  
11:57:46 address:4368B000
11:57:46 flags:  0001
11:57:46   file: cd4270e0
11:57:46 dentry:   cc8ae440
11:57:46   inode: cc6b2e40
11:57:46   num: 848677
11:57:46   dev: 0x0302
11:57:46   path: /usr/src/tst/ram0 (deleted)
11:57:46 vfsmount: c14ca8c0
11:57:46 op:   c0221bc0
11:57:46 count:4
11:57:46 flags:0x0002
11:57:46 mode: 03
11:57:46 pos:  0
11:57:46 reada:0
11:57:46 ramax:0
11:57:46 raend:0
11:57:46 ralen:0
11:57:46 rawin:0
11:57:46 owner.pid: 0
11:57:46 owner.uid: 0
11:57:46 owner.euid: 0
11:57:46 owner.signum: 0
11:57:46 uid:  0
11:57:46 gid:  0
11:57:46 error:0
11:57:46 version:  11641
11:57:46 private_data: 
11:57:46   page: c1265a18
11:57:46 mapping:  
11:57:46 index:

Re: [ANNOUNCE] DProbes 1.1

2000-10-20 Thread richardj_moore

Andi,

Thanks for your feedback.  We are looking at this now. Hopefully we will be
able to give you a response on Monday. If we don't then it's because most
of us are on holiday next week.

I'm interested in getting information on who is using DProbes and how its
being used?

Yes, an also that we haven't yet done the SMP port of Dprobes - that's
next.

Richard Moore -  RAS Project Lead - Linux Technology Centre (PISC).

http://oss.software.ibm.com/developerworks/opensource/linux
Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183
IBM UK Ltd,  MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK

Andi Kleen <[EMAIL PROTECTED]> on 18/10/2000 18:38:13

Please respond to Andi Kleen <[EMAIL PROTECTED]>

To:   Richard J Moore/UK/IBM@IBMGB
cc:   [EMAIL PROTECTED]

Subject:  Re: [ANNOUNCE] DProbes 1.1

Hallo Richard,

On Wed, Oct 18, 2000 at 10:44:11AM +0100, [EMAIL PROTECTED] wrote:
>
>
> We've release v1.1 of DProbes - deatils and code is on the DProbes web
> page.
>
> the enhancements include:
>
> - DProbes for kernel version 2.4.0-test7 is now available.

First thanks for this nice work.

I ported the older 1.0 dprobes to 2.4 a few weeks ago for my own use.
It is very useful for kernel work. Unfortunately the user space support
had still one ugly race which I didn't fix because it required too
extensive changes for my simple port (and it didn't concern me because
I only use kernel level breakpoints)

I see the problems are still in 1.1.

The problem is the vma loop in process_recs_in_cow_pages over the vmas of
an
address_space. In 2.4 the only way to do that safely is to hold the
address_space spinlock. Unfortunately you cannot take the semaphore
or execute handle_mm_fault while holding the spinlock, because they could
sleep. The only way I think to do it relatively race free without adding
locks
to the core VM is to do it two pass (first collect all the mms with mmget()
and their addresses in a separate list with the spinlock and then process
it
with the spinlock released)

Then dp_vaddr_to_page has another race. It cannot hold the mm semaphore
because that would deadlock with handle_mm_struct. Not holding it means
though that the page could be swapped out again after you faulted it in
before you have a change to access it. It probably can be done with an
loop that checks and locks the page atomically (e.g. using cmpexchg)
and retries the handle_mm_fault as needed.

There may be more races I missed, the 2.4 SMP MM locking hierarchy is
unfortunately not very flexible and makes things like what dprobes wants
to do relatively hard.

Another change I added and which I found useful is a printk to show
the opcode of mismatched probes (this way wrong offsets in the probe
definitions are easier to fix)

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: MAP_NR for 2.4

2000-10-20 Thread aprasad


for using MAP_NR with 2.4, i think you can use
macro like

#define MAP_NR(addr) (((unsigned long)(addr)-PAGE_OFFSET) >>PAGE_SHIFT)

regards
anil


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

[PATCH] Boot Logo configuration and generation

2000-10-20 Thread Rick Miller


I've replaced my former attempt at a patch with The Real Thing this
time... including the patches to include/linux/console.h and
kernel/printk.c that should have been in there last time.

BRIEFLY:
-
This patch comes with a "make_bootlogo.pl" script for installing your
own custom boot graphics and adds two options to the frame buffer kernel
configuration:

CONFIG_QUIET_PRINTK: suppresses printk()'s console output
and
CONFIG_FBLOGO_{TOPLEFT,CENTER}: chooses screen position for the boot
graphic

The patch is at:

http://execpc.com/~rdmiller/Linux/kernel-bootlogo.patch.gz

and is roughly 120k bytes due to the movement of the raw logo data.

DETAILS:

Documentation for the kernel configuration options has been added to the
Documentation/Configure.help file.  There is also a
Documentation/fb/make_bootlogo.txt to explain the use of the new script
for installing your own graphic.

Previous graphics have been retained, but moved to linux_logo_data.h
files where applicable.  The non-architecture-specific default logo is
in include/linux/linux_logo_default.h which gets linked to
include/linux/linux_logo.h if there isn't an architecture-specific logo.

ONE FINAL CAUTION:
--
Choose your boot logo with some measure of prudence.  A certain non-ANSI
logo suggested recently on this mailing list would probably not be
appropriate.

Rick Miller
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Trond Myklebust

> " " == Alexander Viro <[EMAIL PROTECTED]> writes:

 > So what exactly do you want it to do when page is mapped by
 > user process?  Should it remain visible or not? What should
 > happen if process writes to that page?

 > Trond, I'm not asking about implementation - the question being
 > what semantics do you want for nfs_zap_caches() wrt user-mapped
 > pages.

For the general case of the page cache I think we can keep them quite
simple:

+ We do in any case want to drop all pages that are unreferenced. (The
reason for flushing may be that the file size has changed.)

+ For pages that are referenced (and unlocked) we would like to force
them to get read in anew ASAP. How this is done in practice is
irrelevant as far as NFS is concerned provided that we don't sleep on
any I/O while in nfs_zap_caches()/invalidate_inode_pages().

The lower level stuff can and will sort out the business of flushing
out pending writebacks that conflict with the read, so that isn't a
problem for the VFS/VM.

The problem lies with writes that haven't yet been msync()ed (and
hence do not have writebacks). For shared mappings, one should perhaps
schedule an automatic msync() of the dirty pages (???). For private
mappings, perhaps the best thing would be to defer the read?

Cheers,
  Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

2.4.0-test9: loop deadlocked on me

2000-10-20 Thread Pavel Machek


Hi!

Trying to copy 1Gig of files onto loop device was not good idea: it
deadlocked on me.

ps -auxl:
   100 08765   9   0872   464  c0134e95   D2  0:01 cp -a 22007.pdf 
and
 0 09071   9   0772   356  c01253ad   D8  0:00 sync

System.map:
c0134e10 T wakeup_bdflush
c0134eb0 t flush_dirty_buffers
c0134f90 t sync_old_buffers

c0125150 T lock_page
c0125180 T __find_get_page
c01252b0 T __find_lock_page
c01253e0 t drop_behind
c01254f0 t generic_file_readahead

Pavel
-- 
I'm [EMAIL PROTECTED] "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bind() allowed to non-local addresses

2000-10-20 Thread David Woodhouse



[EMAIL PROTECTED] said:
>   There is NOT a bug in the JVM code that handles java.net.DatagramSock
> et.  Don't you find it a little compelling that the nearly identical
> JVM code passes the Java Compatibility test suite on Linux 2.2,
> Solaris, HPUX, SCO, and even Windows?  

If the JVM spec says that it 'MUST' fail when used on a non-local address, 
and the POSIX spec for bind does not say that it 'MUST' fail, then yes, 
there is a bug in the JVM if it assumes that the two are compatible.

The fact that they just happen to behave the same in certain phases of the 
moon and on other operating systems is not relevant.

We may decide that we want to pander to this brokenness, especially given 
the widespread nature of the false assumption that bind() will fail when 
given a non-local address. But that doesn't make the JVM non-broken.

--
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

netlink_dev not compile in test9 due to bug in net/Makefile

2000-10-20 Thread Vladimir V. Klenov


Hello!

Found a small bug in net/Makefile
netlink_dev not include in compile list because of wrong variable definition.
here is a patch:

root@valinor:/usr/src/linux/net# diff -u Makefile.old Makefile
--- Makefile.oldFri Oct 20 16:05:14 2000
+++ MakefileFri Oct 20 16:06:55 2000
@@ -27,7 +27,7 @@
 endif

 subdir-$(CONFIG_KHTTPD)+= khttpd
-subdir-$(CONFIG_NETLINK)   += netlink
+subdir-$(CONFIG_NETLINK_DEV)   += netlink
 subdir-$(CONFIG_PACKET)+= packet
 subdir-$(CONFIG_NET_SCHED) += sched
 subdir-$(CONFIG_BRIDGE)+= bridge


SY, Vladimir Klenov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: real_root_dev

2000-10-20 Thread Andries Brouwer

On Thu, Oct 19, 2000 at 09:50:48PM +0200, Geert Uytterhoeven wrote:
> 
> `real_root_dev' must be `int', not `kdev_t'.
> 
> - if (MAJOR(real_root_dev) != RAMDISK_MAJOR
> + if (MAJOR((kdev_t)real_root_dev) != RAMDISK_MAJOR

Ach, Geert, how painful to behold!

Never forget: a kdev_t is a pointer to a structure,
and MAJOR takes a field of this structure.
Casting an integer to a structure is ridiculous.
There are functions to_kdev_t etc to do the conversion
(and these may involve lookup in a hash table).

Please keep the source as much as possible kdev_t clean.
At some point in time, I hope 2.5.1, we must change,
and all such cruft would have to be fixed again.

Andries
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: mapping user space buffer to kernel address space

2000-10-20 Thread Stephen Tweedie

Hi,

On Tue, Oct 17, 2000 at 09:42:36PM -0700, Linus Torvalds wrote:

> Now, the way I'v ealways envisioned this to work is that the VM scanning
> function basically always does the equivalent of just
> 
>  - get PTE entry, clear it out.
>  - if PTE was dirty, add the page to the swap cache, and mark it dirty,
>but DON'T ACTUALLY START THE IO!
>  - free the page.
> 
> Then, we'd move the "writeout" part into the LRU queue side, and at that
> point I agree with you 100% that we probably should just delay it until
> there are no mappings available

I've just been talking about this with Ben LaHaise and Rik van Riel,
and Ben brought up a nasty problem --- NFS, which because of its
credentials requirements needs to have the struct file available in
its writepage function.  Of course, if we defer the write then we
don't necessarily have the file available when we come to flush the
page from cache.

One answer is to say "well then NFS is broken, fix it".  It's not too
hard --- NFS mmaps need a wp_page function which registers the
caller's credentials against the page when we dirty it so that we can
use those credentials on flush.  That means that writes to a
multiply-mapped file essentially get random credentials, but I don't
think we care --- the credentials eventually used will be enough to
avoid the root_squash problems and the permissions at open make sure
we're not doing anything illegal.  

(Changing permissions on an already-mmaped file and causing the NFS
server to refuse the write raises problems which are ... interesting,
but I'm not convinced that that is a new problem; I suspect we can
fabricate such a failure today.)

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Quota fixes and a few questions

2000-10-20 Thread Stephen C. Tweedie

Hi,

On Thu, Oct 19, 2000 at 07:03:54PM +0200, Jan Kara wrote:
> 
> > I stumbled into another problem:
> > When using ext3 with quotas the kjournald process stops responding and
> > stays in DW state when the filesystem gets under heavy load. It is easy
> > to reproduce:
> > Just extract two or three larger tar.gz files at the same time to a ext3
> > filesystem with activated quotas...

Which ext3 version, exactly?  0.0.2f had quota problems because ext3
wasn't doing quota writethrough, so that inode cleaning could force
out random dirty quotas at any point.  0.0.3b should fix that.  If it
doesn't, I'll try to reproduce it here.

>   My suspition is that there is quota enabled on journal and write_dquot()
> deadlocks as to write dquot we need to journal and to journal we need
> to update quota which is locked for writing :(.

Possible in 0.0.2f, but in 0.0.3b all of the quota updates should be
flushed in the context of the calling transaction.  Such updates are
always accounted for in the worst-case calculations of how many
journal blocks a transaction might need.

--Stephen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: mapping user space buffer to kernel address space

2000-10-20 Thread faith

In article <[EMAIL PROTECTED]>,
 you write:
>
>
>On Thu, 19 Oct 2000, Jeff Garzik wrote:
>>
>> I stole the last two lines from drivers/char/drm/vm.c, which might need
>> to be fixed up also..  He uses the vm_flags above and nevers calls
>> get_page, at the very least.
>
>The DRM code does
>
>   atomic_inc(_to_page(physical)->count);
>
>which is basically what get_page() expands into. The DRM code looks ugly,
>but correct, at least as far as this thing is concerned.
>
>But you're right about the mmap vm_flags/vm_file things.

We'll look at this and submit the changes with the next patch set.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

missing mxcsr initialization

2000-10-20 Thread Andrea Arcangeli


mxcsr is per-process thing and it's saved and restored with the fpu and it
should be initialized to its default value at reset the first time a
task uses the FPU as we do with the other parts of the FPU (default value
at reset means all SIMD exceptions are masked).

--- 2.4.0-test10-pre3/arch/i386/kernel/i387.c.~1~   Sat Oct 14 19:00:15 2000
+++ 2.4.0-test10-pre3/arch/i386/kernel/i387.c   Thu Oct 19 04:02:18 2000
@@ -33,6 +33,21 @@
 #endif
 
 /*
+ * The _current_ task is using the FPU for the first time
+ * so initialize it and set the mxcsr to its default
+ * value at reset if we support FXSR and then
+ * remeber the current task has used the FPU.
+ */
+void init_fpu(void)
+{
+   __asm__("fninit");
+   if ( HAVE_FXSR )
+   load_mxcsr(0x1f80);
+   
+   current->used_math = 1;
+}
+
+/*
  * FPU lazy state save handling.
  */
 
--- 2.4.0-test10-pre3/arch/i386/kernel/traps.c.~1~  Thu Oct 12 03:04:39 2000
+++ 2.4.0-test10-pre3/arch/i386/kernel/traps.c  Thu Oct 19 04:02:56 2000
@@ -741,11 +741,7 @@
if (current->used_math) {
restore_fpu(current);
} else {
-   /*
-*  Our first FPU usage, clean the chip.
-*/
-   __asm__("fninit");
-   current->used_math = 1;
+   init_fpu();
}
current->flags |= PF_USEDFPU;   /* So we fnsave on switch_to() */
 }
--- 2.4.0-test10-pre3/include/asm-i386/bugs.h.~1~   Sun Oct 15 17:51:59 2000
+++ 2.4.0-test10-pre3/include/asm-i386/bugs.h   Thu Oct 19 04:01:58 2000
@@ -94,7 +94,6 @@
printk(KERN_INFO "Enabling unmasked SIMD FPU exception support... ");
set_in_cr4(X86_CR4_OSXMMEXCPT);
printk("done.\n");
-   load_mxcsr(0x1f80);
}
 #endif
 
--- 2.4.0-test10-pre3/include/asm-i386/i387.h.~1~   Sun Oct 15 17:51:59 2000
+++ 2.4.0-test10-pre3/include/asm-i386/i387.h   Thu Oct 19 04:03:19 2000
@@ -16,6 +16,7 @@
 #include 
 #include 
 
+extern void init_fpu(void);
 /*
  * FPU lazy state save handling...
  */
--- 2.4.0-test10-pre3/include/asm-i386/user.h.~1~   Thu Oct 19 04:11:32 2000
+++ 2.4.0-test10-pre3/include/asm-i386/user.h   Thu Oct 19 04:37:44 2000
@@ -48,8 +48,8 @@
longtwd;
longfip;
longfcs;
-   longfdp;
-   longfds;
+   longfoo;
+   longfos;
longst_space[20];   /* 8*10 bytes for each FP-reg = 80 bytes */
 };
 

The last part (user.h) shouldn't matter, that structure seems significant only
for its size but it's a typo so I correted it.

(btw such bug was not present in Doug's 2.2.x alternative version of the PIII
FPU support)

Patch is downloadable also from here:


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.0-test10-pre3/PIII-fxsr-1

2.2.x backport of the PIII support with the above bugfix included against
2.2.18pre15aa1 is here (I will include in the next aa patchkit):


ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.2/2.2.18pre15aa1/PIII-3.bz2

Such patch is been generated by a mix of PIII 2.2.x patch and the PIII 2.4.x
support plus some additional change. (at the end it's very similar to 2.4.x,
but the 2.2.x version is been very useful too for doing comparison)

This 2.2.x version is completly dynamic (an i386 compile will use fxsr if CPU
running the kernel supports it). None compile time configuration option is
added.

Two kernel kernel parameters are added: "serialnumber" and "nofxsr".

The former tells the kernel not to disable the serial number (so if the BIOS
doesn't disable it it will remain enabled). The latter forces the kernel not to
use fxsr even if the feature is present in the CPU.

The patch only provides new instructions to userspace and it __never__ takes
adavantage of the FPU in kernel space and it also doesn't change at all the
logic for the lazy FPU handling. It also provides simd exceptions via signal
and the fxsr state via two new ptrace operations with the same interface of
2.4.0-test10-pre3.

It also handles the case of an asymmetric MP system with some CPU supporting
FXSR and some without FXSR by detecting the condition and then doing:

panic("To boot use the `nofxsr' kernel parameter");

asymmetric MP almost always are because people plugins a new CPU by hand,
so they're expected to know also what is a kernel parameter :).
The panic at boot with hint with the solution is much better than breaking
later at runtime as I got bugreport for that. I didn't hanlded that condition
automatically via IPI because that was tricky and it didn't worth the effort
IMHO.

Many thanks to Doug and Gabriel for very useful explanations about this FPU
stuff. I suggest Gabriel to submit his way faster and more correct tag word
conversion function to Linus for 2.4.x.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL

Re: VM_RESERVED [was Re: mapping user space buffer to kernel address space]

2000-10-20 Thread Andrea Arcangeli


On Fri, Oct 20, 2000 at 01:46:53PM -0400, Jeff Garzik wrote:
> In any case, we shouldn't modify videodev.c to call vma_reserve()... 
> Let the driver's mmap operation do that or not do that, as it chooses.

It can't with the current mmap video4linux kernel API.

In practice it doesn't matter because none driver (before your VIA soundcard :)
is implementing the swapout callback. And vma_reserve fits perfectly in your
->nopage driver too.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] 0-byte read()/write() behaviour

2000-10-20 Thread Philipp Rumpf

On Fri, Oct 20, 2000 at 10:47:45AM -0700, Linus Torvalds wrote:
> On Fri, 20 Oct 2000, Philipp Rumpf wrote:
> >
> > Single Unix specifies that 0-byte reads, as well as 0-byte writes, should
> > "return 0 and have no other results".  Our current implementation violates
> > the first requirement and makes it very easy to violate the second one.
> 
> Note that there _are_ cases where 0-byte reads and writes have specific
> meaning, notably there are some networking things where a 0-byte sendto()
> does something special if I remember correctly. And I seem to remember
> that this also _did_ translate into write().

sock_write and sock_read contain:
if(size==0) /* Match SYS5 behaviour */
return 0;

I'm not sure which other read()/write() functions are used by
"networking things", but I don't see any others sendto would map to.

So I suspect if there are any applications which would need the behaviour
you described they've already been broken, and should be recompiled to use
sys_send(|to|msg) (via sys_socketcall on x86, of course).

> I remember that Linux used to do exactly this, and we had to pass the
> 0-byte writes into the low-level cases exactly because some low-level
> cases do care.

I would suspect most of those have been eliminated by now.

> I suspect SUS only talks about regular files.

As I'm reading it, they're talking about every read() call, even those with
an invalid fd.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Any dual AGP slot motherboards?

2000-10-20 Thread Gary E. Miller

Yo James!

On Fri, 20 Oct 2000, James Simmons wrote:

> After much searching I couldn't find one. It was one of those mac rumors
> people spread around. I still like to get more than one AGP going. If I
> have multiple PCI bus in theory I should be able to have one AGP port on
> each PCI bus. Right? 

AGP is much faster than PCI bus and has nothing to do with the 
PCI bus.  So the number of multiple PCI buses has nothing to
do with the number of AGP buses.

The way to get multiple PCI buses is to bridge one PCI bus on
to another.  There are no changes required to the core chipset.
There is no way (yet) to bridge one AGP bus on another.

AGP is very tightly coupled with the main memory controller chip
so it is unlikely that there will be any dual AGP motherboard
until one of the big semi manufacturers puts that feature in a 
core chipset.

RGDS
GARY
---
Gary E. Miller Rellim 20340 Empire Ave, Suite E-3, Bend, OR 97701
[EMAIL PROTECTED]  Tel:+1(541)382-8588 Fax: +1(541)382-8676

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel address space]

2000-10-20 Thread Andrea Arcangeli


On Fri, Oct 20, 2000 at 10:44:40AM -0700, Linus Torvalds wrote:
> agree with your change, but I just suspect it will break drivers that have

you're right, it would break it, the driver should really somehow increase the
pagecount for each mapping with the PG_reserved removed (in the future that can
be easily done in the nopage callback).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

quota 2.2.x

2000-10-20 Thread octave klaba


Hi,
Running with 2.2.17
VFS: Diskquotas version dquot_6.4.0 initialized

We have some problem with the quota with only SOME users.
Is there any new version of quota which fix this kind of
bug ?

thanks
Octave

#cat /etc/fstab | grep home
/dev/rd/c0d0p7  /home  ext2defaults,usrquota1 2

#/usr/sbin/repquota -a | grep cyberpc
cyberpc   --   8  24  24242 0 0   

#du -s /home/cyberpc
13676   /home/cyberpc

#ls -l /home/cyberpc
[...]
-rw---1 cyberpc  users  185465 oct 19 12:35 Mailbox
[...]


Amicalement,
oCtAvE 

"Peu importe ce qu'il y a de l'autre côté.
Tout ce qu'on laisse ici n'est qu'une histoire
dont on se souviendra ou pas."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

RE: cpia_usb cameras

2000-10-20 Thread Dunlap, Randy


> From: John M. Flinchbaugh [mailto:[EMAIL PROTECTED]]
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> did something change in the past 2 months to break cpia_usb cameras?

2.4.0-test9 (and maybe test8) had some significant changes that broke
several USB drivers, including cpia USB.

Should be fixed in 2.4.0-test10-pre3.
If not, please report it.
Also, if there's a problem, please test with both of the
uhci host controller drivers.

~Randy

> the last i remember using my webcam was toward the end of august.
> it hasn't been working with test9 or test8 i don't think.
> 
> the device shows up just fine, the /proc/cpia/video0 is functional,
> and i can set the camera parameters.
> 
> none of my video apps (xawtv, v4lctl, motion 2.0) work.  motion 2.0
> blocks at what seems to be the ioctl(...VIDIOCMCAPTURE...).
> the record light on the camera even comes on just fine.
> i never do get a capture.
> 
> i'm using the uhci.o driver, as usb-uhci.o loads, but trying to
> capture sends the machine into a recurring oops of some sort.
> scrolling way too quickly to read.  power button is only recourse.
> 
> this is of course linux 2.4.0-test9.  i've been staying pretty
> up-to-date.
> 
> thanks.
> - -- 
> }John Flinchbaugh{__
> | [EMAIL PROTECTED] http://www.hjsoft.com/~glynis/ |


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] 0-byte read()/write() behaviour

2000-10-20 Thread Linus Torvalds

On Fri, 20 Oct 2000, Philipp Rumpf wrote:
>
> Single Unix specifies that 0-byte reads, as well as 0-byte writes, should
> "return 0 and have no other results".  Our current implementation violates
> the first requirement and makes it very easy to violate the second one.

Note that there _are_ cases where 0-byte reads and writes have specific
meaning, notably there are some networking things where a 0-byte sendto()
does something special if I remember correctly. And I seem to remember
that this also _did_ translate into write().

I remember that Linux used to do exactly this, and we had to pass the
0-byte writes into the low-level cases exactly because some low-level
cases do care. I suspect SUS only talks about regular files.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bind() allowed to non-local addresses

2000-10-20 Thread Matt Peterson

David Woodhouse wrote:
> 
> [EMAIL PROTECTED] said:
> >   There is NOT a bug in the JVM code that handles java.net.DatagramSock
> > et.  Don't you find it a little compelling that the nearly identical
> > JVM code passes the Java Compatibility test suite on Linux 2.2,
> > Solaris, HPUX, SCO, and even Windows?
> 
> If the JVM spec says that it 'MUST' fail when used on a non-local address,
> and the POSIX spec for bind does not say that it 'MUST' fail, then yes,
> there is a bug in the JVM if it assumes that the two are compatible.

Does some one have a copy of the posix 1003.1g draft so this can be
verified.  This is the kind of ammunition I was talking about earlier
that I would need to convince Sun to change the compatibility test
suite.  However, if the 1003.1g draft even mentions failure with errno
set to EADDRNOTAVAIL in a "SHOULD" context, or if EADDRNOTAVAIL is
mentioned at all as a error code for non-local bind, then I am afraid
(given the widespread acceptance of bind() behavior), Sun will not
change the test suite.  

> The fact that they just happen to behave the same in certain phases of the
> moon and on other operating systems is not relevant.

Huh?  Please give me one example of a sockets implementation (besides
Linux 2.4) of where bind() does not fail if an attempt is made to bind
do a non-local address.  Your telling me that developers who are used to
seeing a consistant behavior across OSes will think that the difference
in Linux 2.4 is irrelivant?  I don't think so. 

> We may decide that we want to pander to this brokenness, especially given
> the widespread nature of the false assumption that bind() will fail when
> given a non-local address. But that doesn't make the JVM non-broken.
> 

Are you also suggesting that every other program that expects bind() to
fail with EADDRNOTAVAIL are broken too?  Just for fun, I greped all
sources of software shipped in Caldera's distributions for instances of
where a check is made for EADDRNOTAVAIL after a call to bind().  Guess
what else besides Java is probably "broken" ...

- lpng
- bind 8.2
- automount
- cvs 
- dhcpd
- KDE
- UCL mbone
- ncftp
- netatalk
- nfsd
- rexec
- pppd
- sendmail
- xchat

... but the Linux kernel... Nope, it's not broken.  Lets email
maintainers of all these projects and tell them that they have been
mistaken all this time in their understanding how bind() should work and
see what kind of a response we get.

Matt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel address space]

2000-10-20 Thread Jeff Garzik


Andrea Arcangeli wrote:
> > As to your rvmalloc()/rvfree() changes, I don't think they are safe as-is:
> > I think it's the right thing to do, but I don't trust the drivers to
> > maintain the right page counts. The code used to mark the pages as
> > reserved, which probably means that it hides bad drivers that do not do
> > proper reference counting - and I'm not willing to make that kind of
> > change at this point.
> 
> The page count of the mapped pages should be ok, it seems those mapped pages
> have a reference count of 1 just from the vmalloc allocation and they use
> PG_reserved just to skip swap_out, but I feel safer too if the bttv maintainers
> will check it and send it to you themself after checking it's correct. (I only
> verified that it was compiling correctly)

In any case, we shouldn't modify videodev.c to call vma_reserve()... 
Let the driver's mmap operation do that or not do that, as it chooses.

Jeff



-- 
Jeff Garzik| The difference between laziness and
Building 1024  | prioritization is the end result.
MandrakeSoft   |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel addressspace]

2000-10-20 Thread Linus Torvalds

On Fri, 20 Oct 2000, Andrea Arcangeli wrote:
> 
> The page count of the mapped pages should be ok, it seems those mapped pages
> have a reference count of 1 just from the vmalloc allocation and they use
> PG_reserved just to skip swap_out, but I feel safer too if the bttv maintainers
> will check it and send it to you themself after checking it's correct. (I only
> verified that it was compiling correctly)

Note that the page count should _not_ be one: the page could should be
1+nr_of_mappings.

Basically, the "nopage()" function has to do a "get_page(page)". But if
the page is marked PG_reserved, that would hide a bug in a drievr that
doesn't do that part.

Also, I suspect some drivers do the "remap_page_range() one page at a
time", and again exactly due to page count issues remap_page_range() will
refuse to touch pages that aren't marked PG_reserved. So I wholeheartedly
agree with your change, but I just suspect it will break drivers that have
depended on the fact that PG_reserved means that they can be lazy and not
bother about getting all the details right.

Not that I mind breaking drivers in general, but not right now.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Any dual AGP slot motherboards?

2000-10-20 Thread James Simmons



> > ** Reply to message from James Simmons <[EMAIL PROTECTED]> on Thu, 19 Oct 2000
> > 18:34:51 -0700 (PDT)
> > > Apple sells a computer with dual AGP slots.
> > I've never heard this. Could you tell me exactly which model this is?
> 
> I think he's confusing dualhead cards with dual agp slots.
> 
> I dont think *anyone* makes dual agp slots.

After much searching I couldn't find one. It was one of those mac rumors
people spread around. I still like to get more than one AGP going. If I
have multiple PCI bus in theory I should be able to have one AGP port on
each PCI bus. Right? 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: mapping user space buffer to kernel address space

2000-10-20 Thread Linus Torvalds

On Thu, 19 Oct 2000, Stephen Tweedie wrote:
> > 
> > Then, we'd move the "writeout" part into the LRU queue side, and at that
> > point I agree with you 100% that we probably should just delay it until
> > there are no mappings available
> 
> I've just been talking about this with Ben LaHaise and Rik van Riel,
> and Ben brought up a nasty problem --- NFS, which because of its
> credentials requirements needs to have the struct file available in
> its writepage function.  Of course, if we defer the write then we
> don't necessarily have the file available when we come to flush the
> page from cache.

Yes. But that doesn't mean that swapping couldn't do it (swapping
fundamentally doesn't have credentials).

And note that this is not about "NFS is broken" - any remote filesystem
will have some issues like this, and shared mappings will always have to
handle this case.

So basically I agree that shared mappings cannot be converted to this
setup, I was only talking about the specific case of the swapping (and
anonymous shared memory, which along with SysV IPC shm is basically the
same thing and already uses the swap cache).

So what I was thinking of was the very end of try_to_swap_out(), where we
have noticed that we do not have a "swapout()" function, and we need to
add the page to the swap cache. I would suggest moving _that_ code to the
LRU queue, and handling it conceptually together with the stuff that
handles the buffer cache writeout.

--

And no, I haven't forgotten about the case of direct IO into a shared
mapping. That _is_ going to be different in many ways, and I suspect that
a solution to that particular issue may be to move the "vm_file"
information from when we do the virtual kiobuf lookup into the kiobuf's,
because otherwise we'd basically lose that information.

(We _already_ lose that information, in fact. Keeping the page in the
virtual mapping doesn't really even fix it - because the page can be in
multiple virtual mappings with different vm_file's and thus different
credentials. And the kiobuf's do not really contain any information of
_which_ of the credentials we looked up. It happens to work, but it's
conceptually not very correct).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel address space]

2000-10-20 Thread Andrea Arcangeli

On Fri, Oct 20, 2000 at 10:10:05AM -0700, Linus Torvalds wrote:
> Sure. I have no problem at all with this suggestion: it's basically just a
> hint to the VM layer that trying to page something out in this vma is
> useless, as its backing store is in memory anyway.

Yes, that is _exactly_ the point.

> As to your rvmalloc()/rvfree() changes, I don't think they are safe as-is:
> I think it's the right thing to do, but I don't trust the drivers to
> maintain the right page counts. The code used to mark the pages as
> reserved, which probably means that it hides bad drivers that do not do
> proper reference counting - and I'm not willing to make that kind of
> change at this point.

The page count of the mapped pages should be ok, it seems those mapped pages
have a reference count of 1 just from the vmalloc allocation and they use
PG_reserved just to skip swap_out, but I feel safer too if the bttv maintainers
will check it and send it to you themself after checking it's correct. (I only
verified that it was compiling correctly)

Thanks.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: missing mxcsr initialization

2000-10-20 Thread Gabriel Paubert

On Fri, 20 Oct 2000, Andrea Arcangeli wrote:

> Many thanks to Doug and Gabriel for very useful explanations about this FPU
> stuff. I suggest Gabriel to submit his way faster and more correct tag word
> conversion function to Linus for 2.4.x.

Here it a first shot, twd_i387_to_fxsr is guaranteed to not add any
new bug (since there are only 65536 cases, I wrote a short test program
to check that the results are the same). 

For the other way around, I only corrected the most glaring error: that
the sign never affects the tag. Some obscure corner cases may still be
wrong, but at least now negative zero and positive infinity and NaNs are
handled correctly.  Performance could also be improved by handling
separately the common case of an empty FP stack.

Now the question: is it worth bloating the code to exactly return the tag
values of a 387 (I actually find that the Intel doc is confusing, what 
is the state of the tag word after an fnsave when MMX is in use, 0x
or 0x, not counting the bugs in this area) ? 

My feeling is that the only important information in the tag field is
empty/not empty. Does any code (debuggers, etc...) actually rely on the
tags for anything else ?

Patch relative to 2.4.0-test9.

Gabriel

= arch/i386/kernel/i387.c 1.1 vs edited =
--- 1.1/arch/i386/kernel/i387.c Tue Jun 27 20:14:20 2000
+++ edited/arch/i386/kernel/i387.c  Fri Oct 20 17:56:49 2000
@@ -79,16 +79,16 @@

 static inline unsigned short twd_i387_to_fxsr( unsigned short twd )
 {
-   unsigned short ret = 0;
-   int i;
-
-   for ( i = 0 ; i < 8 ; i++ ) {
-   if ( (twd & 0x3) != 0x3 ) {
-   ret |= (1 << i);
-   }
-   twd = twd >> 2;
-   }
-   return ret;
+   unsigned int tmp; /* to avoid 16 bit prefixes in the code */
+ 
+   /* Transform each pair of bits into 01 (valid) or 00 (empty) */
+tmp = ~twd;
+tmp = (tmp | (tmp>>1)) & 0x; /* 0V0V0V0V0V0V0V0V */
+/* and move the valid bits to the lower byte. */
+tmp = (tmp | (tmp >> 1)) & 0x; /* 00VV00VV00VV00VV */
+tmp = (tmp | (tmp >> 2)) & 0x0f0f; /*  */
+tmp = (tmp | (tmp >> 4)) & 0x00ff; /*  */
+return tmp;
 }

 static inline unsigned long twd_fxsr_to_i387( struct i387_fxsave_struct *fxsave )
@@ -105,8 +105,8 @@
if ( twd & 0x1 ) {
st = (struct _fpxreg *) FPREG_ADDR( fxsave, i );

-   switch ( st->exponent ) {
-   case 0x:
+   switch ( st->exponent & 0x7fff ) {
+   case 0x7fff:
tag = 2;/* Special */
break;
case 0x:

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Bootable RAID

2000-10-20 Thread Anil kumar


Hi,
 I want to make a Bootable RAID.
 can I make a partition on the disk which I want to
 make as bootable or should I be use whole disk?

 What I mean is like this:
  In the lilo.conf file I do like this:
   disk = /dev/md0
   boot = /dev/hdc

  Is this all what I can do? or can I make a  partiton
  in /dev/hdc & specify that as disk?
  In that case lilo.conf file will be like this:
  disk = /dev/md0
  boot = /dev/hdc1
  
  can I use partition of the disk as boot?

 Expecting reply from you

with regards,
  Anil
  

__
Do You Yahoo!?
Yahoo! Messenger - Talk while you surf!  It's FREE.
http://im.yahoo.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: VM_RESERVED [was Re: mapping user space buffer to kernel addressspace]

2000-10-20 Thread Linus Torvalds

On Fri, 20 Oct 2000, Andrea Arcangeli wrote:
> 
> I'm fine to drop remap_page_range and the PG_reserved bit, but to do that I'd
> suggest to add a new per-VMA VM_RESERVED bitflags.

Sure. I have no problem at all with this suggestion: it's basically just a
hint to the VM layer that trying to page something out in this vma is
useless, as its backing store is in memory anyway.

Applied.

As to your rvmalloc()/rvfree() changes, I don't think they are safe as-is:
I think it's the right thing to do, but I don't trust the drivers to
maintain the right page counts. The code used to mark the pages as
reserved, which probably means that it hides bad drivers that do not do
proper reference counting - and I'm not willing to make that kind of
change at this point.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: 2.4.0-test10-pre3:Oops in mm/filemap.c:filemap_write_pa

2000-10-20 Thread Linus Torvalds

On Fri, 20 Oct 2000, Trond Myklebust wrote:
> 
> The problem lies with writes that haven't yet been msync()ed (and
> hence do not have writebacks). For shared mappings, one should perhaps
> schedule an automatic msync() of the dirty pages (???). For private
> mappings, perhaps the best thing would be to defer the read?

Note that NONE of this is going to happen for 2.4.x.

We've never _ever_ done this before, there's no point in even suggesting
that this is suddenly a "critical" bug. It's not.

I want to know what the suggestion for 2.4.x is. Right now that's the "if
the count is elevated, we don't invalidate". 

Quite frankly, I don't see any other option. Doing the !Uptodate version
will lose local data as it stands now - in fact right now you'd lose data
that way even if you are the only client accessing the file, which is
obviously complete crap and _completely_ unacceptable.

I'm open to suggestions, but I haven't heard anything realistic.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

1 2 3 >

1 - 100 of 209 matches

Mail list logo