Re: sysfs: write returns ENOMEM?

2005-08-23 Thread Pekka Enberg
On 8/23/05, Nathan Scott [EMAIL PROTECTED] wrote:
 FWIW, all filesystems using the generic page cache routines are able
 to return this - see mm/filemap.c - generic_file_buffered_write...

I don't think it makes much sense to fix this in individual
filesystems as many functions returning -NOMEM can be used in other
paths as well where they're ok.

Andrew, please consider picking this up for -mm. (I've included it as
an attachment as well as gmail will surely mess up the patch. Sorry.)

   Pekka

[PATCH] VFS: return ENOBUFS instead of ENOMEM for vfs_write()

As noticed by Dmitry Torokhov, write() can not return ENOMEM:

http://www.opengroup.org/onlinepubs/95399/functions/write.html

Currently almost all filesystems can return -ENOMEM due to
generic_file_buffered_write() in mm/filemap.c so filter out the invalid
error code in vfs_write().

Signed-off-by: Pekka Enberg [EMAIL PROTECTED]
---

 read_write.c |2 ++
 1 files changed, 2 insertions(+)

Index: 2.6-mm/fs/read_write.c
===
--- 2.6-mm.orig/fs/read_write.c
+++ 2.6-mm/fs/read_write.c
@@ -310,6 +310,8 @@ ssize_t vfs_write(struct file *file, con
}
}

+   if (ret == -ENOMEM)
+   ret = -ENOBUFS;
return ret;
 }
[PATCH] VFS: return ENOBUFS instead of ENOMEM for vfs_write()

As noticed by Dmitry Torokhov, write() can not return ENOMEM:

http://www.opengroup.org/onlinepubs/95399/functions/write.html

Currently almost all filesystems can return -ENOMEM due to
generic_file_buffered_write() in mm/filemap.c so filter out the invalid
error code in vfs_write().

Signed-off-by: Pekka Enberg [EMAIL PROTECTED]
---

 read_write.c |2 ++
 1 files changed, 2 insertions(+)

Index: 2.6-mm/fs/read_write.c
===
--- 2.6-mm.orig/fs/read_write.c
+++ 2.6-mm/fs/read_write.c
@@ -310,6 +310,8 @@ ssize_t vfs_write(struct file *file, con
 		}
 	}
 
+	if (ret == -ENOMEM)
+		ret = -ENOBUFS;
 	return ret;
 }
 


Re: sysfs: write returns ENOMEM?

2005-08-23 Thread Andrew Morton
Pekka Enberg [EMAIL PROTECTED] wrote:

 @@ -310,6 +310,8 @@ ssize_t vfs_write(struct file *file, con
  }
  }
 
  +   if (ret == -ENOMEM)
  +   ret = -ENOBUFS;
  return ret;
   }
 

That's lame.  It'd be better to hunt down all the -ENOMEMs and fix them up.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sysfs: write returns ENOMEM?

2005-08-23 Thread Pekka J Enberg

Andrew Morton writes:

That's lame.  It'd be better to hunt down all the -ENOMEMs and fix them up.


So there's our verdict. Thanks, Andrew :-) 

  Pekka 
-

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12 Performance problems

2005-08-23 Thread Helge Hafting

Danial Thom wrote:


--- Jesper Juhl [EMAIL PROTECTED] wrote:

 


On 8/21/05, Danial Thom [EMAIL PROTECTED]
wrote:
   


I just started fiddling with 2.6.12, and
 


there
   


seems to be a big drop-off in performance
 


from
   


2.4.x in terms of networking on a
 


uniprocessor
   


system. Just bridging packets through the
machine, 2.6.12 starts dropping packets at
~100Kpps, whereas 2.4.x doesn't start
 


dropping
   


until over 350Kpps on the same hardware
 


(2.0Ghz
   


Opteron with e1000 driver). This is pitiful
prformance for this hardware. I've
increased the rx ring in the e1000 driver to
 


512
   


with little change (interrupt moderation is
 


set
   


to 8000 Ints/second). Has tuning for MP
destroyed UP performance altogether, or is
 


there
   


some tuning parameter that could make a
 


4-fold
   


difference? All debugging is off and there
 


are
   


no messages on the console or in the error
 


logs.
   


The kernel is the standard kernel.org dowload
config with SMP turned off and the intel
 


ethernet
   


card drivers as modules without any other
changes, which is exactly the config for my
 


2.4
   


kernels.

 


If you have preemtion enabled you could disable
it. Low latency comes
at the cost of decreased throughput - can't
have both. Also try using
a HZ of 100 if you are currently using 1000,
that should also improve
throughput a little at the cost of slightly
higher latencies.

I doubt that it'll do any huge difference, but
if it does, then that
would probably be valuable info.

   


Ok, well you'll have to explain this one:

Low latency comes at the cost of decreased
throughput - can't have both
 


Configuring preempt gives lower latency, because then
almost anything can be interrupted (preempted).  You can then
get very quick responses to some things, i.e. interrupts and such.

The cost comes, because _something_ was interrupted, something
that instead would run to completion first in a kernel made without 
preempt.
So that other thing, whatever it is, got slower. 


And the problem is bigger than merely things happens in a different order.
Switching the cpu from one job to another have a big overhead.  
Particularly,
the cpu caches have to be refilled more often, which takes time.  
Running one
big job to completion fills the cache with that job's data _once_.  If 
the job

is preempted a couple of times you have to bring it into cache three
times instead, and that will cost you, performance wise.

This is not _necessarily_ your problem, but trying a 2.6 kernel without 
preempt
and with hz=100 (both things configurable through normal kernel 
configuration)
will clearly show if this is the problem in your case.  If you're lucky, 
this is all

you need to get your performance back.  If not, then at least it is an
important datapoint for those trying to figure it out.  Nobody here want
2.6 to have 1/4 of the performance of 2.4!

Helge Hafting
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: return ENOBUFS instead of ENOMEM in generic_file_buffered_write

2005-08-23 Thread Pekka J Enberg
As noticed by Dmitry Torokhov, write() can not return ENOMEM:

http://www.opengroup.org/onlinepubs/95399/functions/write.html

Therefore fixup generic_file_buffered_write() in mm/filemap.c (pointed out by
Nathan Scott).

Signed-off-by: Pekka Enberg [EMAIL PROTECTED]
---

 filemap.c |2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: 2.6-mm/mm/filemap.c
===
--- 2.6-mm.orig/mm/filemap.c
+++ 2.6-mm/mm/filemap.c
@@ -1942,7 +1942,7 @@ generic_file_buffered_write(struct kiocb
 
page = __grab_cache_page(mapping,index,cached_page,lru_pvec);
if (!page) {
-   status = -ENOMEM;
+   status = -ENOBUFS;
break;
}
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm2

2005-08-23 Thread Grant . Coady
On Mon, 22 Aug 2005 21:30:21 -0700, Andrew Morton [EMAIL PROTECTED] wrote:


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/

- Various updates.  Nothing terribly noteworthy.

adm9240 i2c still broken, spamming debug with:
Aug 23 18:48:40 peetoo kernel: [ 1591.151460] i2c_adapter i2c-0: Transaction 
(post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00
Aug 23 18:48:40 peetoo kernel: [ 1591.151834] i2c_adapter i2c-0: Transaction 
(pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00
Aug 23 18:48:40 peetoo kernel: [ 1591.170515] i2c_adapter i2c-0: Transaction 
(post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00
Aug 23 18:48:40 peetoo kernel: [ 1591.170881] i2c_adapter i2c-0: Transaction 
(pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00
Aug 23 18:48:40 peetoo kernel: [ 1591.189837] i2c_adapter i2c-0: Transaction 
(post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00
Aug 23 18:48:40 peetoo kernel: [ 1591.190217] i2c_adapter i2c-0: Transaction 
(pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00
Aug 23 18:48:40 peetoo kernel: [ 1591.208927] i2c_adapter i2c-0: Transaction 
(post): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00
Aug 23 18:48:40 peetoo kernel: [ 1591.209296] i2c_adapter i2c-0: Transaction 
(pre): CNT=08, CMD=2c, ADD=5a, DAT0=00, DAT1=00

As soon as write sysfs.  Dunno where to start, this is from adm9240 
driver that works in 2.6.13-rc6-git12 but not -mm1 or -mm2, terminal 
lost, but able to log in on another terminal.  -mm2 was okay until I 
wrote to sysfs.  With -mm1 it failed on reading the sysfs area as well, 
so there's a little progress.  

top:
top - 18:52:07 up 29 min,  2 users,  load average: 0.99, 0.62, 0.26
Tasks:  50 total,   3 running,  47 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3% us,  0.0% sy,  0.0% ni, 99.7% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:515360k total,   146504k used,   368856k free,15932k buffers
Swap:   514000k total,0k used,   514000k free,   109296k cached

Grant.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: use of uninitialized pointer in jffs_create()

2005-08-23 Thread Jörn Engel
On Tue, 23 August 2005 01:07:58 +0200, Adrian Bunk wrote:
 On Mon, Aug 22, 2005 at 12:45:59PM +0200, Jörn Engel wrote:
  On Sun, 21 August 2005 00:28:08 +0200, Jesper Juhl wrote:
   
   gcc kindly pointed me at jffs_create() with this warning : 
   
   fs/jffs/inode-v23.c:1279: warning: `inode' might be used uninitialized
   in this function
  
  Real fix would be to finally remove that code.  Except for the usual
  change this function in the whole kernel stuff, noone has touched it
  for ages.
 
 That's wrong, this -mm specific bug comes git-ocfs2.patch .

Ack.  If I wasn't this lazy, I'd still propose to completely remove
jffs - it's been old and deprecated for a few years already.

Jörn

-- 
Public Domain  - Free as in Beer
General Public - Free as in Speech
BSD License- Free as in Enterprise
Shared Source  - Free as in Work will make you...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


mass tulip_stop_rxtx() failed, network stops

2005-08-23 Thread Tomasz Chmielewski
We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1 
kernel, equipped with a onboard card that uses a tulip module:


02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast 
Ethernet 10/100 (rev 11)


No problem with those.


We are running four more machines like that, the only difference is the 
kernel they are running (2.6.11.4).


On some of them, there are serious problems with a network, and they 
usually happen when the traffic is bigger than usual (i.e., some big 
software deployment to several workstations, remote backup, etc.).


The syslog is then full of entries like that:

Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit 
timed out

Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed

and it's filling logs for hours; network doesn't work anymore, and 
someone has to restart the network or the machine itself.


It doesn't always happen with a big traffic - sometimes you can fill the 
100 Mbit link and do lots of reads from the disk, but nothing bad 
happens for hours.



I saw some posts on this issue (2.6.10-rc3: tulip-driver: 
tulip_stop_rxtx() failed), but it seemed to me that it wasn't similar 
to my problems; I looked into 2.6.10 kernel changelog, but there were 
no descriptions of that problem, either.



Any help appreciated, because rebooting machines which are 500 km away 
and are not responding is no fun :)



--
Tomek
http://wpkg.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: VIA Rhine ethernet driver bug (reprise...)

2005-08-23 Thread Denis Vlasenko
[CCing maintaner]

On Monday 22 August 2005 20:29, Udo van den Heuvel wrote:
 Hello,
 
 It appears that the VIA Rhine chipset has some sort of bug which shows
 up in both the standard Linux VIA-Rhine driver and the Rhinefet driver
 that VIA itself provides.
 
 The difference is that the connection is dropped in case of the standard
 Linux driver for VIA Rhine but that the connection remains OK with the
 Rhinefet driver provided by VIA
 (http://www.viaarena.com/downloads/Source/rhinefet.tgz and other places
 on viaarena.com...).
 So VIA Rhinefet driver consumes more CPU but is also more stable.
 
 I wrote about this issue before: http://lkml.org/lkml/2005/8/7/82 
 http://lkml.org/lkml/2005/1/15/47 etc.
 I opened a bugzilla case: http://bugzilla.kernel.org/show_bug.cgi?id=5030
 
 Who could find out why the standard Linux driver chokes and the Rhinefet
 driver doesn't? Who could fix this bug?

My suggestion was, and still is:

Since it happens less than once a day, why not just add a code
to reset the NIC completely in this case, like it is
typically done in tx_timeout handlers of many NICs, and forget about it?

Do you see any problems in this approach?
--
vda

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem with MMC card reader

2005-08-23 Thread Vid Strpic
On Mon, Aug 22, 2005 at 11:16:46PM -0600, ScytheBlade1 wrote:
 I've enabled everything needed...the CF port works flawlessly. However,
 the SD slot does *not*. I've got about 5+ pages worth of dmesg output
 related to this (MMC is NOT debug enabled, and I still get a disturbing
 amount of output). Would here be the best place to post this, or would a
 different list be better? (Recommendations as to which list are welcomed
 :)).

CONFIG_SCSI_MULTI_LUN=y

This is probably the answer you needed... because, the reader is a
single device, with multiple slots, and usb_storage driver uses SCSI
infrastructure, so multiple slots map to multiple SCSI LUN's.

Atleast, it worked for all my readers during the years :))

-- 
   [EMAIL PROTECTED], IRC:[EMAIL PROTECTED], /bin/zsh. C|NK
Linux moria 2.6.11 #1 Wed Mar 9 19:08:59 CET 2005 i686
 11:23:27 up 27 days,  3:56,  4 users,  load average: 0.11, 0.23, 0.29
We are Microsoft. First we'll reboot, and then asimilate you.


pgpQuFuInhAl4.pgp
Description: PGP signature
This message has been 'sanitized'.  This means that potentially
dangerous content has been rewritten or removed.  The following
log describes which actions were taken.

Sanitizer (start=1124789169):
  Forcing message to be multipart/mixed, to facilitate logging.


Anomy 0.0.0 : Sanitizer.pm
$Id: Sanitizer.pm,v 1.87 2004/05/07 17:42:12 bre Exp $


Re: IRQ problem with PCMCIA

2005-08-23 Thread Andre Hedrick

Alan,

The old code can be fixed, just I don't have the time or any desire to
look at it again, still.  The burn out from the last issues from
2001-2003, cost me some health problems over the stress.  

If I encounter these problems and become annoyed enough, I will fix it.
However, if it is cheaper to buy working hardware, that is the route I
will take.

You (Alan), if anyone knows anything can be done in Linux, otherwise none
of us would have ever put this much effort into its success.

Cheers,

Andre


On Mon, 22 Aug 2005, Alan Cox wrote:

 On Llu, 2005-08-22 at 11:25 +0200, Bartlomiej Zolnierkiewicz wrote:
  CardBus IDE devices work just fine but there are still issues with
  hotplug support (work in progress).
 
 work in progress. Yes because I submitted working IDE cardbus hotplug
 support, and Mark Lord submitted a Delkin driver both of which worked
 months ago rather nicely and neither of which hit the Bartlomiej stone
 wall and never got in and are now stale patches.
 
   up ever getting those into the kernel. Please wait instead for the new
   SATA/ATA layer to develop hotplug support.
  
  This is just a FUD to discourage people from working on IDE drivers.
  Alan is doing this on purpose and doesn't really want to improve things.
 
 Its a realistic assessment based upon over ten years working on the
 Linux kernel. I do not believe you are capable of fixing the old IDE
 code. But don't take that personally I am sceptical than anyone can fix
 the old IDE code.
 
 Alan
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [2.6 patch] cris: extern inline - static inline

2005-08-23 Thread Mikael Starvik
Ok, I've made a testcompile and the resulting image size is similar so
the patch is good.

Acked-by: Mikael Starvik [EMAIL PROTECTED]

/Mikael

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Adrian Bunk
Sent: Monday, August 22, 2005 1:55 AM
To: Mikael Starvik
Cc: dev-etrax; linux-kernel@vger.kernel.org
Subject: [2.6 patch] cris: extern inline - static inline


extern inline doesn't make much sense.


Signed-off-by: Adrian Bunk [EMAIL PROTECTED]

---

 arch/cris/arch-v10/README.mm|6 +--
 arch/cris/arch-v10/kernel/signal.c  |2 -
 arch/cris/arch-v32/kernel/signal.c  |2 -
 arch/cris/mm/ioremap.c  |2 -
 include/asm-cris/arch-v10/byteorder.h   |4 +-
 include/asm-cris/arch-v10/checksum.h|2 -
 include/asm-cris/arch-v10/delay.h   |2 -
 include/asm-cris/arch-v10/ide.h |8 ++--
 include/asm-cris/arch-v10/system.h  |8 ++--
 include/asm-cris/arch-v10/thread_info.h |2 -
 include/asm-cris/arch-v10/timex.h   |2 -
 include/asm-cris/arch-v10/uaccess.h |4 +-
 include/asm-cris/arch-v32/bitops.h  |   10 ++---
 include/asm-cris/arch-v32/byteorder.h   |4 +-
 include/asm-cris/arch-v32/checksum.h|2 -
 include/asm-cris/arch-v32/delay.h   |2 -
 include/asm-cris/arch-v32/ide.h |4 +-
 include/asm-cris/arch-v32/io.h  |6 +--
 include/asm-cris/arch-v32/system.h  |6 +--
 include/asm-cris/arch-v32/thread_info.h |2 -
 include/asm-cris/arch-v32/timex.h   |2 -
 include/asm-cris/arch-v32/uaccess.h |4 +-
 include/asm-cris/atomic.h   |   22 ++--
 include/asm-cris/bitops.h   |   18 -
 include/asm-cris/checksum.h |8 ++--
 include/asm-cris/current.h  |2 -
 include/asm-cris/delay.h|2 -
 include/asm-cris/io.h   |6 +--
 include/asm-cris/irq.h  |2 -
 include/asm-cris/pgalloc.h  |   12 +++---
 include/asm-cris/pgtable.h  |   44 
 include/asm-cris/processor.h|4 +-
 include/asm-cris/semaphore-helper.h |8 ++--
 include/asm-cris/semaphore.h|   14 +++
 include/asm-cris/system.h   |2 -
 include/asm-cris/timex.h|2 -
 include/asm-cris/tlbflush.h |4 +-
 include/asm-cris/uaccess.h  |   24 ++---
 include/asm-cris/unistd.h   |   20 +-
 39 files changed, 140 insertions(+), 140 deletions(-)

--- linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/README.mm.old  2005-08-22
01:38:14.0 +0200
+++ linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/README.mm  2005-08-22
01:38:36.0 +0200
@@ -177,7 +177,7 @@
 Given the top-level Page Directory, the offset in that directory is
calculated
 using the upper 8 bits:
 
-extern inline pgd_t * pgd_offset(struct mm_struct * mm, unsigned long
address)
+static inline pgd_t * pgd_offset(struct mm_struct * mm, unsigned long
address)
 {
return mm-pgd + (address  PGDIR_SHIFT);
 }
@@ -190,14 +190,14 @@
 
 Since the Middle Directory does not exist, it is a unity mapping:
 
-extern inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address)
+static inline pmd_t * pmd_offset(pgd_t * dir, unsigned long address)
 {
return (pmd_t *) dir;
 }
 
 The Page Table provides the final lookup by using bits 13 to 23 as index:
 
-extern inline pte_t * pte_offset(pmd_t * dir, unsigned long address)
+static inline pte_t * pte_offset(pmd_t * dir, unsigned long address)
 {
return (pte_t *) pmd_page(*dir) + ((address  PAGE_SHIFT) 
   (PTRS_PER_PTE - 1));
--- linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/kernel/signal.c.old
2005-08-22 01:38:45.0 +0200
+++ linux-2.6.13-rc6-mm1-full/arch/cris/arch-v10/kernel/signal.c
2005-08-22 01:38:54.0 +0200
@@ -476,7 +476,7 @@
  * OK, we're invoking a handler
  */
 
-extern inline void
+static inline void
 handle_signal(int canrestart, unsigned long sig,
  siginfo_t *info, struct k_sigaction *ka,
   sigset_t *oldset, struct pt_regs * regs)
--- linux-2.6.13-rc6-mm1-full/arch/cris/arch-v32/kernel/signal.c.old
2005-08-22 01:39:03.0 +0200
+++ linux-2.6.13-rc6-mm1-full/arch/cris/arch-v32/kernel/signal.c
2005-08-22 01:39:09.0 +0200
@@ -513,7 +513,7 @@
 }
 
 /* Invoke a singal handler to, well, handle the signal. */
-extern inline void
+static inline void
 handle_signal(int canrestart, unsigned long sig,
  siginfo_t *info, struct k_sigaction *ka,
   sigset_t *oldset, struct pt_regs * regs)
--- linux-2.6.13-rc6-mm1-full/arch/cris/mm/ioremap.c.old2005-08-22
01:39:18.0 +0200
+++ linux-2.6.13-rc6-mm1-full/arch/cris/mm/ioremap.c2005-08-22
01:39:23.0 +0200
@@ -16,7 +16,7 @@
 #include asm/tlbflush.h
 #include asm/arch/memmap.h
 

Re: mass tulip_stop_rxtx() failed, network stops

2005-08-23 Thread jerome lacoste
On 8/23/05, Tomasz Chmielewski [EMAIL PROTECTED] wrote:
 We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
 kernel, equipped with a onboard card that uses a tulip module:
 
 02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
 Ethernet 10/100 (rev 11)
 
 No problem with those.
 
 
 We are running four more machines like that, the only difference is the
 kernel they are running (2.6.11.4).
 
 On some of them, there are serious problems with a network, and they
 usually happen when the traffic is bigger than usual (i.e., some big
 software deployment to several workstations, remote backup, etc.).
 
 The syslog is then full of entries like that:
 
 Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
 timed out
 Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed

I am seeing thousands of tulip_stop_rxtx() failed messages as well
with 2.6.11. No regular network failure though.

See http://kerneltrap.org/mailarchive/1/message/110291/flat

Cheers,

Jerome
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


kernel module seg fault

2005-08-23 Thread manomugdha biswas
Hi,
I have written a kernel module and I can load (insmod)
it without any error. But when i run my module it gets
seg fault at interruptible_sleep_on_timeout();

I have used this function in the following way:

DECLARE_WAIT_QUEUE_HEAD(wq);
init_waitqueue_head(wq);
interruptible_sleep_on_timeout(wq, 2);

I am using redhat version 9.0 and kernel version
2.4.20-8.
Could you please give some light on this issue?

Manomugdha Biswas







Send a rakhi to your brother, buy gifts and win attractive prizes. Log on to 
http://in.promos.yahoo.com/rakhi/index.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mass tulip_stop_rxtx() failed, network stops

2005-08-23 Thread Tomasz Chmielewski

jerome lacoste schrieb:

On 8/23/05, Tomasz Chmielewski [EMAIL PROTECTED] wrote:


We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
kernel, equipped with a onboard card that uses a tulip module:

02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
Ethernet 10/100 (rev 11)

No problem with those.


We are running four more machines like that, the only difference is the
kernel they are running (2.6.11.4).

On some of them, there are serious problems with a network, and they
usually happen when the traffic is bigger than usual (i.e., some big
software deployment to several workstations, remote backup, etc.).

The syslog is then full of entries like that:

Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed



I am seeing thousands of tulip_stop_rxtx() failed messages as well
with 2.6.11. No regular network failure though.

See http://kerneltrap.org/mailarchive/1/message/110291/flat


Lucky you.
Really no network problems, no increased ping responses?
For me lots of pings are lost, and when this tulip_stop_rxtx() failed 
happens, the time for a ping to go back can be as big as 14 seconds in 
a 100 Mbit LAN.




--
Tomek
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC: 2.6 patch] fs/super.c: unexport user_get_super

2005-08-23 Thread Christoph Hellwig
On Mon, Aug 22, 2005 at 06:20:56PM +0200, Adrian Bunk wrote:
 I didn't find any modular usage in the kernel.

And there shouldn't be one either.  This is really just for some syscalls,
everything else should use get_super based on a struct block_device. If
there's any caller using this wrongly in out of tree modules they can
be switched to bdget + get_super trivially (fixing their code would be
even better).

 Signed-off-by: Adrian Bunk [EMAIL PROTECTED]
 
 ---
 
 This patch was already sent on:
 - 30 May 2005
 - 13 May 2005
 - 1 May 2005
 - 23 Apr 2005
 
 --- linux-2.6.12-rc2-mm3-full/fs/super.c.old  2005-04-23 02:45:59.0 
 +0200
 +++ linux-2.6.12-rc2-mm3-full/fs/super.c  2005-04-23 02:46:07.0 
 +0200
 @@ -467,8 +467,6 @@
   return NULL;
  }
  
 -EXPORT_SYMBOL(user_get_super);
 -
  asmlinkage long sys_ustat(unsigned dev, struct ustat __user * ubuf)
  {
  struct super_block *s;
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
---end quoted text---
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] external interrupts

2005-08-23 Thread Christoph Hellwig
On Mon, Aug 22, 2005 at 02:43:30PM -0700, Andrew Morton wrote:
  Laughter was not wholly unexpected, though I wasn't joking.  I'm trying
  to be realistic about the lifetime of any given hardware, and IOC4 is
  several years old at this point.  Couple that with a sincere desire to
  preserve application source compatability when (not if) new hardware
  appears, and an abstraction layer seemed to be a logical choice.  I'm
  more than happy to discuss problems in the abstraction layer's interface
  and make appropriate changes -- I'm nothing if not obliging.
 
 Having an abstraction layer for a single client driver does seem a bit
 pointless.  It would become more pointful if other client drivers were to
 pop up.

The Octane port will hopefully soon support external inteerupts on the
ioc3, so this does make sense.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux AIO status todo

2005-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2005 at 01:14:38PM +0530, Suparna Bhattacharya wrote:

   2. No support for propagating IO completion events to user space
  threads using RT signals. User threads need to poll the completion
  queue using io_getevents. POSIX specifies that when an AIO
  request completes, a signal can be delivered to the application
  to indicate the completion of the IO.

POSIX AIO needs to handle SIGEV_NONE, SIGEV_SIGNAL and SIGEV_THREAD
notification.  Obviously kernel shouldn't create threads for SIGEV_THREAD
itself, as kernel shouldn't hardcode all the implementation details how a
thread can be created.  But it would be good if AIO signalling e.g. handled
both SIGEV_SIGNAL and SIGEV_SIGNAL | SIGEV_THREAD_ID, with the same usage as
e.g. timer_* syscalls.  If kernel makes sure SI_ASYNCIO si_code is set in
the notification signal siginfos, glibc could even use just one helper
thread for timer_*/[al]io_* and maybe in the future other SIGEV_THREAD 
notification.

Jakub
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: IRQ problem with PCMCIA

2005-08-23 Thread Alan Cox
On Maw, 2005-08-23 at 09:49 +0200, Erik Mouw wrote:
 Is there any place where we can get your current patches?

Which ones - the PATA IDE ones are in 2.6.11-ac, a subset in Fedora
(other changes in the core IDE code make forward porting stuff for
hotplug really tricky past 2.6.11).

The SATA ones I can certainly put up if there is interest. I don't want
to put them somewhere too available yet because this right now is stuff
you only want to use under controlled circumstances for development
until both they and the core SATA layer have some improvements.

Alan

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: mass tulip_stop_rxtx() failed, network stops

2005-08-23 Thread Tomasz Chmielewski

jerome lacoste schrieb:

On 8/23/05, Tomasz Chmielewski [EMAIL PROTECTED] wrote:


(...)


We are running four more machines like that, the only difference is the
kernel they are running (2.6.11.4).

On some of them, there are serious problems with a network, and they
usually happen when the traffic is bigger than usual (i.e., some big
software deployment to several workstations, remote backup, etc.).

The syslog is then full of entries like that:

Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed



I am seeing thousands of tulip_stop_rxtx() failed messages as well
with 2.6.11. No regular network failure though.

See http://kerneltrap.org/mailarchive/1/message/110291/flat


This may have something to do with this patch, introduced with 2.6.10 
(see the ChangeLog-2.6.10).
It would explain why I had no problems on ~20 machines with 2.6.8.1 
kernel, and I have this issue on the machines with 2.6.11.5 kernel.




[PATCH] tulip: make tulip_stop_rxtx() wait for DMA to fully stop

From: John W. Linville [EMAIL PROTECTED]

tulip_stop_rxtx() doesn't wait for DMA to fully stop like the function
call name implies.

This was submitted through my employer -- I am not the original author 
of this	patch.  However, I passed it by Jeff Garizk and he expressed 
interest in having it upstream.



--
Tomek
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Alsa-devel] [2.6 patch] sound/core/memalloc.c: fix PROC_FS=n compilation

2005-08-23 Thread Takashi Iwai
At Tue, 23 Aug 2005 03:24:25 +0200,
Adrian Bunk wrote:
 
 On Mon, Aug 22, 2005 at 02:41:07PM +0200, Takashi Iwai wrote:
 ...
  I think the below is simpler.
 
 Looks good.

OK, it's now on ALSA tree.

Thanks.


Takashi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add MCE resume under ia32

2005-08-23 Thread Pavel Machek
Hi!

 It's widely seen a MCE non-fatal error reported after resume. It seems
 MCE resume is lacked under ia32. This patch tries to fix the gap.

Well, you patch seems like missing piece of puzzle, but:

a) we probably want to do it for x86-64, too, and 

b)

 diff -puN arch/i386/power/cpu.c~mcheck_resume arch/i386/power/cpu.c
 --- linux-2.6.13-rc6/arch/i386/power/cpu.c~mcheck_resume  2005-08-23 
 09:32:13.054008584 +0800
 +++ linux-2.6.13-rc6-root/arch/i386/power/cpu.c   2005-08-23 
 09:41:54.992540480 +0800
 @@ -104,6 +104,8 @@ static void fix_processor_context(void)
  
  }
  
 +extern void mcheck_init(struct cpuinfo_x86 *c);
 +
  void __restore_processor_state(struct saved_context *ctxt)
  {
   /*


this should go to some header file and most importantly

 @@ -138,6 +140,9 @@ void __restore_processor_state(struct sa
   fix_processor_context();
   do_fpu_end();
   mtrr_ap_init();
 +#ifdef CONFIG_X86_MCE
 + mcheck_init(boot_cpu_data);
 +#endif
  }

c) can't we register MCEs like some kind of system device so that this
kind of hooks is not neccessary?
Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: APIC version and 8-bit APIC IDs

2005-08-23 Thread Maciej W. Rozycki
On Mon, 22 Aug 2005, Martin Wilck wrote:

 It's a scalable system where multiple boards may be combined. Anyway, I see
 nothing in the specs that says you must start counting CPUs from zero.

 Well, Intel's Multiprocessor Specification mandates that (see section 
3.6.1 and also the compliance list in Appendix C).  I does not mandate 
local APIC IDs to be consecutive though.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel module seg fault

2005-08-23 Thread linux-os \(Dick Johnson\)

On Tue, 23 Aug 2005, manomugdha biswas wrote:

 Hi,
 I have written a kernel module and I can load (insmod)
 it without any error. But when i run my module it gets
 seg fault at interruptible_sleep_on_timeout();

 I have used this function in the following way:

 DECLARE_WAIT_QUEUE_HEAD(wq);
 init_waitqueue_head(wq);
 interruptible_sleep_on_timeout(wq, 2);

 I am using redhat version 9.0 and kernel version
 2.4.20-8.
 Could you please give some light on this issue?

 Manomugdha Biswas

seg fault??  You meen you get a kernel panic? Please
show us what it says. Note you can't sleep with a spin-lock
held.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.12.5 on an i686 machine (5537.79 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
I apologize for the following. I tried to kill it with the above dot :


The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


patch for compiling ppc without pmu

2005-08-23 Thread Johannes Berg
Hi,

This patch seems to be required to compile 2.6.13-rc6 for ppc configured
without PMU.

Apologies if it is already known, I haven't found anything like this
quickly.

Signed-Off-By: Johannes Berg [EMAIL PROTECTED]

--- linux-2.6.13-rc6.orig/arch/ppc/platforms/pmac_time.c2005-08-23 
12:14:37.689485664 +0200
+++ linux-2.6.13-rc6/arch/ppc/platforms/pmac_time.c 2005-08-23 
12:14:37.689485664 +0200
@@ -251,7 +251,7 @@
struct device_node *cpu;
unsigned int freq, *fp;
 
-#ifdef CONFIG_PM  CONFIG_ADB_PMU
+#if defined(CONFIG_PM)  defined(CONFIG_ADB_PMU)
pmu_register_sleep_notifier(time_sleep_notifier);
 #endif /* CONFIG_PM */
 



signature.asc
Description: This is a digitally signed message part


Re: skge missing ifdefs.

2005-08-23 Thread Roman Zippel
Hi,

On Tue, 23 Aug 2005, Al Viro wrote:

 As for your s/thread_info/stack/ - I don't believe it's doable in mainline
 right now.  It's definitely separate from m68k merge and should not be
 mixed into it.  Moreover, mandatory changes to every platform arch-specific
 code over basically cosmetic issue (renaming a field of task_struct) at
 this point are going to be gratitious PITA for every architecture with
 out-of-tree development.  And m68k folks, of all people, should know what
 fun it is.

No, I don't know it. Sometimes merging can be tricky, but then I check the 
original diff and apply it manually. What I'm planning involves no logical 
changes, so it would be an absolute no-brainer to merge. It's the logical 
changes that may even compile normally, that can be the a real PITA.

 When folks start using task_thread_info() in arch/* (i.e. by 2.6.1[45]) the
 size of that delta will go down big way and it will be less painful.  Until
 then...  Not a good idea.

I already did the complete conversion (and I did it forward and backward 
to be sure the result is the same), so I dont see the problem to merge it 
in 2.6.13. The final removal of the thread_info field can happen in 2.6.14 
and any missed changes in external trees are trivially fixable.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched_yield() makes OpenLDAP slow

2005-08-23 Thread linux-os \(Dick Johnson\)

On Mon, 22 Aug 2005, Robert Hancock wrote:

 linux-os (Dick Johnson) wrote:
 I reported thet sched_yield() wasn't working (at least as expected)
 back in March of 2004.

  for(;;)
  sched_yield();

 ... takes 100% CPU time as reported by `top`. It should take
 practically 0. Somebody said that this was because `top` was
 broken, others said that it was because I didn't know how to
 code. Nevertheless, the problem was not fixed, even after
 schedular changes were made for the current version.

 This is what I would expect if run on an otherwise idle machine.
 sched_yield just puts you at the back of the line for runnable
 processes, it doesn't magically cause you to go to sleep somehow.


When a kernel build is occurring??? Plus `top` itself It damn
well sleep while giving up the CPU. If it doesn't it's broken.

 --

 Robert Hancock  Saskatoon, SK, Canada
 To email, remove nospam from [EMAIL PROTECTED]
 Home Page: http://www.roberthancock.com/



Cheers,
Dick Johnson
Penguin : Linux version 2.6.12.5 on an i686 machine (5537.79 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
I apologize for the following. I tried to kill it with the above dot :


The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] New system call, unshare

2005-08-23 Thread Al Viro
On Mon, Aug 08, 2005 at 03:46:06PM +0100, Alan Cox wrote:
 On Llu, 2005-08-08 at 09:33 -0400, Janak Desai wrote:
  
  [PATCH 1/2] unshare system call: System Call handler function sys_unshare
 
 
 Given the complexity of the kernel code involved and the obscurity of
 the functionality why not just do another clone() in userspace to
 unshare the things you want to unshare and then _exit the parent ?

Because you want to keep children?  Because you don't want to deal with
the implications for sessions/groups/etc.?

FWIW, syscall makes sense.  It is a valid primitive and the only reason
to keep it out of clone() (i.e. not making it just another flag to clone())
is that clone() is already cluttered _and_ uses bad calling conventions
for that stuff (I want to retain list rather than I want private list).
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] New system call, unshare

2005-08-23 Thread Al Viro
On Wed, Aug 10, 2005 at 04:08:31PM +0200, Florian Weimer wrote:
 * Janak Desai:
 
  With unshare, namespace setup can be done using PAM session
  management functions without patching individual commands.
 
 I don't think it's a good idea to use security-critical code well
 without its original specification.  Clearly the current situation
 sucks, but this is mainly a lack of PAM functionality, IMHO.

Eh?  We are talking about a primitive that has far more uses than
PAM.  This is a missing piece of the stuff done by clone() and fork():
each task is a virtual machine with sharable components.  We can
get a copy of machine  with arbitrary set of components replaced with
private copies.  That's what clone() and fork() do.  The thing missing
from that set is taking a component (VM, descriptors, etc.) of process
itself and making it private.  The same thing we do on fork(), but
without creating a new process.

FWIW, I'm OK with that.  IIRC, Linus ACKed the concept some time ago.
PAM is one obvious use, but there's are other situations where the lack
of that primitive is inconvenient...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote:
 Thomas Gleixner wrote:
 Ok, exit_itimers()-itimer_delete() called when the last thread exits
 or does exec.
 
 kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after
 that nobody can access this timer, so we don't need to lock timer-it_lock
 at all in this case. No lock - no deadlock.

It still deadlocks:

CPU 0   CPU 1
write_lock(tasklist_lock); 
__exit_signal()
timer expires
base-running_timer = timer
  send_group_sigqueue()
   read_lock(tasklist_lock();
exit_itimers()
  del_timer_sync(timer)
 waits for ever because   waits for ever on tasklist_lock
 base-running_timer == timer


I still think the last patch I sent is still necessary.

 But I know nothing about kernel/posix-cpu-timers.c, I doubt it will work
 for posix_cpu_timer_del(). I don't have time to study posix-cpu-timers now.
 However, I see that __exit_signal() calls posix_cpu_timers_exit_xxx(), so
 may be it can work?
 
380  int posix_cpu_timer_del(struct k_itimer *timer)
381  {
382  struct task_struct *p = timer-it.cpu.task;
383
384  if (timer-it.cpu.firing)
385  return TIMER_RETRY;
386
387  if (unlikely(p == NULL))
388  return 0;
389
390  if (!list_empty(timer-it.cpu.entry)) {
391  read_lock(tasklist_lock);
 
 Surely, it should be impossible to happen when process exists, otherwise
 it would deadlock immediately, we did write_lock(tasklist).
 
 Thomas, do you know something about posix-cpu-timers.c?

Not much. I look into this 

tglx


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.13-rc6] cpu_exclusive sched domains on partial nodes temp fix

2005-08-23 Thread Ingo Molnar

* Paul Jackson [EMAIL PROTECTED] wrote:

   /*
 +  * Hack to avoid 2.6.13 partial node dynamic sched domain bug.
 +  * Require the 'cpu_exclusive' cpuset to include all (or none)
 +  * of the CPUs on each node, or return w/o changing sched domains.
 +  * Remove this hack when dynamic sched domains fixed.
 +  */
 + {
 + int i, j;
 +
 + for_each_cpu_mask(i, cur-cpus_allowed) {
 + for_each_cpu_mask(j, node_to_cpumask(cpu_to_node(i))) {
 + if (!cpu_isset(j, cur-cpus_allowed))
 + return;
 + }
 + }
 + }
 +

certainly looks acceptable from a scheduler POV.

Acked-by: Ingo Molnar [EMAIL PROTECTED]

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.13-rc6] cpu_exclusive sched domains on partial nodes temp fix

2005-08-23 Thread Paul Jackson
If Dinakar, Hawkes and Nick concur (and no one else complains too
loud) then the following should go into 2.6.13, to avoid the potential
kernel oops that Hawkes reported in Dinakar's feature to allow user
control of dynamic sched domain placement using cpu_exclusive cpusets.

This patch keeps the kernel/cpuset.c routine update_cpu_domains()
from invoking the sched.c routine partition_sched_domains() if the
cpuset in question doesn't fall on node boundaries.

I have boot tested this on an SN2, and with the help of a couple of
ad hoc printk's, determined that it does indeed avoid calling the
partition_sched_domains() routine on partial nodes.

I did not directly verify that this avoids setting up bogus sched
domains or avoids the oops that Hawkes saw.

Obviously, if the above named parties decide to take some other path,
then this patch should be discarded.  I submit this patch under the
expectation that Hawkes and others fixes to support sched domains not
on node boundaries will go into *-mm and 2.6.14.  Do not include the
following patch in *-mm or 2.6.14 versions which have the real sched
domain fixes.

This patch imposes a silent artificial constraint on which cpusets
can be used to define dynamic sched domains.

This patch should allow proceeding with this new feature in 2.6.13 for
the configurations in which it is useful (node alligned sched domains)
while avoiding trying to setup sched domains in the less useful cases
that can cause the kernel corruption and oops.

Signed-off-by: Paul Jackson [EMAIL PROTECTED]

Index: linux-2.6.13-cpuset-mempolicy-migrate/kernel/cpuset.c
===
--- linux-2.6.13-cpuset-mempolicy-migrate.orig/kernel/cpuset.c
+++ linux-2.6.13-cpuset-mempolicy-migrate/kernel/cpuset.c
@@ -636,6 +636,23 @@ static void update_cpu_domains(struct cp
return;
 
/*
+* Hack to avoid 2.6.13 partial node dynamic sched domain bug.
+* Require the 'cpu_exclusive' cpuset to include all (or none)
+* of the CPUs on each node, or return w/o changing sched domains.
+* Remove this hack when dynamic sched domains fixed.
+*/
+   {
+   int i, j;
+
+   for_each_cpu_mask(i, cur-cpus_allowed) {
+   for_each_cpu_mask(j, node_to_cpumask(cpu_to_node(i))) {
+   if (!cpu_isset(j, cur-cpus_allowed))
+   return;
+   }
+   }
+   }
+
+   /*
 * Get all cpus from parent's cpus_allowed not part of exclusive
 * children
 */

-- 
  I won't rest till it's the best ...
  Programmer, Linux Scalability
  Paul Jackson [EMAIL PROTECTED] 1.650.933.1373
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote:
 But I know nothing about kernel/posix-cpu-timers.c, I doubt it will work
 for posix_cpu_timer_del(). I don't have time to study posix-cpu-timers now.
 However, I see that __exit_signal() calls posix_cpu_timers_exit_xxx(), so
 may be it can work?

timer-it.cpu.task is set to NULL by posix_cpu_timers_exit(), so the
code in posix_cpu_timer_del returns before accessing tasklist_lock.


The exit functions do not take any locks, but it is not necessary
there. 

posix_run_cpu_timers(p) is called with p=current() and we have
interrupts disabled, so the timer interrupt can not run on this CPU. The
current exiting process can not run at the same time on a different CPU,
so no race and lockup possible here.

tglx



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


usb oops in 2.6.13-rc6-mm2

2005-08-23 Thread Jens Axboe
Hi,

usbcore: deregistering driver usb-storage
usb 1-1: USB disconnect, address 3
Unable to handle kernel NULL pointer dereference at 
RIP: 
803cf140{_spin_lock+0}
PGD 1c303067 PUD 1c304067 PMD 0 
Oops: 0002 [1] SMP 
CPU 0 
Modules linked in: nls_iso8859_1 nls_cp437 vfat fat nls_base ide_cd
cdrom
Pid: 80, comm: khubd Not tainted 2.6.13-rc6-mm2
RIP: 0010:[803cf140] 803cf140{_spin_lock+0}
RSP: 0018:81001fc75d80  EFLAGS: 00010296
RAX: 81001c08cdb0 RBX: 810019f5f8f8 RCX: 81001c4b14e8
RDX: 0070 RSI: 8040cfcc RDI: 
RBP: 810019f5f8a0 R08:  R09: 
R10: 0001 R11: 8018ad27 R12: 
R13: 810001a23c20 R14: 810001a23c00 R15: 0100
FS:  2ade8b00() GS:80612880()
knlGS:61ad4bb0
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2:  CR3: 1c302000 CR4: 06e0
Process khubd (pid: 80, threadinfo 81001fc74000, task
8100019f4e80)
Stack: 803cd130 80500aa0 810019f5f980
80500aa0 
   802a3263 810019f5f9e8 810019f5f8a0
810019f5f8a0 
   802a34f2 80500880 
Call Trace:803cd130{klist_remove+21}
802a3263{__device_release_driver+75}
   802a34f2{device_release_driver+39}
802a2db7{bus_remove_device+146}
   802a1f75{device_del+55}
802a1fbc{device_unregister+9}
   802ff51c{hub_thread+900}
80145e70{autoremove_wake_function+0}
   802ff198{hub_thread+0}
80145a70{keventd_create_kthread+0}
   80145c9e{kthread+203}
8012e3ae{schedule_tail+57}
   8010e6ce{child_rip+8}
80145a70{keventd_create_kthread+0}
   80145bd3{kthread+0} 8010e6c6{child_rip+0}
   

Code: f0 fe 0f 79 09 f3 90 80 3f 00 7e f9 eb f2 c3 f0 ff 0f 8b 07 
RIP 803cf140{_spin_lock+0} RSP 81001fc75d80
CR2: 

Just got this oops removing a usb-storage managed usb device.
usb-storage had been manually removed (as you can see from the kernel
message), a few seconds later I removed power from the device and the
oopsed happened right then.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


new qla2xxx driver breaks SAN setup with 2 controllers

2005-08-23 Thread Frederik Schueler
hello,

we are experiencing problems with the new qlogic driver in 2.6.12 on
a set of servers with qla2310 HBAs.

The problem is as follows:

The Infotrend storage array we are using has two controllers, each
of them has two virtual discs with a couple of partitions exported
as shared storage.

The controllers are linked inside of the storage box, each controller
has one qlogic fabric switch attached, and half of the servers are
connected to the lefthand switch, the other half is connected to the
righthand switch.

Now, with the qlogic driver in 2.6.11.12, we can access all shares
on both controllers from every server, while the new driver allows
only access to the respective controller where the switch is attached
to directly, thus depriving the servers of half of it's shared
storage devices.

Example: on server s05, we have a boot device (lun 3 on primary
controller), and 2 shared storages (lun 9 on primary, lun 10 on
secondary controller).

With 2.6.11.12, this looks as follows:

s05:~# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 03
  Vendor: IFT  Model: A16F-R1211   Rev: 334B
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 00 Lun: 09
  Vendor: IFT  Model: A16F-R1211   Rev: 334B
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 01 Lun: 10
  Vendor: IFT  Model: A16F-R1211   Rev: 334B
  Type:   Direct-AccessANSI SCSI revision: 03


and the driver sees everything:

s05:~# cat /proc/scsi/qla2xxx/0
QLogic PCI to Fibre Channel Host Adapter for QLA2310:
Firmware version 3.03.08 IPX, Driver version 8.00.02b4-k
ISP: ISP2300, Serial# R74545
Request Queue = 0xcf94, Response Queue = 0xcf98
Request Queue count = 2048, Response Queue count = 512
Total number of active commands = 0
Total number of interrupts = 1117762
Device queue depth = 0x20
Number of free request entries = 964
Number of mailbox timeouts = 0
Number of ISP aborts = 0
Number of loop resyncs = 0
Number of retries for empty slots = 0
Number of reqs in pending_q= 0, retry_q= 0, done_q= 0, scsi_retry_q= 0
Host adapter:loop state = READY, flags = 0x1a03
Dpc flags = 0x0
MBX flags = 0x0
Link down Timeout = 030
Port down retry = 030
Login retry count = 030
Commands retried with dropped frame(s) = 0
Product ID = 4953 5020 2020 0001


SCSI Device Information:
scsi-qla0-adapter-node=20e08b1bd113;
scsi-qla0-adapter-port=21e08b1bd113;
scsi-qla0-target-0=21d02382;
scsi-qla0-target-1=21d02362;

SCSI LUN Information:
(Id:Lun)  * - indicates lun is not registered with the OS.
( 0: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:81 00
( 0: 3): Total reqs 470693, Pending reqs 0, flags 0x0, 0:0:81 00
( 0: 9): Total reqs 227717, Pending reqs 0, flags 0x0, 0:0:81 00
( 0:11): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00
( 0:13): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:81 00
( 1: 0): Total reqs 2, Pending reqs 0, flags 0x0*, 0:0:82 00
( 1:10): Total reqs 12, Pending reqs 0, flags 0x0, 0:0:82 00
( 1:12): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00
( 1:14): Total reqs 0, Pending reqs 0, flags 0x0*, 0:0:82 00


while on 2.6.12.5 and 2.6.13-rc6 it looks like this:

sm05:~# scsiadd -a 0 0 0 9
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 03
  Vendor: IFT  Model: A16F-R1211   Rev: 334B
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 00 Lun: 09
  Vendor: IFT  Model: A16F-R1211   Rev: 334B
  Type:   Direct-AccessANSI SCSI revision: 03


sm05:~# scsiadd -a 0 0 1 10
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 03
  Vendor: IFT  Model: A16F-R1211   Rev: 334B
  Type:   Direct-AccessANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 00 Lun: 09
  Vendor: IFT  Model: A16F-R1211   Rev: 334B
  Type:   Direct-AccessANSI SCSI revision: 03


unfortunately, the proc interface was removed:

s05:/sys/devices/pci:00/:00:02.0/:01:00.0/:02:02.0/host0#
find .
.
./rport-0:0-1
./rport-0:0-1/power
./rport-0:0-1/power/state
./rport-0:0-0
./rport-0:0-0/target0:0:0
./rport-0:0-0/target0:0:0/0:0:0:9
./rport-0:0-0/target0:0:0/0:0:0:9/ioerr_cnt
./rport-0:0-0/target0:0:0/0:0:0:9/iodone_cnt
./rport-0:0-0/target0:0:0/0:0:0:9/iorequest_cnt
./rport-0:0-0/target0:0:0/0:0:0:9/iocounterbits
./rport-0:0-0/target0:0:0/0:0:0:9/timeout
./rport-0:0-0/target0:0:0/0:0:0:9/state
./rport-0:0-0/target0:0:0/0:0:0:9/delete
./rport-0:0-0/target0:0:0/0:0:0:9/rescan
./rport-0:0-0/target0:0:0/0:0:0:9/rev
./rport-0:0-0/target0:0:0/0:0:0:9/model
./rport-0:0-0/target0:0:0/0:0:0:9/vendor
./rport-0:0-0/target0:0:0/0:0:0:9/scsi_level
./rport-0:0-0/target0:0:0/0:0:0:9/type
./rport-0:0-0/target0:0:0/0:0:0:9/queue_type
./rport-0:0-0/target0:0:0/0:0:0:9/queue_depth
./rport-0:0-0/target0:0:0/0:0:0:9/device_blocked

Re: 2.6.13-rc6-mm2

2005-08-23 Thread Reuben Farrelly

Hi,

On 23/08/2005 4:30 p.m., Andrew Morton wrote:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/

- Various updates.  Nothing terribly noteworthy.


Yup, seems to be generally good...

Noticed this in the log earlier tonight:

Aug 23 19:44:51 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
re-enabling...

Aug 23 19:44:51 tornado kernel: usb 5-1: USB disconnect, address 2
Aug 23 19:44:51 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed
Aug 23 19:44:51 tornado kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 0004

Aug 23 19:44:51 tornado kernel:  printing eip:
Aug 23 19:44:51 tornado kernel: c01ccef2
Aug 23 19:44:51 tornado kernel: *pde = 
Aug 23 19:44:51 tornado kernel: Oops:  [#1]
Aug 23 19:44:51 tornado kernel: SMP
Aug 23 19:44:51 tornado kernel: last sysfs file: 
/devices/pci:00/:00:1f.3/i2c-0/name
Aug 23 19:44:51 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom 
sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de
flate zlib_inflate dm_mod video thermal processor fan button ac tpm_nsc 
i2c_i801 sky2 e100 sr_mod

Aug 23 19:44:51 tornado kernel: CPU:1
Aug 23 19:44:51 tornado kernel: EIP:0060:[c01ccef2]Not tainted VLI
Aug 23 19:44:51 tornado kernel: EFLAGS: 00010286   (2.6.13-rc6-mm2)
Aug 23 19:44:51 tornado kernel: EIP is at _raw_spin_lock+0x7/0x73
Aug 23 19:44:51 tornado kernel: eax:    ebx:    ecx: c1a60658 
  edx: c1a63e24
Aug 23 19:44:51 tornado kernel: esi:    edi: c0382400   ebp: f7c55e98 
  esp: f7c55e90

Aug 23 19:44:51 tornado kernel: ds: 007b   es: 007b   ss: 0068
Aug 23 19:44:51 tornado kernel: Process khubd (pid: 109, threadinfo=f7c54000 
task=c192b030)
Aug 23 19:44:51 tornado kernel: Stack: f7c58a8c  f7c55ea0 c0312219 
f7c55eb0 c030feb7 f7c58ae8 f7c58a48
Aug 23 19:44:51 tornado kernel:f7c55ec4 c0217e73 f7c58a48 f7d134ec 
0040 f7c55ed0 c0217ec0 f7c58a48
Aug 23 19:44:51 tornado kernel:f7c55edc c0217814 f7c58a48 f7c55eec 
c0216ad2 f7c58a48 f7c58a14 f7c55ef8

Aug 23 19:44:51 tornado kernel: Call Trace:
Aug 23 19:44:51 tornado kernel:  [c01039c3] show_stack+0x94/0xca
Aug 23 19:44:51 tornado kernel:  [c0103b6c] show_registers+0x15a/0x1ea
Aug 23 19:44:51 tornado kernel:  [c0103d8a] die+0x108/0x183
Aug 23 19:44:51 tornado kernel:  [c031295a] do_page_fault+0x1ea/0x63d
Aug 23 19:44:51 tornado kernel:  [c0103693] error_code+0x4f/0x54
Aug 23 19:44:51 tornado kernel:  [c0312219] _spin_lock+0x8/0xa
Aug 23 19:44:51 tornado kernel:  [c030feb7] klist_remove+0x10/0x2c
Aug 23 19:44:51 tornado kernel:  [c0217e73] __device_release_driver+0x41/0x65
Aug 23 19:44:51 tornado kernel:  [c0217ec0] device_release_driver+0x29/0x39
Aug 23 19:44:51 tornado kernel:  [c0217814] bus_remove_device+0x52/0x60
Aug 23 19:44:51 tornado kernel:  [c0216ad2] device_del+0x2e/0x5d
Aug 23 19:44:51 tornado kernel:  [c0216b0c] device_unregister+0xb/0x15
Aug 23 19:44:51 tornado kernel:  [c0275d67] usb_disconnect+0x115/0x15c
Aug 23 19:44:51 tornado kernel:  [c0276b85] hub_port_connect_change+0x54/0x399
Aug 23 19:44:51 tornado kernel:  [c027713e] hub_events+0x274/0x3b2
Aug 23 19:44:51 tornado kernel:  [c0277296] hub_thread+0x1a/0xdf
Aug 23 19:44:51 tornado kernel:  [c012fba7] kthread+0x99/0x9d
Aug 23 19:44:51 tornado kernel:  [c01010b5] kernel_thread_helper+0x5/0xb
Aug 23 19:44:51 tornado kernel: Code: 00 00 00 8b 0d a8 62 36 c0 e9 61 ff ff 
ff f3 90 31 c0 86 07 84 c0 0f 8e 79 ff ff ff 83 c4 18 5
b 5e 5f 5d c3 55 89 e5 56 53 89 c3 81 78 04 ad 4e ad de 75 2d be 00 e0 ff ff 
21 e6 8b 06 39 43 0c


reuben

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: return ENOBUFS instead of ENOMEM in generic_file_buffered_write

2005-08-23 Thread Christoph Hellwig
On Tue, Aug 23, 2005 at 11:46:33AM +0300, Pekka J Enberg wrote:
 As noticed by Dmitry Torokhov, write() can not return ENOMEM:
 
 http://www.opengroup.org/onlinepubs/95399/functions/write.html
 
 Therefore fixup generic_file_buffered_write() in mm/filemap.c (pointed out by
 Nathan Scott).

We had this discussion before, for EACCESS then.  We've always been returning
more errnos than SuS mentioned and Linus declared it's fine.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched_yield() makes OpenLDAP slow

2005-08-23 Thread Denis Vlasenko
On Tuesday 23 August 2005 14:17, linux-os \(Dick Johnson\) wrote:
 
 On Mon, 22 Aug 2005, Robert Hancock wrote:
 
  linux-os (Dick Johnson) wrote:
  I reported thet sched_yield() wasn't working (at least as expected)
  back in March of 2004.
 
 for(;;)
   sched_yield();
 
  ... takes 100% CPU time as reported by `top`. It should take
  practically 0. Somebody said that this was because `top` was
  broken, others said that it was because I didn't know how to
  code. Nevertheless, the problem was not fixed, even after
  schedular changes were made for the current version.
 
  This is what I would expect if run on an otherwise idle machine.
  sched_yield just puts you at the back of the line for runnable
  processes, it doesn't magically cause you to go to sleep somehow.
 
 
 When a kernel build is occurring??? Plus `top` itself It damn
 well sleep while giving up the CPU. If it doesn't it's broken.

top doesn't run all the time:

# strace -o top.strace -tt top

14:52:19.407958 write(1,   758 root  16   0   104   2..., 79) = 79
14:52:19.408318 write(1,   759 root  16   0   100   1..., 79) = 79
14:52:19.408659 write(1,   760 root  16   0   100   1..., 79) = 79
14:52:19.409001 write(1,   761 root  18   0  2604  39..., 74) = 74
14:52:19.409342 write(1,   763 daemon17   0   108   1..., 78) = 78
14:52:19.409672 write(1,   773 root  16   0   104   2..., 79) = 79
14:52:19.410010 write(1,   774 root  16   0   104   2..., 79) = 79
14:52:19.410362 write(1,   775 root  16   0   100   1..., 79) = 79
14:52:19.410692 write(1,   776 root  16   0   104   2..., 79) = 79
14:52:19.411136 write(1,   777 daemon17   0   108   1..., 86) = 86
14:52:19.411505 select(1, [0], NULL, NULL, {5, 0}) = 0 (Timeout)
hrrr. ps...
14:52:24.411744 time([1124797944])  = 1124797944
14:52:24.411883 lseek(4, 0, SEEK_SET)   = 0
14:52:24.411957 read(4, 24822.01 18801.28\n, 1023) = 18
14:52:24.412082 access(/var/run/utmpx, F_OK) = -1 ENOENT (No such file or 
directory)
14:52:24.412224 open(/var/run/utmp, O_RDWR) = 8
14:52:24.412328 fcntl64(8, F_GETFD) = 0
14:52:24.412399 fcntl64(8, F_SETFD, FD_CLOEXEC) = 0
14:52:24.412467 _llseek(8, 0, [0], SEEK_SET) = 0
14:52:24.412556 alarm(0)= 0
14:52:24.412643 rt_sigaction(SIGALRM, {0x4015a57c, [], SA_RESTORER, 
0x40094ae8}, {SIG_DFL}, 8) = 0
14:52:24.412747 alarm(1)= 0

However, kernel compile shouldn't.

I suggest stracing with -tt for(;;) yield(); test proggy with and without
kernel compile in parallel, and comparing the output...

Hmm... actually, knowing that you will argue to death instead...

# cat t.c
#include sched.h

int main() {
for(;;) sched_yield();
return 0;
}
# gcc t.c
# strace -tt ./a.out
...
15:03:41.211324 sched_yield()   = 0
15:03:41.211673 sched_yield()   = 0
15:03:41.212034 sched_yield()   = 0
15:03:41.212400 sched_yield()   = 0
15:03:41.212749 sched_yield()   = 0
15:03:41.213126 sched_yield()   = 0
15:03:41.213486 sched_yield()   = 0
15:03:41.213835 sched_yield()   = 0
15:03:41.214220 sched_yield()   = 0
15:03:41.214577 sched_yield()   = 0
15:03:41.214939 sched_yield()   = 0
I start while true; do true; done on another console...
15:03:43.314645 sched_yield()   = 0
15:03:43.847644 sched_yield()   = 0
15:03:43.954635 sched_yield()   = 0
15:03:44.063798 sched_yield()   = 0
15:03:44.171596 sched_yield()   = 0
15:03:44.282624 sched_yield()   = 0
15:03:44.391632 sched_yield()   = 0
15:03:44.498609 sched_yield()   = 0
15:03:44.605584 sched_yield()   = 0
15:03:44.712538 sched_yield()   = 0
15:03:44.819557 sched_yield()   = 0
15:03:44.928594 sched_yield()   = 0
15:03:45.040603 sched_yield()   = 0
15:03:45.148545 sched_yield()   = 0
15:03:45.259311 sched_yield()   = 0
15:03:45.368563 sched_yield()   = 0
15:03:45.476482 sched_yield()   = 0
15:03:45.583568 sched_yield()   = 0
15:03:45.690491 sched_yield()   = 0
15:03:45.797512 sched_yield()   = 0
15:03:45.906534 sched_yield()   = 0
15:03:46.013545 sched_yield()   = 0
15:03:46.120505 sched_yield()   = 0
Ctrl-C

# uname -a
Linux firebird 2.6.12-r4 #1 SMP Sun Jul 17 13:51:47 EEST 2005 i686 unknown 
unknown GNU/Linux
--
vda

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC - 0/9] Generic timekeeping subsystem (v. B5)

2005-08-23 Thread Roman Zippel
Hi,

On Mon, 22 Aug 2005, john stultz wrote:

 The reason why we calculate the interval_length in the continuous
 timesource case is because we are not assuming anything about the
 frequency that the timekeeping_periodic_hook() is called.

The problem with your patch is that it doesn't allow making such 
assumptions.
Anyway, it's rather simple, if you want to update the time asynchronously:

cycle_offset = get_cycles() - last_update;

while (cycle_offset = update_cycles) {
cycle_offset -= update_cycles;
last_update += update_cycles;
// at init: system_update = update_cycles * mult;
system_time += system_update;
xtime += [tick_nsec, time_adj];
}

error = system_time - (xtime.tv_nsec  shift);

if (abs(error)  update_cycles/2) {
mult_adj = (error +- update_cycles/2) / update_cycles;
mult += mult_adj;
system_update += mult_adj * update_cycles;
system_time -= mult_adj * cycle_offset;
error -= mult_adj * cycle_offset;
}

if (xtime.tv_nsec + (error  shift)  NSEC_PER_SEC) {
system_time -= NSEC_PER_SEC  shift;
second_overflow();
}

Since we usually don't have to adjust for the error all at once, it should 
be possible to precalculate some of it in adjtimex/second_overflow and 
turn mult_adj into a mult_adj_shift.
I didn't really check the math here in detail, so there should be enough 
errors left :), but I hope it's enough to show the idea (especially how to 
do it without mult/divide).

There are now variations of this possible, the initial cycle_offset can be 
constant, this happens if it's regularly  called from an interrupt (and 
it's sufficient for UP systems). We could also completely ignore the 
error, so that the core calculation of the above results in the familiar:

xtime += [tick_nsec, time_adj];
if (xtime.tv_nsec  NSEC_PER_SEC)
second_overflow();

Another variation would be useful for ppc64 (or maybe any 64bit arch, but 
ppc64 has already the matching gettimeofday). In this case we don't use a 
timespec based xtime and don't scale it to ns, but use 64bit values 
instead scaled to seconds.
The last one may become a bit of a challenge to keep as much as possible 
code common without abusing the preprocessor too much. In any case some 
functions will differ completely anyway, especially gettimeofday will be 
optimized differently depending on the arch/clock requirements, OTOH
introducing a common gettimeofday (that would even require a 64bit 
divide) would be a huge mistake.

bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: kernel module seg fault

2005-08-23 Thread bunnans
Hi Biswas,

You need to post the complete kernel dump message and body of your
source code.

-Bunnan
 
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of manomugdha
biswas
Sent: Tuesday, August 23, 2005 3:13 PM
To: linux-kernel@vger.kernel.org
Subject: kernel module seg fault

Hi,
I have written a kernel module and I can load (insmod)
it without any error. But when i run my module it gets
seg fault at interruptible_sleep_on_timeout();

I have used this function in the following way:

DECLARE_WAIT_QUEUE_HEAD(wq);
init_waitqueue_head(wq);
interruptible_sleep_on_timeout(wq, 2);

I am using redhat version 9.0 and kernel version
2.4.20-8.
Could you please give some light on this issue?

Manomugdha Biswas







Send a rakhi to your brother, buy gifts and win attractive prizes. Log
on to http://in.promos.yahoo.com/rakhi/index.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel
in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] blk queue io tracing support

2005-08-23 Thread Jens Axboe
Hi,

This is a little something I have played with. It allows you to see
exactly what is going on in the block layer for a given queue. Currently
it can logs request queueing and building, dispatches, requeues, and
completions. I've uploaded a little silly app to do dumps here:

http://www.kernel.org/pub/linux/kernel/people/axboe/tools/blktrace.c

Sample output looks like this:

wiggum:~ # ./blktrace /dev/sda
relay name: /relay/sda0
   0  3765 Q R 192-200
   5  3765 G R
  13  3765 M R [200-208]
  15  3765 M R [208-216]
  17  3765 M R [216-224]
  18  3765 M R [224-232]
  19  3765 M R [232-240]
  20  3765 M R [240-248]
  21  3765 M R [248-256]
 154  3765 M R [256-264]
 156  3765 M R [264-272]
 157  3765 M R [272-280]
 159  3765 M R [280-288]
 160  3765 M R [288-296]
 161  3765 M R [296-304]
 162  3765 M R [304-312]
 163  3765 M R [312-320]
 164  3765 M R [320-328]
 170  3765 M R [328-336]
 171  3765 M R [336-344]
 172  3765 M R [344-352]
 173  3765 M R [352-360]
 174  3765 M R [360-368]
 175  3765 M R [368-376]
 177  3765 M R [376-384]
 178  3765 M R [384-392]
 179  3765 Q R 392-400
 180  3765 G R
 181  3765 M R [400-408]
 182  3765 M R [408-416]
 183  3765 M R [416-424]
 184  3765 M R [424-432]
 185  3765 M R [432-440]
 186  3765 M R [440-448]
 187  3765 M R [448-456]
 189  3765 M R [456-464]
 190  3765 M R [464-472]
 191  3765 M R [472-480]
 193  3765 M R [480-488]
 194  3765 M R [488-496]
 196  3765 M R [496-504]
 197  3765 M R [504-512]
 228  3765 D R 192-392
 245  3765 D R 392-512
   14049 0 C R 192-392 [0]
   14067 0 D R 392-512
   14807 0 C R 392-512 [0]
Reads:  Queued:   2,  160KiB
Completed:2,  160KiB
Merges:  38
Writes: Queued:   0,0KiB
Completed:0,0KiB
Merges:   0
Events: 47
Missed events: 0

This is a log of a dd if=/dev/sda of=/dev/null bs=64k count=2 and it
shows queueing (Q) and allocation (G) of two requests, along with the
merges (M) that happens there. Finally you see dispatch (D) and
completion (C) of them as well. When sigint is received, blktrace dumps
stats of the current run.

It will work for scsi commands as well, so you can see what is going on
when cdrecord is talking to the device (the cdb is dumped, not the
data). The final integer printed in [] after a completion is the error,
0 for correct completion.

You can register interest in various events, see blktrace.c (grep for
buts and BLKSTARTTRACE).

Patch is against 2.6.13-rc6-mm2. I'm attaching a relayfs update from Tom
Zanussi as well, which is required to handle sub-buffer wrapping
correctly. You need to apply both patches to play with this - and make
sure to enable CONFIG_BLK_DEV_IO_TRACE in your .config, of course. And
blktrace.c relies on relayfs being mounted on /relay, add something ala

none /relay   relayfsdefaults 0 0

to your /etc/fstab to accomplish that (or do it manually, only
mentioning it for completeness).

-- 
Jens Axboe

diff -urpN -X /home/axboe/cdrom/exclude 
/opt/kernel/linux-2.6.13-rc6-mm2/drivers/block/blktrace.c 
linux-2.6.13-rc6-mm2/drivers/block/blktrace.c
--- /opt/kernel/linux-2.6.13-rc6-mm2/drivers/block/blktrace.c   1970-01-01 
01:00:00.0 +0100
+++ linux-2.6.13-rc6-mm2/drivers/block/blktrace.c   2005-08-23 
13:34:17.0 +0200
@@ -0,0 +1,119 @@
+#include linux/config.h
+#include linux/kernel.h
+#include linux/blkdev.h
+#include linux/blktrace.h
+#include asm/uaccess.h
+
+void __blk_add_trace(struct blk_trace *bt, sector_t sector, int bytes,
+int rw, u32 what, int error, int pdu_len, char *pdu_data)
+{
+   struct blk_io_trace t;
+   unsigned long flags;
+
+   if (rw == WRITE)
+   what |= BLK_TC_ACT(BLK_TC_WRITE);
+   else
+   what |= BLK_TC_ACT(BLK_TC_READ);
+   
+   if (((bt-act_mask  BLK_TC_SHIFT)  what) == 0)
+   return;
+
+   t.magic = BLK_IO_TRACE_MAGIC | BLK_IO_TRACE_VERSION;
+   t.sequence  = atomic_add_return(1, bt-sequence);
+   t.time  = sched_clock() / 1000;
+   t.sector= sector;
+   t.bytes = bytes;
+   t.action= what;
+   t.pid   = current-pid;
+   t.error = error;
+   t.pdu_len   = pdu_len;
+
+   local_irq_save(flags);
+   __relay_write(bt-rchan, t, sizeof(t));
+   if (pdu_len)
+   __relay_write(bt-rchan, pdu_data, pdu_len);
+   local_irq_restore(flags);
+}
+
+int blk_stop_trace(struct block_device *bdev)
+{
+   request_queue_t *q = 

Re: 2.6.13-rc6-rt9

2005-08-23 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 Ingo, can't you get rt.c to be more confusing. I mean it is too 
 simple. We need to add a few more underscores here and there :-) 
 Seriously, that rt.c is mind boggling. It was nice before, now it is 
 just screaming for a cleanup (come now, do we really need the four 
 underscores?). Same with latency.c.

i agree that it's ugly, but some of that ugliness is to achieve the 
7-instructions fail-through codepath for the common acquire (and 
release) codepath:

 c03a5320 __down_mutex:
 c03a5320:   89 c1   mov%eax,%ecx
 c03a5322:   8b 15 08 76 3a c0   mov0xc03a7608,%edx
 c03a5328:   31 c0   xor%eax,%eax
 c03a532a:   0f b1 51 14 cmpxchg %edx,0x14(%ecx)
 c03a532e:   85 c0   test   %eax,%eax
 c03a5330:   75 01   jnec03a5333 __down_mutex+0x13
 c03a5332:   c3  ret

that's how much it takes to acquire an RT lock, and i worked hard to get 
there. As long as the fastpath is kept this tight, feel free to do 
cleanups. But i really want to avoid having to write mutex_down/up in 
assembly for 24 architectures ...

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-rt9

2005-08-23 Thread Steven Rostedt
On Tue, 2005-08-23 at 14:36 +0200, Ingo Molnar wrote:
 * Steven Rostedt [EMAIL PROTECTED] wrote:
 
  Ingo, can't you get rt.c to be more confusing. I mean it is too 
  simple. We need to add a few more underscores here and there :-) 
  Seriously, that rt.c is mind boggling. It was nice before, now it is 
  just screaming for a cleanup (come now, do we really need the four 
  underscores?). Same with latency.c.
 
 i agree that it's ugly, but some of that ugliness is to achieve the 
 7-instructions fail-through codepath for the common acquire (and 
 release) codepath:
 
  c03a5320 __down_mutex:
  c03a5320:   89 c1   mov%eax,%ecx
  c03a5322:   8b 15 08 76 3a c0   mov0xc03a7608,%edx
  c03a5328:   31 c0   xor%eax,%eax
  c03a532a:   0f b1 51 14 cmpxchg %edx,0x14(%ecx)
  c03a532e:   85 c0   test   %eax,%eax
  c03a5330:   75 01   jnec03a5333 __down_mutex+0x13
  c03a5332:   c3  ret
 

Impressive!

 that's how much it takes to acquire an RT lock, and i worked hard to get 
 there. As long as the fastpath is kept this tight, feel free to do 
 cleanups. But i really want to avoid having to write mutex_down/up in 
 assembly for 24 architectures ...

Warning! I'm hacking hard to get rid of the global pi_lock, and I'm not
worrying now about efficiency.  I figure that if I can get it to work,
then we can speed it up afterwards.  Since it's complex enough keeping
all the locks straight, I just want it to work without deadlocking. 

Once I get it to work, I'll let you figure out how get it back down to
7-instructions :-)

-- Steve


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] suspend: update warnings

2005-08-23 Thread Pavel Machek
Hi!

  + * If you have unsupported (*) devices using DMA, you may have some
  + * problems. If your disk driver does not support suspend... (IDE does),
  + * it may cause some problems, too. If you change kernel command line 
  + * between suspend and resume, it may do something wrong. If you change 
  + * your hardware while system is suspended... well, it was not good idea;
  + * but it wil probably only crash.
 
 The most common driver issues I see involve:
 - USB being built in or as modules that are still loaded while
 suspending (getting better, but not there yet)
 - DRI being used in X where the drivers don't properly support
 suspend/resume (NVidia esp)
 - Firewire
 - CPU Freq  (improving too)
 
 It might be good to mention these areas too.

Well, right; but those 'only' cause system to crash during suspend. I
was talking about really dangerous stuff.

Both usb and cpufreq seems to work okay here.

I've added FAQ entry at the end:

Q: What information is usefull for debugging suspend-to-disk problems?

A: Well, last messages on the screen are always useful. If something
is broken, it is usually some kernel driver, therefore trying with as
little as possible modules loaded helps a lot. I also prefer people to
suspend from console, preferably without X running. Booting with
init=/bin/bash, then swapon and starting suspend sequence manually
usually does the trick. Then it is good idea to try with latest
vanilla kernel.

Known problematic modules are; be sure to unload them before
suspend:
- DRI being used in X where the drivers don't properly support
suspend/resume (NVidia esp)
- Firewire
- SCSI


 Perhaps the 'changing your hardware' could mention that replacing faulty
 hardware may be safe.

I do not want to encourage people to do that. Yep, its probably safe,
no, I do not want them to know.

Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm2 (hangs on non-SMP x86-64 and oopses)

2005-08-23 Thread Rafael J. Wysocki
On Tuesday, 23 of August 2005 06:30, Andrew Morton wrote:
 
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/
 
 - Various updates.  Nothing terribly noteworthy.

It hangs solig during boot (after starting kjournald) on Asus L5D (non-SMP 
x86-64),
which is caused by this patch:

8250-serial-console-locking-bug-spelling-fix.patch

(from binary search).

If this patch is reverted, it oopses like in the following trace.

At the same time it works fine on an SMP box (dual-core Athlon 64).

Greetings,
Rafael


ACPI: PCI Interrupt Link [LUS2] enabled at IRQ 5
PCI: setting IRQ 5 as level-triggered
ACPI: PCI Interrupt :00:02.2[C] - Link [LUS2] - GSI 5 (level, low) - IRQ 
5
PCI: Setting latency timer of device :00:02.2 to 64
ehci_hcd :00:02.2: EHCI Host Controller
ehci_hcd :00:02.2: debug port 1
ehci_hcd :00:02.2: new USB bus registered, assigned bus number 3
ehci_hcd :00:02.2: irq 5, io mem 0xfebfdc00
PCI: cache line size of 64 is not supported by device :00:02.2
ehci_hcd :00:02.2: park 0
ehci_hcd :00:02.2: USB 2.0 initialized, EHCI 1.00, driver 10 Dec 2004
hub 3-0:1.0: USB hub found
usb 2-2: string descriptor 0 read error: -110
hub 3-0:1.0: 6 ports detected
usb 2-2: string descriptor 0 read error: -110
usb 2-2: can't set config #1, error -110
Unable to handle kernel NULL pointer dereference at 0004 RIP:
8024373b{_raw_spin_lock+27}
PGD 2ca73067 PUD 2ca46067 PMD 0
Oops:  [1] PREEMPT
CPU 0
Modules linked in: ehci_hcd ohci_hcd sk98lin evdev joydev sg st sr_mod sd_mod 
scsi_mod ide_cd cdrom dm_mod parport_pc lp parport
Pid: 108, comm: khubd Not tainted 2.6.13-rc6-mm2
RIP: 0010:[8024373b] 8024373b{_raw_spin_lock+27}
RSP: :81002fc7dcc8  EFLAGS: 00010282
RAX: 810001ce20d0 RBX:  RCX: 81002d586530
RDX:  RSI: 81002d586540 RDI: 
RBP: 81002fc7dce8 R08:  R09: 81002d586410
R10:  R11:  R12: 
R13: 803f06a0 R14: 81002d5557f8 R15: 0002
FS:  2b28fe80() GS:804f8840() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 0004 CR3: 2ca61000 CR4: 06e0
Process khubd (pid: 108, threadinfo 81002fc7c000, task 810001ce20d0)
Stack:   803f06a0 81002d5557f8
   81002fc7dd08 8035612e 81002d555918 81002d555870
   81002fc7dd28 80353b2f
Call Trace:8035612e{_spin_lock+30} 80353b2f{klist_remove+31}
   802ad11d{__device_release_driver+93} 
802ad254{device_release_driver+52}
   802ac994{bus_remove_device+180} 
802ab7f8{device_del+56}
   802d657f{usb_new_device+495} 
802d7419{hub_thread+1961}
   80354b6f{thread_return+187} 
8014a710{autoremove_wake_function+0}
   8014a710{autoremove_wake_function+0} 
802d6c70{hub_thread+0}
   8014a583{kthread+211} 8010f5e6{child_rip+8}
   8014a4b0{kthread+0} 8010f5de{child_rip+0}

BUG: spinlock trylock failure on UP on CPU#0, khubd/108
 lock: 803bf020, .magic: dead4ead, .owner: khubd/108, .owner_cpu: 0

Call Trace:802439f9{add_preempt_count+105} 
80243623{spin_bug+211}
   8011004b{show_trace+571} 
8024370e{_raw_spin_trylock+62}
   80355e4e{_spin_trylock+30} 8010fc81{oops_begin+17}
   8035702a{do_page_fault+1722} 8013452e{vprintk+830}
   8013452e{vprintk+830} 80152296{kallsyms_lookup+246}
   8010f431{error_exit+0} 8011004b{show_trace+571}
   80110047{show_trace+567} 80110168{show_stack+216}
   80110207{show_registers+135} 8011050e{__die+142}
   80357098{do_page_fault+1832} 
80355fa4{_spin_unlock_irq+20}
   80354b6f{thread_return+187} 8010f431{error_exit+0}
   8024373b{_raw_spin_lock+27} 
802439f9{add_preempt_count+105}
   8035612e{_spin_lock+30} 80353b2f{klist_remove+31}
   802ad11d{__device_release_driver+93} 
802ad254{device_release_driver+52}
   802ac994{bus_remove_device+180} 
802ab7f8{device_del+56}
   802d657f{usb_new_device+495} 
802d7419{hub_thread+1961}
   80354b6f{thread_return+187} 
8014a710{autoremove_wake_function+0}
   8014a710{autoremove_wake_function+0} 
802d6c70{hub_thread+0}
   8014a583{kthread+211} 8010f5e6{child_rip+8}
   8014a4b0{kthread+0} 8010f5de{child_rip+0}

---
| preempt count: 0003 ]
| 3 level deep critical section nesting:

.. [80356126]  

Re: [patch] suspend: update warnings

2005-08-23 Thread Nigel Cunningham
Hi.

On Tue, 2005-08-23 at 22:50, Pavel Machek wrote:
 Hi!
 
   + * If you have unsupported (*) devices using DMA, you may have some
   + * problems. If your disk driver does not support suspend... (IDE does),
   + * it may cause some problems, too. If you change kernel command line 
   + * between suspend and resume, it may do something wrong. If you change 
   + * your hardware while system is suspended... well, it was not good idea;
   + * but it wil probably only crash.
  
  The most common driver issues I see involve:
  - USB being built in or as modules that are still loaded while
  suspending (getting better, but not there yet)
  - DRI being used in X where the drivers don't properly support
  suspend/resume (NVidia esp)
  - Firewire
  - CPU Freq  (improving too)
  
  It might be good to mention these areas too.
 
 Well, right; but those 'only' cause system to crash during suspend. I
 was talking about really dangerous stuff.
 
 Both usb and cpufreq seems to work okay here.

It depends on what you're using. I believe one of the usb root hub
drivers is okay, the others aren't. Similar for cpufreq. USB certainly
accounts for a high percentage of the failures I see.

 I've added FAQ entry at the end:
 
 Q: What information is usefull for debugging suspend-to-disk problems?
 
 A: Well, last messages on the screen are always useful. If something
 is broken, it is usually some kernel driver, therefore trying with as
 little as possible modules loaded helps a lot. I also prefer people to
 suspend from console, preferably without X running. Booting with
 init=/bin/bash, then swapon and starting suspend sequence manually
 usually does the trick. Then it is good idea to try with latest
 vanilla kernel.
 
 Known problematic modules are; be sure to unload them before
 suspend:
 - DRI being used in X where the drivers don't properly support
 suspend/resume (NVidia esp)
 - Firewire
 - SCSI
 
 
  Perhaps the 'changing your hardware' could mention that replacing faulty
  hardware may be safe.
 
 I do not want to encourage people to do that. Yep, its probably safe,
 no, I do not want them to know.

:

Thanks

Nigel
-- 
Evolution.
Enumerate the requirements.
Consider the interdependencies.
Calculate the probabilities.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-rt9

2005-08-23 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 On Tue, 2005-08-23 at 14:36 +0200, Ingo Molnar wrote:
  * Steven Rostedt [EMAIL PROTECTED] wrote:
  
   Ingo, can't you get rt.c to be more confusing. I mean it is too 
   simple. We need to add a few more underscores here and there :-) 
   Seriously, that rt.c is mind boggling. It was nice before, now it is 
   just screaming for a cleanup (come now, do we really need the four 
   underscores?). Same with latency.c.
  
  i agree that it's ugly, but some of that ugliness is to achieve the 
  7-instructions fail-through codepath for the common acquire (and 
  release) codepath:
  
   c03a5320 __down_mutex:
   c03a5320:   89 c1   mov%eax,%ecx
   c03a5322:   8b 15 08 76 3a c0   mov0xc03a7608,%edx
   c03a5328:   31 c0   xor%eax,%eax
   c03a532a:   0f b1 51 14 cmpxchg %edx,0x14(%ecx)
   c03a532e:   85 c0   test   %eax,%eax
   c03a5330:   75 01   jnec03a5333 __down_mutex+0x13
   c03a5332:   c3  ret
  
 
 Impressive!
 
  that's how much it takes to acquire an RT lock, and i worked hard to get 
  there. As long as the fastpath is kept this tight, feel free to do 
  cleanups. But i really want to avoid having to write mutex_down/up in 
  assembly for 24 architectures ...
 
 Warning! I'm hacking hard to get rid of the global pi_lock, and I'm not
 worrying now about efficiency.  I figure that if I can get it to work,
 then we can speed it up afterwards.  Since it's complex enough keeping
 all the locks straight, I just want it to work without deadlocking. 
 
 Once I get it to work, I'll let you figure out how get it back down to 
 7-instructions :-)

yeah. It can always be done after the fact - the basics wont change.  
(Note that the above disassembly is for UP, on SMP the fastpath is 
longer and around 10-15 instructions.)

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] suspend: update warnings

2005-08-23 Thread Pavel Machek
Hi!

+ * If you have unsupported (*) devices using DMA, you may have some
+ * problems. If your disk driver does not support suspend... (IDE 
does),
+ * it may cause some problems, too. If you change kernel command line 
+ * between suspend and resume, it may do something wrong. If you 
change 
+ * your hardware while system is suspended... well, it was not good 
idea;
+ * but it wil probably only crash.
   
   The most common driver issues I see involve:
   - USB being built in or as modules that are still loaded while
   suspending (getting better, but not there yet)
   - DRI being used in X where the drivers don't properly support
   suspend/resume (NVidia esp)
   - Firewire
   - CPU Freq  (improving too)
   
   It might be good to mention these areas too.
  
  Well, right; but those 'only' cause system to crash during suspend. I
  was talking about really dangerous stuff.
  
  Both usb and cpufreq seems to work okay here.
 
 It depends on what you're using. I believe one of the usb root hub
 drivers is okay, the others aren't. Similar for cpufreq. USB certainly
 accounts for a high percentage of the failures I see.

Do you remember which one is it? I have UHCI here, and it seems to
work okay. powernow-k8 and cpufreq-centrino also seems to behave ok.

Pavel
-- 
if you have sharp zaurus hardware you don't need... you know my address
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] suspend: update warnings

2005-08-23 Thread Christoph Hellwig
On Tue, Aug 23, 2005 at 02:50:17PM +0200, Pavel Machek wrote:
 - DRI being used in X where the drivers don't properly support
 suspend/resume (NVidia esp)

NVidias driver is not support and a copyright violation of the
copyrights of many of use.  It's never supported so please don't
mention it.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] suspend: update warnings

2005-08-23 Thread Pavel Machek
Hi!

  - DRI being used in X where the drivers don't properly support
  suspend/resume (NVidia esp)
 
 NVidias driver is not support and a copyright violation of the
 copyrights of many of use.  It's never supported so please don't
 mention it.

Unfortunately, it is quite common out there. I need to somehow keep
those bug reports off my mailbox.

Okay, this should be enough:

Q: What information is usefull for debugging suspend-to-disk problems?

A: Well, last messages on the screen are always useful. If something
is broken, it is usually some kernel driver, therefore trying with as
little as possible modules loaded helps a lot. I also prefer people to
suspend from console, preferably without X running. Booting with
init=/bin/bash, then swapon and starting suspend sequence manually
usually does the trick. Then it is good idea to try with latest
vanilla kernel.

Known problematic modules are; be sure to unload them before
suspend:
- DRI being used (3D acceleration)
- Firewire
- SCSI



-- 
if you have sharp zaurus hardware you don't need... you know my address
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] suspend: update warnings

2005-08-23 Thread Christoph Hellwig
On Tue, Aug 23, 2005 at 03:00:50PM +0200, Pavel Machek wrote:
 Hi!
 
   - DRI being used in X where the drivers don't properly support
   suspend/resume (NVidia esp)
  
  NVidias driver is not support and a copyright violation of the
  copyrights of many of use.  It's never supported so please don't
  mention it.
 
 Unfortunately, it is quite common out there. I need to somehow keep
 those bug reports off my mailbox.

I think we made it pretty clear that people with binary modules should
sodd off.  Feel free to use banner for a big sod off as usual warning
for all binary module user idiots.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Ext3 Errors on Dell RAID

2005-08-23 Thread Jess Balint
Problem:
I get massive ext3 errors once every few days. See errors on console
section below. Almost all commands return I/O error. I have to power
cycle the machine to get it running again. Upon reboot, there are
usually 3 orphan inodes deleted and everything is fine. See messages
on reboot below.

Configuration:
System: Dell PowerEdge 6300/500, 4 CPU SMP w/2GB memory
Discs: 3 SCSI discs in a controller-managed striped configuration
Controller: Dell PERC-2
kernel messages in kernel boot messages below

Other:
I had this problem before. I upgrade the card firmware to 2.8/build
6809, but still the same issue. I tried with the 2.4.29 kernel
(aacraid driver v 1.1-3) from the Slackware (10?) distribution and
then I upgraded to 2.4.31. It has the same driver version and same
problem. Running fsck always shows everything is fine (rc=0).

Does anybody have experience with this machine working well? If so,
what combination of kernel and firmware version?

Or does anybody know the root cause of the occasional massive ext3
errors or what I can do to test and/or fix it?

Please cc me jbalint-at-gmail as I am not on the list.
Thanks.
Jess

--
--
errors on console
--
--
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)) in ext3_orphan_add: IO failure
EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read
inode block - inode=1015869, block=1015811
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read
inode block - inode=1015869, block=1015811
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)) in ext3_orphan_add: IO failure
EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read
inode block - inode=1213811, block=1212461
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)) in ext3_new_inode: IO failure

--
--
messages on reboot
--
--
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: sd(8,2): orphan cleanup on readonly fs
EXT3-fs: sd(8,2): 3 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.

--
--
kernel boot messages
--
--
SCSI subsystem driver Revision: 1.00
Red Hat/Adaptec aacraid driver (1.1-3 Aug 16 2005 17:25:05)
AAC0: kernel 2.8.4 build 6089
AAC0: monitor 2.8.4 build 6089
AAC0: bios 2.8.0 build 6089
AAC0: serial 4c72e2fafaf001
scsi0 : percraid
  Vendor: DELL  Model: rootvgRev: V1.0
  Type:   Direct-Access  ANSI SCSI revision: 02
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
Adaptec aic7890/91 Ultra2 SCSI adapter
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi2 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
Adaptec aic7890/91 Ultra2 SCSI adapter
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi3 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
Adaptec aic7860 Ultra SCSI adapter
aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs

blk: queue f7aaca18, I/O limit 4095Mb (mask 0x)
(scsi3:A:5): 20.000MB/s transfers (20.000MHz, offset 15)
  Vendor: NEC   Model: CD-ROM DRIVE:465  Rev: 1.03
  Type:   CD-ROM ANSI SCSI revision: 02
blk: queue f7aac818, I/O limit 4095Mb (mask 0x)
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 213274368 512-byte hdwr sectors (109196 MB)
Partition check:
 sda: sda1 sda2
Attached scsi CD-ROM sr0 at scsi3, channel 0, id 5, lun 0
sr0: scsi3-mmc drive: 14x/32x cd/rw xa/form2 cdda tray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm2 (hangs on non-SMP x86-64 and oopses)

2005-08-23 Thread Ralf Baechle
Andrew,

On Tue, Aug 23, 2005 at 02:51:51PM +0200, Rafael J. Wysocki wrote:

  
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc6/2.6.13-rc6-mm2/
  
  - Various updates.  Nothing terribly noteworthy.
 
 It hangs solig during boot (after starting kjournald) on Asus L5D (non-SMP 
 x86-64),
 which is caused by this patch:
 
 8250-serial-console-locking-bug-spelling-fix.patch
 
 (from binary search).
 
 If this patch is reverted, it oopses like in the following trace.

I thought this one was already pulled?

  Ralf
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.4.31] - USB device numbering in /proc/bus/usb

2005-08-23 Thread Paul Rolland
Hello,

I've just rebooted a machine, and the eagle ADSL modem I was using,
presented as /proc/bus/usb/002/005 in now presented as 
/proc/bus/usb/002/003 (same bus, but device ID changed from 5 to 3).

Is this an expected behavior, when running a 2.4.31 kernel ?
I would have been expecting some more stability in the numbering across
reboot, the same way IDE disks numbers are stable.

Paul

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] asus_acpi M6000A model support

2005-08-23 Thread Lukas Hejtmanek
Hello,

here is patch for Asus M6A laptop support. It works fine for me.

-- 
Lukáš Hejtmánek
--- asus_acpi.c.old 2005-04-21 02:03:13.0 +0200
+++ asus_acpi.c 2005-05-08 18:22:49.0 +0200
@@ -128,6 +128,7 @@
L8L,  //L8400L
M1A,  //M1300A
M2E,  //M2400E, L4400L
+   M6A,  //M6000A
M6N,  //M6800N
M6R,  //M6700R
P30,  //Samsung P30
@@ -304,7 +305,20 @@
.display_set   = SDSP,
.display_get   = \\INFB
},
-
+   {
+   .name  = M6A,
+   /* M6A does not have MLED */
+   .mt_wled   = WLED,
+   .mt_lcd_switch = xxN_PREFIX _Q10,
+   .lcd_status= \\RGPL,
+   .brightness_set= SPLV,
+   .brightness_get= GPLV,
+   .display_set   = SDSP,
+   /* FIXME: this is not correct display_get.
+* It always returns 1 
+* */
+   .display_get   = \\ADVG
+   },
{
.name  = M6N,
.mt_mled   = MLED,
@@ -622,7 +636,7 @@
 {
int lcd = 0;
 
-   if (hotk-model != L3H) {
+   if (hotk-model != L3H  hotk-model != M6A) {
/* We don't have to check anything if we are here */
if (!read_acpi_int(NULL, hotk-methods-lcd_status, lcd))
printk(KERN_WARNING Asus ACPI: Error reading LCD 
status\n);
@@ -638,22 +652,33 @@

input.count = 2;
input.pointer = mt_params;
-   /* Note: the following values are partly guessed up, but 
-  otherwise they seem to work */
mt_params[0].type = ACPI_TYPE_INTEGER;
-   mt_params[0].integer.value = 0x02;
mt_params[1].type = ACPI_TYPE_INTEGER;
-   mt_params[1].integer.value = 0x02;
+   if(hotk-model == L3H) {
+   /* Note: the following values are partly guessed up, 
+* but otherwise they seem to work */
+   mt_params[0].integer.value = 0x02;
+   mt_params[1].integer.value = 0x02;
+   } else if(hotk-model == M6A) {
+   mt_params[0].integer.value = 0x15;
+   mt_params[1].integer.value = 0x01;
+   }
 
output.length = sizeof(out_obj);
output.pointer = out_obj;

-   status = acpi_evaluate_object(NULL, hotk-methods-lcd_status, 
input, output);
+   status = acpi_evaluate_object(NULL, hotk-methods-lcd_status, 
+   input, output);
if (status != AE_OK)
return -1;
-   if (out_obj.type == ACPI_TYPE_INTEGER)
-   /* That's what the AML code does */
-   lcd = out_obj.integer.value  8;
+   if (out_obj.type == ACPI_TYPE_INTEGER) {
+   if(hotk-model== L3H) {
+   /* That's what the AML code does */
+   lcd = out_obj.integer.value  8;
+   } else if(hotk-model == M6A) {
+   lcd = out_obj.integer.value;
+   }
+   }
}

return (lcd  1);
@@ -1029,6 +1054,8 @@
hotk-model = M6N;
else if (strncmp(model-string.pointer, M6R, 3) == 0)
hotk-model = M6R;
+   else if (strncmp(model-string.pointer, M6A, 3) == 0)
+   hotk-model = M6A;
else if (strncmp(model-string.pointer, M2N, 3) == 0 ||
 strncmp(model-string.pointer, M3N, 3) == 0 ||
 strncmp(model-string.pointer, M5N, 3) == 0 ||
@@ -1058,8 +1085,9 @@
hotk-model = L5x;
 
if (hotk-model == END_MODEL) {
-   printk(unsupported, trying default values, supply the 
-  developers with your DSDT\n);
+   printk(unsupported model %s, trying default values, supply 
+  the developers with your DSDT\n, 
+  model-string.pointer);
hotk-model = M2E;
} else {
printk(supported\n);


Re: Linux AIO status todo

2005-08-23 Thread Laurent Vivier
Le mar 23/08/2005 à 11:56, Jakub Jelinek a écrit :
 On Tue, Aug 23, 2005 at 01:14:38PM +0530, Suparna Bhattacharya wrote:
 
  2. No support for propagating IO completion events to user space
 threads using RT signals. User threads need to poll the completion
 queue using io_getevents. POSIX specifies that when an AIO
 request completes, a signal can be delivered to the application
 to indicate the completion of the IO.
 
 POSIX AIO needs to handle SIGEV_NONE, SIGEV_SIGNAL and SIGEV_THREAD
 notification.  Obviously kernel shouldn't create threads for SIGEV_THREAD
 itself, as kernel shouldn't hardcode all the implementation details how a
 thread can be created.  But it would be good if AIO signalling e.g. handled
 both SIGEV_SIGNAL and SIGEV_SIGNAL | SIGEV_THREAD_ID, with the same usage as
 e.g. timer_* syscalls.  If kernel makes sure SI_ASYNCIO si_code is set in
 the notification signal siginfos, glibc could even use just one helper
 thread for timer_*/[al]io_* and maybe in the future other SIGEV_THREAD 
 notification.
 

See chapter 2.2. AIO completion event.

The libposix-aio written by Sébastien and I manages all these cases:

http://www.bullopensource.org/posix/

There is a patch allowing kernel to send signal to a given process on
aio event completion:

http://cvs.sourceforge.net/viewcvs.py/paiol/kernel-patches/2.6.12/aioevent.patch?rev=1.1.1.1view=auto

With the help of an helper thread in the user space, the libposix-aio is
able to manage SIGEV_THREAD and create new thread by using user space
code (and thus implementation dependent calls):

http://cvs.sourceforge.net/viewcvs.py/paiol/libposix-aio/src/aio_read.c?view=markup
http://cvs.sourceforge.net/viewcvs.py/paiol/libposix-aio/src/aio_thread_create.c?view=markup

Sébastien wrote this part of libposix-aio (So I'm not an expert on this
part :-P ), but I think his helper thread is made like the glibc timer
helper thread is made. And thus, if we want to merge libposix-aio in
glibc, we should use existing mechanism, and it should be easy to put
POSIX AIO helper thread portions inside the timer helper thread.

But only the glibc maintainer can answer to this question: 

should we mixe timer and AIO code ?

Laurent
-- 
-- Laurent Vivier ---
  mailto:[EMAIL PROTECTED] BULL/FREC:B1-226
phone: (+33) 476 29 7213  Bullcom: 229-7213
--[ DT/OSwRD/AIX ]--
http://www.bullopensource.org/ext4


signature.asc
Description: Ceci est une partie de message	numériquement signée.


dnotify/inotify and vfs questions

2005-08-23 Thread Asser Femø
Hi,

I'm currently implementing change notification support for the linux
cifs client as part of Google's Summer of Code program.

In cifs, change notification works pretty much the same as dnotify does
in the kernel, and you cancel the notification by sending a NT_CANCEL
request. 

According to the fcntl manual you can cancel a notification by doing
fcntl(fd, F_NOTIFY, 0) (ie. sending 0 as the notification mask), but
looking in the kernel code fcntl_dirnotify() immediately calls
dnotify_flush() with neither telling the vfs module about it. Is there a
reason for this?  Otherwise I'd propose calling
filp-f_op-dir_notify(filp, 0) at some point in this scenario.

Regarding inotify, inotify_add_watch doesn't seem to pass on the request
either, which works fine for local filesystem operations as they call
fsnotify_* functions every time, but that isn't really feasible for
filesystems like cifs because we'd have to request change notification
on everything. Is there plans for implementing a mechanism to let vfs
modules get watch requests too?

cheers,
Asser



pgps8E5TYYiFC.pgp
Description: PGP signature


[PATCH] FUTEX_WAKE_OP (pthread_cond_signal speedup)

2005-08-23 Thread Jakub Jelinek
Hi!

ATM pthread_cond_signal is unnecessarily slow, because it wakes one
waiter (which at least on UP usually means an immediate context switch
to one of the waiter threads).  This waiter wakes up and after a few
instructions it attempts to acquire the cv internal lock, but that lock
is still held by the thread calling pthread_cond_signal.  So it goes
to sleep and eventually the signalling thread is scheduled in, unlocks
the internal lock and wakes the waiter again.

Now, before 2003-09-21 NPTL was using FUTEX_REQUEUE in pthread_cond_signal
to avoid this performance issue, but it was removed when locks were
redesigned to the 3 state scheme (unlocked, locked uncontended, locked
contended).

Following scenario shows why simply using FUTEX_REQUEUE in
pthread_cond_signal together with using lll_mutex_unlock_force
in place of lll_mutex_unlock is not enough and probably why it
has been disabled at that time:

The number is value in cv-__data.__lock.
thr1thr2thr3
0   pthread_cond_wait
1   lll_mutex_lock (cv-__data.__lock)
0   lll_mutex_unlock (cv-__data.__lock)
0   lll_futex_wait (cv-__data.__futex, futexval)
0   pthread_cond_signal
1   lll_mutex_lock (cv-__data.__lock)
1   pthread_cond_signal
2   lll_mutex_lock (cv-__data.__lock)
2 lll_futex_wait (cv-__data.__lock, 2)
2   lll_futex_requeue (cv-__data.__futex, 0, 1, 
cv-__data.__lock)
  # FUTEX_REQUEUE, not FUTEX_CMP_REQUEUE
2   lll_mutex_unlock_force (cv-__data.__lock)
0 cv-__data.__lock = 0
0 lll_futex_wake (cv-__data.__lock, 1)
1   lll_mutex_lock (cv-__data.__lock)
0   lll_mutex_unlock (cv-__data.__lock)
  # Here, lll_mutex_unlock doesn't know there are threads waiting
  # on the internal cv's lock

Now, I believe it is possible to use FUTEX_REQUEUE in pthread_cond_signal,
but it will cost us not one, but 2 extra syscalls and, what's worse, one
of these extra syscalls will be done for every single waiting loop in
pthread_cond_*wait.
We would need to use lll_mutex_unlock_force in pthread_cond_signal
after requeue and lll_mutex_cond_lock in pthread_cond_*wait after
lll_futex_wait.

Another alternative is to do the unlocking pthread_cond_signal needs
to do (the lock can't be unlocked before lll_futex_wake, as that is racy)
in the kernel.

I have implemented both variants, futex-requeue-glibc.patch is the
first one and futex-wake_op{,-glibc}.patch is the unlocking
inside of the kernel.  The kernel interface allows userland to specify
how exactly an unlocking operation should look like (some atomic
arithmetic operation with optional constant argument and comparison
of the previous futex value with another constant).

It has been implemented just for ppc*, x86_64 and i?86, for other
architectures I'm including just a stub header which can be used as
a starting point by maintainers to write support for their arches
and ATM will just return -ENOSYS for FUTEX_WAKE_OP.  The requeue
patch has been (lightly) tested just on x86_64, the wake_op patch
on ppc64 kernel running 32-bit and 64-bit NPTL and x86_64 kernel running
32-bit and 64-bit NPTL.

With the following benchmark on UP x86-64 I get:

for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so 
--library-path .:$i /tmp/bench; \
for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 21; 
done; done
time elf/ld.so --library-path .:nptl-orig /tmp/bench
real 0m0.655s user 0m0.253s sys 0m0.403s
real 0m0.657s user 0m0.269s sys 0m0.388s
time elf/ld.so --library-path .:nptl-requeue /tmp/bench
real 0m0.496s user 0m0.225s sys 0m0.271s
real 0m0.531s user 0m0.242s sys 0m0.288s
time elf/ld.so --library-path .:nptl-wake_op /tmp/bench
real 0m0.380s user 0m0.176s sys 0m0.204s
real 0m0.382s user 0m0.175s sys 0m0.207s

The benchmark is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt1.txt
Older futex-requeue-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt2.txt
Older futex-wake_op-glibc.patch version is at:
http://sourceware.org/ml/libc-alpha/2005-03/txt3.txt
Will post a new version (just x86-64 fixes so that the patch
applies against pthread_cond_signal.S) to libc-hacker ml soon.

Attached is the kernel FUTEX_WAKE_OP patch as well as a simple-minded
testcase that will not test the atomicity of the operation, but at least
check if the threads that should have been woken up are woken up and
whether the arithmetic operation in the kernel gave the expected results.

Jakub
--- linux-2.6.12/include/linux/futex.h.jj   2005-06-17 21:48:29.0 
+0200
+++ linux-2.6.12/include/linux/futex.h  2005-08-23 11:11:41.0 +0200
@@ -4,14 +4,40 @@
 /* Second argument to futex syscall */
 
 
-#define FUTEX_WAIT (0)
-#define 

irq 11: nobody cared

2005-08-23 Thread Nigel Rantor


Hail,

I posted a report a while back, no answer.

Who should I be talking to wrt to the irq 11: nobody cared issue?

I'm happy to provide as much info as possible but need to know what info 
is required.


I'm happily running 2.6.7, tried the latest and greatest (2.6.12) and 
found the problem, then started by looking at 2.6.8 and found the 
problem there too.


It happens on boot, is a showstopper and I'm wondering what, if anything 
useful I can provide you guys.


Throw me a bone...

  Nige

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: debug a high load average

2005-08-23 Thread Erik Mouw
On Tue, Aug 23, 2005 at 04:38:36PM +0530, Rajesh wrote:
 I have a case occasionally when I copy data from a usb storage (ipod) to 
 my hard drive the load average goes up from 0.4 to about 15.0, and the 
 system becomes very unusable till I kill the cp command. I have checked 
 the CPU usage, bytes read from usb device, byte written to hard drive 
 etc, and all these values are low like CPU usage is at a maximum of 30%, 
 disk read bytes is at an average of 1.5 MiB/s, disk write bytes is at 
 1.5 MiB/s, number of processes is at 110, etc, during this high load.

1.5 MB/s suggests you're using an IDE drive in PIO mode. Switch to DMA
mode (hdparm -d 1 /dev/hda) and see if it gets any better.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: IRQ problem with PCMCIA

2005-08-23 Thread Erik Mouw
On Tue, Aug 23, 2005 at 11:31:58AM +0100, Alan Cox wrote:
 On Maw, 2005-08-23 at 09:49 +0200, Erik Mouw wrote:
  Is there any place where we can get your current patches?
 
 Which ones - the PATA IDE ones are in 2.6.11-ac, a subset in Fedora
 (other changes in the core IDE code make forward porting stuff for
 hotplug really tricky past 2.6.11).

I know about those and have been using them on my laptop.

 The SATA ones I can certainly put up if there is interest. I don't want
 to put them somewhere too available yet because this right now is stuff
 you only want to use under controlled circumstances for development
 until both they and the core SATA layer have some improvements.

That's the one I'm interested in. Yes, I do understand it can erase all
my partitions, etc.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Ext3 Errors on Dell RAID

2005-08-23 Thread Matt Domsch
On Tue, Aug 23, 2005 at 09:05:27AM -0400, Jess Balint wrote:
 Problem:
 I get massive ext3 errors once every few days. See errors on console
 section below. Almost all commands return I/O error. I have to power
 cycle the machine to get it running again. Upon reboot, there are
 usually 3 orphan inodes deleted and everything is fine. See messages
 on reboot below.
 
 Configuration:
 System: Dell PowerEdge 6300/500, 4 CPU SMP w/2GB memory
 Discs: 3 SCSI discs in a controller-managed striped configuration
 Controller: Dell PERC-2
 kernel messages in kernel boot messages below

This looks very familiar, and given the firmware versions you mention,
is probably a known issue.  The controller firmware goes to do a cache
flush, but that doesn't complete in a sane amount of time, and
eventually the SCSI midlayer starts aborting commands and taking the
file system offline.

I don't believe a firmware update was released for your add-in PERC2
quad-channel card.  Firmware 6091 was released for the PERC3/Di ROMBs
which addresses this exact case, though other failures have been
reported on [EMAIL PROTECTED] (subscribe and read archives at
http://lists.us.dell.com) even with newer firmware.

The workarounds include:
1) disable the read and write cache using afacli.
2) mount file systems using 'noatime'.
3) backup your data, replace the controller with something newer
(disks on the onboard aic7xxx controller combined with Linux Software
RAID works quite well), recreate your RAID array on the new
controller, and restore your data from backups.

Thanks,
Matt

-- 
Matt Domsch
Software Architect
Dell Linux Solutions linux.dell.com  www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: what does scsi sense means?

2005-08-23 Thread Erik Mouw
On Tue, Aug 23, 2005 at 05:07:12PM +0800, jeff shia wrote:
 in the file of aic7.c ,what is the function of the structure of
 scsi_sense?here what is the meaning of  sense?just like probe?

Return value of a failed command. Normally commands just succeed, but
if it fails, you can get sense information which tells you more about
why a particular command failed.


Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fix whitespace handling on sysfs attributes

2005-08-23 Thread Jon Smirl
The first version of this patch didn't allow for the request firmware
case which does multiple parsing passes on the parameter. This was
discussed in the thread '2.6.13-rc6-mm1'

gregkh-driver-sysfs-strip_leading_trailing_whitespace-3.patch
  should replace in 2.6.13-rc6-mm1
gregkh-driver-sysfs-strip_leading_trailing_whitespace.patch

Signed-off-by: Jon Smirl [EMAIL PROTECTED]

-- 
Jon Smirl
[EMAIL PROTECTED]
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -6,6 +6,7 @@
 #include linux/fsnotify.h
 #include linux/kobject.h
 #include linux/namei.h
+#include linux/ctype.h
 #include asm/uaccess.h
 #include asm/semaphore.h
 
@@ -207,8 +208,41 @@ flush_write_buffer(struct dentry * dentr
 	struct attribute * attr = to_attr(dentry);
 	struct kobject * kobj = to_kobj(dentry-d_parent);
 	struct sysfs_ops * ops = buffer-ops;
+	size_t ws_count = count, leading = 0;
+	int ret = 0;
+	char *x;
 
-	return ops-store(kobj,attr,buffer-page,count);
+	/* locate trailing white space */
+	while ((ws_count  0)  isspace(buffer-page[ws_count - 1]))
+		ws_count--;
+	if (ws_count == 0)
+		return count;
+
+	/* locate leading white space */
+	x = buffer-page;
+	while (isspace(*x))
+		x++;
+	leading = x - buffer-page;
+	ws_count -= leading;
+
+	/* interface is still ambigous about this */
+	/* string is both passed by length and terminated */
+	if (ws_count != PAGE_SIZE)
+		x[ws_count] = '\0';
+
+	ret = ops-store(kobj, attr, x, ws_count);
+
+	/* is it an error? */
+	if (ret  0) 
+		return ret;
+
+	/* the whole string was consumed */
+	if (ret == ws_count)
+		return count;
+
+	/* only part of the string was consumed */
+	/* return count can not include trailing space */
+	return leading + ret;
 }
 
 


Re: [PATCH] mm: return ENOBUFS instead of ENOMEM in generic_file_buffered_write

2005-08-23 Thread Dmitry Torokhov
On 8/23/05, Christoph Hellwig [EMAIL PROTECTED] wrote:
 On Tue, Aug 23, 2005 at 11:46:33AM +0300, Pekka J Enberg wrote:
  As noticed by Dmitry Torokhov, write() can not return ENOMEM:
 
  http://www.opengroup.org/onlinepubs/95399/functions/write.html
 
  Therefore fixup generic_file_buffered_write() in mm/filemap.c (pointed out 
  by
  Nathan Scott).
 
 We had this discussion before, for EACCESS then.  We've always been returning
 more errnos than SuS mentioned and Linus declared it's fine.
 

So does that mean that any error code is allowed? I would love to be
able to return ENODEV from a sysfs attribute if its device happens to
be removed in process. Is there a list of valid errnos for Linux that
supercedes SuS?

-- 
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: CONFIG_PRINTK_TIME woes

2005-08-23 Thread Luck, Tony
I'd hate to have to test for something for CONFIG_PRINTK_TIME
every time sched_clock() is being called.

Me too.

The quick fix would seem to be to only allow CONFIG_PRINTK_TIME
from kernel cmdline to make it happen a bit later. So basically
make int printk_time = 0 until command line is evaluated.

Good thought, but this won't work for ia64 in the hot-plug cpu case.
There are a couple of printk() calls by new cpus as they boot before
they have set-up their per-cpu areas.  So there is no global state
that can be checked to decide whether it is safe for printk() to
call sched_clock().

-Tony
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Posix file attribute support on VFAT (take #2)

2005-08-23 Thread Lennart Sorensen
On Mon, Aug 22, 2005 at 01:46:29PM +0200, Pavel Machek wrote:
 Unfortunately, it makes sense. If you have compact flash card, you
 really want to have VFAT there, so that it is a) compatible with
 windows and b) so that you don't kill the hardware.

VFAT is plenty good at killing hardware.  It's a terrible filesystem for
flash cards (if they don't do their own wear leveling properly).  Most
of the linux filesystems may not be any better but they are also no
worse.  Windows compatibility is completely irrelevant if the card is
being used as your root filesystem since any extensions you make to vfat
wouldn't be understood by windows anyhow, so at best it makes a mess of
it.

 I guess being able to use CF card for root filesystem is usefull,
 too

I run ext3 on CF and so far, no problems.  I run with noatime and try to
avoid writing in general as much as possible.  VFAT would be crap since,
well, I run linux on the system.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


another Followup on 2.6.13-rc3 ACPI processor C-state regression

2005-08-23 Thread Daniel Nofftz
(It looks like my first try to send this message as a reply to the Followup
... didn't work. If it worked: sorry for double-post)

I use 2.6.13-rc6-mm1 which includes the patch as far as i can see, but
the C2 idle state (which my processor definetly supports) isn't
detected . it also isn't detected with 2.6.13-rc6 or 2.6.12.5 . but it definetly
worked with some older 2.6.x kernel.

is there any way to enforce using c2 ? so that you could say that the
acpi system uses c2 even if it is unable to detect that it is supported
?

daniel
(please CC me, cause i am not on the list at the moment)

-- 
# Daniel Nofftz .. #



This message was sent using IMP, the Internet Messaging Program.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: Incorrect RAM Detected at kernel init

2005-08-23 Thread Lennart Sorensen
On Sun, Aug 21, 2005 at 11:27:51PM -0400, Terry wrote:
 Not sure if I have provided enough info, or to much info, but here it goes:
 
 [1.] One line summary of the problem:
 Not Detecting all the memory installed in the system.
 
 [2.] Full description of the problem/report:
 I have Linux Kernel 2.4.31 running on a Compaq 5000R server with 2 PPro 200
 processors, 768M RAM, RealTeck 8139 Network Card, and Compaq Smart 2 Raid
 controller with 5 9.1G drives in Raid 5 configuration.
 The kernel appears to compile perfectly, installs fine, but after reboot it
 is only reporting 16M of RAM. I have tried with and without the mem=768M
 boot up option in the lilo.conf script. All other modules and boot up
 includes appear to run perfectly fine. I had a 2.4.18 kernel running on this
 box just fine, detected all 768M of RAM and ran perfectly. The 2.4.31 Kernel
 runs almost perfectly, the only hold back is the false detection of memory.

Compaq machines of that era are known to have non standard bios methods
for identifying ram.  Do a google search for how to pass memory maps to
2.6 kernels on a compaq.

ie something like:

mem=exactmap [EMAIL PROTECTED] [EMAIL PROTECTED]

Add that to the kernel command line when booting and see what happens.

Len Sorensen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FUTEX_WAKE_OP (pthread_cond_signal speedup)

2005-08-23 Thread Ingo Molnar

On Tue, 23 Aug 2005, Jakub Jelinek wrote:

 Hi!
 
 ATM pthread_cond_signal is unnecessarily slow, because it wakes one
 waiter (which at least on UP usually means an immediate context switch
 to one of the waiter threads).  This waiter wakes up and after a few
 instructions it attempts to acquire the cv internal lock, but that lock
 is still held by the thread calling pthread_cond_signal.  So it goes
 to sleep and eventually the signalling thread is scheduled in, unlocks
 the internal lock and wakes the waiter again.


 With the following benchmark on UP x86-64 I get:
 
 for i in nptl-orig nptl-requeue nptl-wake_op; do echo time elf/ld.so 
 --library-path .:$i /tmp/bench; \
 for j in 1 2; do echo ( time elf/ld.so --library-path .:$i /tmp/bench ) 21; 
 done; done
 time elf/ld.so --library-path .:nptl-orig /tmp/bench
 real 0m0.655s user 0m0.253s sys 0m0.403s
 real 0m0.657s user 0m0.269s sys 0m0.388s
 time elf/ld.so --library-path .:nptl-requeue /tmp/bench
 real 0m0.496s user 0m0.225s sys 0m0.271s
 real 0m0.531s user 0m0.242s sys 0m0.288s
 time elf/ld.so --library-path .:nptl-wake_op /tmp/bench
 real 0m0.380s user 0m0.176s sys 0m0.204s
 real 0m0.382s user 0m0.175s sys 0m0.207s

translation: effective thread switching is now almost twice as fast with
the WAKE_OP extension of the futex interface. Cool!

a detail: many of the futex_atomic_op_inuser() seem to be duplicated
across architectures. Might be worth putting into asm-generic, to avoid
the duplication?

Ingo
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] FUTEX_WAKE_OP (pthread_cond_signal speedup)

2005-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2005 at 10:36:08AM -0400, Ingo Molnar wrote:
 a detail: many of the futex_atomic_op_inuser() seem to be duplicated
 across architectures. Might be worth putting into asm-generic, to avoid
 the duplication?

Those are stub files waiting for arch maintainers to actually implement
them, so they will be eventually different, but for the time being they
just -ENOSYS, so that things compile.

Jakub
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3com 3c59x stopped working with 2.6.13-rc[56]

2005-08-23 Thread Maciej Soltysiak
Hello

 I assume it worked OK in 2.6.12.
Yes, sorry, forgot to mention that.

 18:27:47: eth1: Setting full-duplex based on MII #24 link partner capability 
 of 05e1.
 18:32:02: NETDEV WATCHDOG: eth1: transmit timed out
 18:32:02: eth1: transmit timed out, tx_status 00 status e601.
 18:32:02:   diagnostics: net 0cfa media 8880 dma 003a fifo 8800
 18:32:02: eth1: Interrupt posted but not delivered -- IRQ blocked by another 
 device?

 gargh, I have acpi feelings.  Could you please
It seems you had a good hunch.

 a) Compare /proc/interrupts for 2.6.12 and 2.6.13-rc6
/proc/interrputs for 2.6.12:
   CPU0
  0:   76133896  XT-PIC  timer
  1:   1170  XT-PIC  i8042
  2:  0  XT-PIC  cascade
  9:  0  XT-PIC  acpi
 11:2483056  XT-PIC  eth1
 14: 603767  XT-PIC  ide0
 15: 13  XT-PIC  ide1
NMI:  0
ERR:  0

/proc/interrputs for 2.6.13:
   CPU0
  0: 851172  XT-PIC  timer
  1:802  XT-PIC  i8042
  2:  0  XT-PIC  cascade
  5:  0  XT-PIC  eth1
 14:  30180  XT-PIC  ide0
 15: 13  XT-PIC  ide1
NMI:  0
ERR:  0

What is missing is acpi on irq9

 b) Generate the boot-time dmesg output for 2.6.12 and 2.6.13-rc6
(dmesg -s 100  foo), then do
   diff -u dmesg-2.6.12 dmesg-2.6.13-rc6  foo
and send foo?
Here is foo:

--- dmesg-2.6.122005-08-23 12:53:43.0 +0200
+++ dmesg-2.6.132005-08-23 14:26:54.0 +0200
@@ -1,4 +1,4 @@
-Linux version 2.6.12 ([EMAIL PROTECTED]) (gcc version 4.0.1 (Debian 4.0.1-2)) 
#1 Mon Aug 22 14:49:40 CEST 2005
+Linux version 2.6.13-rc6-git13 ([EMAIL PROTECTED]) (gcc version 4.0.1 (Debian 
4.0.1-2)) #2 Mon Aug 22 15:22:10 CEST 2005
 BIOS-provided physical RAM map:
  BIOS-e820:  - 000a (usable)
  BIOS-e820: 000f - 0010 (reserved)
@@ -13,25 +13,20 @@
   Normal zone: 126956 pages, LIFO batch:31
   HighMem zone: 0 pages, LIFO batch:1
 DMI 2.3 present.
-ACPI: RSDP (v000 ASUS  ) @ 0x000f6c20
-ACPI: RSDT (v001 ASUS   A7V266-C 0x30303031 MSFT 0x31313031) @ 0x1ffec000
-ACPI: FADT (v001 ASUS   A7V266-C 0x30303031 MSFT 0x31313031) @ 0x1ffec080
-ACPI: BOOT (v001 ASUS   A7V266-C 0x30303031 MSFT 0x31313031) @ 0x1ffec040
-ACPI: DSDT (v001   ASUS A7V266-C 0x1000 MSFT 0x010b) @ 0x
 Allocating PCI resources starting at 2000 (gap: 2000:dfff)
 Built 1 zonelists
-Kernel command line: BOOT_IMAGE=Linux.old ro root=301 lapic pci=usepirqmask
+Kernel command line: auto BOOT_IMAGE=Linux ro root=301 lapic pci=usepirqmask
 Initializing CPU#0
-CPU 0 irqstacks, hard=c0442000 soft=c0441000
+CPU 0 irqstacks, hard=c041b000 soft=c041a000
 PID hash table entries: 2048 (order: 11, 32768 bytes)
 Detected 1210.984 MHz processor.
 Using tsc for high-res timesource
 Console: colour VGA+ 80x25
 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
 Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
-Memory: 515260k/524208k available (2114k kernel code, 8412k reserved, 801k 
data, 392k init, 0k highmem)
+Memory: 515424k/524208k available (2010k kernel code, 8252k reserved, 753k 
data, 388k init, 0k highmem)
 Checking if this processor honours the WP bit even in supervisor mode... Ok.
-Calibrating delay loop... 2383.87 BogoMIPS (lpj=1191936)
+Calibrating delay using timer specific routine.. 2424.59 BogoMIPS (lpj=1212295)
 Mount-cache hash table entries: 512
 CPU: After generic identify, caps: 0383f9ff c1cbf9ff   
  
 CPU: After vendor identify, caps: 0383f9ff c1cbf9ff    
 
@@ -40,65 +35,49 @@
 CPU: After all inits, caps: 0383f9ff c1cbf9ff  0020  
 
 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#0.
+mtrr: v2.0 (20020519)
 CPU: AMD Duron(TM)Processor stepping 01
 Enabling fast FPU save and restore... done.
 Enabling unmasked SIMD FPU exception support... done.
 Checking 'hlt' instruction... OK.
-ACPI: setting ELCR to 0200 (from 0400)
 NET: Registered protocol family 16
 PCI: PCI BIOS revision 2.10 entry at 0xf0f00, last bus=1
 PCI: Using configuration type 1
-mtrr: v2.0 (20020519)
-ACPI: Subsystem revision 20050309
-ACPI: Interpreter enabled
-ACPI: Using PIC for interrupt routing
-ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
-ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
-ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
-ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
-ACPI: PCI Root Bridge [PCI0] (:00)
+PCI: Probing PCI hardware
 PCI: Probing PCI hardware (bus 00)
 Boot video device is :01:00.0
-ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
-ACPI: PCI Interrupt 

Re: irq 11: nobody cared

2005-08-23 Thread Daniel Drake

Nigel Rantor wrote:

Who should I be talking to wrt to the irq 11: nobody cared issue?

I'm happy to provide as much info as possible but need to know what info 
is required.


I'm happily running 2.6.7, tried the latest and greatest (2.6.12) and 
found the problem, then started by looking at 2.6.8 and found the 
problem there too.


Try 2.6.13-rc6 and if it still appears, try the new irqpoll boot option.

Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] pci: Block config access during BIST (resend)

2005-08-23 Thread Brian King
Brian King wrote:
 Greg KH wrote:
Here is an updated patch which will now fail writes to config space 
while the device is blocked. I have also fixed up the caching to return 
the correct data and tested it on both little endian and big endian 
machines.


Applied, thanks.

greg k-h

Greg,

This patch appears to have been dropped. Please apply.

Thanks


-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center

Some PCI adapters (eg. ipr scsi adapters) have an exposure today in that 
they issue BIST to the adapter to reset the card. If, during the time
it takes to complete BIST, userspace attempts to access PCI config space, 
the host bus bridge will master abort the access since the ipr adapter 
does not respond on the PCI bus for a brief period of time when running BIST. 
On PPC64 hardware, this master abort results in the host PCI bridge
isolating that PCI device from the rest of the system, making the device
unusable until Linux is rebooted. This patch is an attempt to close that
exposure by introducing some blocking code in the PCI code. When blocked,
writes will be humored and reads will return the cached value. Ben
Herrenschmidt has also mentioned that he plans to use this in PPC power
management.

Signed-off-by: Brian King [EMAIL PROTECTED]
---

 linux-2.6-bjking1/drivers/pci/access.c|   86 ++
 linux-2.6-bjking1/drivers/pci/pci-sysfs.c |   20 +++---
 linux-2.6-bjking1/drivers/pci/pci.h   |7 ++
 linux-2.6-bjking1/drivers/pci/proc.c  |   28 -
 linux-2.6-bjking1/drivers/pci/syscall.c   |   14 ++--
 linux-2.6-bjking1/include/linux/pci.h |5 +
 6 files changed, 129 insertions(+), 31 deletions(-)

diff -puN drivers/pci/access.c~pci_block_user_config_io_during_bist_again 
drivers/pci/access.c
--- linux-2.6/drivers/pci/access.c~pci_block_user_config_io_during_bist_again   
2005-08-22 17:00:21.0 -0500
+++ linux-2.6-bjking1/drivers/pci/access.c  2005-08-22 17:00:21.0 
-0500
@@ -60,3 +60,89 @@ EXPORT_SYMBOL(pci_bus_read_config_dword)
 EXPORT_SYMBOL(pci_bus_write_config_byte);
 EXPORT_SYMBOL(pci_bus_write_config_word);
 EXPORT_SYMBOL(pci_bus_write_config_dword);
+
+static u32 pci_user_cached_config(struct pci_dev *dev, int pos)
+{
+   u32 data;
+
+   data = dev-saved_config_space[pos/sizeof(dev-saved_config_space[0])];
+   data = (pos % sizeof(dev-saved_config_space[0])) * 8;
+   return data;
+}
+
+#define PCI_USER_READ_CONFIG(size,type)
\
+int pci_user_read_config_##size
\
+   (struct pci_dev *dev, int pos, type *val)   \
+{  \
+   unsigned long flags;\
+   int ret = 0;\
+   u32 data = -1;  \
+   if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;   \
+   spin_lock_irqsave(pci_lock, flags);\
+   if (likely(!dev-block_ucfg_access))\
+   ret = dev-bus-ops-read(dev-bus, dev-devfn, \
+   pos, sizeof(type), data);  \
+   else if (pos  sizeof(dev-saved_config_space)) \
+   data = pci_user_cached_config(dev, pos);\
+   spin_unlock_irqrestore(pci_lock, flags);   \
+   *val = (type)data;  \
+   return ret; \
+}
+
+#define PCI_USER_WRITE_CONFIG(size,type)   \
+int pci_user_write_config_##size   \
+   (struct pci_dev *dev, int pos, type val)\
+{  \
+   unsigned long flags;\
+   int ret = -EIO; \
+   if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;   \
+   spin_lock_irqsave(pci_lock, flags);\
+   if (likely(!dev-block_ucfg_access))\
+   ret = dev-bus-ops-write(dev-bus, dev-devfn,\
+   pos, sizeof(type), val);\
+   spin_unlock_irqrestore(pci_lock, flags);   \
+   return ret; \
+}
+
+PCI_USER_READ_CONFIG(byte, u8)
+PCI_USER_READ_CONFIG(word, u16)
+PCI_USER_READ_CONFIG(dword, u32)
+PCI_USER_WRITE_CONFIG(byte, u8)
+PCI_USER_WRITE_CONFIG(word, u16)
+PCI_USER_WRITE_CONFIG(dword, u32)
+
+/**
+ * pci_block_user_cfg_access - Block userspace PCI config reads/writes
+ * @dev:   pci device struct
+ *
+ * This 

Re: [Samba] Re: New maintainer needed for the Linux smb filesystem

2005-08-23 Thread Gerald (Jerry) Carter
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ian Kent wrote:
 On Sun, 21 Aug 2005, Gerald (Jerry) Carter wrote:
 
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Steven French wrote:
|
| We are close, but not quite ready to disable smbfs.

Steve,

I have been itching to work on some kernel code.
If you need someone just to keep things afloat,
I'd been happy to look into it.  There would be some
start up time of course.  If you would be willing to
help me navigate the things other than code, it
shouldn't be that big of a deal.
 
 I wouldn't mind helping out here either.  Perhaps a joint 
 effort Jerry?

That's fine by me.

Steve, I'll touch base with on #samba-technical to work out
what to do first.  I know we have had a lot of reports
on https://bugzilla.samba.org/ that were originally closed
as invalid since were weren't supporting the kernel smbfs code
at that time.






cheers, jerry
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDCzvsIR7qMdg1EfYRAga/AKCTUZpLIL6oUrpg5gOiPOc80e3KjQCeNv0I
XKnUztDUIKyR+3uon+ofKB4=
=BwsH
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][RESEND] don't allow sys_readahead() on files opened with O_DIRECT

2005-08-23 Thread Jan Blunck
IMO sys_readahead() doesn't make sense if the file is opened with
O_DIRECT, because the page cache is stuffed but never used. Therefore
this patch changes that by letting the call return with -EINVAL.

Signed-off-by: Jan Blunck [EMAIL PROTECTED]

 mm/filemap.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)

Index: experimental-jb/mm/filemap.c
===
--- experimental-jb.orig/mm/filemap.c
+++ experimental-jb/mm/filemap.c
@@ -,7 +,8 @@ static ssize_t
 do_readahead(struct address_space *mapping, struct file *filp,
 	 unsigned long index, unsigned long nr)
 {
-	if (!mapping || !mapping-a_ops || !mapping-a_ops-readpage)
+	if (!mapping || !mapping-a_ops || !mapping-a_ops-readpage
+	|| (filp-f_flags  O_DIRECT))
 		return -EINVAL;
 
 	force_page_cache_readahead(mapping, filp, index,


Re: [PATCH 2.6.13-rc6] cpu_exclusive sched domains on partial nodes temp fix

2005-08-23 Thread Dinakar Guniguntala
On Tue, Aug 23, 2005 at 01:04:27AM -0700, Paul Jackson wrote:
 If Dinakar, Hawkes and Nick concur (and no one else complains too
 loud) then the following should go into 2.6.13, to avoid the potential
 kernel oops that Hawkes reported in Dinakar's feature to allow user
 control of dynamic sched domain placement using cpu_exclusive cpusets.

I agree this is the way to go for 2.6.13 before we fix things the
right way for 2.6.14. Thanks for the patch Paul.

 This patch should allow proceeding with this new feature in 2.6.13 for
 the configurations in which it is useful (node alligned sched domains)
 while avoiding trying to setup sched domains in the less useful cases
 that can cause the kernel corruption and oops.
 

Dunno if it is something in my setup (4 CPU Power5 box with NUMA enabled)
but this patch causes some hard hangs when I run the attached script.
The same script runs for much longer with Ingo's changes but panics
as I had described earlier. I am still debugging what causes this.

-Dinakar




sd-stress.tar.gz
Description: GNU Zip compressed data


Re: [PATCH 2/2] ipr: Block config access during BIST (resend)

2005-08-23 Thread Brian King
Greg,

Please apply along with the previous pci patch.

Thanks

-- 
Brian King
eServer Storage I/O
IBM Linux Technology Center

IPR scsi adapter have an exposure today in that  they issue BIST to
the adapter to reset the card. If, during the time it takes to complete
BIST, userspace attempts to access PCI config space, the host bus bridge
will master abort the access since the ipr adapter does not respond on
the PCI bus for a brief period of time when running BIST. On PPC64
hardware, this master abort results in the host PCI bridge isolating
that PCI device from the rest of the system, making the device unusable
until Linux is rebooted. This patch makes use of some newly added PCI layer
APIs that allow for protection from userspace accessing config space
of a device in scenarios such as this.

Signed-off-by: Brian King [EMAIL PROTECTED]
---

 linux-2.6-bjking1/drivers/scsi/ipr.c |2 ++
 1 files changed, 2 insertions(+)

diff -puN drivers/scsi/ipr.c~ipr_block_user_config_io_during_bist 
drivers/scsi/ipr.c
--- linux-2.6/drivers/scsi/ipr.c~ipr_block_user_config_io_during_bist   
2005-08-22 17:03:57.0 -0500
+++ linux-2.6-bjking1/drivers/scsi/ipr.c2005-08-22 17:03:57.0 
-0500
@@ -4944,6 +4944,7 @@ static int ipr_reset_restore_cfg_space(s
int rc;
 
ENTER;
+   pci_unblock_user_cfg_access(ioa_cfg-pdev);
rc = pci_restore_state(ioa_cfg-pdev);
 
if (rc != PCIBIOS_SUCCESSFUL) {
@@ -4998,6 +4999,7 @@ static int ipr_reset_start_bist(struct i
int rc;
 
ENTER;
+   pci_block_user_cfg_access(ioa_cfg-pdev);
rc = pci_write_config_byte(ioa_cfg-pdev, PCI_BIST, PCI_BIST_START);
 
if (rc != PCIBIOS_SUCCESSFUL) {
_


Re: [2.4.31] - USB device numbering in /proc/bus/usb

2005-08-23 Thread Sergey Vlasov
On Tue, 23 Aug 2005 15:14:38 +0200 Paul Rolland wrote:

 I've just rebooted a machine, and the eagle ADSL modem I was using,
 presented as /proc/bus/usb/002/005 in now presented as 
 /proc/bus/usb/002/003 (same bus, but device ID changed from 5 to 3).
 
 Is this an expected behavior, when running a 2.4.31 kernel ?

Yes.  Addresses for USB devices are assigned dynamically.  If you
disconnect the modem from USB and connect it again, its address will
change.

 I would have been expecting some more stability in the numbering across
 reboot, the same way IDE disks numbers are stable.

Use some other identifier which is stable - e.g., serial number of the
USB device (unfortunately, many devices don't have it).


pgpWDerdwRRlJ.pgp
Description: PGP signature


Re: dnotify/inotify and vfs questions

2005-08-23 Thread Jamie Lokier
Asser Femø wrote:
 According to the fcntl manual you can cancel a notification by doing
 fcntl(fd, F_NOTIFY, 0) (ie. sending 0 as the notification mask), but
 looking in the kernel code fcntl_dirnotify() immediately calls
 dnotify_flush() with neither telling the vfs module about it. Is there a
 reason for this?  Otherwise I'd propose calling
 filp-f_op-dir_notify(filp, 0) at some point in this scenario.
 
 Regarding inotify, inotify_add_watch doesn't seem to pass on the request
 either, which works fine for local filesystem operations as they call
 fsnotify_* functions every time, but that isn't really feasible for
 filesystems like cifs because we'd have to request change notification
 on everything. Is there plans for implementing a mechanism to let vfs
 modules get watch requests too?

On a related note:

dnotify and inotify on local filesystems appear to be synchronous, in
the following rather useful sense:

If you have previously registered for inotify/dnotify events that will
catch a change to a file, and called stat() on the file, then the
following operation:

receive some request...
stat_info = stat(file)

may be replaced in userspace code with:

receive some request...
if (any_dnotify_or_inotify_events_pending) {
read_dnotify_or_inotify_events();
if (any_events_related_to(file)) {
store_in_userspace_stat_cache(file, stat(file));
}
}
stat_info = lookup_userspace_stat_cache(file);

Now that's a silly way to save one system call in the fast path by itself.

But when the stat_info is a prerequisite for validating cached data --
such as the contents of a file parsed into a data structure -- it can
save a lot of system calls and logical work.

For example, an Apache-style path walk which checks for .htaccess, or
a Samba-style path walk which is checking for unsafe symbolic links,
can be reduced from say 20 system calls to zero using this method.

Pre-compiled or pre-parsed programs/scripts/templates/config-files
where all the source files used are prerequisites for invalidating a
cached compiled form, reduces from say 40 system calls to stat() all
the source files, to zero  that's quite a saving.

It's not just reducing system calls.  The logical tests in userspace
are also skipped, if coded properly, facilitating very quick decisions
about things that depend on files which mostly don't change.
(Cascading structured cache prerequisites...mmm).

Remote dnotify/inotify doesn't _necessarily_ have this synchronous
property.  It may do in some cases, depending on the implementation
(this is subtle...).

So, it would be nice if there was a way to query this... rather than
the tedious method of testing the filesystem type and having a table
of known local filesystem types where it's safe to depend on this
property.  Alternatively, a way to specify at dnotify/inotify creation
type that synchronous notifications are required, and have the request
rejected if those can't be provided.

-- Jamie


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


kernel BUG at kernel/workqueue.c:104!

2005-08-23 Thread Karl Hiramoto

Hi,  i get this a lot now when doing:  rmmod  cp2101 io_edgeport 

I try and do the rmmod, because i loose comunications on the USB to 
RS-232 adapters.



Not sure if i did the ksymoops correctly but here it is:

# ./ksymoops
ksymoops 2.4.9 on i686 2.6.12-gentoo-r9.  Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.6.12-gentoo-r9/ (default)
-m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (regular_file): read_ksyms stat /proc/ksyms failed
./ksymoops: No such file or directory
No modules in ksyms, skipping objects
No ksyms, skipping lsmod
Reading Oops report from the terminal
[ cut here ]
kernel BUG at kernel/workqueue.c:104!
invalid operand:  [#1]
PREEMPT
Modules linked in: cp2101 io_edgeport ipv6 nfs lockd sunrpc ohci_hcd 
analog ns558 parport_pc parport pcspkr rtc nvidia via_rhine mii 
snd_via82xx gameport snd_ac97_codec snd_mpu401_uart snd_rawmidi 
i2c_viapro i2c_core ehci_hcd usbserial uhci_hcd lpclinux via_agp agpgart 
usbcore

CPU:0
EIP:0060:[c0125213]Tainted: P  VLI
EFLAGS: 00210213   (2.6.12-gentoo-r9)
EIP is at queue_work+0x73/0x80
eax: cc5e7948   ebx: cc5e7944   ecx: 0001   edx: d6de1000
esi: dffe7960   edi:    ebp: d6de1000   esp: d6de1eac
ds: 007b   es: 007b   ss: 0068
Process rmmod (pid: 11256, threadinfo=d6de1000 task=cc84e510)
Stack: 0002 cdf78000  ccfa1f20 d2f29df4 e086c1fe cc5e7000 
cdf78000
  0083 d2f29de0 e0979020 e0979040 e0917134 d2f29de0 d2f29de0 
d2f29df4
  d2f29e18 d2f29df4 c0308667 d2f29df4 c041d290 e0979040 e0979088 


Call Trace:
[e086c1fe] usb_serial_disconnect+0x8e/0xc0 [usbserial]
[e0917134] usb_unbind_interface+0x84/0x90 [usbcore]
[c0308667] device_release_driver+0x77/0x80
[c03086a0] driver_detach+0x30/0x40
[c0308b3c] bus_remove_driver+0x4c/0x90
[c03090f3] driver_unregister+0x13/0x30
[e0917227] usb_deregister+0x37/0x50 [usbcore]
[e097786f] cp2101_exit+0xf/0x1f [cp2101]
[c012ef67] sys_delete_module+0x167/0x1a0
[c0153941] sys_write+0x51/0x80
[c0102dc1] syscall_call+0x7/0xb
Code: d4 c1 fe ff b8 00 f0 ff ff 21 e0 8b 40 08 a8 08 75 12 89 f8 8b 5c 
24 08 8b 74 24 0c 8b 7c 24 10 83 c4 14 c3 e8 6f 5f 2c 00 eb e7 0f 0b 
68 00 2d 37 40 c0 eb b4 8d 76 00 83 ec 08 8b 44 24 0c 8b

6note: rmmod[11256] exited with preempt_count 1
scheduling while atomic: rmmod/0x1001/11256
[c03eb176] schedule+0x5f6/0x600
[c0142a9a] unmap_page_range+0x8a/0xb0
[c03eb9fc] cond_resched+0x2c/0x50
[c0142c68] unmap_vmas+0x1a8/0x200
[c01476b3] exit_mmap+0x83/0x170
[c0103ba0] do_invalid_op+0x0/0xd0
[c0112a87] mmput+0x37/0xb0
[c0117630] do_exit+0xb0/0x3d0
[c0103ba0] do_invalid_op+0x0/0xd0
[c01037db] die+0x18b/0x190
[c0103c4e] do_invalid_op+0xae/0xd0
[c030b3c6] pool_find_page+0x46/0x70
[c0125213] queue_work+0x73/0x80
[c030b46b] dma_pool_free+0x7b/0x112
[e091d580] urb_destroy+0x0/0x10 [usbcore]
[e091dbed] usb_start_wait_urb+0xcd/0xf0 [usbcore]
[e0966bf4] qh_destroy+0x54/0x80 [ehci_hcd]
[e0966ba0] qh_destroy+0x0/0x80 [ehci_hcd]
[c0299e1d] kref_put+0x3d/0xa0
[e096b444] ehci_endpoint_disable+0x124/0x172 [ehci_hcd]
[e0966ba0] qh_destroy+0x0/0x80 [ehci_hcd]
[c0102fdb] error_code+0x4f/0x54
[c0125213] queue_work+0x73/0x80
[e086c1fe] usb_serial_disconnect+0x8e/0xc0 [usbserial]
[e0917134] usb_unbind_interface+0x84/0x90 [usbcore]
[c0308667] device_release_driver+0x77/0x80
[c03086a0] driver_detach+0x30/0x40
[c0308b3c] bus_remove_driver+0x4c/0x90
[c03090f3] driver_unregister+0x13/0x30
[e0917227] usb_deregister+0x37/0x50 [usbcore]
[e097786f] cp2101_exit+0xf/0x1f [cp2101]
[c012ef67] sys_delete_module+0x167/0x1a0
[c0153941] sys_write+0x51/0x80
[c0102dc1] syscall_call+0x7/0xb
kernel BUG at kernel/workqueue.c:104!
invalid operand:  [#1]
CPU:0
EIP:0060:[c0125213]Tainted: P  VLI
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00210213   (2.6.12-gentoo-r9)
eax: cc5e7948   ebx: cc5e7944   ecx: 0001   edx: d6de1000
esi: dffe7960   edi:    ebp: d6de1000   esp: d6de1eac
ds: 007b   es: 007b   ss: 0068
Stack: 0002 cdf78000  ccfa1f20 d2f29df4 e086c1fe cc5e7000 
cdf78000
  0083 d2f29de0 e0979020 e0979040 e0917134 d2f29de0 d2f29de0 
d2f29df4
  d2f29e18 d2f29df4 c0308667 d2f29df4 c041d290 e0979040 e0979088 


Call Trace:
[e086c1fe] usb_serial_disconnect+0x8e/0xc0 [usbserial]
[e0917134] usb_unbind_interface+0x84/0x90 [usbcore]
[c0308667] device_release_driver+0x77/0x80
[c03086a0] driver_detach+0x30/0x40
[c0308b3c] bus_remove_driver+0x4c/0x90
[c03090f3] driver_unregister+0x13/0x30
[e0917227] 

Re: [2.4.31] - USB device numbering in /proc/bus/usb

2005-08-23 Thread Paul Rolland
Hello Sergey,

 Yes.  Addresses for USB devices are assigned dynamically.  If you
 disconnect the modem from USB and connect it again, its address will
 change.

The problem I've is that nothing changed on the machine except that 
I did a reboot. Nothing (USB device) added, nothing removed, so with
a stable hardware config, USB numbering should have stayed stable, IMHO.
 
  I would have been expecting some more stability in the 
 numbering across
  reboot, the same way IDE disks numbers are stable.
 
 Use some other identifier which is stable - e.g., serial number of the
 USB device (unfortunately, many devices don't have it).

Well yes, I'm going to try to convert to some other identifiers space
as this seems to be the only way to go.

Thanks for the confirmation,
Regards,
Paul
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm2

2005-08-23 Thread Daniel Ritz

 Yup, seems to be generally good...
 
 Noticed this in the log earlier tonight:
 
 Aug 23 19:44:51 tornado kernel: hub 5-0:1.0: port 1 disabled by hub (EMI?), 
 re-enabling...
 Aug 23 19:44:51 tornado kernel: usb 5-1: USB disconnect, address 2
 Aug 23 19:44:51 tornado kernel: drivers/usb/class/usblp.c: usblp0: removed
 Aug 23 19:44:51 tornado kernel: Unable to handle kernel NULL pointer 
 dereference at virtual address 0004
 Aug 23 19:44:51 tornado kernel:  printing eip:
 Aug 23 19:44:51 tornado kernel: c01ccef2
 Aug 23 19:44:51 tornado kernel: *pde = 
 Aug 23 19:44:51 tornado kernel: Oops:  [#1]
 Aug 23 19:44:51 tornado kernel: SMP
 Aug 23 19:44:51 tornado kernel: last sysfs file: 
 /devices/pci:00/:00:1f.3/i2c-0/name
 Aug 23 19:44:51 tornado kernel: Modules linked in: nfsd exportfs lockd eeprom 
 sunrpc ipv6 iptable_filter binfmt_misc reiser4 zlib_de
 flate zlib_inflate dm_mod video thermal processor fan button ac tpm_nsc 
 i2c_i801 sky2 e100 sr_mod
 Aug 23 19:44:51 tornado kernel: CPU:1
 Aug 23 19:44:51 tornado kernel: EIP:0060:[c01ccef2]Not tainted VLI
 Aug 23 19:44:51 tornado kernel: EFLAGS: 00010286   (2.6.13-rc6-mm2)
 Aug 23 19:44:51 tornado kernel: EIP is at _raw_spin_lock+0x7/0x73
 Aug 23 19:44:51 tornado kernel: eax:    ebx:    ecx: c1a60658 
edx: c1a63e24
 Aug 23 19:44:51 tornado kernel: esi:    edi: c0382400   ebp: f7c55e98 
esp: f7c55e90
 Aug 23 19:44:51 tornado kernel: ds: 007b   es: 007b   ss: 0068
 Aug 23 19:44:51 tornado kernel: Process khubd (pid: 109, threadinfo=f7c54000 
 task=c192b030)
 Aug 23 19:44:51 tornado kernel: Stack: f7c58a8c  f7c55ea0 c0312219 
 f7c55eb0 c030feb7 f7c58ae8 f7c58a48
 Aug 23 19:44:51 tornado kernel:f7c55ec4 c0217e73 f7c58a48 f7d134ec 
 0040 f7c55ed0 c0217ec0 f7c58a48
 Aug 23 19:44:51 tornado kernel:f7c55edc c0217814 f7c58a48 f7c55eec 
 c0216ad2 f7c58a48 f7c58a14 f7c55ef8
 Aug 23 19:44:51 tornado kernel: Call Trace:
 Aug 23 19:44:51 tornado kernel:  [c01039c3] show_stack+0x94/0xca
 Aug 23 19:44:51 tornado kernel:  [c0103b6c] show_registers+0x15a/0x1ea
 Aug 23 19:44:51 tornado kernel:  [c0103d8a] die+0x108/0x183
 Aug 23 19:44:51 tornado kernel:  [c031295a] do_page_fault+0x1ea/0x63d
 Aug 23 19:44:51 tornado kernel:  [c0103693] error_code+0x4f/0x54
 Aug 23 19:44:51 tornado kernel:  [c0312219] _spin_lock+0x8/0xa
 Aug 23 19:44:51 tornado kernel:  [c030feb7] klist_remove+0x10/0x2c
 Aug 23 19:44:51 tornado kernel:  [c0217e73] 
 __device_release_driver+0x41/0x65
 Aug 23 19:44:51 tornado kernel:  [c0217ec0] device_release_driver+0x29/0x39
 Aug 23 19:44:51 tornado kernel:  [c0217814] bus_remove_device+0x52/0x60
 Aug 23 19:44:51 tornado kernel:  [c0216ad2] device_del+0x2e/0x5d
 Aug 23 19:44:51 tornado kernel:  [c0216b0c] device_unregister+0xb/0x15
 Aug 23 19:44:51 tornado kernel:  [c0275d67] usb_disconnect+0x115/0x15c
 Aug 23 19:44:51 tornado kernel:  [c0276b85] 
 hub_port_connect_change+0x54/0x399
 Aug 23 19:44:51 tornado kernel:  [c027713e] hub_events+0x274/0x3b2
 Aug 23 19:44:51 tornado kernel:  [c0277296] hub_thread+0x1a/0xdf
 Aug 23 19:44:51 tornado kernel:  [c012fba7] kthread+0x99/0x9d
 Aug 23 19:44:51 tornado kernel:  [c01010b5] kernel_thread_helper+0x5/0xb
 Aug 23 19:44:51 tornado kernel: Code: 00 00 00 8b 0d a8 62 36 c0 e9 61 ff ff 
 ff f3 90 31 c0 86 07 84 c0 0f 8e 79 ff ff ff 83 c4 18 5
 b 5e 5f 5d c3 55 89 e5 56 53 89 c3 81 78 04 ad 4e ad de 75 2d be 00 e0 ff 
 ff 
 21 e6 8b 06 39 43 0c
 
this one is my fault, caused by driver-core-fix-bus_rescan_devices-race.patch
problem is that USB is direclty messing with dev-driver and then calling
device_bind_driver() if the device is not already bound...
i think the correct solution would be a sane API here and disallow direct
messing with dev-driver...meanwhile the attached patch will do.

messing directly with dev-driver is especially bad if it's already set
to another driver. this leads to problems later in device_release_driver().

akpm: please replace driver-core-fix-bus_rescan_devices-race.patch with
the attached one.

rgds
-daniel

---
[PATCH] driver core: fix bus_rescan_devices() race.

bus_rescan_devices_helper() does not hold the dev-sem when it checks for
!dev-driver. device_attach() holds the sem, but calls again 
device_bind_driver()
even when dev-driver is set. what happens is that a first device_attach() call
(module insertion time) is on the way binding the device to a driver. another
thread calls bus_rescan_devices().  now when bus_rescan_devices_helper() checks
for dev-driver it is still NULL 'cos the the prior device_attach() is not yet
finished. but as soon as the first one releases the dev-sem the second
device_attach() tries to rebind the already bound device again.
device_bind_driver() does this blindly which leads to a corrupt
driver-klist_devices list (the device links itself, the head points to the
device). later a call to device_release_driver() sets dev-driver to NULL and
breaks the link it has to 

RE: kernel module seg fault

2005-08-23 Thread manomugdha biswas
Hi,
This is the code where i am getting this problem. 

static byte4
VNICClientStart(unsigned long arg)
{
  VNICClientCfgCreateInfo_t  clientConfig;
  struct socket*sock  = NULL;
  ubyte4   status = 0;
  ubyte4   retryCnt   =
VNIC_CLIENT_MAX_CONN_RETRY_CNT;
  ubyte4   ret= 0;
  byte4len= 0;
  struct net_device*dev   = NULL;
  VNICConnMap_t*connMap= NULL;
  byte4error  = 0;
  VNICHdrForm_t  vnicHdr;
  VNICVirtMirrIfaceAndServIPList_t  *ifaceIPNode =
NULL;
  
 
  DECLARE_WAIT_QUEUE_HEAD(wq);
  init_waitqueue_head(wq);
  
 
  EnterFunction(VNICClientStart);


   memset(vnicHdr, 0, sizeof(vnicHdr));
  while (retryCnt) {
--retryCnt;
  
   
   if (!retryCnt) {
 return VNIC_CLIENT_SERVER_RESPONSE_TIMEOUT;
   }
  
 
   /* wait for small */
   interruptible_sleep_on_timeout(wq, 2);
  } /* end while (retryCnt)*/

  LeaveFunction(VNICClientStart);
  return VNIC_CLIENT_SERVER_SUCCESS; /* for success */
} /* end VNICClientStart() */

I commneted out all the other functionalities of this
function to make it simple but still it is getting
kernel panic.
   
This function gets called when i invoke ioctl() from
my user application and gets kernel panic.

Regards,
Manomugdha



--- [EMAIL PROTECTED] wrote:

 Hi Biswas,
 
 You need to post the complete kernel dump message
 and body of your
 source code.
 
 -Bunnan
  
 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On
 Behalf Of manomugdha
 biswas
 Sent: Tuesday, August 23, 2005 3:13 PM
 To: linux-kernel@vger.kernel.org
 Subject: kernel module seg fault
 
 Hi,
 I have written a kernel module and I can load
 (insmod)
 it without any error. But when i run my module it
 gets
 seg fault at interruptible_sleep_on_timeout();
 
 I have used this function in the following way:
 
 DECLARE_WAIT_QUEUE_HEAD(wq);
 init_waitqueue_head(wq);
 interruptible_sleep_on_timeout(wq, 2);
 
 I am using redhat version 9.0 and kernel version
 2.4.20-8.
 Could you please give some light on this issue?
 
 Manomugdha Biswas
 
 
   
 
   
   
 
 Send a rakhi to your brother, buy gifts and win
 attractive prizes. Log
 on to http://in.promos.yahoo.com/rakhi/index.html
 -
 To unsubscribe from this list: send the line
 unsubscribe linux-kernel
 in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at 
 http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
 
 -
 To unsubscribe from this list: send the line
 unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at 
 http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 


Manomugdha Biswas







Send a rakhi to your brother, buy gifts and win attractive prizes. Log on to 
http://in.promos.yahoo.com/rakhi/index.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Oleg Nesterov
Thomas Gleixner wrote:

 On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote:
 
  kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after
  that nobody can access this timer, so we don't need to lock timer-it_lock
  at all in this case. No lock - no deadlock.

 It still deadlocks:

 CPU 0   CPU 1
 write_lock(tasklist_lock);
 __exit_signal()
 timer expires
 base-running_timer = timer
   send_group_sigqueue()
read_lock(tasklist_lock();
 exit_itimers()
   del_timer_sync(timer)
  waits for ever because   waits for ever on tasklist_lock
  base-running_timer == timer

Silly me.

 I still think the last patch I sent is still necessary.

Thomas, you know that I like this change in __exit_{signal,sighand},
but i think this change is dangerous, should go in a separate patch,
and needs a lot of testing. But the decision is up to Ingo and Roland.

I am looking at your previous patch:

 -   read_lock(tasklist_lock);
 +retry:
 +   if (unlikely(p-flags  PF_EXITING))
 +   return -1;
 +
 +   if (unlikely(!read_trylock(tasklist_lock))) {
 +   cpu_relax();
 +   goto retry;
 +   }
 +   if (unlikely(p-flags  PF_EXITING)) {
 +   ret = -1;
 +   goto out_err;

What do you think about this:

int try_to_lock_this_beep_tasklist_lock(struct task_struct *group_leader)
{
while (unlikely(!read_trylock(tasklist_lock))) {
if (group_leader-flags  PF_EXITING) {
smp_rmb();
if (thread_group_empty(group_leader))
return 0;
}
cpu_relax();
}

return 1;
}

No need to re-check after we got tasklist, the signal will be flushed.
I think it's better to move the locking into the posix_timer_event, btw.
In that case we can drop my patch.

What is your opinion, can it work?

P.S.
 Thomas, thanks for explanation about posix-cpu-timers.

Oleg.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: irq 11: nobody cared

2005-08-23 Thread Jeff Garzik

Nigel Rantor wrote:


Hail,

I posted a report a while back, no answer.

Who should I be talking to wrt to the irq 11: nobody cared issue?

I'm happy to provide as much info as possible but need to know what info 
is required.


I'm happily running 2.6.7, tried the latest and greatest (2.6.12) and 
found the problem, then started by looking at 2.6.8 and found the 
problem there too.


It happens on boot, is a showstopper and I'm wondering what, if anything 
useful I can provide you guys.


Throw me a bone...


Read REPORTING-BUGS.  We can't do much of anything with this report. 
Tell us what's on irq 11, for starters


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: some missing spin_unlocks

2005-08-23 Thread David S. Miller
From: Ted Unangst [EMAIL PROTECTED]
Date: Mon, 22 Aug 2005 15:26:47 -0700

 net/rose/rose_route.c rose_route_frame, line 998
 returns without unlocking rose_node_list_lock, rose_neigh_list_lock, or 
 rose_route_list_lock

I fixed this one with the patch below.

 net/rose/rose_timer.c rose_heartbeat_expiry, line 141
 rose_destroy_socket does not unlock sk as far as i can see

This one needs more care.  We can't drop the lock, because
the destroy actions need to be protected by that lock, but
we can't release the lock after rose_destroy_socket() because
the object may not even exist any longer.

The problem there, at the core, is that the timer doesn't
grab a reference to the socket, which would make the solution
to this bug very straight forward.

Someone should work on that :-)

diff-tree 61ef36aa6cf356649863a24a850c2183cb762c61 (from 
daf53344fadaa8c47c6b0864e7f34efcbb66e391)
Author: David S. Miller [EMAIL PROTECTED]
Date:   Tue Aug 23 09:42:38 2005 -0700

[ROSE]: Fix missing unlocks in rose_route_frame()

Noticed by Coverity checker.

Signed-off-by: David S. Miller [EMAIL PROTECTED]

diff --git a/net/rose/rose_route.c b/net/rose/rose_route.c
--- a/net/rose/rose_route.c
+++ b/net/rose/rose_route.c
@@ -994,8 +994,10 @@ int rose_route_frame(struct sk_buff *skb
 *  1. The frame isn't for us,
 *  2. It isn't owned by any existing route.
 */
-   if (frametype != ROSE_CALL_REQUEST) /* XXX */
-   return 0;
+   if (frametype != ROSE_CALL_REQUEST) {   /* XXX */
+   ret = 0;
+   goto out;
+   }
 
len  = (((skb-data[3]  4)  0x0F) + 1) / 2;
len += (((skb-data[3]  0)  0x0F) + 1) / 2;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux AIO status todo

2005-08-23 Thread Benjamin LaHaise
On Tue, Aug 23, 2005 at 05:56:09AM -0400, Jakub Jelinek wrote:
 POSIX AIO needs to handle SIGEV_NONE, SIGEV_SIGNAL and SIGEV_THREAD
 notification.  Obviously kernel shouldn't create threads for SIGEV_THREAD
 itself, as kernel shouldn't hardcode all the implementation details how a
 thread can be created.  But it would be good if AIO signalling e.g. handled
 both SIGEV_SIGNAL and SIGEV_SIGNAL | SIGEV_THREAD_ID, with the same usage as
 e.g. timer_* syscalls.  If kernel makes sure SI_ASYNCIO si_code is set in
 the notification signal siginfos, glibc could even use just one helper
 thread for timer_*/[al]io_* and maybe in the future other SIGEV_THREAD 
 notification.

The signal patch from Sebastien should handle the SIGEV_foo.  The patch 
at http://www.kvack.org/~bcrl/patches/aio-2.6.13-rc6-B1/817_sigevent.diff 
has the latest changes from me and should do what is needed.

-ben
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: some missing spin_unlocks

2005-08-23 Thread Arjan van de Ven

 This one needs more care.  We can't drop the lock, because
 the destroy actions need to be protected by that lock, but
 we can't release the lock after rose_destroy_socket() because
 the object may not even exist any longer.


does it matter? can ANYTHING be spinning on the lock? if not .. can we
just let the lock go poof and not unlock it... 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm2 - fs/xfs/xfs*.c warnings

2005-08-23 Thread Damir Perisa
i'm compiling 2.6.13-rc6-mm2 atm and noticed that xfs is having lots of 
warnings while compiling. recently i switched to gcc 4.0.1 - maybe it's 
because of this.

details:

fs/xfs/xfs_acl.c: In function 'xfs_acl_access':
fs/xfs/xfs_acl.c:445: warning: 'matched.ae_perm' may be used uninitialized 
in this function

fs/xfs/xfs_alloc_btree.c: In function 'xfs_alloc_insrec':
fs/xfs/xfs_alloc_btree.c:622: warning: 'nrec.ar_startblock' may be used 
uninitialized in this function
fs/xfs/xfs_alloc_btree.c:622: warning: 'nrec.ar_blockcount' may be used 
uninitialized in this function

fs/xfs/xfs_bmap.c: In function 'xfs_bmap_alloc':
fs/xfs/xfs_bmap.c:2335: warning: 'rtx' is used uninitialized in this 
function

fs/xfs/xfs_dir2_sf.c: In function 'xfs_dir2_block_sfsize':
fs/xfs/xfs_dir2_sf.c:110: warning: 'parent' may be used uninitialized in 
this function

fs/xfs/xfs_dir_leaf.c: In function 'xfs_dir_leaf_to_shortform':
fs/xfs/xfs_dir_leaf.c:653: warning: 'parent' may be used uninitialized in 
this function

fs/xfs/xfs_ialloc_btree.c: In function 'xfs_inobt_insrec':
fs/xfs/xfs_ialloc_btree.c:750: warning: 'nrec.ir_free' is used 
uninitialized in this function
fs/xfs/xfs_ialloc_btree.c:750: warning: 'nrec.ir_freecount' is used 
uninitialized in this function
fs/xfs/xfs_ialloc_btree.c:567: warning: 'nrec.ir_startino' may be used 
uninitialized in this function

and the following warning appears a lot of times:

fs/xfs/xfs_bmap_btree.h:508:21: warning: __BIG_ENDIAN is not defined
fs/xfs/xfs_bmap_btree.h:626:21: warning: __BIG_ENDIAN is not defined

just giving a heads-up if somebody wants to clean this code. 

thanx + greetings,
Damir

Le Tuesday 23 August 2005 06:30, Andrew Morton a écrit :
| ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc
|6/2.6.13-rc6-mm2/
|
| - Various updates.  Nothing terribly noteworthy.
|
| - This kernel still spits a bunch of scheduling-while-atomic warnings
| from the scsi code.  Please ignore.
|

-- 
It is impossible for an optimist to be pleasantly surprised.


pgpIJ9o61dnLw.pgp
Description: PGP signature


Re: 2.6.12 Performance problems

2005-08-23 Thread Danial Thom


--- Helge Hafting [EMAIL PROTECTED]
wrote:

 Danial Thom wrote:
 
 --- Jesper Juhl [EMAIL PROTECTED] wrote:
 
   
 
 On 8/21/05, Danial Thom
 [EMAIL PROTECTED]
 wrote:
 
 
 I just started fiddling with 2.6.12, and
   
 
 there
 
 
 seems to be a big drop-off in performance
   
 
 from
 
 
 2.4.x in terms of networking on a
   
 
 uniprocessor
 
 
 system. Just bridging packets through the
 machine, 2.6.12 starts dropping packets at
 ~100Kpps, whereas 2.4.x doesn't start
   
 
 dropping
 
 
 until over 350Kpps on the same hardware
   
 
 (2.0Ghz
 
 
 Opteron with e1000 driver). This is pitiful
 prformance for this hardware. I've
 increased the rx ring in the e1000 driver to
   
 
 512
 
 
 with little change (interrupt moderation is
   
 
 set
 
 
 to 8000 Ints/second). Has tuning for MP
 destroyed UP performance altogether, or is
   
 
 there
 
 
 some tuning parameter that could make a
   
 
 4-fold
 
 
 difference? All debugging is off and there
   
 
 are
 
 
 no messages on the console or in the error
   
 
 logs.
 
 
 The kernel is the standard kernel.org
 dowload
 config with SMP turned off and the intel
   
 
 ethernet
 
 
 card drivers as modules without any other
 changes, which is exactly the config for my
   
 
 2.4
 
 
 kernels.
 
   
 
 If you have preemtion enabled you could
 disable
 it. Low latency comes
 at the cost of decreased throughput - can't
 have both. Also try using
 a HZ of 100 if you are currently using 1000,
 that should also improve
 throughput a little at the cost of slightly
 higher latencies.
 
 I doubt that it'll do any huge difference,
 but
 if it does, then that
 would probably be valuable info.
 
 
 
 Ok, well you'll have to explain this one:
 
 Low latency comes at the cost of decreased
 throughput - can't have both
   
 
 Configuring preempt gives lower latency,
 because then
 almost anything can be interrupted (preempted).
  You can then
 get very quick responses to some things, i.e.
 interrupts and such.

I think part of the problem is the continued
misuse of the word latency. Latency, in
language terms, means unexplained delay. Its
wrong here because for one, its explainable. But
it also depends on your perspective. The
latency is increased for kernel tasks, while it
may be reduced for something that is getting the
benefit of preempting the kernel. So you really
can't say the price of reduced latency is lower
throughput, because thats simply backwards.
You've increased the kernel tasks latency by
allowing it to be pre-empted. Reduced latency
implies higher efficiency. All you've done here
is shift the latency from one task to another, so
there is no reduction overall, in fact there is
probably a marginal increase due to the overhead
of pre-emption vs doing nothing.

DT





Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12 Performance problems

2005-08-23 Thread Patrick McHardy
Danial Thom wrote:
 I think part of the problem is the continued
 misuse of the word latency. Latency, in
 language terms, means unexplained delay. Its
 wrong here because for one, its explainable. But
 it also depends on your perspective. The
 latency is increased for kernel tasks, while it
 may be reduced for something that is getting the
 benefit of preempting the kernel. So you really
 can't say the price of reduced latency is lower
 throughput, because thats simply backwards.
 You've increased the kernel tasks latency by
 allowing it to be pre-empted. Reduced latency
 implies higher efficiency. All you've done here
 is shift the latency from one task to another, so
 there is no reduction overall, in fact there is
 probably a marginal increase due to the overhead
 of pre-emption vs doing nothing.

If instead of complaining you would provide the information
I've asked for two days ago someone might actually be able
to help you.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc6-mm2 - drivers/net/s2io.o failed building

2005-08-23 Thread Damir Perisa
2.6.13-rc6-mm2  failed building with this problem (gcc 4.0.1):

  CC [M]  drivers/net/s2io.o
In file included from drivers/net/s2io.c:65:
drivers/net/s2io.h: In function 'readq':
drivers/net/s2io.h:765: error: invalid lvalue in assignment
drivers/net/s2io.h:766: error: invalid lvalue in assignment
make[2]: *** [drivers/net/s2io.o] Error 1
make[1]: *** [drivers/net] Error 2
make: *** [drivers] Error 2
== ERROR: Build Failed.  Aborting...

greetings,
Damir

Le Tuesday 23 August 2005 06:30, vous avez écrit :
| - Various updates.  Nothing terribly noteworthy.
|
| - This kernel still spits a bunch of scheduling-while-atomic warnings
| from the scsi code.  Please ignore.

-- 
Never give in.  Never give in.  Never. Never. Never.
-- Winston Churchill


pgpEkVtqfaK6M.pgp
Description: PGP signature


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

So, I've been trying to use epoll.. on linux-2.6.11-6mdk


However, I'm getting segfaults because some pointers in places are 
getting set to low integer values (which didn't used to have those values).


The deal is that my application is multi-threaded, and I was wondering 
if epoll had issues if you use epoll_ctl while an epoll_wait is waiting 
or something like that.  I'm also compiling with -D_MULTI_THREADED.  I'm 
not new to threading, but am stumped at this point.


I'm not ruling out it being my code, but wanted to ask about epoll since 
it's so new.


Any ideas?

Thanks,
 Davy


bert hubert wrote:


On Fri, Jul 22, 2005 at 04:18:46PM -0500, Davy Durham wrote:
 

Please forgive and redirect me if this is not the right place to ask 
this question:


I'm looking to write a sort of messaging system that would take input 
from any number of entities that register with it.. it would then 
route the messages to outputs and so forth..
   



Look at epoll, or libevent, which uses epoll to be quick in this scenario.


 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Console Resolution Support Request - 1024x480

2005-08-23 Thread ttye0

Hey,
After seeing many many posts and no solutions anywhere regarding having a 
full screen console on a Sony Vaio Picturebook with an ATI Rage Mobility 
video chip on a kernel anywhere near the current version~inhale~I 
finally made an attempt with a kernel-2.4.17 diff patch to manually change 
the nessecery source code in order to make 1024x480 supported. If there is 
any info that you might need I might be able to get it for you but my 
experience is limited unfortunately. My attempt to fix this manually was a 
failure. I managed to find all the code and made all of the proper 
adjustments, but when I made the adjustments to 
../drivers/video/aty/mach64_ct.c there were problems with pll and mpostdiv 
being undefined. I attempted anxiouslly to work around this by leaving the 
code as it was originally or even replacing the nessecery code to make it 
similar (by theory), but there was no success. It did compile, however after 
booting into the kernel with vga=0x301 the screen was terribly unreadable, 
missized (large), and flashing. I wish I could manage this on my own, but 
I'm not capable and have made no progress. I'm surprised that no one has 
taken the older patch and implemented it into the kernel so this would not 
have been an issue except in 2.4.17 and earlier. I do know that if I use 
Windows XP ont his machine, in order to get fullscreen usage I needed to use 
NeoMagic drivers (if you need these, contact me for the exact drivers that 
made mine work) when my video card is in fact an ATI Rage Mobility chip. 
Please, even if you can't help with this, give me some information that may 
lead to a positive outcome for any/all picturebook users. Thank you in 
advance!


Note: I attempted this fix on linux-2.4.28-r9 kernel.

If you would like to see my attempted patch I will show you that as well, it 
works fantastic on terms of patching but the code however does not work.

Here is the patch for the 2.4.17 kernel:


 Code:

 diff -Nur linux-2.4.17/drivers/video/Config.in 
linux/drivers/video/Config.in

 --- linux-2.4.17/drivers/video/Config.in   Thu Nov 15 10:16:31 2001
 +++ linux/drivers/video/Config.in   Fri Jan 11 16:13:37 2002
 @@ -135,6 +135,9 @@
  if [ $CONFIG_FB_ATY != n ]; then
 bool 'Mach64 GX support (EXPERIMENTAL)' CONFIG_FB_ATY_GX
 bool 'Mach64 CT/VT/GT/LT (incl. 3D RAGE) support' 
CONFIG_FB_ATY_CT

 +   if [ $CONFIG_FB_ATY_CT = y ]; then
 +  bool '  Sony Vaio C1VE 1024x480 LCD support' 
CONFIG_FB_ATY_CT_VAIO_LCD

 +   fi
  fi
   tristate '  ATI Radeon display support (EXPERIMENTAL)' 
CONFIG_FB_RADEON
  tristate '  ATI Rage128 display support (EXPERIMENTAL)' 
CONFIG_FB_ATY128
 diff -Nur linux-2.4.17/drivers/video/aty/atyfb_base.c 
linux/drivers/video/aty/atyfb_base.c
 --- linux-2.4.17/drivers/video/aty/atyfb_base.c   Fri Dec 21 21:37:11 
2001

 +++ linux/drivers/video/aty/atyfb_base.c   Sat Dec 22 02:39:12 2001
 @@ -353,6 +353,7 @@

  /* 3D RAGE Mobility */
  { 0x4c4d, 0x4c4d, 0x00, 0x00, m64n_mob_p,   230,  50, M64F_GT | 
M64F_INTEGRATED | M64F_RESET_3D | M64F_GTB_DSP | M64F_MOBIL_BUS },
 +{ 0x4c52, 0x4c52, 0x00, 0x00, m64n_mob_p,   230,  40, M64F_GT | 
M64F_INTEGRATED | M64F_RESET_3D | M64F_GTB_DSP | M64F_MOBIL_BUS | 
M64F_MAGIC_POSTDIV | M64F_SDRAM_MAGIC_PLL | M64F_XL_DLL },
  { 0x4c4e, 0x4c4e, 0x00, 0x00, m64n_mob_a,   230,  50, M64F_GT | 
M64F_INTEGRATED | M64F_RESET_3D | M64F_GTB_DSP | M64F_MOBIL_BUS },

  #endif /* CONFIG_FB_ATY_CT */
  };
 @@ -423,7 +424,7 @@

  #endif /* defined(CONFIG_PPC) */

 -#if defined(CONFIG_PMAC_PBOOK) || defined(CONFIG_PMAC_BACKLIGHT)
 +#if defined(CONFIG_PMAC_PBOOK) || defined(CONFIG_PMAC_BACKLIGHT) || 
defined(CONFIG_FB_ATY_CT_VAIO_LCD)
  static void aty_st_lcd(int index, u32 val, const struct fb_info_aty 
*info)

  {
  unsigned long temp;
 @@ -445,7 +446,7 @@
  /* read the register value */
  return aty_ld_le32(LCD_DATA, info);
  }
 -#endif /* CONFIG_PMAC_PBOOK || CONFIG_PMAC_BACKLIGHT */
 +#endif /* CONFIG_PMAC_PBOOK || CONFIG_PMAC_BACKLIGHT || 
CONFIG_FB_ATY_CT_VAIO_LCD */


  /* - 
*/


 @@ -1744,6 +1745,9 @@
  #if defined(CONFIG_PPC)
  int sense;
  #endif
 +#if defined(CONFIG_FB_ATY_CT_VAIO_LCD)
 +u32 pm, hs;
 +#endif
  u8 pll_ref_div;

  info-aty_cmap_regs = (struct aty_cmap_regs 
*)(info-ati_regbase+0xc0);

 @@ -2068,6 +2072,35 @@
 var = default_var;
  #endif /* !__sparc__ */
  #endif /* !CONFIG_PPC */
 +#if defined(CONFIG_FB_ATY_CT_VAIO_LCD)
 +   /* Power Management */
 +   pm=aty_ld_lcd(POWER_MANAGEMENT, info);
 +   pm=(pm  ~PWR_MGT_MODE_MASK) | PWR_MGT_MODE_PCI;
 +   pm|=PWR_MGT_ON;
 +   

Re: some missing spin_unlocks

2005-08-23 Thread David S. Miller
From: Arjan van de Ven [EMAIL PROTECTED]
Subject: Re: some missing spin_unlocks
Date: Tue, 23 Aug 2005 19:40:06 +0200

 On Tue, 2005-08-23 at 10:30 -0700, David S. Miller wrote:
  From: Arjan van de Ven [EMAIL PROTECTED]
  Date: Tue, 23 Aug 2005 18:54:03 +0200
  
   does it matter? can ANYTHING be spinning on the lock? if not .. can we
   just let the lock go poof and not unlock it... 
  
  I believe socket lookup can, otherwise the code is OK as-is.
 
 lookup while the object is in progress of being destroyed sounds really
 bad though

This happens all the time with TCP sockets, for example.
When we're trying to kill off a socket which is in time
wait state, the receive path can find it, grab a reference,
and process a packet against it right as we're trying to
kill it off.

This is completely normal.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: some missing spin_unlocks

2005-08-23 Thread Arjan van de Ven
On Tue, 2005-08-23 at 10:30 -0700, David S. Miller wrote:
 From: Arjan van de Ven [EMAIL PROTECTED]
 Date: Tue, 23 Aug 2005 18:54:03 +0200
 
  does it matter? can ANYTHING be spinning on the lock? if not .. can we
  just let the lock go poof and not unlock it... 
 
 I believe socket lookup can, otherwise the code is OK as-is.

lookup while the object is in progress of being destroyed sounds really
bad though


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_PRINTK_TIME woes

2005-08-23 Thread Nick Piggin

David S. Miller wrote:


This is a useful feature, please do not labotomize it just because
it's difficult to implement on ia64.  Just make a
printk_get_timestamp_because_ia64_sucks() interface or something
like that :-)


I was a bit unclear when I raised this issue. It is not just an
ia64 problem.

The sched_clock() interface is allowed to return wildly different
values depending on which CPU it is called from, and currently
has fundamental problems at least on i386 where it can go fowards
and backwards arbitrary amounts of time (due to frequency scaling,
if I understand correctly), and also needn't be exactly nanoseconds
at the best of times.

The interface is like this so it can be per-cpu and lockless and
as fast as possible for the scheduler heuristics (which aren't too
picky).

I just don't want its usage spreading outside kernel/sched.c if we
can help it. Pragmatically it sounds like the best thing we have
for printk at this time, however I hope we can come up with
something slightly more appropriate even if it ends up being slower.

Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 2.6.13-rc6] i386: fix incorrect FP signal delivery

2005-08-23 Thread Chuck Ebbert
On Tue, 23 Aug 2005 02:20:07 +0200, Andi Kleen wrote:

 every reviewer has to look up all the bits in the manual?

 I fixed the test program too:

 Before patch:

$ ./fpsig
handler: signum = 8, errno = 0, code = 0 [unknown]
handler: fpu cwd = 0xb40, fpu swd = 0xbaa0
handler: i387 unmasked precision exception, rounded up

 After:

$ ./fpsig
handler: signum = 8, errno = 0, code = 6 [inexact result]
handler: fpu cwd = 0xb40, fpu swd = 0xbaa0
handler: i387 unmasked precision exception, rounded up

/* i387 fp signal test */

#define _GNU_SOURCE
#include stdlib.h
#include unistd.h
#include stdio.h
#include signal.h
#include errno.h

__attribute__ ((aligned(4096))) unsigned char altstack[4096];
unsigned short cw = 0x0b40; /* unmask all exceptions, round up */
struct sigaction sa;
stack_t ss = {
.ss_sp   = altstack[2047],
.ss_size = sizeof(altstack)/2,
};

static void handler(int nr, siginfo_t *si, void *uc)
{
char *decode;
int code = si-si_code;
unsigned short cwd = *(unsigned short *)altstack[0xd84];
unsigned short swd = *(unsigned short *)altstack[0xd88];

switch (code) {
case FPE_INTDIV:
decode = divide by zero;
break;
case FPE_FLTRES:
decode = inexact result;
break;
case FPE_FLTINV:
decode = invalid operation;
break;
default:
decode = unknown;
break;
}
printf(handler: signum = %d, errno = %d, code = %d [%s]\n,
si-si_signo, si-si_errno, code, decode);
printf(handler: fpu cwd = 0x%hx, fpu swd = 0x%hx\n, cwd, swd);
if (swd  0x20  ~cwd)
printf(handler: i387 unmasked precision exception, rounded 
%s\n,
swd  0x200 ? up : down);
exit(1);
}

int main(int argc, char * const argv[])
{
sa.sa_sigaction = handler;
sa.sa_flags = SA_ONSTACK | SA_SIGINFO;

if (sigaltstack(ss, 0))
perror(sigaltstack);
if (sigaction(SIGFPE, sa, NULL))
perror(sigaction);

asm volatile (fnclex ; fldcw %0 : : m (cw));
asm volatile ( /*  st(1) = 3.0, st = 1.0  */
fld1 ; fld1 ; faddp ; fld1 ; faddp ; fld1);
asm volatile (
fdivp ; fwait);  /*  1.0 / 3.0  */

return 0;
}
__
Chuck
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12 Performance problems

2005-08-23 Thread Sven-Thorsten Dietrich
On Tue, 2005-08-23 at 10:10 -0700, Danial Thom wrote:
 

  Ok, well you'll have to explain this one:
  
  Low latency comes at the cost of decreased
  throughput - can't have both

  
  Configuring preempt gives lower latency,
  because then
  almost anything can be interrupted (preempted).
   You can then
  get very quick responses to some things, i.e.
  interrupts and such.
 
 I think part of the problem is the continued
 misuse of the word latency. Latency, in
 language terms, means unexplained delay.

latency

n 
1: (computer science) the time it takes for a specific block of data on
a data track to rotate around to the read/write head [syn: rotational
latency] 
2: the time that elapses between a stimulus and the response to it [syn:
reaction time, response time, latent period] 
3: the state of being not yet evident or active

No apparent references to unexplained in association with the word
latency.




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    1   2   3   4   5   6   >