date:20080112

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Matthew Wilcox

On Sun, Jan 13, 2008 at 12:24:15AM -0700, Matthew Wilcox wrote:
> Here's a patch (on top of Ivan's) to improve things further.

Oops.  I forgot to check the ordering of mmconfig vs direct probing, so
that patch would end up just using mmconfig for everything.  Not what we
want.  Also, there's three bits of mmconfig-shared that're probing using
conf1, even if it might have failed.  And if we're going to use
raw_pci_read() when conf1 might have failed and mmconf isn't set up yet,
we need to check raw_pci_ops in raw_pci_read().  Add the check in
raw_pci_write too, just for symmetry.

I don't like it that mmconfig_32 prints a message and mmconfig_64
doesn't, but fixing that is not part of this patch.

Interdiff:

diff -u b/arch/x86/pci/common.c b/arch/x86/pci/common.c
--- b/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -31,7 +31,7 @@
 int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn,
int reg, int len, u32 *val)
 {
-   if (reg < 256)
+   if (reg < 256 && raw_pci_ops)
return raw_pci_ops->read(domain, bus, devfn, reg, len, val);
if (raw_pci_ext_ops)
return raw_pci_ext_ops->read(domain, bus, devfn, reg, len, val);
@@ -41,7 +41,7 @@
 int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn,
int reg, int len, u32 val)
 {
-   if (reg < 256)
+   if (reg < 256 && raw_pci_ops)
return raw_pci_ops->write(domain, bus, devfn, reg, len, val);
if (raw_pci_ext_ops)
return raw_pci_ext_ops->write(domain, bus, devfn, reg, len, 
val);
diff -u b/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
--- b/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -28,7 +28,7 @@
 static const char __init *pci_mmcfg_e7520(void)
 {
u32 win;
-   pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
+   raw_pci_read(0, 0, PCI_DEVFN(0,0), 0xce, 2, &win);
 
win = win & 0xf000;
if(win == 0x || win == 0xf000)
@@ -53,7 +53,7 @@
 
pci_mmcfg_config_num = 1;
 
-   pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
+   raw_pci_read(0, 0, PCI_DEVFN(0,0), 0x48, 4, &pciexbar);
 
/* Enable bit */
if (!(pciexbar & 1))
@@ -118,7 +118,7 @@
int i;
const char *name;
 
-   pci_direct_conf1.read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
+   raw_pci_read(0, 0, PCI_DEVFN(0,0), 0, 4, &l);
vendor = l & 0x;
device = (l >> 16) & 0x;
 
diff -u b/arch/x86/pci/mmconfig_32.c b/arch/x86/pci/mmconfig_32.c
--- b/arch/x86/pci/mmconfig_32.c
+++ b/arch/x86/pci/mmconfig_32.c
@@ -132,8 +132,10 @@
 
 int __init pci_mmcfg_arch_init(void)
 {
-   printk(KERN_INFO "PCI: Using MMCONFIG\n");
-   raw_pci_ops = &pci_mmcfg;
+   printk(KERN_INFO "PCI: Using MMCONFIG for %s config space\n",
+   raw_pci_ops ? "extended" : "all");
+   if (!raw_pci_ops)
+   raw_pci_ops = &pci_mmcfg;
raw_pci_ext_ops = &pci_mmcfg;
return 1;
 }
diff -u b/arch/x86/pci/mmconfig_64.c b/arch/x86/pci/mmconfig_64.c
--- b/arch/x86/pci/mmconfig_64.c
+++ b/arch/x86/pci/mmconfig_64.c
@@ -144,7 +144,8 @@
return 0;
}
}
-   raw_pci_ops = &pci_mmcfg;
+   if (!raw_pci_ops)
+   raw_pci_ops = &pci_mmcfg;
raw_pci_ext_ops = &pci_mmcfg;
return 1;
 }

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.24-rc6-mm1 - oddness with IPv4/v6 mapped sockets hanging...

2008-01-12 Thread Valdis . Kletnieks

I'm seeing problems with Sendmail on 24-rc6-mm1, where the main Sendmail is
listening on ::1/25, and Fetchmail connects to 127.0.0.1:25 to inject mail it
has just fetched from an outside server via IMAP - it will often just hang and
not make any further progress. Looking at netstat shows something interesting:

% netstat -n -a -A inet | grep 25
tcp0   5108 127.0.0.1:59355 127.0.0.1:25
ESTABLISHED 
% netstat -n -a -A inet6 | grep 25
tcp0  0 :::25   :::*
LISTEN  
tcp0  0 :::127.0.0.1:25 :::127.0.0.1:59355  
ESTABLISHED 
% netstat -n -a -A inet | grep 25
tcp0   5108 127.0.0.1:59355 127.0.0.1:25
ESTABLISHED 
% netstat -n -a -A inet6 | grep 25
tcp0  0 :::25   :::*
LISTEN  
tcp0  0 :::127.0.0.1:25 :::127.0.0.1:59355  
ESTABLISHED 
% netstat -n -a -A inet | grep 25
tcp0   5108 127.0.0.1:59355 127.0.0.1:25
ESTABLISHED 
% netstat -n -a -A inet6 | grep 25
tcp0  0 :::25   :::*
LISTEN  
tcp0  0 :::127.0.0.1:25 :::127.0.0.1:59355  
ESTABLISHED 

On the IPv4 side, it thinks it's got 5108 bytes in the send queue - but on
the IPv6 side of that same connection, it's showing 0 in the receive queue,
and we're stuck there.

It's not consistent - sometimes Fetchmail will wedge on the very first mail,
and do so several times in a row.  Other times, it will do well for a while -
at the moment, it's gone through 471 of the 1,470 currently queued mails just
fine, only to get wedged again on number 472.

For what it's worth, here's what 'echo w > /proc/sysrq-trigger' got, although I
don't see anything that looks odd to me given the netstat output above -
procmail has sent data, and is waiting for a response back, and sendmail is
waiting for data to arrive:

fetchmail S 8053c520  5360 17612   9902
 81007d37bb08 0086  000200d0
 81006bf826c0 80687360 81006bf82918 0001
 0003 81007d37bb88  
Call Trace:
 [] schedule_timeout+0x22/0xb4
 [] _spin_lock_bh+0x11/0x38
 [] _spin_unlock_bh+0x1e/0x20
 [] release_sock+0xa3/0xac
 [] sk_wait_data+0x8a/0xcf
 [] autoremove_wake_function+0x0/0x38
 [] tcp_recvmsg+0x35a/0x86b
 [] sock_common_recvmsg+0x32/0x47
 [] selinux_socket_recvmsg+0x1d/0x1f
 [] sock_recvmsg+0x10e/0x12f
 [] autoremove_wake_function+0x0/0x38
 [] avc_has_perm+0x4c/0x5e
 [] pty_write+0x3a/0x44
 [] remove_wait_queue+0x2f/0x3b
 [] sys_recvfrom+0xa4/0xf5
 [] hrtimer_start+0x11f/0x131
 [] do_setitimer+0x184/0x326
 [] system_call_after_swapgs+0x7b/0x80

sendmail  S 81007d30a400  5360 17613  16992
 81006bc419e8 0086 81006bc41998 8023f6a5
 81007d30a400 81007d24f200 81007d30a658 00010286
 81006bc419e8 8023f851 4789b768 81000100eb20
Call Trace:
 [] lock_timer_base+0x26/0x4a
 [] __mod_timer+0xc4/0xd6
 [] schedule_timeout+0x8d/0xb4
 [] process_timeout+0x0/0xb
 [] schedule_timeout+0x88/0xb4
 [] do_select+0x4a9/0x50b
 [] __pollwait+0x0/0xdf
 [] default_wake_function+0x0/0xf
 [] _spin_lock_bh+0x11/0x38
 [] lock_sock_nested+0xa5/0xb2
 [] _spin_lock_bh+0x11/0x38
 [] _spin_unlock_bh+0x1e/0x20
 [] release_sock+0xa3/0xac
 [] tcp_recvmsg+0x759/0x86b
 [] sock_common_recvmsg+0x32/0x47
 [] selinux_socket_recvmsg+0x1d/0x1f
 [] sock_aio_read+0x121/0x139
 [] avc_has_perm+0x4c/0x5e
 [] core_sys_select+0x1f2/0x2a0
 [] page_add_new_anon_rmap+0x20/0x22
 [] file_has_perm+0xa5/0xb4
 [] autoremove_wake_function+0x0/0x38
 [] sys_select+0x150/0x17b
 [] system_call_after_swapgs+0x7b/0x80

Any ideas?


pgp5iKREQ1xDQ.pgp
Description: PGP signature

Re: [PATCH 0/4] cpuinitconst and devinitconst

2008-01-12 Thread Sam Ravnborg

>  
> +#ifdef CONFIG_HOTPLUG
> +#define DEV_KEEP(sec)
> +#define DEV_DISCARD(sec) *(.dev##sec)
> +#else
> +#define DEV_KEEP(sec)*(.dev##sec)
> +#define DEV_DISCARD(sec)
> +#endif
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> +#define CPU_KEEP(sec)
> +#define CPU_DISCARD(sec) *(.cpu##sec)
> +#else
> +#define CPU_KEEP(sec)*(.cpu##sec)
> +#define CPU_DISCARD(sec)
> +#endif
> +
> +#if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_ACPI_HOTPLUG_MEMORY) \
> +|| defined(CONFIG_ACPI_HOTPLUG_MEMORY_MODULE)
> +#define MEM_KEEP(sec)
> +#define MEM_DISCARD(sec) *(.mem##sec)
> +#else
> +#define MEM_KEEP(sec)*(.mem##sec)
> +#define MEM_DISCARD(sec)
> +#endif

I inversed it in the ifdef's above.
And I found another small buglet too. I hope to post a complete
solution later today.

Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Matthew Wilcox

On Sun, Jan 13, 2008 at 06:08:05PM +1100, Benjamin Herrenschmidt wrote:
> On Sat, 2008-01-12 at 17:40 +0300, Ivan Kokshaysky wrote:
> > Actually I'm strongly against Arjan's patch. First, it's based on
> > assumption that the MMCONFIG thing is sort of fundamentally broken
> > on some systems, but none of the facts we have so far does confirm
> > that.
> > And second, I really don't like the implementation as it breaks all
> > non-x86 arches (or forces them to add a set of totally meaningless
> > PCI functions).
> 
> I agree, I quite dislike it too. Even If the breakage on x86 makes us
> want to totally disable it there, it can be done within the existing PCI
> ops I believe.
> 
> I think Arjan's problem is to try to do it per-device since the
> "standard" PCI ops don't get a pci_dev structure (for obvious reasons).

Here's a patch (on top of Ivan's) to improve things further.

One of Arjan's big problems with Ivan's patch is the hardcoding of conf1
as the fallback.  So I took an idea from Arjan's patch, crossed it
with an idea of my own and came up with this.  It gets rid of the
raw_pci_ops as a generic idea, and makes it private to the x86 arch.
It also makes the whole select-which-ops private to the x86 arch without
touching the pci layer at all.

Only compile-tested on x86-64.

diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 488e48a..ffaf02b 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -43,8 +43,7 @@
 #define PCI_SAL_EXT_ADDRESS(seg, bus, devfn, reg)  \
(((u64) seg << 28) | (bus << 20) | (devfn << 12) | (reg))
 
-static int
-pci_sal_read (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_read(unsigned int seg, unsigned int bus, unsigned int devfn,
  int reg, int len, u32 *value)
 {
u64 addr, data = 0;
@@ -68,8 +67,7 @@ pci_sal_read (unsigned int seg, unsigned int bus, unsigned 
int devfn,
return 0;
 }
 
-static int
-pci_sal_write (unsigned int seg, unsigned int bus, unsigned int devfn,
+int raw_pci_write(unsigned int seg, unsigned int bus, unsigned int devfn,
   int reg, int len, u32 value)
 {
u64 addr;
@@ -91,24 +89,17 @@ pci_sal_write (unsigned int seg, unsigned int bus, unsigned 
int devfn,
return 0;
 }
 
-static struct pci_raw_ops pci_sal_ops = {
-   .read = pci_sal_read,
-   .write =pci_sal_write
-};
-
-struct pci_raw_ops *raw_pci_ops = &pci_sal_ops;
-
-static int
-pci_read (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 
*value)
+static int pci_read(struct pci_bus *bus, unsigned int devfn, int where,
+   int size, u32 *value)
 {
-   return raw_pci_ops->read(pci_domain_nr(bus), bus->number,
+   return raw_pci_read(pci_domain_nr(bus), bus->number,
 devfn, where, size, value);
 }
 
-static int
-pci_write (struct pci_bus *bus, unsigned int devfn, int where, int size, u32 
value)
+static int pci_write(struct pci_bus *bus, unsigned int devfn, int where,
+   int size, u32 value)
 {
-   return raw_pci_ops->write(pci_domain_nr(bus), bus->number,
+   return raw_pci_write(pci_domain_nr(bus), bus->number,
  devfn, where, size, value);
 }
 
diff --git a/arch/ia64/sn/pci/tioce_provider.c 
b/arch/ia64/sn/pci/tioce_provider.c
index e1a3e19..f6df212 100644
--- a/arch/ia64/sn/pci/tioce_provider.c
+++ b/arch/ia64/sn/pci/tioce_provider.c
@@ -752,13 +752,13 @@ tioce_kern_init(struct tioce_common *tioce_common)
 * Determine the secondary bus number of the port2 logical PPB.
 * This is used to decide whether a given pci device resides on
 * port1 or port2.  Note:  We don't have enough plumbing set up
-* here to use pci_read_config_xxx() so use the raw_pci_ops vector.
+* here to use pci_read_config_xxx() so use raw_pci_read().
 */
 
seg = tioce_common->ce_pcibus.bs_persist_segment;
bus = tioce_common->ce_pcibus.bs_persist_busnum;
 
-   raw_pci_ops->read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
+   raw_pci_read(seg, bus, PCI_DEVFN(2, 0), PCI_SECONDARY_BUS, 1,&tmp);
tioce_kern->ce_port1_secondary = (u8) tmp;
 
/*
@@ -799,11 +799,11 @@ tioce_kern_init(struct tioce_common *tioce_common)
 
/* mem base/limit */
 
-   raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+   raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
  PCI_MEMORY_BASE, 2, &tmp);
base = (u64)tmp << 16;
 
-   raw_pci_ops->read(seg, bus, PCI_DEVFN(dev, 0),
+   raw_pci_read(seg, bus, PCI_DEVFN(dev, 0),
  PCI_MEMORY_LIMIT, 2, &tmp);
limit = (u64)tmp << 16;
limit |= 0xfUL;
@@ -817,21 +817,21 @@ tioce_kern_init(struct tioce_common *tioce_common)

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Loic Prylli




On 1/13/2008 1:01 AM, Matthew Wilcox wrote:

On Sat, Dec 29, 2007 at 12:12:19AM +0300, Ivan Kokshaysky wrote:
  

On Fri, Dec 28, 2007 at 12:40:53PM -0500, Loic Prylli wrote:


One thing that could be changed in pci_cfg_space_size() is to avoid
making a special case for PCI-X 266MHz/533Mhz (assume cfg_size == 256
for such devices too, reserve extended cfg-space for pci-express
devices). 
  

I agree, we should remove it. IIRC, this PCI-X check was written
long ago with some draft (not a final spec) in hands. Matthew?



I have what I believe to be the released version of PCI-X 2.0a (July
22, 2003).  It is quite clear that Mode 2 devices (ie those running at
266MHz or 533MHz) are required to support all 4096 bytes of extended
config space.

More to the point, I don't think we have any bug reports suggesting that
PCI-X Mode 2 devices/bridges have any problems. 




As PCI-X2 bridge/chipset, I only knows about the AMD-8132 (from what I 
understand it does PCI-X Mode 2), and some obscure IBM enterprise 
chipset (I am sure there are a few more).




Too bad for the spec, but we definitely know for sure the AMD-8132 
doesn't do ext-space (and makes it unusable for any device behind it).





There are relatively
few of them in existance, and my impression is that PCI-X2 is only being
implemented on server-class machines. 




True.




 'Consumer grade' equipment is
where all the problems lie anyway.
  




mmconfig has been a pain on the servers too (there are a lot of server 
class amd machines using one pcie/mmconfig/chipset + amd-8131/2).




While the PCI-X 2.0a spec does not define any Extended Capability IDs,
it simply states that "This field is a PCI-SIG defined ID number that
indicates the nature and format of the Extended Capabilities List item".
The PCIe spec does define Extended Capability IDs, and I would think
it's entirely appropriate to use the same IDs for PCI-X Mode 2 devices.
  



Sure it might be needed on PCI-X2. But contrary to pcie (where the 
driver/pci/pcie/aer subsystem already use ext-conf-space, and other 
usages are bound to increase), needing ext-conf-space in the future on 
pci-x2 is quite unlikely (pcie is long-lived, whereas PCI-X2 was 
short-lived, obsoleted by PCI-E, and nobody has mentioned yet an example 
of using ext-registers with a PCI-X2 device).


I was only mentioning that because of the very small trade-off:  if you 
don't exclude PCI-X2, on platforms with the amd-8132+bad-MCFG, you might 
trigger a cfg-read==0x/master-abort in pci_cfg_space_size() for 
such devices with Ivan patch. This is harmless, because a lot of similar 
master-abort happen during PCI-probing anyway, so one more won't change 
anything.



Anyway, I am equally happy with keeping pci_cfg_space_size() as it is.


Loic

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: backlight module for nvidia cards -- control backlight even with offb

2008-01-12 Thread Benjamin Herrenschmidt

No time to look at this right now, please ping me if you have no news in
a week or so.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Benjamin Herrenschmidt

On Sat, 2008-01-12 at 17:40 +0300, Ivan Kokshaysky wrote:
> 
> Actually I'm strongly against Arjan's patch. First, it's based on
> assumption that the MMCONFIG thing is sort of fundamentally broken
> on some systems, but none of the facts we have so far does confirm
> that.
> And second, I really don't like the implementation as it breaks all
> non-x86 arches (or forces them to add a set of totally meaningless
> PCI functions).

I agree, I quite dislike it too. Even If the breakage on x86 makes us
want to totally disable it there, it can be done within the existing PCI
ops I believe.

I think Arjan's problem is to try to do it per-device since the
"standard" PCI ops don't get a pci_dev structure (for obvious reasons).

But from what I read in this thread, this per-device enabling/disabling
doesn't seem very useful at all.

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-12 Thread Fengguang Wu

On Sun, Jan 13, 2008 at 12:32:30AM +0100, Joerg Platte wrote:
> Am Freitag, 11. Januar 2008 schrieb Fengguang Wu:
> > On Thu, Jan 10, 2008 at 11:03:05AM +0100, Joerg Platte wrote:
> > > Am Donnerstag, 10. Januar 2008 schrieb Fengguang Wu:
> > > > > problem, because the iowait problem disappeared today after the
> > > > > regular Debian update. I'll try to install the old package versions
> > > > > to make it show up again. Maybe that helps to debug it.
> > > >
> > > > Thank you. I'm running sid, ext2 as rootfs now ;-)
> > >
> > > The error is back and I'm getting thousands of messages like this with
> > > the patched kernel:
> > >
> > > mm/page-writeback.c 668 wb_kupdate: pdflush(146) 21115 global 3936 0 0 wc
> > > _M > tw 1024 sk 0 requeue_io 301: inode 81441 size 0 at 08:07(sda7)
> > > mm/page-writeback.c 668 wb_kupdate: pdflush(147) 17451 global 3936 0 0 wc
> > > _M > tw 1024 sk 2 requeue_io 301: inode 81441 size 0 at 08:07(sda7)
> > > mm/page-writeback.c 668 wb_kupdate: pdflush(147) 17451 global 3936 0 0 wc
> > > _M > tw 1024 sk 2 requeue_io 301: inode 81441 size 0 at 08:07(sda7)
> >
> > Joerg, what's the output of `dumpe2fs /dev/sda7` and `lsof|grep /tmp`?
> 
> After another reboot I tried to get more information about the konqueror 
> process possibly causing the iowait load by using strace -p. Here is the 
> output:
> 
> gettimeofday({1200180588, 878508}, NULL) = 0
> setitimer(ITIMER_VIRTUAL, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
> rt_sigaction(SIGVTALRM, {SIG_DFL}, {0xb5cffed0, [VTALRM], SA_RESTART}, 8) = 0
> gettimeofday({1200180588, 879942}, NULL) = 0
> time(NULL)  = 1200180588
> gettimeofday({1200180588, 880838}, NULL) = 0
> gettimeofday({1200180588, 881284}, NULL) = 0

No idea yet :-/ I'm afraid I have to trouble you again - the bug just
refused to appear in my system. I prepared a kernel module for you to
gather more information:

make && insmod ext2-writeback-debug.ko && sleep 1s && rmmod ext2-writeback-debug
dmesg > ext2-writeback-debug.dmesg

Please do it when 100% iowait appears, and send the dmesg file.

Thank you,
Fengguang
obj-m := ext2-writeback-debug.o
KDIR  := /lib/modules/$(shell uname -r)/build
PWD   := $(shell pwd)

default:
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
clean:  
rm -f *.mod.c *.ko *.o

#include 
#include 
#include 
#include 
#include 
#include 
#include 

void print_page(struct page *page)
{
	printk(KERN_DEBUG "%lu\t%u\t%u\t%c%c%c%c%c\n",
			page->index,
			page_count(page),
			page_mapcount(page),
			PageUptodate(page)  ? 'U' : '_',
			PageDirty(page) ? 'D' : '_',
			PageWriteback(page) ? 'W' : '_',
			PagePrivate(page)   ? 'P' : '_',
			PageLocked(page)? 'L' : '_');
}

void print_writeback_control(struct writeback_control *wbc)
{
	printk(KERN_DEBUG
			"global dirty %lu writeback %lu nfs %lu\n"
			"wbc flags %c%c towrite %ld skipped %ld\n",
			global_page_state(NR_FILE_DIRTY),
			global_page_state(NR_WRITEBACK),
			global_page_state(NR_UNSTABLE_NFS),
			wbc->encountered_congestion ? 'C':'_',
			wbc->more_io ? 'M':'_',
			wbc->nr_to_write,
			wbc->pages_skipped);
}

void print_inode_pages(struct inode *inode)
{
	struct address_space *mapping = inode->i_mapping;
	struct pagevec pvec;
	int nr_pages;
	int i;
	pgoff_t index = 0;
	struct dentry *dentry;
	int dcount;
	char *dname;

	nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
			PAGECACHE_TAG_DIRTY,
			(pgoff_t)PAGEVEC_SIZE);

	if (list_empty(&inode->i_dentry)) {
		dname = "";
		dcount = 0;
	} else {
		dentry = list_entry(inode->i_dentry.next,
	struct dentry, d_alias);
		dname = dentry->d_iname;
		dcount = atomic_read(&dentry->d_count);
	}

	printk(KERN_DEBUG "inode %lu(%s/%s) count %d,%d size %llu pages %lu\n",
			inode->i_ino,
			inode->i_sb->s_id,
			dname,
			atomic_read(&inode->i_count),
			dcount,
			i_size_read(inode),
			mapping->nrpages
	  );

	for (i = 0; i < nr_pages; i++)
		print_page(pvec.pages[i]);
}

int handler_pre(struct kprobe *p, struct pt_regs *regs)
{
	static int count = 0;

	if (count++ < 10)
		dump_stack();

	return 0;
}

void handler_post(struct kprobe *p, struct pt_regs *regs, unsigned long flags)
{
}

static struct kprobe my_kprobe = {
	.pre_handler = handler_pre,
	.post_handler = handler_post,
	.symbol_name = "submit_bio"
};


static int jdo_ext2_writepage(struct page *page, struct writeback_control *wbc)
{
	struct inode * const inode = page->mapping->host;

	if (!i_size_read(inode)) {
		printk(KERN_DEBUG "ext2_writepage:\n");
		print_page(page);
		print_writeback_control(wbc);
	}

	jprobe_return();
	return 0;
}
static struct jprobe jprobe_ext2_writepage = {
	.entry = jdo_ext2_writepage,
	.kp.symbol_name = "ext2_writepage"
};


static void jdo_requeue_io(struct inode *inode)
{
	if (!i_size_read(inode)) {
		printk(KERN_DEBUG "requeue_io:\n");
		print_inode_pages(inode);
	}

	jprobe_return();
}
static struct jprobe jprobe_requeue_io = {
	.entry = jdo_requeue_io,
	.kp.symbol_name = "requeue_io"
};

st

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Jeff Garzik


Matthew Wilcox wrote:

On Sat, Jan 12, 2008 at 08:42:48PM -0800, Arjan van de Ven wrote:

Wanne bet there'll be devices that screw this up? THere's devices that even 
screwed
up the 64-256 region after all.


I don't know if they 'screwed it up'.  There are devices that misbehave
when registers are read from pci config space.  But this was never
guaranteed to be a safe thing to do; it gradualy became clear that
people expected to be able to read random registers and manufacturers
responded accordingly, but I don't think you were ever guaranteed to be
able to peek at bits of config space arbitrarily.


Quite correct...  Reading registers can have all sorts of side effects, 
for example clearing chip conditions.


Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm: pnp-do-not-stop-start-devices-in-suspend-resume-path.patch breaks resuming isapnp cards

2008-01-12 Thread Rene Herman


On 13-01-08 06:50, Bjorn Helgaas wrote:


On Saturday 12 January 2008 1:08:01 pm Rene Herman wrote:



pnp-do-not-stop-start-devices-in-suspend-resume-path.patch in current -mm
breaks resuming isapnp cards from hibernation. They need the pnp_start_dev
to enable the device again after hibernation.

They don't really need the pnp_stop_dev() which the above mentioned patch
also removes but with the pnp_start_dev() restored it seems pnp_stop_dev()
should also stay. Bjorn Helgaas should decide  -- currently the patch as
you have it breaks drivers though. Could you drop it?


Yes, please drop pnp-do-not-stop-start-devices-in-suspend-resume-path.patch
for now.


Okay, thanks for the reply. And, now that I have your attention, while it's 
not important to the issue anymore with the tests removed as the submitted 
patch did, do you have an opinion on (include/linux/pnp.h):


/* pnp driver flags */
#define PNP_DRIVER_RES_DO_NOT_CHANGE0x0001  /* do not change the state 
of the device */
#define PNP_DRIVER_RES_DISABLE  0x0003  /* ensure the device is 
disabled */


I find DISABLE including DO_NOT_CHANGE rather unexpected...

By the way, I also still have this next one outstanding for you... :-/

http://lkml.org/lkml/2008/1/9/168

Rene.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Matthew Wilcox

On Sat, Dec 29, 2007 at 12:12:19AM +0300, Ivan Kokshaysky wrote:
> On Fri, Dec 28, 2007 at 12:40:53PM -0500, Loic Prylli wrote:
> > One thing that could be changed in pci_cfg_space_size() is to avoid
> > making a special case for PCI-X 266MHz/533Mhz (assume cfg_size == 256
> > for such devices too, reserve extended cfg-space for pci-express
> > devices). There is good reasons to think no such PCI-X 266Mhz/533 device
> > will ever have an extended-space (no capability IDs was ever defined in
> > the PCI-X 2.0 spec, no new revision is planned). Such a check would
> > avoid the possibility of trying extended-conf-space access for PCI-X 2.0
> > devices behind a amd-8132 or similar (such accesses would just returnd
> > -1, but there was some objections raised about doing anything like that
> > other than at initialization time, even if there is ample reasons to
> > argue it would be harmless).
> 
> I agree, we should remove it. IIRC, this PCI-X check was written
> long ago with some draft (not a final spec) in hands. Matthew?

I have what I believe to be the released version of PCI-X 2.0a (July
22, 2003).  It is quite clear that Mode 2 devices (ie those running at
266MHz or 533MHz) are required to support all 4096 bytes of extended
config space.

More to the point, I don't think we have any bug reports suggesting that
PCI-X Mode 2 devices/bridges have any problems.  There are relatively
few of them in existance, and my impression is that PCI-X2 is only being
implemented on server-class machines.  'Consumer grade' equipment is
where all the problems lie anyway.

While the PCI-X 2.0a spec does not define any Extended Capability IDs,
it simply states that "This field is a PCI-SIG defined ID number that
indicates the nature and format of the Extended Capabilities List item".
The PCIe spec does define Extended Capability IDs, and I would think
it's entirely appropriate to use the same IDs for PCI-X Mode 2 devices.

So I don't believe any change in this area is appropriate.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm: pnp-do-not-stop-start-devices-in-suspend-resume-path.patch breaks resuming isapnp cards

2008-01-12 Thread Bjorn Helgaas

On Saturday 12 January 2008 1:08:01 pm Rene Herman wrote:
> pnp-do-not-stop-start-devices-in-suspend-resume-path.patch in current -mm
> breaks resuming isapnp cards from hibernation. They need the pnp_start_dev
> to enable the device again after hibernation.
>
> They don't really need the pnp_stop_dev() which the above mentioned patch
> also removes but with the pnp_start_dev() restored it seems pnp_stop_dev()
> should also stay. Bjorn Helgaas should decide  -- currently the patch as
> you have it breaks drivers though. Could you drop it?

Yes, please drop pnp-do-not-stop-start-devices-in-suspend-resume-path.patch
for now.

When the PNP core requested resources for active devices, the
pnp_stop_dev() in the suspend path released the PNP core resources,
leaving the underlying driver resources orphaned.  But the patch
that made the PNP core request those resource is also gone, so
we can leave the start/stop alone.

> On 12-01-08 20:08, Rafael J. Wysocki wrote:
> > On Saturday, 12 of January 2008, Rene Herman wrote:
> >> It seems all PnP drivers would need to stick a pnp_start_dev in their
> >> resume method
> >
> > Yes.
> >
> >> then which means it really belongs in core.
> >
> > Yes, if practical.
> >
> >> One important point where PnP and PCI differ is that PnP allows to
> >> change the resources on a protocol level and I don't see how it could
> >> ever not be necessary to restore the state a user may have set if power
> >> has been removed. Hibernate is just that, isn't it?

That's a good point, thanks for pointing that out.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 18/19] account mlocked pages

2008-01-12 Thread Rik van Riel

On Fri, 11 Jan 2008 18:21:09 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> * Rik van Riel <[EMAIL PROTECTED]> [2008-01-08 15:59:57]:
> 
> The following patch is required to compile the code with
> CONFIG_NORECLAIM enabled and CONFIG_NORECLAIM_MLOCK disabled.

I have untangled the #ifdefs to make things compile with
all combinations of config settings.  Thanks for pointing
out this problem.

-- 
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-12 Thread Abhishek Rai

Thanks for the great feedback Daniel!

Following this email, I'll be sending out two separate emails with the
actual patches, one against the latest stable kernel and one against
the latest mm patch, using the format suggested by you. Sorry about
the tabs and spaces thing, I've fixed my email client now.

Thanks for pointing out problems with the (free_blocks <= 0)
expression, I totally agree with you and have fixed it in the new
patch. Regarding some of the other points that you raised:

1. Worse performance with random read on large files using metaclustering:
This is a genuine drawback of this kind of metaclustering. In short,
the choice is between a slightly slower random read or a much faster
fsck. However, I believe that going forward, especially if and when we
port metaclustering to Ext4, slower random read will probably be less
of an issue, since there'll be fewer indirect blocks (due to the use
of extents) and so we'll be able to do more aggressive prefetching of
indirect blocks to help random reads.

That said, it would be great to see what random read performance other
users report since in my own experiments the degradation has been
somewhat smaller than I'd expect (I've also tried more complex random
read non-standard benchmarks that I haven't reported numbers for and
they did "reasonably ok" with metaclustering, but of course standard
reproducible results are always better).

2. Porting to Ext4:
It seems that popular opinion is that some form of metaclustering
could be useful for Ext4 as some other Ext4 hackers have also
suggested the same on LKML and I'd be glad to work on it. However, I
think metaclustering provides genuine value to current users of Ext3
and Ext2 and most people will agree that these two file systems are
very likely to remain popular for quite some time now (the backport of
metaclustering to Ext2 is quite trivial, so if metaclustering gets
accepted in Ext3, I'll probably release a "use-at-your-own-risk" patch
for Ext2 users).

Thanks!
Abhishek

On Jan 12, 2008 1:05 AM, Daniel Phillips <[EMAIL PROTECTED]> wrote:
> On Friday 11 January 2008 16:04, Andrew Morton wrote:
> > It needs to be reviewed.  In exhaustive detail.  Few people can do
> > that and fewer are inclined to do so.
>
> Agreed, there just have to be a few bugs in this many lines of code.
> I spent a couple of hours going through it, not really looking at the
> algorithms but just the superficial details.  I only found minor nits,
> and not many of those.
>
> For example, I do not like to see "if (free_blocks == 0)" written as"if
> (free_blocks <= 0)" in an attempt to increase robustness.  What it
> actually does is make the effect of an error more subtle, or
> even "corrects" it.  Firmly in the niggle category.
>
> I checked the locking of sbi->bginfo and didn't see a flaw, good.
>
> I see a missing KERN_INFO added to a printk, it technically counts as an
> unrelated change but oh well.
>
> Stylistically this new code is hard to tell apart from the incumbent
> code, except for being more heavily commented.  I wish all kernel code
> was written this clearly.
>
> At this point I will run away in favor of for-real Ext3 hackers (you
> know who you are:-)
>
> > I went to merge it so it could get some testing while we await review
> > but the patch has all its tabs replaced with spaces, is seriously
> > wordwrapped and has random newlines added to it.  Please fix email
> > client and resend (offlist is OK if it is unaltered).
>
> Odd, the original post has tabs and the updated one does not, though the
> client seems to be kmail in both cases.
>
> > We should have a think about which workloads are most likely to be
> > adversely affected by this change.
>
> I was just rolling up my sleeves to construct the nasty sequential case
> where the head keeps seeking back to the center of the group after
> picking up each 4 MB of doubly indexed data when I realized that even
> the most simple minded disk cache makes this case a non-issue.  The
> drive will most likely suck a full track (roughly .5 MB) or big chunk
> thereof into cache the first time it seeks to the index cluster, thus
> having a whole group of double index blocks in cache and then will
> proceed to chew happily and linearly through the data blocks.
> It seems like placing those second level index blocks all together
> really helps this case.  Hmm, how to break it.
>
> How about having a disk full of 100 MB files and skipping all over the
> disk randomly reading one block each time.  That will fill the disk
> cache, and each random read then requires seeking to two places that
> were hopefully close together without index node clustering, and now
> will be an average of 32 MB apart.  Each of these "extra" seeks costs a
> couple of ms worth of head travel plus average rotational latency of 4
> ms or so, for a total 6 ms.  However, even with a perfect non-clustered
> layout, the index mode will still be an average of 2 MB away from the
> data block, so the rotation

Re: [PATCH 2/2] updating ctime and mtime at syncing

2008-01-12 Thread Rik van Riel

On Sun, 13 Jan 2008 07:39:59 +0300
Anton Salikhmetov <[EMAIL PROTECTED]> wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=2645
> 
> Changes for updating the ctime and mtime fields for memory-mapped files:
> 
> 1) new flag triggering update of the inode data;
> 2) new function to update ctime and mtime for block device files;
> 3) new helper function to update ctime and mtime when needed;
> 4) updating time stamps for mapped files in sys_msync() and do_fsync();
> 5) implementing the feature of auto-updating ctime and mtime.
> 
> Signed-off-by: Anton Salikhmetov <[EMAIL PROTECTED]>

Acked-by: Rik van Riel <[EMAIL PROTECTED]>

-- 
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Matthew Wilcox

On Sat, Jan 12, 2008 at 08:42:48PM -0800, Arjan van de Ven wrote:
> Wanne bet there'll be devices that screw this up? THere's devices that even 
> screwed
> up the 64-256 region after all.

I don't know if they 'screwed it up'.  There are devices that misbehave
when registers are read from pci config space.  But this was never
guaranteed to be a safe thing to do; it gradualy became clear that
people expected to be able to read random registers and manufacturers
responded accordingly, but I don't think you were ever guaranteed to be
able to peek at bits of config space arbitrarily.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] massive code cleanup of sys_msync()

2008-01-12 Thread Rik van Riel

On Sun, 13 Jan 2008 07:39:58 +0300
Anton Salikhmetov <[EMAIL PROTECTED]> wrote:

> Substantial code cleanup of the sys_msync() function:
> 
> 1) using the PAGE_ALIGN() macro instead of "manual" alignment;
> 2) improved readability of the loop traversing the process memory regions.
> 
> Signed-off-by: Anton Salikhmetov <[EMAIL PROTECTED]>

Acked-by: Rik van Riel <[EMAIL PROTECTED]>


-- 
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] per-task I/O throttling

2008-01-12 Thread Balbir Singh

* Andrea Righi <[EMAIL PROTECTED]> [2008-01-12 19:01:14]:

> Peter Zijlstra wrote:
> > On Sat, 2008-01-12 at 16:27 +0530, Balbir Singh wrote:
> >> * Peter Zijlstra <[EMAIL PROTECTED]> [2008-01-12 10:46:37]:
> >>
> >>> On Fri, 2008-01-11 at 23:57 -0500, [EMAIL PROTECTED] wrote:
>  On Fri, 11 Jan 2008 17:32:49 +0100, Andrea Righi said:
> 
> > The interesting feature is that it allows to set a priority for each
> > process container, but AFAIK it doesn't allow to "partition" the
> > bandwidth between different containers (that would be a nice feature
> > IMHO). For example it would be great to be able to define per-container
> > limits, like assign 10MB/s for processes in container A, 30MB/s to
> > container B, 20MB/s to container C, etc.
>  Has anybody considered allocating based on *seeks* rather than bytes 
>  moved,
>  or counting seeks as "virtual bytes" for the purposes of accounting (if 
>  the
>  disk can do 50mbytes/sec, and a seek takes 5millisecs, then count it as 
>  100K
>  of data)?
> >>> I was considering a time scheduler, you can fill your time slot with
> >>> seeks or data, it might be what CFQ does, but I've never even read the
> >>> code.
> >>>
> >> So far the definition of I/O bandwidth has been w.r.t time. Not all IO
> >> devices have sectors; I'd prefer bytes over a period of time.
> > 
> > Doing a time based one would only require knowing the (avg) delay of
> > seeks, whereas doing a bytes based one would also require knowing the
> > (avg) speed of the device.
> > 
> > That is, if you're also interested in providing a latency guarantee.
> > Because that'd force you to convert bytes to time again.
> 
> So, what about considering both bytes/sec and io-operations/sec? In this
> way we should be able to limit huge streams of data and seek storms (or
> any mix of them).
> 
> Regarding CFQ, AFAIK it's only possible to configure an I/O priorty for
> a process, but there's no way for example to limit the bandwidth (or I/O
> operations/sec) for a particular user or group.
> 

Limiting usage is also a very useful feature. Andrea could you please
port your patches over to control groups.

-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Arjan van de Ven

On Sat, 12 Jan 2008 20:36:59 -0500
Tony Camuso <[EMAIL PROTECTED]> wrote:

> Thanks, Arjan.
> 
> The problem we have been experiencing has to do with Northbridges,
> not with devices.

correct for now.
HOWEVER, and this is the point Linus has made several times:
Just about NOBODY has devices that need the extended config space. At all.
So making this opt-in for devices allows our users to boot and use
their system if they are in the majority that has no need for even getting
close to this mess.

> 
> As far as the device is concerned, after the Northbridge translates
> the config access into PCI bus cycles, the device has no idea what
> mechanism drove the Northbridge to the translation.

Wanne bet there'll be devices that screw this up? THere's devices that even 
screwed
up the 64-256 region after all.

> The patch I devised concerned itself with Northbridges and separated
> MMCONFIG-compliant buses from those that could not handle MMCONFIG.

THis kind of patchup has been going on for the better part of a year (well 2 
years)
by now and it's STILL NOT ENOUGH, as you can see by the more patchups that have
been proposed as "alternative" to my approach.

> 
> In my humble opinion, Port IO config access is here to stay, having
> been defined as an architected mechanism in the PCI 2.1 spec.
> 
> This is most especially true for x86.
> 
> In other words, for x86, I don't think we need to worry about Port
> IO config access ever going away at all.

You're wrong there. Sad to say, but you're wrong there.

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] updating ctime and mtime at syncing

2008-01-12 Thread Anton Salikhmetov

http://bugzilla.kernel.org/show_bug.cgi?id=2645

Changes for updating the ctime and mtime fields for memory-mapped files:

1) new flag triggering update of the inode data;
2) new function to update ctime and mtime for block device files;
3) new helper function to update ctime and mtime when needed;
4) updating time stamps for mapped files in sys_msync() and do_fsync();
5) implementing the feature of auto-updating ctime and mtime.

Signed-off-by: Anton Salikhmetov <[EMAIL PROTECTED]>
---
 fs/buffer.c |1 +
 fs/fs-writeback.c   |2 ++
 fs/inode.c  |   42 +++---
 fs/sync.c   |2 ++
 include/linux/fs.h  |9 -
 include/linux/pagemap.h |3 ++-
 mm/msync.c  |   24 
 mm/page-writeback.c |1 +
 8 files changed, 67 insertions(+), 17 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 7249e01..09adf7e 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -719,6 +719,7 @@ static int __set_page_dirty(struct page *page,
}
write_unlock_irq(&mapping->tree_lock);
__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
+   set_bit(AS_MCTIME, &mapping->flags);
 
return 1;
 }
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 0fca820..c25ebd5 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -243,6 +243,8 @@ __sync_single_inode(struct inode *inode, struct 
writeback_control *wbc)
 
spin_unlock(&inode_lock);
 
+   mapping_update_time(mapping);
+
ret = do_writepages(mapping, wbc);
 
/* Don't write the inode if only I_DIRTY_PAGES was set */
diff --git a/fs/inode.c b/fs/inode.c
index ed35383..c02bfab 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1243,8 +1243,8 @@ void touch_atime(struct vfsmount *mnt, struct dentry 
*dentry)
 EXPORT_SYMBOL(touch_atime);
 
 /**
- * file_update_time-   update mtime and ctime time
- * @file: file accessed
+ * inode_update_time   -   update mtime and ctime time
+ * @inode: inode accessed
  *
  * Update the mtime and ctime members of an inode and mark the inode
  * for writeback.  Note that this function is meant exclusively for
@@ -1253,10 +1253,8 @@ EXPORT_SYMBOL(touch_atime);
  * S_NOCTIME inode flag, e.g. for network filesystem where these
  * timestamps are handled by the server.
  */
-
-void file_update_time(struct file *file)
+void inode_update_time(struct inode *inode)
 {
-   struct inode *inode = file->f_path.dentry->d_inode;
struct timespec now;
int sync_it = 0;
 
@@ -1279,8 +1277,39 @@ void file_update_time(struct file *file)
if (sync_it)
mark_inode_dirty_sync(inode);
 }
+EXPORT_SYMBOL(inode_update_time);
+
+/*
+ * Update the ctime and mtime stamps for memory-mapped block device files.
+ */
+static void bd_inode_update_time(struct inode *inode)
+{
+   struct block_device *bdev = inode->i_bdev;
+   struct list_head *p;
+
+   if (bdev == NULL)
+   return;
+
+   mutex_lock(&bdev->bd_mutex);
+   list_for_each(p, &bdev->bd_inodes) {
+   inode = list_entry(p, struct inode, i_devices);
+   inode_update_time(inode);
+   }
+   mutex_unlock(&bdev->bd_mutex);
+}
 
-EXPORT_SYMBOL(file_update_time);
+/*
+ * Update the ctime and mtime stamps after checking if they are to be updated.
+ */
+void mapping_update_time(struct address_space *mapping)
+{
+   if (test_and_clear_bit(AS_MCTIME, &mapping->flags)) {
+   if (S_ISBLK(mapping->host->i_mode))
+   bd_inode_update_time(mapping->host);
+   else
+   inode_update_time(mapping->host);
+   }
+}
 
 int inode_needs_sync(struct inode *inode)
 {
@@ -1290,7 +1319,6 @@ int inode_needs_sync(struct inode *inode)
return 1;
return 0;
 }
-
 EXPORT_SYMBOL(inode_needs_sync);
 
 int inode_wait(void *word)
diff --git a/fs/sync.c b/fs/sync.c
index 7cd005e..5561464 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -87,6 +87,8 @@ long do_fsync(struct file *file, int datasync)
goto out;
}
 
+   mapping_update_time(mapping);
+
ret = filemap_fdatawrite(mapping);
 
/*
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b3ec4a4..1dccd4b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1977,7 +1977,14 @@ extern int buffer_migrate_page(struct address_space *,
 extern int inode_change_ok(struct inode *, struct iattr *);
 extern int __must_check inode_setattr(struct inode *, struct iattr *);
 
-extern void file_update_time(struct file *file);
+extern void inode_update_time(struct inode *);
+
+static inline void file_update_time(struct file *file)
+{
+   inode_update_time(file->f_path.dentry->d_inode);
+}
+
+extern void mapping_update_time(struct address_space *);
 
 static inline ino_t parent_ino(struct dentry *dentry)
 {
diff --git a/include/linux/pagemap.h b/includ

[PATCH 0/2] yet another attempt to fix the ctime and mtime issue

2008-01-12 Thread Anton Salikhmetov

The POSIX standard requires that the ctime and mtime fields
for memory-mapped files should be updated after a write
reference to the memory region where the file data is mapped.
At least FreeBSD 6.2 and HP-UX 11i implement this properly.
Linux does not, which leads to data loss problems in database
backup applications.

Kernel Bug Tracker contains more information about the problem:

http://bugzilla.kernel.org/show_bug.cgi?id=2645

There have been several attempts in the past to address this
issue. Following are a few links to LKML discussions related
to this bug:

http://lkml.org/lkml/2006/5/17/138
http://lkml.org/lkml/2007/2/21/242
http://lkml.org/lkml/2008/1/7/234

All earlier solutions were criticized. Some solutions did not
handle memory-mapped block devices properly. Some led to forcing
applications to explicitly call msync() to update file metadata.
Some contained errors in using kernel synchronization primitives.

In the two patches that follow, I would like to propose a new
solution.

This is the third version of my changes. This version takes
into account all feedback I received for the two previous versions.
The overall design remains basically the same as the one that
was acked by Rick van Riel:

http://lkml.org/lkml/2008/1/11/208

To the best of my knowledge, these patches are free of all the
drawbacks found during previous attempts by Peter Staubach,
Miklos Szeredi and myself.

New since the previous version:

1) no need to explicitly call msync() to update file times;
2) changing block device data is visible to all device files
   associated with the block device;
3) in the cleanup part, the error checks are separated out as
   suggested by Rik van Riel;
4) some small refinements accodring to the LKML comments.

This is how I tested the patches.

1. To test the features mentioned above, I wrote a unit test
   available from

   http://bugzilla.kernel.org/attachment.cgi?id=14430

   I verified that the unit test passed successfully for both
   regular files and block device files. For the unit test I
   used the following architectures: 32-bit x86, x86_64 and
   MIPS32 (cross-compiled from x86_64).

2. I did build tests with allmodconfig and allyesconfig on x86_64.

3. I ran the following test cases from the LTP test suite:

   msync01
   msync02
   msync03
   msync04
   msync05
   mmapstress01
   mmapstress09
   mmapstress10

   No regressions were found by these test cases.

I think that the bug #2645 is resolved by these patches.

Please apply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] massive code cleanup of sys_msync()

2008-01-12 Thread Anton Salikhmetov

Substantial code cleanup of the sys_msync() function:

1) using the PAGE_ALIGN() macro instead of "manual" alignment;
2) improved readability of the loop traversing the process memory regions.

Signed-off-by: Anton Salikhmetov <[EMAIL PROTECTED]>
---
 mm/msync.c |   78 +++-
 1 files changed, 35 insertions(+), 43 deletions(-)

diff --git a/mm/msync.c b/mm/msync.c
index 144a757..ff654c9 100644
--- a/mm/msync.c
+++ b/mm/msync.c
@@ -1,24 +1,25 @@
 /*
  * linux/mm/msync.c
  *
+ * The msync() system call.
  * Copyright (C) 1994-1999  Linus Torvalds
+ *
+ * Substantial code cleanup.
+ * Copyright (C) 2008 Anton Salikhmetov <[EMAIL PROTECTED]>
  */
 
-/*
- * The msync() system call.
- */
+#include 
 #include 
 #include 
 #include 
-#include 
-#include 
 #include 
+#include 
 
 /*
  * MS_SYNC syncs the entire file - including mappings.
  *
  * MS_ASYNC does not start I/O (it used to, up to 2.5.67).
- * Nor does it marks the relevant pages dirty (it used to up to 2.6.17).
+ * Nor does it mark the relevant pages dirty (it used to up to 2.6.17).
  * Now it doesn't do anything, since dirty pages are properly tracked.
  *
  * The application may now run fsync() to
@@ -33,71 +34,62 @@ asmlinkage long sys_msync(unsigned long start, size_t len, 
int flags)
unsigned long end;
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma;
-   int unmapped_error = 0;
-   int error = -EINVAL;
+   int error = 0, unmapped_error = 0;
 
if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC))
-   goto out;
+   return -EINVAL;
if (start & ~PAGE_MASK)
-   goto out;
+   return -EINVAL;
if ((flags & MS_ASYNC) && (flags & MS_SYNC))
-   goto out;
-   error = -ENOMEM;
-   len = (len + ~PAGE_MASK) & PAGE_MASK;
+   return -EINVAL;
+
+   len = PAGE_ALIGN(len);
end = start + len;
if (end < start)
-   goto out;
-   error = 0;
+   return -ENOMEM;
if (end == start)
-   goto out;
+   return 0;
+
/*
 * If the interval [start,end) covers some unmapped address ranges,
 * just ignore them, but return -ENOMEM at the end.
 */
down_read(&mm->mmap_sem);
vma = find_vma(mm, start);
-   for (;;) {
+   do {
struct file *file;
 
-   /* Still start < end. */
-   error = -ENOMEM;
-   if (!vma)
-   goto out_unlock;
-   /* Here start < vma->vm_end. */
+   if (!vma) {
+   error = -ENOMEM;
+   break;
+   }
if (start < vma->vm_start) {
start = vma->vm_start;
-   if (start >= end)
-   goto out_unlock;
+   if (start >= end) {
+   error = -ENOMEM;
+   break;
+   }
unmapped_error = -ENOMEM;
}
-   /* Here vma->vm_start <= start < vma->vm_end. */
-   if ((flags & MS_INVALIDATE) &&
-   (vma->vm_flags & VM_LOCKED)) {
+   if ((flags & MS_INVALIDATE) && (vma->vm_flags & VM_LOCKED)) {
error = -EBUSY;
-   goto out_unlock;
+   break;
}
file = vma->vm_file;
-   start = vma->vm_end;
-   if ((flags & MS_SYNC) && file &&
-   (vma->vm_flags & VM_SHARED)) {
+   if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) {
get_file(file);
up_read(&mm->mmap_sem);
error = do_fsync(file, 0);
fput(file);
-   if (error || start >= end)
-   goto out;
+   if (error)
+   return error;
down_read(&mm->mmap_sem);
-   vma = find_vma(mm, start);
-   } else {
-   if (start >= end) {
-   error = 0;
-   goto out_unlock;
-   }
-   vma = vma->vm_next;
}
-   }
-out_unlock:
+
+   start = vma->vm_end;
+   vma = vma->vm_next;
+   } while (start < end);
up_read(&mm->mmap_sem);
-out:
+
return error ? : unmapped_error;
 }
-- 
1.4.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Huawei EC321 CDMA PCCARD support broken

2008-01-12 Thread Zhang Weiwu

Daolong Wang wrote:
> Sigh! I was just going to buy one……
>   
If you found any CDMA PCCARD working for Linux, please also let me know,
I am looking for a replacement myself. If you wait a few weeks later, I
get a Ubuntu 6.04 live CD from my Beijing office and then I'll let you
know if this product really work on older kernel.

For your information, BORA 5188 USB is what my friend's using. It works
fine on Ubuntu latest kernel:
http://auction1.taobao.com/auction/0/item_detail-0db2-146885a140b439a1aa469c2c9b3562dc.jhtml
Unfortunately she took the last one available on taobao.com, thus I
cannot buy one myself.

Meanwhile it's my wish to call attention on support of EC321 card in
kernel list, because, as you are probably in China, you know, it's too
popular device.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM REMAINS: [sata_nv ADMA breaks ATAPI] Crash on accessing DVD-RAM

2008-01-12 Thread James Bottomley

On Sat, 2008-01-12 at 19:38 -0600, Robert Hancock wrote:
> James Bottomley wrote:
> > On Sat, 2008-01-12 at 17:04 -0600, Robert Hancock wrote:
> >> I don't think the problem is that there's some buffer which is getting 
> >> allocated above 4GB and never bounced, since the problem goes away if 
> >> ADMA is disabled entirely and the DMA mask remains 32-bit always. My 
> >> guess is something is basing its decision on whether to bounce or not on 
> >> the device DMA mask. That can't possibly work properly for sata_nv since 
> >> the same PCI device has 2 ports, one of which can be in ADMA mode and 
> >> 64-bit capable and the other can be in legacy mode and only 32-bit capable.
> > 
> > Erm, well, you can't decouple them.  Having a differing blk queue bounce
> > mask and device mask is going to cause huge trouble.  The reason is this
> > insidious nasty called swiotlb.  Basically, with it enabled (and again,
> > it can be on ia64 or x86_64), the kernel can bypass the bounce limit
> > safe in the knowledge that swiotlb will fix up behind in the dma_map_
> > Unfortunately, if the device mask doesn't match the queue mask then
> > swiotlb will never kick in and you'll end up with mapped pages beyond
> > the 4GB limit.
> 
> Yuck.. All the IOMMU DMA mapping code checks against the device DMA 
> mask, so it looks like if we get to the point of doing the DMA mapping 
> on >4GB addresses in libata we're screwed with this approach.
> 
> The key problem is that both ata_ports share the same struct device with 
> one DMA mask which really doesn't match what this controller wants. I 
> wonder if we could do a different struct device for each port?
> 
> Other than that, I guess the solutions would be to just set a 32-bit 
> mask on the device if either port has an ATAPI device connected (which 
> is fairly ugly, considering that you could do things like hotplug an 
> ATAPI device when the other port was in use, for example), or do 
> something to prevent requests from reaching this point with >4GB 
> addresses in the first place..

Well ... assuming this is the problem (and perhaps we'd better get the
traces to confirm) there are at least three possible solutions:

 1. As you say, just take the pci device mask down to 32 bits.
 2. Find the problematic allocations and add GFP_DMA32
 3. set the mask on the actual SCSI device rather than the PCI
device and pass that into dma_map_ (this approach would have to
get signoff from the arch people; I know it will work on parisc
and x86 but I'm not sure about any other arch).
 
> >> Tejun, I believe you had a patch that was printing warnings when libata 
> >> tried to program a legacy PRD with an address over 4GB. Could we change 
> >> that to WARN_ON and get someone experiencing this to try it and
> >> see what the stack trace points to?
> > 
> > Unfortunately, the stack trace probably won't help, since the command
> > likely gets issued from the block request function, so the trace won't
> > go back to the culpable initiator; that's why the command would be
> > helpful.
> 
> Well, dumping the ATA command surely isn't helpful, as I'm sure it will 
> be PACKET. I guess we'd have to dump out the actual CDB..

Sorry, when a SCSI person says dump the command, they mean the CDB.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kvm-devel] boot stops after console handover?

2008-01-12 Thread Carlo Marcelo Arenas Belon

On Sun, Jan 13, 2008 at 01:19:13AM +, Antoine Martin wrote:
> Carlo Marcelo Arenas Belon wrote:
> > On Sat, Jan 12, 2008 at 11:01:28PM +0200, Avi Kivity wrote:
> >> Antoine Martin wrote:
> >>
> >>> FYI, just tried building 2.6.24-rc7-git4 and got this warning:
> >>> (...)
> >> Probably harmless, but worth reporting to lkml.
> > 
> > couldn't replicate it here, but I'd seen usually those kind of errors
> > when building a kernel that hasn't been "mrproper" between re-configurations
> > 
> cp .config ../config-2.6.24-rc7-git4
> make mrproper
> cp ../config-2.6.24-rc7-git4 .config

make oldconfig  # presumed

> make -j 4

replicated with your configuration, but can't still reproduce it with mine
even when forcing as you do

  CONFIG_KVM=y
  CONFIG_KVM_INTEL=y

available from :

  http://tapir.sajinet.com.pe/kvm/config-2.6.24-rc7-git4

> (...)
>   MODPOST vmlinux.o
> WARNING: vmlinux.o(.text+0x2cb657): Section mismatch: reference to
> .init.text:register_cpu_notifier (between 'kvm_init_x86' and 'kvm_sched_in')
> WARNING: vmlinux.o(.text.head+0xe4): Section mismatch: reference to
> .init.data.2:trampoline_level4_pgt (between 'ident_complete' and
> 'secondary_startup_64')
> WARNING: vmlinux.o(.text.head+0xeb): Section mismatch: reference to
> .init.data.2:trampoline_level4_pgt (between 'ident_complete' and
> 'secondary_startup_64')
>   LD  vmlinux
>   SYSMAP  System.map
>   SYSMAP  .tmp_System.map
> 
> You can grab the .config here:
> http://194.145.196.85/kvm/config-2.6.24-rc7-git4

is this the configuration for the host kernel?

if you want to use the last code for kvm you should have CONFIG_KVM disabled
and instead use the external module as instructed in (HINT: you don't have a
patched kernel) :

  http://kvm.qumranet.com/kvmwiki/HOWTO1

not sure which version will be part of 2.6.24 but I suspect it might not be
kvm 59.

Carlo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Robert Hancock


Linus Torvalds wrote:


On Fri, 11 Jan 2008, Matthew Wilcox wrote:

Did I miss a bug report?  The only problems I'm currently aware of are
the ones where using MMCONFIG during BAR probing causes a hard lockup on
some Intel machines, and the ones where we get bad config data on some
AMD machines due to the configuration retry status being mishandled.


Hmm. Were all those reports root-caused to just that BAR probing? If so, 
we may be in better shape than I worried.


As far as I'm aware, the known MMCONFIG-related issues that I'm aware of 
are or have been:


-Some devices built into the AMD K8 integrated northbridge can't be 
reached by MMCONFIG - already handled


-Overlap of device BAR and MMCONFIG aperature during BAR sizing causing 
lockup - can be avoided by disabling device decode during BAR sizing.


-PCI Express CRS-related issues - already handled by disabling CRS by 
default


-Devices behind certain host bridges (some AMD HT to PCI-X bridges, 
others?) can't be reached by MMCONFIG - can be handled by Tony Camuso's 
patch or something similar (note that this is really a BIOS bug, it 
should not list those buses in the MCFG table if MMCONFIG cannot access 
them, and if it didn't I think we could already handle that)


-Some issue with some AMD CPUs needing MMCONFIG accesses to use a 
certain register I believe? already handled?


Of these, I think the PCI BAR/MMCONFIG overlap problem is responsible 
for by far the most cases of machines thought to have "broken MMCONFIG", 
when in fact they were nothing of the sort. I don't recall hearing of a 
single machine where MMCONFIG really just didn't work at all.


As I've mentioned before, all of these issues (well, I suppose not the 
BAR overlap one) need to be resolved whether we have Arjan's patch or 
not, otherwise if a driver does opt in and tries to use extended config 
space it will still break. And if they are resolved, the patch seems 
quite pointless.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Documentation: Add 00-INDEX file for AoE

2008-01-12 Thread Jesper Juhl

Documentation/aoe/ is missing a 00-INDEX file. Add one.

Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>
---

 00-INDEX |   18 ++
 1 file changed, 18 insertions(+)

--- /dev/null   2005-11-21 04:22:37.0 +0100
+++ linux-2.6/Documentation/aoe/00-INDEX2008-01-13 02:37:29.0 
+0100
@@ -0,0 +1,18 @@
+00-INDEX
+   - this file.
+aoe.txt
+   - ATA over Ethernet (AoE) driver user guide and pointer to tools.
+autoload.sh
+   - script for making the AoE driver autoload via /etc/modprobe.conf.
+mkdevs.sh
+   - script for creating required AoE device nodes.
+mkshelf.sh
+   - script for making one shelf's worth of block device nodes.
+status.sh
+   - script to collate and present sysfs information about AoE storage.
+todo.txt
+   - list of things yet to be done for the AoE driver.
+udev-install.sh
+   - script to install the aoe-specific udev rules from udev.txt.
+udev.txt
+   - list of udev rules for AoE device node creation.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM REMAINS: [sata_nv ADMA breaks ATAPI] Crash on accessing DVD-RAM

2008-01-12 Thread Robert Hancock


James Bottomley wrote:

On Sat, 2008-01-12 at 17:04 -0600, Robert Hancock wrote:
I don't think the problem is that there's some buffer which is getting 
allocated above 4GB and never bounced, since the problem goes away if 
ADMA is disabled entirely and the DMA mask remains 32-bit always. My 
guess is something is basing its decision on whether to bounce or not on 
the device DMA mask. That can't possibly work properly for sata_nv since 
the same PCI device has 2 ports, one of which can be in ADMA mode and 
64-bit capable and the other can be in legacy mode and only 32-bit capable.


Erm, well, you can't decouple them.  Having a differing blk queue bounce
mask and device mask is going to cause huge trouble.  The reason is this
insidious nasty called swiotlb.  Basically, with it enabled (and again,
it can be on ia64 or x86_64), the kernel can bypass the bounce limit
safe in the knowledge that swiotlb will fix up behind in the dma_map_
Unfortunately, if the device mask doesn't match the queue mask then
swiotlb will never kick in and you'll end up with mapped pages beyond
the 4GB limit.


Yuck.. All the IOMMU DMA mapping code checks against the device DMA 
mask, so it looks like if we get to the point of doing the DMA mapping 
on >4GB addresses in libata we're screwed with this approach.


The key problem is that both ata_ports share the same struct device with 
one DMA mask which really doesn't match what this controller wants. I 
wonder if we could do a different struct device for each port?


Other than that, I guess the solutions would be to just set a 32-bit 
mask on the device if either port has an ATAPI device connected (which 
is fairly ugly, considering that you could do things like hotplug an 
ATAPI device when the other port was in use, for example), or do 
something to prevent requests from reaching this point with >4GB 
addresses in the first place..




Tejun, I believe you had a patch that was printing warnings when libata 
tried to program a legacy PRD with an address over 4GB. Could we change 
that to WARN_ON and get someone experiencing this to try it and

see what the stack trace points to?


Unfortunately, the stack trace probably won't help, since the command
likely gets issued from the block request function, so the trace won't
go back to the culpable initiator; that's why the command would be
helpful.


Well, dumping the ATA command surely isn't helpful, as I'm sure it will 
be PACKET. I guess we'd have to dump out the actual CDB..

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Tony Camuso


Thanks, Arjan.

The problem we have been experiencing has to do with Northbridges,
not with devices.

As far as the device is concerned, after the Northbridge translates
the config access into PCI bus cycles, the device has no idea what
mechanism drove the Northbridge to the translation.

That is to say, the device does not know whether the config cycle
on the bus was caused by an MMCONFIG cycle or a legacy Port IO
cycle delivered to the Northbridge.

In systems that had Northbridges that did not respond correctly to
MMCONFIG cycles, like the AMD 8132, we (HP & RH) were blacklisting
whole platforms to limit them to Port IO PCI config.

However, when platforms emerged using both legacy PCI and PCI express,
the platforms that were limited to Port IO config cycles were not
express compliant, since the express spec requires the platform to
be able to address the full 4096 byte region of config space to
be considered express-compliant.

The patch I devised concerned itself with Northbridges and separated
MMCONFIG-compliant buses from those that could not handle MMCONFIG.

Therefore, the express bus in the platform could happily employ
MMCONFIG to access the entire 4K region, while the legacy bus
with the non-compliant Northbridge could be restricted to Port IO
config.

However, even with my patch, the problem remained where devices
requiring large displacements could overlap the BIOS-mapped
MMCONFIG region. In such a situation, where the bus has passed
the MMCONFIG test, the MMCONFIG region can get doubly mapped by
bus-sizing code, causing the system to hang.

The remedy proposed by Loic and implemented by Ivan is actually
quite elegant, in that it addresses all these problems quite
effectively while eliminating a ration of specialized and somewhat
obscure code.

In my humble opinion, Port IO config access is here to stay, having
been defined as an architected mechanism in the PCI 2.1 spec.

This is most especially true for x86.

In other words, for x86, I don't think we need to worry about Port
IO config access ever going away at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kvm-devel] boot stops after console handover?

2008-01-12 Thread Antoine Martin

Carlo Marcelo Arenas Belon wrote:
> On Sat, Jan 12, 2008 at 11:01:28PM +0200, Avi Kivity wrote:
>> Antoine Martin wrote:
>>
>>> FYI, just tried building 2.6.24-rc7-git4 and got this warning:
>>> (...)
>> Probably harmless, but worth reporting to lkml.
> 
> couldn't replicate it here, but I'd seen usually those kind of errors
> when building a kernel that hasn't been "mrproper" between re-configurations
> 
cp .config ../config-2.6.24-rc7-git4
make mrproper
cp ../config-2.6.24-rc7-git4 .config
make -j 4
(...)
  MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x2cb657): Section mismatch: reference to
.init.text:register_cpu_notifier (between 'kvm_init_x86' and 'kvm_sched_in')
WARNING: vmlinux.o(.text.head+0xe4): Section mismatch: reference to
.init.data.2:trampoline_level4_pgt (between 'ident_complete' and
'secondary_startup_64')
WARNING: vmlinux.o(.text.head+0xeb): Section mismatch: reference to
.init.data.2:trampoline_level4_pgt (between 'ident_complete' and
'secondary_startup_64')
  LD  vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map

You can grab the .config here:
http://194.145.196.85/kvm/config-2.6.24-rc7-git4

Antoine
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Arjan van de Ven

On Sat, 12 Jan 2008 19:12:23 -0500
Tony Camuso <[EMAIL PROTECTED]> wrote:

> Arjan,
> 
> I have not seen your MMCONFIG patch.
> 
> Would you mind sending me a copy?
> 

sure

On PCs, PCI extended configuration space (4Kb) is riddled with problems
associated with the memory mapped access method (MMCONFIG). At the same
time, there are very few machines that actually need or use this
extended configuration space.

At this point in time, the only sensible action is to make access to the
extended configuration space an opt-in operation for those device
drivers that need/want access to this space, as well as for those
userland diagnostics utilities that (on admin request) want to access
this space.

It's inevitable that this is done per device rather than per bus; we'll
be needing per device PCI quirks to turn this extended config space off
over time no matter what; in addition, it gives the least amount of
surprise: loading a driver for a device only impacts that one device,
not a whole bus worth of devices (although it'll be common to have one
physical device per bus on PCI-E).

The (desireable) side-effect of this patch is that all enumeration is
done using normal configuration cycles.

The patch below splits the lower level PCI config space operation (which
operate on a bus) in two: one that normally only operates on traditional
space, and one that gets used after the driver has opted in to using the
extended configuration space. This has lead to a little code
duplication, but it's not all that bad (most of it is prototypes in
headers and such).

Architectures that have a solid reliable way to get to extended
configuration space can just keep doing what they do now and allow
extended space access from the "traditional" bus ops, and just not fill
in the new bus ops.  (This could include x86 for, say, BIOS year 2009
and later, but doesn't right now)

This patch also adds a sysfs property for each device into which root
can write a '1' to enable extended configuration space. The kernel will
print a notice into dmesg when this happens (including the name of the
app) so that if the system crashes as a result of this action, the user
can know what action/tool caused it.

Signed-off-by: Arjan van de Ven <[EMAIL PROTECTED]>
Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>

---
 Documentation/ABI/testing/sysfs-pci-extended-config |   39 
 arch/x86/pci/common.c   |   23 +
 arch/x86/pci/init.c |   10 
 arch/x86/pci/mmconfig_32.c  |2 
 arch/x86/pci/mmconfig_64.c  |2 
 arch/x86/pci/pci.h  |2 
 drivers/pci/access.c|   46 +++
 drivers/pci/pci-sysfs.c |   31 +
 drivers/pci/pci.c   |   28 +++
 include/linux/pci.h |   47 +---
 10 files changed, 222 insertions(+), 8 deletions(-)

--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-pci-extended-config
@@ -0,0 +1,39 @@
+What:  /sys/devices/pci//extended_config_space
+Date:  January 11, 2008
+Contact:   Arjan van de Ven <[EMAIL PROTECTED]>
+Description:
+   This attribute is for use for system-diagnostic software
+   only.
+
+   The kernel may decide to restrict PCI configuration space
+   access for userspace to the first 64 or 256 bytes by
+   default, for stability reasons. This attribute, when
+   present, can be used to request access to the full
+   4Kb from the kernel.
+
+   Request to get access to the full 4Kb can be done by
+   writing a '1' into this attribute file. All other values
+   are reserved for future use and should not be used by
+   software at this point.
+
+   The kernel may log the request to the various kernel
+   logging services. The kernel may decide to ignore the
+   request if the kernel deems extended configuration space
+   access not reliable enough for the system or the device.
+   The kernel may decide to not present this attribute
+   if the kernel decides extended config space is reliable
+   and made available by default, or if the kernel decides
+   that extended configuration space will never be
+   accessible.
+
+   Software needs to gracefully deal with getting the
+   access not granted. Software also needs to gracefully deal
+   with this attribute not being present.
+
+   Due to the fragility of extended configuration space,
+   system diagnostic software should only set this attribute
+   on explicit user request, or in the case of GUI like tool

[PATCH 2/2] irda: avoid potential memory leak in irda_setsockopt()

2008-01-12 Thread Jesper Juhl


There are paths through the irda_setsockopt() function where we return and 
may or may not have allocated a new ias_obj but in any case have not used 
it for anything yet and we end up leaking memory.

As far as I can tell, in the case where we didn't allocate a new ias_ob 
but simply were planning to use one already available then we should not 
free it before returning. But when we have allocated a brand new ias_obj 
but have not yet used it or put it on any lists etc and then return, 
that's a memory leak.

There are two cases:

1)
   switch (optname) {
   case IRLMP_IAS_SET:
 ...
 if(ias_obj == (struct ias_object *) NULL) {
 /* Create a new object */
--[alloc]--> ias_obj = irias_new_object(ias_opt->irda_class_name,
jiffies);
 ...
 switch(ias_opt->irda_attrib_type) {
 case IAS_OCT_SEQ:
   /* Check length */
   if(ias_opt->attribute.irda_attrib_octet_seq.len >
  IAS_MAX_OCTET_STRING) {
   kfree(ias_opt);
--[leak]-->return -EINVAL;
 ...
 }
 irias_insert_object(ias_obj);

The allocated object isn't referenced at all until we get outside the 
inner switch, so clearly we leak it (if we took the path that allocated it 
that is).


2)
The second case is the same as the above, except it's the default: case in 
the inner switch instead of case IAS_OCT_SEQ:

   default :
 kfree(ias_opt);
 return -EINVAL;


The way I propose to fix this is with a new variable that keeps track of 
whether or not we found an existing ias_obj to use or if we took the 
allocation path, then use that variable to determine if we should free 
ias_obj before returning from the function.

I'm not very intimate with this code, so if there's a better solution I'd 
very much like to hear it. It's also entirely possible that someone with 
more knowledge of this code can prove that these cases can't actually ever 
happen - if that's the case then please let me know.

This patch is meant to be applied on top of 
  [PATCH 1/2] irda: return -ENOMEM upon failure to allocate new ias_obj

The Coverity checker gets credit for pointing its finger towards this.


Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>
---

 af_irda.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index e33f0a5..352e8a7 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1824,7 +1824,8 @@ static int irda_setsockopt(struct socket *sock, int 
level, int optname,
struct irda_sock *self = irda_sk(sk);
struct irda_ias_set*ias_opt;
struct ias_object  *ias_obj;
-   struct ias_attrib * ias_attr;   /* Attribute in IAS object */
+   struct ias_attrib  *ias_attr;   /* Attribute in IAS object */
+   int alloc_new_obj = 0;
int opt;
 
IRDA_DEBUG(2, "%s(%p)\n", __FUNCTION__, self);
@@ -1885,6 +1886,7 @@ static int irda_setsockopt(struct socket *sock, int 
level, int optname,
kfree(ias_opt);
return -ENOMEM;
}
+   alloc_new_obj = 1;
}
 
/* Do we have the attribute already ? */
@@ -1908,6 +1910,8 @@ static int irda_setsockopt(struct socket *sock, int 
level, int optname,
if(ias_opt->attribute.irda_attrib_octet_seq.len >
   IAS_MAX_OCTET_STRING) {
kfree(ias_opt);
+   if (alloc_new_obj)
+   kfree(ias_obj);
return -EINVAL;
}
/* Add an octet sequence attribute */
@@ -1936,6 +1940,8 @@ static int irda_setsockopt(struct socket *sock, int 
level, int optname,
break;
default :
kfree(ias_opt);
+   if (alloc_new_obj)
+   kfree(ias_obj);
return -EINVAL;
}
irias_insert_object(ias_obj);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] irda: return -ENOMEM upon failure to allocate new ias_obj

2008-01-12 Thread Jesper Juhl


irias_new_object() can fail its memory allocation and will return NULL in 
that case. I believe the proper thing to do is to catch this, free the 
ias_opt that was allocated earlier but won't be used and then return 
-ENOMEM.
There are assertions further on that check for a NULL ias_obj, but I think 
it's a lot nicer to simply return -ENOMEM to the caller here where we know 
a memory allocation failed, rather than hitting an assertion later.

note: I don't have any means of actually testing this, so it has been 
compile tested only.


Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>
---

 af_irda.c |4 
 1 file changed, 4 insertions(+)

diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index d5e4dd7..e33f0a5 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1881,6 +1881,10 @@ static int irda_setsockopt(struct socket *sock, int 
level, int optname,
/* Create a new object */
ias_obj = irias_new_object(ias_opt->irda_class_name,
   jiffies);
+   if (!ias_obj) {
+   kfree(ias_opt);
+   return -ENOMEM;
+   }
}
 
/* Do we have the attribute already ? */



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pda_power: add device_init_wakeup

2008-01-12 Thread Dmitry

Hi,

2008/1/13, Anton Vorontsov <[EMAIL PROTECTED]>:
> On Sun, Jan 13, 2008 at 01:50:16AM +0300, Dmitry Baryshkov wrote:
> > Hi,
> >
> > Please apply this patch. Otherwise pda_power can't generate wakeup
> > events. I forgot this in the initial pda_power suspend/resume patch.
> >
> > Add device_init_wakeup to init wakeup.
>
> Thanks, folded into "pda_power: add suspend/resume support" patch.
>
> FYI, I got it first time, just didn't have time to process. ;-)


Thank you :) Sorry for pinging you too fast.

-- 
With best wishes
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Tony Camuso


Arjan,

I have not seen your MMCONFIG patch.

Would you mind sending me a copy?

Thanks.

Tony

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc7-git4: Reported regressions from 2.6.23

2008-01-12 Thread Rafael J. Wysocki

On Sunday, 13 of January 2008, Frans Pop wrote:
> Rafael J. Wysocki wrote:
> > If you know of any other unresolved regressions from 2.6.23, please let me
> > know either and I'll add them to the list.
> 
> Does this qualify as a regression?

I guess so.

> I have not seen a fix posted for it yet. 
> 
> http://lkml.org/lkml/2007/12/16/121
> Follow-ups:
> http://lkml.org/lkml/2008/1/6/228

Thanks for the pointer, I'll add it to the list.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pda_power: add device_init_wakeup

2008-01-12 Thread Anton Vorontsov

On Sun, Jan 13, 2008 at 01:50:16AM +0300, Dmitry Baryshkov wrote:
> Hi,
> 
> Please apply this patch. Otherwise pda_power can't generate wakeup
> events. I forgot this in the initial pda_power suspend/resume patch.
> 
> Add device_init_wakeup to init wakeup.

Thanks, folded into "pda_power: add suspend/resume support" patch.

FYI, I got it first time, just didn't have time to process. ;-)

> Signed-off-by: Dmitry Baryshkov <[EMAIL PROTECTED]>
> 
> diff --git a/drivers/power/pda_power.c b/drivers/power/pda_power.c
> index d98622f..28360e8 100644
> --- a/drivers/power/pda_power.c
> +++ b/drivers/power/pda_power.c
> @@ -207,6 +207,8 @@ static int pda_power_probe(struct platform_device *pdev)
>   }
>   }
>  
> + device_init_wakeup(&pdev->dev, 1);
> +
>   return 0;
>  
>  usb_irq_failed:

-- 
Anton Vorontsov
email: [EMAIL PROTECTED]
backup email: [EMAIL PROTECTED]
irc://irc.freenode.net/bd2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFD] Incremental fsck

2008-01-12 Thread Daniel Phillips

On Wednesday 09 January 2008 01:16, Andreas Dilger wrote:
> While an _incremental_ fsck isn't so easy for existing filesystem
> types, what is pretty easy to automate is making a read-only snapshot
> of a filesystem via LVM/DM and then running e2fsck against that.  The
> kernel and filesystem have hooks to flush the changes from cache and
> make the on-disk state consistent.
>
> You can then set the the ext[234] superblock mount count and last
> check time via tune2fs if all is well, or schedule an outage if there
> are inconsistencies found.
>
> There is a copy of this script at:
> http://osdir.com/ml/linux.lvm.devel/2003-04/msg1.html
>
> Note that it might need some tweaks to run with DM/LVM2
> commands/output, but is mostly what is needed.

You can do this now with ddsnap (an out-of-tree device mapper target) 
either by checking a local snapshot or a replicated snapshot on a 
different machine, see:

http://zumastor.org/

Doing the check on a remote machine seems attractive because the fsck 
does not create a load on the server.

Regards,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Top 10 kernel oopses for the week ending January 12th, 2008

2008-01-12 Thread Arjan van de Ven


Adrian Bunk wrote:

On Sat, Jan 12, 2008 at 03:13:29PM -0800, Arjan van de Ven wrote:

Adrian Bunk wrote:
All the other reports only contain the plain trace. Is there any way to 
get more information whether the former is a pattern or not, and to

get this information somehow displayed on the webpage?

IF the kernel prints that its tainted or whatever it'll be shown, as well
as the exact versions etc etc if they are there.
Sadly none of this information is there prior to 2.6.24-rc4.
...


OK, the problem might actually not be the omission of displaying the 
tainted information but the omission of considering any relevant 
context.


Looking deeper:

Number #2424 is WARN_ON-after-tainted-oops.

Is your rank 1 just a symptom that the system is in a bad state after 
running in what is your rank 8?


In this case the information when following e.g. #2827 is quite useless 
since wherever you got this trace from all related context information 
like e.g. whether it's like #2424 just the symptom of a previous Oops is 
not displayed.


the tainted flags have a flag for "there was a previous oops", and if that's 
set,
the kerneloops.org website ignores the report. Simple as that.

In the worst case, an entry might only contain WARN_ON traces without 
any information where the traces came from and whether it's worth 
looking at them or whether the system always already was in a known-bad 
state when they occured?


again as of 2.6.24-rc4 or so, this is just no longer the case. The problem is 
with
older kernels which had a WARN_ON() that didn't print ANY information other than
a plain backtrace.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Top 10 kernel oopses for the week ending January 12th, 2008

2008-01-12 Thread Adrian Bunk

On Sat, Jan 12, 2008 at 03:13:29PM -0800, Arjan van de Ven wrote:
> Adrian Bunk wrote:
>>
>> All the other reports only contain the plain trace. Is there any way to 
>> get more information whether the former is a pattern or not, and to
>> get this information somehow displayed on the webpage?
>
> IF the kernel prints that its tainted or whatever it'll be shown, as well
> as the exact versions etc etc if they are there.
> Sadly none of this information is there prior to 2.6.24-rc4.
>...

OK, the problem might actually not be the omission of displaying the 
tainted information but the omission of considering any relevant 
context.

Looking deeper:

Number #2424 is WARN_ON-after-tainted-oops.

Is your rank 1 just a symptom that the system is in a bad state after 
running in what is your rank 8?

In this case the information when following e.g. #2827 is quite useless 
since wherever you got this trace from all related context information 
like e.g. whether it's like #2424 just the symptom of a previous Oops is 
not displayed.

In the worst case, an entry might only contain WARN_ON traces without 
any information where the traces came from and whether it's worth 
looking at them or whether the system always already was in a known-bad 
state when they occured?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[git patches] net driver fixes

2008-01-12 Thread Jeff Garzik


Please pull from 'upstream-linus' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git 
upstream-linus

to receive the following updates:

 MAINTAINERS |   10 ++-
 drivers/net/3c509.c |4 +
 drivers/net/Kconfig |   20 +++---
 drivers/net/fs_enet/fs_enet-main.c  |   11 ++-
 drivers/net/loopback.c  |2 +-
 drivers/net/netxen/netxen_nic.h |   69 +
 drivers/net/netxen/netxen_nic_init.c|   20 ++---
 drivers/net/netxen/netxen_nic_main.c|   70 ++---
 drivers/net/netxen/netxen_nic_niu.c |8 +-
 drivers/net/r8169.c |2 +-
 drivers/net/sky2.c  |   48 +++-
 drivers/net/sky2.h  |4 +-
 drivers/net/tulip/de4x5.c   |  127 +++
 drivers/net/tulip/tulip_core.c  |3 +-
 drivers/net/tulip/xircom_cb.c   |   54 ++---
 drivers/net/usb/asix.c  |6 +-
 drivers/net/wireless/rt2x00/rt2500usb.c |2 +-
 drivers/net/wireless/rt2x00/rt2x00pci.c |   20 -
 drivers/net/wireless/rt2x00/rt2x00usb.c |   17 -
 drivers/net/wireless/rt2x00/rt61pci.c   |   12 +++
 20 files changed, 238 insertions(+), 271 deletions(-)

Al Viro (3):
  xircom_cb endianness fixes
  de4x5 fixes
  endianness noise in tulip_core

Anton Vorontsov (1):
  fs_enet: check for phydev existence in the ethtool handlers

Dhananjay Phadke (1):
  netxen: fix byte-swapping in tx and rx

Emil Medve (1):
  Fixed a small typo in the loopback driver

Francois Romieu (1):
  r8169: fix missing loop variable increment

Ivo van Doorn (2):
  rt2x00: Corectly initialize rt2500usb MAC
  rt2x00: Put 802.11 data on 4 byte boundary

Jens Osterkamp (1):
  spidernet MAINTAINERship update

Krzysztof Helt (1):
  3c509: PnP resource management fix

Mattias Nissler (1):
  rt2x00: Allow rt61 to catch up after a missing tx report

Russ Dill (1):
  [usb netdev] asix: fix regression

Stephen Hemminger (3):
  ip1000: menu location change
  sky2: large memory workaround.
  sky2: remove check for PCI wakeup setting from BIOS

[EMAIL PROTECTED] (4):
  netxen: update MAINTAINERS
  netxen: update driver version
  netxen: stop second phy correctly
  netxen: optimize tx handling

diff --git a/MAINTAINERS b/MAINTAINERS
index b4f611c..92aa0a7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2739,8 +2739,8 @@ T:git 
kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.git
 S: Maintained
 
 NETXEN (1/10) GbE SUPPORT
-P: Amit S. Kale
-M: [EMAIL PROTECTED]
+P: Dhananjay Phadke
+M: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 W: http://www.netxen.com
 S: Supported
@@ -3611,8 +3611,10 @@ L:   linux-kernel@vger.kernel.org ?
 S: Supported
 
 SPIDERNET NETWORK DRIVER for CELL
-P: Linas Vepstas
-M: [EMAIL PROTECTED]
+P: Ishizaki Kou
+M: [EMAIL PROTECTED]
+P: Jens Osterkamp
+M: [EMAIL PROTECTED]
 L: [EMAIL PROTECTED]
 S: Supported
 
diff --git a/drivers/net/3c509.c b/drivers/net/3c509.c
index edda6e1..8fafac9 100644
--- a/drivers/net/3c509.c
+++ b/drivers/net/3c509.c
@@ -385,6 +385,7 @@ static int __init el3_probe(int card_idx)
 #if defined(__ISAPNP__)
static int pnp_cards;
struct pnp_dev *idev = NULL;
+   int pnp_found = 0;
 
if (nopnp == 1)
goto no_pnp;
@@ -430,6 +431,7 @@ __again:
pnp_cards++;
 
netdev_boot_setup_check(dev);
+   pnp_found = 1;
goto found;
}
}
@@ -560,6 +562,8 @@ no_pnp:
lp = netdev_priv(dev);
 #if defined(__ISAPNP__)
lp->dev = &idev->dev;
+   if (pnp_found)
+   lp->type = EL3_PNP;
 #endif
err = el3_common_init(dev);
 
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index d9107e5..114771a 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -166,16 +166,6 @@ config NET_SB1000
 
  If you don't have this card, of course say N.
 
-config IP1000
-   tristate "IP1000 Gigabit Ethernet support"
-   depends on PCI && EXPERIMENTAL
-   select MII
-   ---help---
- This driver supports IP1000 gigabit Ethernet cards.
-
- To compile this driver as a module, choose M here: the module
- will be called ipg.  This is recommended.
-
 source "drivers/net/arcnet/Kconfig"
 
 source "drivers/net/phy/Kconfig"
@@ -1992,6 +1982,16 @@ config E1000E
  To compile this driver as a module, choose M here. The module
  will be called e1000e.
 
+config IP1000
+   tristate "IP1000 Gigabit Ethernet support"
+   depends on PCI && EXPERIMENTAL
+   select MII
+   ---help---
+ This driver supports IP1000 gigabit Ethernet cards.
+
+ To compile this

Re: PROBLEM REMAINS: [sata_nv ADMA breaks ATAPI] Crash on accessing DVD-RAM

2008-01-12 Thread James Bottomley

On Sat, 2008-01-12 at 17:04 -0600, Robert Hancock wrote:
> I don't think the problem is that there's some buffer which is getting 
> allocated above 4GB and never bounced, since the problem goes away if 
> ADMA is disabled entirely and the DMA mask remains 32-bit always. My 
> guess is something is basing its decision on whether to bounce or not on 
> the device DMA mask. That can't possibly work properly for sata_nv since 
> the same PCI device has 2 ports, one of which can be in ADMA mode and 
> 64-bit capable and the other can be in legacy mode and only 32-bit capable.

Erm, well, you can't decouple them.  Having a differing blk queue bounce
mask and device mask is going to cause huge trouble.  The reason is this
insidious nasty called swiotlb.  Basically, with it enabled (and again,
it can be on ia64 or x86_64), the kernel can bypass the bounce limit
safe in the knowledge that swiotlb will fix up behind in the dma_map_
Unfortunately, if the device mask doesn't match the queue mask then
swiotlb will never kick in and you'll end up with mapped pages beyond
the 4GB limit.

> Tejun, I believe you had a patch that was printing warnings when libata 
> tried to program a legacy PRD with an address over 4GB. Could we change 
> that to WARN_ON and get someone experiencing this to try it and
> see what the stack trace points to?

Unfortunately, the stack trace probably won't help, since the command
likely gets issued from the block request function, so the trace won't
go back to the culpable initiator; that's why the command would be
helpful.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Top 10 kernel oopses for the week ending January 12th, 2008

2008-01-12 Thread Arjan van de Ven


Adrian Bunk wrote:


All the other reports only contain the plain trace. Is there any way to 
get more information whether the former is a pattern or not, and to

get this information somehow displayed on the webpage?


IF the kernel prints that its tainted or whatever it'll be shown, as well
as the exact versions etc etc if they are there.
Sadly none of this information is there prior to 2.6.24-rc4.
(I wonder if the patch to print this should be put in -stable ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM REMAINS: [sata_nv ADMA breaks ATAPI] Crash on accessing DVD-RAM

2008-01-12 Thread Robert Hancock


James Bottomley wrote:

With mem<=4098M or sata_nv.adma=0 it still mounts and works ok.
As I wrote, it would appear that somehow the blk_queue_bounce_limit 
setting that the driver has made is not being respected and the block 
layer is still trying to feed it addresses over 4GB. Any ideas anyone?


Actually, I'd be very sceptical that the blk_queue_bounce_limit isn't
working as advertised; there'd be a large number of failures if it were
not.

However, the "as advertised" part doesn't seem to be generally well
understood.  The point being that block commands are only bounced if
they come from the filesystem or the user, not if they're generated
directly inside the kernel.  Since the fault occurs before mount, it's
suggestive of the latter.

A long time ago, GFP_KERNEL allocations meant that the memory allocated
was physically under 4GB.  Then x86_64 (and before it ia64) wanted to
break this.  So they introduced a new flag:  GFP_DMA32 that behaved like
the old GFP_KERNEL and then changed GFP_KERNEL on their architectures to
return memory from anywhere.  I'd strongly suggest some piece of kernel
memory was allocated for a transfer buffer without GFP_DMA32 and then
passed in to the driver.  Unfortunately, that could be anywhere inside
cdrom or sr.  Knowing what the actual command is might help ... some of
the distinctive MMC media ones only have a single source.


Just to give some background on what sata_nv is trying to do:

There are two sets of static DMA memory allocations that sata_nv does, 
the legacy ATA PRD and padding buffer which need to be in the lower 4GB, 
and the ADMA APRD and CPB areas which can be anywhere in 64-bit memory. 
With this patch, this is done by setting a 32-bit DMA mask before 
allocating the legacy areas and setting a 64-bit DMA mask before 
allocating the ADMA areas. Previously the driver just set a 64-bit mask 
before the legacy PRD got allocated so it could end up above 4GB, in 
which case legacy DMA couldn't possibly work. That part of the problem 
appears to be successfully fixed by the patch in question.


There's a further problem with runtime DMA mapping, however. Normally 
when ADMA is enabled the controller can reach anywhere in 64-bit memory. 
However, if an ATAPI device is connected, since ADMA doesn't work with 
ATAPI commands we have to switch it off on that port and use legacy DMA, 
which is limited to 32-bit. This is where the blk_queue_bounce_limit 
call comes in, it's trying to make the block layer bounce requests above 
4GB when legacy DMA is in use.


I don't think the problem is that there's some buffer which is getting 
allocated above 4GB and never bounced, since the problem goes away if 
ADMA is disabled entirely and the DMA mask remains 32-bit always. My 
guess is something is basing its decision on whether to bounce or not on 
the device DMA mask. That can't possibly work properly for sata_nv since 
the same PCI device has 2 ports, one of which can be in ADMA mode and 
64-bit capable and the other can be in legacy mode and only 32-bit capable.


Tejun, I believe you had a patch that was printing warnings when libata 
tried to program a legacy PRD with an address over 4GB. Could we change 
that to WARN_ON and get someone experiencing this to try it and

see what the stack trace points to?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Arjan van de Ven

On Sun, 13 Jan 2008 00:49:11 +0300
Ivan Kokshaysky <[EMAIL PROTECTED]> wrote:

> On Sat, Jan 12, 2008 at 09:45:57AM -0800, Arjan van de Ven wrote:
> > btw this is my main objection to your patch; it intertwines the
> > conf1 and mmconfig code even more.
> 
> There is nothing wrong with it; please realize that mmconf and conf1
> are just different cpu-side interfaces. Both produce precisely the
> *same* bus cycles as far as the lower 256-byte space is concerned.
> 
> > When (and I'm saying "when" not "if") systems arrive that only have
> > MMCONFIG for some of the devices, we'll have to detangle this again,
> > and I'm really not looking forward to that.
> 
> MMCONFIG for *some* of the devices? This doesn't sound realistic
> from technical point of view.

you're wrong. 

> MMCONFIG-only systems? Sure. I really hope to see these. But it won't
> be PC-AT architecture anymore. It has to be something like alpha,
> for instance, fully utilizing the 64-bit address space, and we'll have
> to have the whole low-level PCI infrastructure completely different
> for these future platforms anyway.
> Right now, each and every x86 chipset *does* require working
> conf1 just in order to set up the mmconf aperture. It's the very
> fundamental thing, sort of design philosophy.

s/x86/pc/

and not even that.

Really this is a huge design mistake in your patch, the hard coding of conf1,
and for that reason I really don't think it should go in.

We have 4 or so methods on PC today to access config space, probably going to 6 
in the next year
or two. One of those methods *HARD PICKING* another one as "second best" for 
cases where it
doesn't want to deal with is WRONG. It really needs to be up to the 
architecture/platform
to decide which ops vector is the fallback. And yes on your current PC that 
might well be conf1.
But hardcoding that is not the right thing. We have the vectors, we have the 
ranking code,
just make a "second rank" thing. 
Oh wait, my patch did that ;)
Then let either the mmconfig code or the wrapper above it (doesn't matter, in 
fact, I can see
value of making this decision in the wrapper and keep mmconfig code simple and 
clean,
because maybe mmconfig IS the thing that the architecture says needs to deal 
with the lower 256 bytes)..

Oh wait my patch also did that pretty much ;)

The rest of my patch was defaulting to off. Is it that bit that you really hate?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pda_power: add device_init_wakeup

2008-01-12 Thread Dmitry Baryshkov

Hi,

Please apply this patch. Otherwise pda_power can't generate wakeup
events. I forgot this in the initial pda_power suspend/resume patch.

Add device_init_wakeup to init wakeup.

Signed-off-by: Dmitry Baryshkov <[EMAIL PROTECTED]>

diff --git a/drivers/power/pda_power.c b/drivers/power/pda_power.c
index d98622f..28360e8 100644
--- a/drivers/power/pda_power.c
+++ b/drivers/power/pda_power.c
@@ -207,6 +207,8 @@ static int pda_power_probe(struct platform_device *pdev)
}
}
 
+   device_init_wakeup(&pdev->dev, 1);
+
return 0;
 
 usb_irq_failed:


-- 
With best wishes
Dmitry

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc7-git4: Reported regressions from 2.6.23

2008-01-12 Thread Rafael J. Wysocki

On Saturday, 12 of January 2008, Harvey Harrison wrote:
> On Sat, 2008-01-12 at 22:05 +0100, Rafael J. Wysocki wrote:
> > [RFC: Would that be useful if I sent regression-fixing patches, CCed to the
> > appropriate maintainers/lists, along with the reports?]
> 
> Perhaps keep the regression report as-is, but for each regression with
> a patch follow up with an appropriately CC'd reply to the regression
> report.  Similar in concept to the regression report being patch 0/X
> or how the -stable series patches are announced?

That's exactly what I'm thinking about. :-)

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

backlight module for nvidia cards -- control backlight even with offb

2008-01-12 Thread Andy Wingo

Hi all,

I have a 12" powerbook, one of the last G4's, and have long been
irritated that I couldn't use offb because it has no backlight control.
This is more irritating now that the nouveau project's X drivers are
starting to work for me on PPC, but are incompatible with the nvidiafb
frame buffer.

I decided to rip out the backlight code from the nvidia frame buffer
into a separate module that can be loaded even when using offb as the
frame buffer. I am attaching the source, but you may find a tarball with
a makefile here:

  http://wingolog.org/pub/nvbacklight-0.1.tar.bz2

I do not know what the correct solution is. Ideally offb would export a
backlight device. I tried getting open firmware to give me the needed
information, but the "reg" entry for the backlight seems short, given
that the [EMAIL PROTECTED] #address-cells == 1 and #size-cells == 1:

  $ hd /proc/device-tree/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL 
PROTECTED]/reg 
    00 00 f3 00   ||
  0004

For that reason I'm copying Ben Herrenschmidt to see if he knows
something about a proper solution. For now I'll just add nvbacklight to
my /etc/modules. Ideas about a "proper solution" are appreciated.

Regards,

Andy

/*
 * nvbacklight.c: Backlight driver for nVidia graphics cards
 *
 * Copyright 2008 Andy Wingo <[EMAIL PROTECTED]>
 * Copyright 2004 Antonino Daplas <[EMAIL PROTECTED]>
 *
 * This file is subject to the terms and conditions of the GNU General Public
 * License.  See the file COPYING in the main directory of this archive
 * for more details.
 *
 */


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#ifdef CONFIG_PMAC_BACKLIGHT
#include 
#endif


struct nvbacklight_par {
struct pci_dev *pci_dev;
struct fb_info *info;
struct backlight_device *bd;

volatile u32 __iomem *REGS;
volatile u32 __iomem *PCRTC0;
volatile u32 __iomem *PCRTC;
volatile u32 __iomem *PRAMDAC0;
volatile u32 __iomem *PFB;
volatile u32 __iomem *PFIFO;
volatile u32 __iomem *PGRAPH;
volatile u32 __iomem *PEXTDEV;
volatile u32 __iomem *PTIMER;
volatile u32 __iomem *PMC;
volatile u32 __iomem *PRAMIN;
volatile u32 __iomem *FIFO;
volatile u32 __iomem *CURSOR;
volatile u8 __iomem *PCIO0;
volatile u8 __iomem *PCIO;
volatile u8 __iomem *PVIO;
volatile u8 __iomem *PDIO0;
volatile u8 __iomem *PDIO;
volatile u32 __iomem *PRAMDAC;

u32 fpSyncs;
};


/* We do not have any information about which values are allowed, thus
 * we used safe values.
 */
#define MIN_LEVEL 0x158
#define MAX_LEVEL 0x534
#define LEVEL_STEP ((MAX_LEVEL - MIN_LEVEL) / FB_BACKLIGHT_MAX)

#define NV_WR32(p,i,d)  (__raw_writel((d), (void __iomem *)(p) + (i)))
#define NV_RD32(p,i)(__raw_readl((void __iomem *)(p) + (i)))


static int nvidia_bl_get_level_brightness(struct fb_info *info, int level)
{
int nlevel;

/* Get and convert the value */
/* No locking of bl_curve since we read a single value */
nlevel = MIN_LEVEL + info->bl_curve[level] * LEVEL_STEP;

if (nlevel < 0)
nlevel = 0;
else if (nlevel < MIN_LEVEL)
nlevel = MIN_LEVEL;
else if (nlevel > MAX_LEVEL)
nlevel = MAX_LEVEL;

return nlevel;
}

static int nvidia_bl_update_status(struct backlight_device *bd)
{
struct nvbacklight_par *par = bl_get_data(bd);
u32 tmp_pcrt, tmp_pmc, fpcontrol;
int level;

if (bd->props.power != FB_BLANK_UNBLANK ||
bd->props.fb_blank != FB_BLANK_UNBLANK)
level = 0;
else
level = bd->props.brightness;

tmp_pmc = NV_RD32(par->PMC, 0x10F0) & 0x;
tmp_pcrt = NV_RD32(par->PCRTC0, 0x081C) & 0xFFFC;
fpcontrol = NV_RD32(par->PRAMDAC, 0x0848) & 0xCFCC;

if (level > 0) {
tmp_pcrt |= 0x1;
tmp_pmc |= (1 << 31); /* backlight bit */
tmp_pmc |= nvidia_bl_get_level_brightness(par->info, level) << 
16;
fpcontrol |= par->fpSyncs;
} else
fpcontrol |= 0x2022;

NV_WR32(par->PCRTC0, 0x081C, tmp_pcrt);
NV_WR32(par->PMC, 0x10F0, tmp_pmc);
NV_WR32(par->PRAMDAC, 0x848, fpcontrol);

return 0;
}

static int nvidia_bl_get_brightness(struct backlight_device *bd)
{
return bd->props.brightness;
}

static struct backlight_ops nvidia_bl_ops = {
.get_brightness = nvidia_bl_get_brightness,
.update_status  = nvidia_bl_update_status,
};

static struct fb_info *nvbacklight_attach(struct pci_dev *pd)
{
struct nvbacklight_par *par;
struct fb_info *info;
struct backlight_device *bd;

info = framebuffer_alloc(sizeof(struct nvbacklight

Re: regression: 100% io-wait with 2.6.24-rcX

2008-01-12 Thread Joerg Platte

Am Freitag, 11. Januar 2008 schrieb Fengguang Wu:
> On Thu, Jan 10, 2008 at 11:03:05AM +0100, Joerg Platte wrote:
> > Am Donnerstag, 10. Januar 2008 schrieb Fengguang Wu:
> > > > problem, because the iowait problem disappeared today after the
> > > > regular Debian update. I'll try to install the old package versions
> > > > to make it show up again. Maybe that helps to debug it.
> > >
> > > Thank you. I'm running sid, ext2 as rootfs now ;-)
> >
> > The error is back and I'm getting thousands of messages like this with
> > the patched kernel:
> >
> > mm/page-writeback.c 668 wb_kupdate: pdflush(146) 21115 global 3936 0 0 wc
> > _M > tw 1024 sk 0 requeue_io 301: inode 81441 size 0 at 08:07(sda7)
> > mm/page-writeback.c 668 wb_kupdate: pdflush(147) 17451 global 3936 0 0 wc
> > _M > tw 1024 sk 2 requeue_io 301: inode 81441 size 0 at 08:07(sda7)
> > mm/page-writeback.c 668 wb_kupdate: pdflush(147) 17451 global 3936 0 0 wc
> > _M > tw 1024 sk 2 requeue_io 301: inode 81441 size 0 at 08:07(sda7)
>
> Joerg, what's the output of `dumpe2fs /dev/sda7` and `lsof|grep /tmp`?

After another reboot I tried to get more information about the konqueror 
process possibly causing the iowait load by using strace -p. Here is the 
output:

gettimeofday({1200180588, 878508}, NULL) = 0
setitimer(ITIMER_VIRTUAL, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
rt_sigaction(SIGVTALRM, {SIG_DFL}, {0xb5cffed0, [VTALRM], SA_RESTART}, 8) = 0
gettimeofday({1200180588, 879942}, NULL) = 0
time(NULL)  = 1200180588
gettimeofday({1200180588, 880838}, NULL) = 0
gettimeofday({1200180588, 881284}, NULL) = 0
time(NULL)  = 1200180588
gettimeofday({1200180588, 882131}, NULL) = 0
gettimeofday({1200180588, 882572}, NULL) = 0
ioctl(5, FIONREAD, [0]) = 0
gettimeofday({1200180588, 883477}, NULL) = 0
select(16, [5 6 7 9 11 14 15], [], [], {0, 150095}) = 0 (Timeout)
gettimeofday({1200180589, 34269}, NULL) = 0
gettimeofday({1200180589, 34885}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 36672}, NULL) = 0
rt_sigaction(SIGVTALRM, {0xb5cffed0, [VTALRM], SA_RESTART}, {SIG_DFL}, 8) = 0
setitimer(ITIMER_VIRTUAL, {it_interval={10, 0}, it_value={5, 0}}, 
{it_interval={0, 0}, it_value={0, 0}}) = 0
gettimeofday({1200180589, 38555}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 39802}, NULL) = 0
setitimer(ITIMER_VIRTUAL, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
rt_sigaction(SIGVTALRM, {SIG_DFL}, {0xb5cffed0, [VTALRM], SA_RESTART}, 8) = 0
gettimeofday({1200180589, 40912}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 42019}, NULL) = 0
gettimeofday({1200180589, 42458}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 43303}, NULL) = 0
gettimeofday({1200180589, 43747}, NULL) = 0
ioctl(5, FIONREAD, [0]) = 0
gettimeofday({1200180589, 45834}, NULL) = 0
select(16, [5 6 7 9 11 14 15], [], [], {0, 149913}) = 0 (Timeout)
gettimeofday({1200180589, 194815}, NULL) = 0
ioctl(5, FIONREAD, [0]) = 0
gettimeofday({1200180589, 195730}, NULL) = 0
select(16, [5 6 7 9 11 14 15], [], [], {0, 17}) = 0 (Timeout)
gettimeofday({1200180589, 197555}, NULL) = 0
gettimeofday({1200180589, 198020}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 198884}, NULL) = 0
rt_sigaction(SIGVTALRM, {0xb5cffed0, [VTALRM], SA_RESTART}, {SIG_DFL}, 8) = 0
setitimer(ITIMER_VIRTUAL, {it_interval={10, 0}, it_value={5, 0}}, 
{it_interval={0, 0}, it_value={0, 0}}) = 0
gettimeofday({1200180589, 200702}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 200806}, NULL) = 0
setitimer(ITIMER_VIRTUAL, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
rt_sigaction(SIGVTALRM, {SIG_DFL}, {0xb5cffed0, [VTALRM], SA_RESTART}, 8) = 0
gettimeofday({1200180589, 202975}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 203837}, NULL) = 0
gettimeofday({1200180589, 204319}, NULL) = 0
time(NULL)  = 1200180589
gettimeofday({1200180589, 205169}, NULL) = 0
gettimeofday({1200180589, 205613}, NULL) = 0
ioctl(5, FIONREAD, [0]) = 0
gettimeofday({1200180589, 206515}, NULL) = 0
select(16, [5 6 7 9 11 14 15], [], [], {0, 149098} 

Fengguang, do you have any idea what's going wrong here?

regards,
Jörg

-- 
PGP Key: send mail with subject 'SEND PGP-KEY' PGP Key-ID: FD 4E 21 1D
PGP Fingerprint: 388A872AFC5649D3 BCEC65778BE0C605

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Top 10 kernel oopses for the week ending January 12th, 2008

2008-01-12 Thread Adrian Bunk

On Sat, Jan 12, 2008 at 10:48:05AM -0800, Arjan van de Ven wrote:
> The http://www.kerneloops.org website collects kernel oops and
> warning reports from various mailing lists and bugzillas as well as
> with a client users can install to auto-submit oopses.
> Below is a top 10 list of the oopses collected in the last 7 days.
> (Reports prior to 2.6.23 have been omitted in collecting the top 10)
>
> This week, a total of 136 oopses and warnings have been reported,
> compared to 46 reports in the previous 7 days.
>
> kerneloops.org news:
>   * Based on feedback from last weeks report, the website now tries
> to also present a disassembled Code: line
>   * the kerneloops collection client is now part of Fedora (rawhide)
> (yum install kerneloops)
>   * the kerneloops collection client is now included in Debian testing
> (apt-get install kerneloops)
>   * gentoo has received an updated version of the client
>
>
> Rank 1: implement (hid code)
>   WARN_ON at drivers/hid/hid-core.c:784
>   Reported 23 times (39 total reports)
>   This appears to be the kernel doing a WARN_ON based on unexpected 
> ioctl() arguments
>   More info: http://www.kerneloops.org/search.php?search=implement
>...

The only complete bug reports seems to be from one user who loaded a 
module whose distribution might be considered a criminal act in some 
countries.

All the other reports only contain the plain trace. Is there any way to 
get more information whether the former is a pattern or not, and to
get this information somehow displayed on the webpage?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc7-git4: Reported regressions from 2.6.23

2008-01-12 Thread Harvey Harrison

On Sat, 2008-01-12 at 22:05 +0100, Rafael J. Wysocki wrote:
> [RFC: Would that be useful if I sent regression-fixing patches, CCed to the
> appropriate maintainers/lists, along with the reports?]

Perhaps keep the regression report as-is, but for each regression with
a patch follow up with an appropriately CC'd reply to the regression
report.  Similar in concept to the regression report being patch 0/X
or how the -stable series patches are announced?

Harvey

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] x86: Function ifdefs in fault_32|64.c

2008-01-12 Thread Harvey Harrison

Add caller of is_errata93() to X86_32, ifdef'd to do
nothing.

Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]>
---
 arch/x86/mm/fault_32.c |   59 +--
 arch/x86/mm/fault_64.c |   30 ++--
 2 files changed, 83 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/fault_32.c b/arch/x86/mm/fault_32.c
index 5a52489..7361f96 100644
--- a/arch/x86/mm/fault_32.c
+++ b/arch/x86/mm/fault_32.c
@@ -228,6 +228,7 @@ KERN_ERR "*** Your BIOS seems to not contain a fix for 
K8 errata #93\n"
 KERN_ERR "*** Working around it, but it may cause SEGVs or burn power.\n"
 KERN_ERR "*** Please consider a BIOS update.\n"
 KERN_ERR "*** Disabling USB legacy in the BIOS may also help.\n";
+#endif
 
 /* Workaround for K8 erratum #93 & buggy BIOS.
BIOS SMM functions are required to use a specific workaround
@@ -235,10 +236,12 @@ KERN_ERR "*** Disabling USB legacy in the BIOS may 
also help.\n";
A lot of BIOS that didn't get tested properly miss this.
The OS sees this as a page fault with the upper 32bits of RIP cleared.
Try to work around it here.
-   Note we only handle faults in kernel here. */
-
+   Note we only handle faults in kernel here.
+   Does nothing for X86_32
+ */
 static int is_errata93(struct pt_regs *regs, unsigned long address)
 {
+#ifdef CONFIG_X86_64
static int warned;
if (address != regs->ip)
return 0;
@@ -254,9 +257,10 @@ static int is_errata93(struct pt_regs *regs, unsigned long 
address)
regs->ip = address;
return 1;
}
+#endif
return 0;
 }
-#endif
+
 
 /*
  * Handle a fault on the vmalloc or module mapping area
@@ -265,6 +269,7 @@ static int is_errata93(struct pt_regs *regs, unsigned long 
address)
  */
 static inline int vmalloc_fault(unsigned long address)
 {
+#ifdef CONFIG_X86_32
unsigned long pgd_paddr;
pmd_t *pmd_k;
pte_t *pte_k;
@@ -283,6 +288,51 @@ static inline int vmalloc_fault(unsigned long address)
if (!pte_present(*pte_k))
return -1;
return 0;
+#else
+   pgd_t *pgd, *pgd_ref;
+   pud_t *pud, *pud_ref;
+   pmd_t *pmd, *pmd_ref;
+   pte_t *pte, *pte_ref;
+
+   /* Copy kernel mappings over when needed. This can also
+  happen within a race in page table update. In the later
+  case just flush. */
+
+   pgd = pgd_offset(current->mm ?: &init_mm, address);
+   pgd_ref = pgd_offset_k(address);
+   if (pgd_none(*pgd_ref))
+   return -1;
+   if (pgd_none(*pgd))
+   set_pgd(pgd, *pgd_ref);
+   else
+   BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
+
+   /* Below here mismatches are bugs because these lower tables
+  are shared */
+
+   pud = pud_offset(pgd, address);
+   pud_ref = pud_offset(pgd_ref, address);
+   if (pud_none(*pud_ref))
+   return -1;
+   if (pud_none(*pud) || pud_page_vaddr(*pud) != pud_page_vaddr(*pud_ref))
+   BUG();
+   pmd = pmd_offset(pud, address);
+   pmd_ref = pmd_offset(pud_ref, address);
+   if (pmd_none(*pmd_ref))
+   return -1;
+   if (pmd_none(*pmd) || pmd_page(*pmd) != pmd_page(*pmd_ref))
+   BUG();
+   pte_ref = pte_offset_kernel(pmd_ref, address);
+   if (!pte_present(*pte_ref))
+   return -1;
+   pte = pte_offset_kernel(pmd, address);
+   /* Don't use pte_page here, because the mappings can point
+  outside mem_map, and the NUMA hash lookup cannot handle
+  that. */
+   if (!pte_present(*pte) || pte_pfn(*pte) != pte_pfn(*pte_ref))
+   BUG();
+   return 0;
+#endif
 }
 
 int show_unhandled_signals = 1;
@@ -524,6 +574,9 @@ no_context:
if (is_prefetch(regs, address, error_code))
return;
 
+   if (is_errata93(regs, address))
+   return;
+
 /*
  * Oops. The kernel tried to access some bad page. We'll have to
  * terminate things with extreme prejudice.
diff --git a/arch/x86/mm/fault_64.c b/arch/x86/mm/fault_64.c
index befe9da..481e045 100644
--- a/arch/x86/mm/fault_64.c
+++ b/arch/x86/mm/fault_64.c
@@ -234,6 +234,7 @@ KERN_ERR "*** Your BIOS seems to not contain a fix for 
K8 errata #93\n"
 KERN_ERR "*** Working around it, but it may cause SEGVs or burn power.\n"
 KERN_ERR "*** Please consider a BIOS update.\n"
 KERN_ERR "*** Disabling USB legacy in the BIOS may also help.\n";
+#endif
 
 /* Workaround for K8 erratum #93 & buggy BIOS.
BIOS SMM functions are required to use a specific workaround
@@ -241,10 +242,12 @@ KERN_ERR "*** Disabling USB legacy in the BIOS may 
also help.\n";
A lot of BIOS that didn't get tested properly miss this.
The OS sees this as a page fault with the upper 32bits of RIP cleared.
Try to work around it here.
-   Note we only handle faults in kernel here. */
-
+   Note we only handle faults i

[PATCH 1/2] x86: Last of trivial fault_32|64.c unification

2008-01-12 Thread Harvey Harrison

Comments, indentation, printk format.

Uses task_pid_nr() on X86_64 now, but this is always defined
to task->pid.

Signed-off-by: Harvey Harrison <[EMAIL PROTECTED]>
---
 arch/x86/mm/fault_32.c |   20 
 arch/x86/mm/fault_64.c |   29 +++--
 2 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/arch/x86/mm/fault_32.c b/arch/x86/mm/fault_32.c
index f8fc240..5a52489 100644
--- a/arch/x86/mm/fault_32.c
+++ b/arch/x86/mm/fault_32.c
@@ -36,10 +36,10 @@
  * bit 3 == 1 means use of reserved bit detected
  * bit 4 == 1 means fault was an instruction fetch
  */
-#define PF_PROT(1<<0)
+#define PF_PROT(1<<0)
 #define PF_WRITE   (1<<1)
-#define PF_USER(1<<2)
-#define PF_RSVD(1<<3)
+#define PF_USER(1<<2)
+#define PF_RSVD(1<<3)
 #define PF_INSTR   (1<<4)
 
 static inline int notify_page_fault(struct pt_regs *regs)
@@ -477,11 +477,15 @@ bad_area_nosemaphore:
 
if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
printk_ratelimit()) {
-   printk("%s%s[%d]: segfault at %08lx ip %08lx "
-   "sp %08lx error %lx\n",
-   task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
-   tsk->comm, task_pid_nr(tsk), address, regs->ip,
-   regs->sp, error_code);
+   printk(
+#ifdef CONFIG_X86_32
+   "%s%s[%d]: segfault at %08lx ip %08lx sp %08lx error 
%lx\n",
+#else
+   "%s%s[%d]: segfault at %lx ip %lx sp %lx error %lx\n",
+#endif
+   task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
+   tsk->comm, task_pid_nr(tsk), address, regs->ip,
+   regs->sp, error_code);
}
tsk->thread.cr2 = address;
/* Kernel addresses are always protection faults */
diff --git a/arch/x86/mm/fault_64.c b/arch/x86/mm/fault_64.c
index 11e9398..befe9da 100644
--- a/arch/x86/mm/fault_64.c
+++ b/arch/x86/mm/fault_64.c
@@ -454,8 +454,11 @@ asmlinkage void __kprobes do_page_fault(struct pt_regs 
*regs,
if (!(vma->vm_flags & VM_GROWSDOWN))
goto bad_area;
if (error_code & PF_USER) {
-   /* Allow userspace just enough access below the stack pointer
-* to let the 'enter' instruction work.
+   /*
+* Accessing the stack below %sp is always a bug.
+* The large cushion allows instructions like enter
+* and pusha to work.  ("enter $65535,$31" pushes
+* 32 pointers and then decrements %sp by 65535.)
 */
if (address + 65536 + 32 * sizeof(unsigned long) < regs->sp)
goto bad_area;
@@ -540,10 +543,14 @@ bad_area_nosemaphore:
if (show_unhandled_signals && unhandled_signal(tsk, SIGSEGV) &&
printk_ratelimit()) {
printk(
-  "%s%s[%d]: segfault at %lx ip %lx sp %lx error %lx\n",
-   tsk->pid > 1 ? KERN_INFO : KERN_EMERG,
-   tsk->comm, tsk->pid, address, regs->ip,
-   regs->sp, error_code);
+#ifdef CONFIG_X86_32
+   "%s%s[%d]: segfault at %08lx ip %08lx sp %08lx error 
%lx\n",
+#else
+   "%s%s[%d]: segfault at %lx ip %lx sp %lx error %lx\n",
+#endif
+   task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
+   tsk->comm, task_pid_nr(tsk), address, regs->ip,
+   regs->sp, error_code);
}
 
tsk->thread.cr2 = address;
@@ -627,10 +634,12 @@ LIST_HEAD(pgd_list);
 
 void vmalloc_sync_all(void)
 {
-   /* Note that races in the updates of insync and start aren't
-  problematic:
-  insync can only get set bits added, and updates to start are only
-  improving performance (without affecting correctness if undone). */
+   /*
+* Note that races in the updates of insync and start aren't
+* problematic: insync can only get set bits added, and updates to
+* start are only improving performance (without affecting correctness
+* if undone).
+*/
static DECLARE_BITMAP(insync, PTRS_PER_PGD);
static unsigned long start = VMALLOC_START & PGDIR_MASK;
unsigned long address;
-- 
1.5.4.rc2.1164.g6451


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc7-git4: Reported regressions from 2.6.23

2008-01-12 Thread Adrian Bunk

On Sat, Jan 12, 2008 at 10:05:34PM +0100, Rafael J. Wysocki wrote:
> [RFC: Would that be useful if I sent regression-fixing patches, CCed to the
> appropriate maintainers/lists, along with the reports?]
>...

I don't think sending the patches around brings much advantage.

But I always also Cc'ed all submitters, maintainers and all people 
somehow involved with one or more of the regressions.

This sometimes required splitting the regression lists into 5-7 emails 
due to some vger rule of dropping emails with IIRC >= 30 recipients, but 
the required grouping of regressions also had the advantage that you 
often have regressions that might or might not be related grouped and 
discussed together.

And if you want to push people to work on regressions or get fixes 
into Linus' tree spamming their INBOX'es with personal copies of 
regression reports sometimes gets regressions fixed faster...

> Thanks,
> Rafael

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: remove "too much work for irq" printk

2008-01-12 Thread H. Peter Anvin


Alan Cox wrote:
This is going to put trouble in other places unless, at least, it can be 
turned off.  The UART serial interface is heavily emulated as a console 
interface without an actual serial port behind it, both in hardware and 
software.  Rates will typically vastly exceed real serial port rates, 


Yes Dell do it, HP do it, IBM do it. Have done for years, never caused us
a problem. Its only KVM tripping it.

As to bit rate we support up to about 1 to 1.5 Mbit.



FWIW, the highest rate I've seen in hardware is a FIFO about 4K deep 
that can be polled at about 25 MHz when all the overhead is accounted for.


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in

2008-01-12 Thread Ivan Kokshaysky

On Sat, Jan 12, 2008 at 09:45:57AM -0800, Arjan van de Ven wrote:
> btw this is my main objection to your patch; it intertwines the conf1
> and mmconfig code even more.

There is nothing wrong with it; please realize that mmconf and conf1 are
just different cpu-side interfaces. Both produce precisely the *same* bus
cycles as far as the lower 256-byte space is concerned.

> When (and I'm saying "when" not "if") systems arrive that only have
> MMCONFIG for some of the devices, we'll have to detangle this again,
> and I'm really not looking forward to that.

MMCONFIG for *some* of the devices? This doesn't sound realistic
from technical point of view.
MMCONFIG-only systems? Sure. I really hope to see these. But it won't
be PC-AT architecture anymore. It has to be something like alpha,
for instance, fully utilizing the 64-bit address space, and we'll have
to have the whole low-level PCI infrastructure completely different
for these future platforms anyway.
Right now, each and every x86 chipset *does* require working
conf1 just in order to set up the mmconf aperture. It's the very
fundamental thing, sort of design philosophy.

Ivan.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: remove "too much work for irq" printk

2008-01-12 Thread Alan Cox

> This is going to put trouble in other places unless, at least, it can be 
> turned off.  The UART serial interface is heavily emulated as a console 
> interface without an actual serial port behind it, both in hardware and 
> software.  Rates will typically vastly exceed real serial port rates, 

Yes Dell do it, HP do it, IBM do it. Have done for years, never caused us
a problem. Its only KVM tripping it.

As to bit rate we support up to about 1 to 1.5 Mbit.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: remove "too much work for irq" printk

2008-01-12 Thread H. Peter Anvin


Alan Cox wrote:

On Sat, 12 Jan 2008 15:15:43 -0500
Benjamin LaHaise <[EMAIL PROTECTED]> wrote:

When using kvm with a serial console, the serial driver will print out 
"too much work for irq4" on any heavy activity (ie vi on a file repainting 
the terminal).  This message is entirely spurious, as output continues to 
work fine.  Remove the message as it corrupts screen output and is far too 
easy to trigger.


NAK. This is a qemu/kvm emulation bug. The real check is there to catched
jammed IRQs and combined with the IRQ bug handling nowdays does actually
do the intended job.

Our serial port code (correctly) interprets a continuous stream of bytes
at an impossible bit rate as an error. KVM should be emulating to some
extent at least the timing on serial interfaces or using a virtualised
interface.



This is going to put trouble in other places unless, at least, it can be 
turned off.  The UART serial interface is heavily emulated as a console 
interface without an actual serial port behind it, both in hardware and 
software.  Rates will typically vastly exceed real serial port rates, 
especially emulating 16450 or 16550 (which is typical) which are limited 
to 115200 bps.


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: remove "too much work for irq" printk

2008-01-12 Thread Alan Cox

On Sat, 12 Jan 2008 15:15:43 -0500
Benjamin LaHaise <[EMAIL PROTECTED]> wrote:

> When using kvm with a serial console, the serial driver will print out 
> "too much work for irq4" on any heavy activity (ie vi on a file repainting 
> the terminal).  This message is entirely spurious, as output continues to 
> work fine.  Remove the message as it corrupts screen output and is far too 
> easy to trigger.

NAK. This is a qemu/kvm emulation bug. The real check is there to catched
jammed IRQs and combined with the IRQ bug handling nowdays does actually
do the intended job.

Our serial port code (correctly) interprets a continuous stream of bytes
at an impossible bit rate as an error. KVM should be emulating to some
extent at least the timing on serial interfaces or using a virtualised
interface.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 06/21] ide-floppy: remove struct idefloppy_flexible_disk_page

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Saturday 12 January 2008, Borislav Petkov wrote:

[...]

> > >   set_disk_ro(floppy->disk, floppy->wp);
> > > - page = (idefloppy_flexible_disk_page_t *) (header + 1);
> > > -
> > > - page->transfer_rate = be16_to_cpu(page->transfer_rate);
> > > - page->sector_size = be16_to_cpu(page->sector_size);
> > > - page->cyls = be16_to_cpu(page->cyls);
> > > - page->rpm = be16_to_cpu(page->rpm);
> > > - capacity = page->cyls * page->heads * page->sectors * page->sector_size;
> > > - if (memcmp (page, &floppy->flexible_disk_page, sizeof 
> > > (idefloppy_flexible_disk_page_t)))
> > > +
> > > + transfer_rate = be16_to_cpu(*(u16 *)&pc.buffer[8 + 2]);
> > > + sector_size   = be16_to_cpu(*(u16 *)&pc.buffer[8 + 6]);
> > > + cyls  = be16_to_cpu(*(u16 *)&pc.buffer[8 + 8]);
> > > + rpm   = be16_to_cpu(*(u16 *)&pc.buffer[8 + 28]);
> > > + heads = pc.buffer[8 + 4];
> > > + sectors   = pc.buffer[8 + 5];
> > > +
> > > + capacity = cyls * heads * sectors * sector_size;
> > > +
> > > + if ((1UL << IDEFLOPPY_MEDIA_CHANGED) & floppy->flags)
> > 
> > IDEFLOPPY_MEDIA_CHANGED is set when block device is opened for the first
> > time (please check idefloppy_open() for details) so I don't think it is
> > the right change.  'Flexible Disk Page' is only 32 bytes so we are better
> > off with leaving 'u8 flexible_disk_page[32]' in idefloppy_floppy_t and
> > doing things the old way.
> > 
> > Besides please do not intermix real changes like the above one with purely
> > cleanup ones like idefloppy_flexible_disk_page_t removal.  This is bad from
> > maintainability point of view.  If some patch causes problems it is easier
> > to narrow it down by heaving purely cleanup changes separated out + if we
> > would need to revert the real change we would have to make a separate patch
> > doing it instead of just reverting the guilty commit (given that we don't
> > want cleanup changes to be reverted as well).
> 
> How about we get rid of that chunk altogether? floppy->flexible_disk_page is
> used only here for the purpose of printk-ing to the syslog and has no "real"
> purpose otherwise. Do we need that info spewed into the syslog at all?

Well, it has some debugging value since drive's capabilities are given in
'Flexible Disk Page' but fine with me given that this change is separated
out from idefloppy_flexible_disk_page_t removal and pushed at the end of
patch series.

Thanks,
Bart
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc7-git4: Reported regressions from 2.6.23

2008-01-12 Thread Rafael J. Wysocki

On Saturday, 12 of January 2008, Rafael J. Wysocki wrote:
> [RFC: Would that be useful if I sent regression-fixing patches,

That is, the patches pointed to by the "Patch" fields in the list entries.

> CCed to the appropriate maintainers/lists, along with the reports?]

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 18/21] ide-floppy: fix error handling in idefloppy_probe()

2008-01-12 Thread Borislav Petkov

On Sat, Jan 12, 2008 at 09:18:01PM +0100, Bartlomiej Zolnierkiewicz wrote:
> On Friday 11 January 2008, Borislav Petkov wrote:
> > Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> > ---
> >  drivers/ide/ide-floppy.c |3 ++-
> >  1 files changed, 2 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/ide/ide-floppy.c b/drivers/ide/ide-floppy.c
> > index 89b26ea..0729df5 100644
> > --- a/drivers/ide/ide-floppy.c
> > +++ b/drivers/ide/ide-floppy.c
> > @@ -1737,7 +1737,8 @@ static int ide_floppy_probe(ide_drive_t *drive)
> > " emulation.\n", drive->name);
> > goto failed;
> > }
> > -   if ((floppy = kzalloc(sizeof (idefloppy_floppy_t), GFP_KERNEL)) == 
> > NULL) {
> > +   floppy = kzalloc(sizeof(idefloppy_floppy_t), GFP_KERNEL);
> > +   if (!floppy) {
> > printk(KERN_ERR "ide-floppy: %s: Can't allocate a floppy"
> > " structure\n", drive->name);
> > goto failed;
> 
> I'm unable to see any problem with error handling here?

I changed it simply because checkpatch.pl complains so:

ERROR: do not use assignment in if condition (+ if ((floppy = 
kzalloc(sizeof(idefloppy_floppy_t), GFP_KERNEL)) == NULL))
#1740: FILE: home/boris/tmp/ide-floppy.c:1740:
+   if ((floppy = kzalloc(sizeof (idefloppy_floppy_t), GFP_KERNEL)) == NULL)
{

> This change should be combined with the rest of checkpatch.pl fixes.
ok.

-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] kthread: allow kthread_stop calls to run in parallel

2008-01-12 Thread Jeff Layton

Currently, all kthread_stop calls are serialized. If something calls
kthread_stop on a kthread, no other kthread will be able to be
brought down until the first kthread_stop returns. A "rogue" kthread
could therefore prevent other kthreads from ever coming down.

This patch makes it so that kthread_stop calls can be done in parallel.
To do this, it adds a new pointer to the task struct. This works, but
I'd prefer to use an existing pointer that's never used for kthreads
rather than adding a new one. I'm just not sure what pointer would be
a good candidate for this.

This is really an RFC at this point, so comments and suggestions
are appreciated...

Signed-off-by: Jeff Layton <[EMAIL PROTECTED]>
---
 include/linux/sched.h |2 ++
 kernel/kthread.c  |   33 +
 2 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ac3d496..bc38574 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1178,6 +1178,8 @@ struct task_struct {
int make_it_fail;
 #endif
struct prop_local_single dirties;
+
+   struct kthread_stop_info *kthread_stop_info;
 };
 
 /*
diff --git a/kernel/kthread.c b/kernel/kthread.c
index dcfe724..12594fa 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -35,26 +35,23 @@ struct kthread_create_info
 
 struct kthread_stop_info
 {
-   struct task_struct *k;
int err;
struct completion done;
 };
 
-/* Thread stopping is done by setthing this var: lock serializes
- * multiple kthread_stop calls. */
-static DEFINE_MUTEX(kthread_stop_lock);
-static struct kthread_stop_info kthread_stop_info;
-
 /**
  * kthread_should_stop - should this kthread return now?
  *
  * When someone calls kthread_stop() on your kthread, it will be woken
  * and this will return true.  You should then return, and your return
  * value will be passed through to kthread_stop().
+ *
+ * In this implementation, setting the kthread_stop_info pointer to
+ * a non-NULL value means that it should stop.
  */
 int kthread_should_stop(void)
 {
-   return (kthread_stop_info.k == current);
+   return (current->kthread_stop_info != NULL);
 }
 EXPORT_SYMBOL(kthread_should_stop);
 
@@ -68,6 +65,7 @@ static int kthread(void *_create)
/* Copy data: it's on kthread's stack */
threadfn = create->threadfn;
data = create->data;
+   current->kthread_stop_info = NULL;
 
/* OK, tell user we're spawned, wait for stop or wakeup */
__set_current_state(TASK_UNINTERRUPTIBLE);
@@ -79,8 +77,8 @@ static int kthread(void *_create)
 
/* It might have exited on its own, w/o kthread_stop.  Check. */
if (kthread_should_stop()) {
-   kthread_stop_info.err = ret;
-   complete(&kthread_stop_info.done);
+   current->kthread_stop_info->err = ret;
+   complete(¤t->kthread_stop_info->done);
}
return 0;
 }
@@ -104,7 +102,7 @@ static void create_kthread(struct kthread_create_info 
*create)
 
 /**
  * kthread_create - create a kthread.
- * @threadfn: the function to run until signal_pending(current).
+ * @threadfn: the function to run until kthread_should_stop()
  * @data: data ptr for @threadfn.
  * @namefmt: printf-style name for the thread.
  *
@@ -188,29 +186,24 @@ EXPORT_SYMBOL(kthread_bind);
  */
 int kthread_stop(struct task_struct *k)
 {
-   int ret;
-
-   mutex_lock(&kthread_stop_lock);
+   struct kthread_stop_info kstop_info;
 
/* It could exit after stop_info.k set, but before wake_up_process. */
get_task_struct(k);
 
/* Must init completion *before* thread sees kthread_stop_info.k */
-   init_completion(&kthread_stop_info.done);
+   init_completion(&kstop_info.done);
smp_wmb();
 
/* Now set kthread_should_stop() to true, and wake it up. */
-   kthread_stop_info.k = k;
+   k->kthread_stop_info = &kstop_info;
wake_up_process(k);
put_task_struct(k);
 
/* Once it dies, reset stop ptr, gather result and we're done. */
-   wait_for_completion(&kthread_stop_info.done);
-   kthread_stop_info.k = NULL;
-   ret = kthread_stop_info.err;
-   mutex_unlock(&kthread_stop_lock);
+   wait_for_completion(&kstop_info.done);
 
-   return ret;
+   return kstop_info.err;
 }
 EXPORT_SYMBOL(kthread_stop);
 
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Could not set non-blocking flag with 2.6.24-rc5

2008-01-12 Thread Christoph Hellwig

On Thu, Dec 13, 2007 at 09:16:01PM +0100, Tino Keitel wrote:
> Hi folks,
> 
> I often build Debian packages inside a chroot. Today I discovered a
> failure during an "aptitude update", which is a command to download new
> package lists for the package management. In strace, the lines around
> the failure look like this:

Are you using XFS?  This looks a lot like the bug I introduced where
i_rdev gets a wrong value assigned after mknod.   In that case please
try -rc7 as it has a fix for that particular problem.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.24-rc7-git4: Reported regressions from 2.6.23

2008-01-12 Thread Rafael J. Wysocki

[RFC: Would that be useful if I sent regression-fixing patches, CCed to the
appropriate maintainers/lists, along with the reports?]

This message contains a list of some regressions from 2.6.23 reported since
2.6.24-rc1 was released, for which there are no fixes in the mainline I know
of.  If any of them have been fixed already, please let me know.

If you know of any other unresolved regressions from 2.6.23, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.

Listed regressions statistics:

  Date  Total  Pending  Unresolved
  
  Today   150   28  14
  2008-01-05  139   28  15
  2008-01-01  139   38  23
  2007-12-21  118   21  13
  2007-12-18  115   29  15
  2007-12-12  106   31  17
  2007-12-08   98   29  19
  2007-12-01   85   29  18
  2007-11-24   75   25  21
  2007-11-19   68   26  21
  2007-11-17   65   25  20


Unresolved regressions
--

Subject : EHCI causes system to resume instantly from S4
Submitter   : Maxim Levitsky <[EMAIL PROTECTED]>
Date: 2007-10-28 14:56
References  : http://lkml.org/lkml/2007/10/27/66
  http://bugzilla.kernel.org/show_bug.cgi?id=9258
Handled-By  : "Rafael J. Wysocki" <[EMAIL PROTECTED]>
  David Brownell <[EMAIL PROTECTED]>
  Alan Stern <[EMAIL PROTECTED]>
Workaround  : http://bugzilla.kernel.org/show_bug.cgi?id=9258#c30


Subject : SError: { DevExch } occuring and causing disruption
Submitter   : Avuton Olrich <[EMAIL PROTECTED]>
Date: 2007-11-15 22:39
References  : http://bugzilla.kernel.org/show_bug.cgi?id=9393
Handled-By  : Tejun Heo <[EMAIL PROTECTED]>
  Mark Lord <[EMAIL PROTECTED]>


Subject : BUG: bad unlock balance detected!
Submitter   : Krzysztof Oledzki <[EMAIL PROTECTED]>
Date: 2007-12-11 03:17
References  : http://bugzilla.kernel.org/show_bug.cgi?id=9542
Handled-By  : Andrew Morton <[EMAIL PROTECTED]>
  Herbert Xu <[EMAIL PROTECTED]>


Subject : Could not set non-blocking flag with 2.6.24-rc5
Submitter   : Tino Keitel <[EMAIL PROTECTED]>
Date: 2007-12-13 16:27
References  : http://lkml.org/lkml/2007/12/13/392
  http://bugzilla.kernel.org/show_bug.cgi?id=9557
Handled-By  : Linus Torvalds <[EMAIL PROTECTED]>


Subject : swapping in 2.6.24-rc5-git3
Submitter   : Lukas Hejtmanek <[EMAIL PROTECTED]>
Date: 2007-12-17 14:04
References  : http://lkml.org/lkml/2007/12/17/98
  http://bugzilla.kernel.org/show_bug.cgi?id=9592
Handled-By  : Jan Kara <[EMAIL PROTECTED]>


Subject : Problems on booting
Submitter   : "werner" <[EMAIL PROTECTED]>
Date: 2007-12-22 14:29
References  : http://lkml.org/lkml/2007/12/22/110
  http://bugzilla.kernel.org/show_bug.cgi?id=9621


Subject : ACPI or radeon: spontaneous reboot regression
Submitter   : Matt Mackall <[EMAIL PROTECTED]>
Date: 2007-12-22 16:09
References  : http://lkml.org/lkml/2007/12/22/139
  http://bugzilla.kernel.org/show_bug.cgi?id=9624
Handled-By  : Len Brown <[EMAIL PROTECTED]>


Subject : iptables won't work
Submitter   : Kristoffer Malmström <[EMAIL PROTECTED]>
Date: 2007-12-28
References  : http://bugzilla.kernel.org/show_bug.cgi?id=9657
Handled-By  : Patrick McHardy <[EMAIL PROTECTED]>


Subject : lockdep warning with LTP dio test (v2.6.24-rc6-125-g5356f66)
Submitter   : Erez Zadok <[EMAIL PROTECTED]>
Date: 2007-12-24 18:02
References  : http://lkml.org/lkml/2007/12/24/107
  http://bugzilla.kernel.org/show_bug.cgi?id=9670


Subject : kexec buffer error
Submitter   : Randy Dunlap <[EMAIL PROTECTED]>
Date: 2008-01-04 22:54
References  : http://lkml.org/lkml/2008/1/4/255
  http://bugzilla.kernel.org/show_bug.cgi?id=9693


Subject : wake on lan fails with sky2 module
Submitter   : cpo <[EMAIL PROTECTED]>
Date: 2008-01-09 13:05
References  : http://bugzilla.kernel.org/show_bug.cgi?id=9721
  http://marc.info/?t=11999234931&r=1&w=4
Handled-By  : Stephen Hemminger <[EMAIL PROTECTED]>


Subject : psmouse.c: GlidePoint at isa0060/serio1/input0 lost sync at 
byte 1
Submitter   : "Vegard Nossum" <[EMAIL PROTECTED]>
Date: 2007-11-14 21:26
References  : http://lkml.org/lkml/2007/11/14/363
  http://bugzilla.kernel.org/show_bug.cgi?id=9727


Subject : 2.6.24-rc7 -- WARNING: at kernel/lockdep.c:2662 check_flags()
Submitter   : "Miles

Re: [PATCH 0/4] cpuinitconst and devinitconst

2008-01-12 Thread Sam Ravnborg

On Fri, Jan 11, 2008 at 08:44:28PM +0100, Sam Ravnborg wrote:
> Hi Jan.
> 
> On Fri, Jan 11, 2008 at 08:55:29AM +, Jan Beulich wrote:
> > Since __cpuinitdata/__devinitdata don't allow const to be specified with
> > them (otherwise .init.data sections with and without the writeable attribute
> > will be generated by the compiler), and since __devinitdata except for
> > embedded systems evaluates to  unconditionally and
> > __cpuinitdata at least in most production kernel configurations also
> > likely evaluates to , it seems appropriate to add an additional
> > attribute allowing the respective objects to end up in .rodata rather than
> > .data when not used at initialization time only.
> 
> How about a slightly diffrent approach...
> Consider
> __cpuinitconst => function is placed in section .init.const.text
> __devinitconst => function is placed in section .init.const.text
> 
> Then we in the linker scrip can distingush between the two
> and locate the sections as appropriate.
> 
> This will require some updates to modpost but are align with an
> idea I have had for a while.
> All of the following should have dedicated sections associated
> unconditionally:
> 
> __init
> __cpuinit
> __meminit
> __initdata
> __cpuinitdata
> __meminitdata
> 
> And then in the linker script we decide what to do with the section.
> In the built-in case we put them in the "to-be-discarded" section.
> In the module case we put them as today.
> 
> The primary tasks needed to accomplish this is:
> 1) Update all arch linker scripts (and some of them looks ugly)
> 2) Teach modpost about the new sections
> 
> If you following the suggestion above this is a simple step
> in this direction which would be good.

The following patch implment first step in this direction.
It is only an RFC as I have not touched anything else than
64 bit x86 for the arch specific parts.
But it should show what I tried to say above.

On top of x86.git mm-branch.

Sam

diff --git a/arch/x86/kernel/vmlinux_64.lds.S b/arch/x86/kernel/vmlinux_64.lds.S
index ba8ea97..26c1d81 100644
--- a/arch/x86/kernel/vmlinux_64.lds.S
+++ b/arch/x86/kernel/vmlinux_64.lds.S
@@ -155,12 +155,16 @@ SECTIONS
   __init_begin = .;
   .init.text : AT(ADDR(.init.text) - LOAD_OFFSET) {
_sinittext = .;
+   INIT_TEXT
*(.init.text)
_einittext = .;
   }
-  __initdata_begin = .;
-  .init.data : AT(ADDR(.init.data) - LOAD_OFFSET) { *(.init.data) }
-  __initdata_end = .;
+  .init.data : AT(ADDR(.init.data) - LOAD_OFFSET) {
+   __initdata_begin = .;
+   INIT_DATA
+   __initdata_end = .;
+   }
+
   . = ALIGN(16);
   __setup_start = .;
   .init.setup : AT(ADDR(.init.setup) - LOAD_OFFSET) { *(.init.setup) }
@@ -187,8 +191,12 @@ SECTIONS
   }
   /* .exit.text is discard at runtime, not link time, to deal with references
  from .altinstructions and .eh_frame */
-  .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) { *(.exit.text) }
-  .exit.data : AT(ADDR(.exit.data) - LOAD_OFFSET) { *(.exit.data) }
+  .exit.text : AT(ADDR(.exit.text) - LOAD_OFFSET) {
+   EXIT_TEXT
+  }
+  .exit.data : AT(ADDR(.exit.data) - LOAD_OFFSET) {
+   EXIT_DATA
+  }
 
 /* vdso blob that is mapped into user space */
   vdso_start = . ;
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 9f584cc..2f359d9 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -9,10 +9,48 @@
 /* Align . to a 8 byte boundary equals to maximum function alignment. */
 #define ALIGN_FUNCTION()  . = ALIGN(8)
 
+#ifdef CONFIG_HOTPLUG
+#define DEV_KEEP(sec)
+#define DEV_DISCARD(sec) *(.dev##sec)
+#else
+#define DEV_KEEP(sec)*(.dev##sec)
+#define DEV_DISCARD(sec)
+#endif
+
+#ifdef CONFIG_HOTPLUG_CPU
+#define CPU_KEEP(sec)
+#define CPU_DISCARD(sec) *(.cpu##sec)
+#else
+#define CPU_KEEP(sec)*(.cpu##sec)
+#define CPU_DISCARD(sec)
+#endif
+
+#if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_ACPI_HOTPLUG_MEMORY) \
+|| defined(CONFIG_ACPI_HOTPLUG_MEMORY_MODULE)
+#define MEM_KEEP(sec)
+#define MEM_DISCARD(sec) *(.mem##sec)
+#else
+#define MEM_KEEP(sec)*(.mem##sec)
+#define MEM_DISCARD(sec)
+#endif
+
+
 /* .data section */
 #define DATA_DATA  \
*(.data)\
*(.data.init.refok) \
+   DEV_KEEP(init.data) \
+   DEV_KEEP(init.data.const)   \
+   DEV_KEEP(exit.data) \
+   DEV_KEEP(exit.data.const)   \
+   CPU_KEEP(init.data) \
+   CPU_KEEP(init.data.const)   \
+   CPU_KEEP(exit.data) \
+   CPU_KEEP(exit.data.const)

Re: [PATCH 00/21] ide-floppy redux v2

2008-01-12 Thread Borislav Petkov

On Sat, Jan 12, 2008 at 09:14:39PM +0100, Bartlomiej Zolnierkiewicz wrote:
[...]

> PS1 Please rebase the patches still needing polishing on top of updated
> IDE quilt tree, recast them and respin the patch series (no need to post
> already merged patches).

sure, will do.

> PS2 what happend to "fix DMA error reporting" patch? (#13 is missing, hmm?)

Yeah, this is strange. It seems git-send-email missed some of the patches. I had
to send #16,#17 manually but didn't notice #13 was also missing. Will send with
the next batch.

-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 06/21] ide-floppy: remove struct idefloppy_flexible_disk_page

2008-01-12 Thread Borislav Petkov


[...]

> This is not an equivalent transformation:
> 
> header->wp is 0 or 1
> pc.buffer[3] & 0x80 is 0 or 0x80
> 
> It seems to work fine for ->wp (because it is needlessly defined as 'int')
> but may seriously confuse set_disk_ro() and thus bdev_read_only() users.
> 
> Should be fixed to '(pc.buffer[3] & 0x80) ? 1 : 0' (or something similar).

upps, sorry, that was silly. I changed it to:

floppy->wp = !!(pc.buffer[3] & 0x80);

> > set_disk_ro(floppy->disk, floppy->wp);
> > -   page = (idefloppy_flexible_disk_page_t *) (header + 1);
> > -
> > -   page->transfer_rate = be16_to_cpu(page->transfer_rate);
> > -   page->sector_size = be16_to_cpu(page->sector_size);
> > -   page->cyls = be16_to_cpu(page->cyls);
> > -   page->rpm = be16_to_cpu(page->rpm);
> > -   capacity = page->cyls * page->heads * page->sectors * page->sector_size;
> > -   if (memcmp (page, &floppy->flexible_disk_page, sizeof 
> > (idefloppy_flexible_disk_page_t)))
> > +
> > +   transfer_rate = be16_to_cpu(*(u16 *)&pc.buffer[8 + 2]);
> > +   sector_size   = be16_to_cpu(*(u16 *)&pc.buffer[8 + 6]);
> > +   cyls  = be16_to_cpu(*(u16 *)&pc.buffer[8 + 8]);
> > +   rpm   = be16_to_cpu(*(u16 *)&pc.buffer[8 + 28]);
> > +   heads = pc.buffer[8 + 4];
> > +   sectors   = pc.buffer[8 + 5];
> > +
> > +   capacity = cyls * heads * sectors * sector_size;
> > +
> > +   if ((1UL << IDEFLOPPY_MEDIA_CHANGED) & floppy->flags)
> 
> IDEFLOPPY_MEDIA_CHANGED is set when block device is opened for the first
> time (please check idefloppy_open() for details) so I don't think it is
> the right change.  'Flexible Disk Page' is only 32 bytes so we are better
> off with leaving 'u8 flexible_disk_page[32]' in idefloppy_floppy_t and
> doing things the old way.
> 
> Besides please do not intermix real changes like the above one with purely
> cleanup ones like idefloppy_flexible_disk_page_t removal.  This is bad from
> maintainability point of view.  If some patch causes problems it is easier
> to narrow it down by heaving purely cleanup changes separated out + if we
> would need to revert the real change we would have to make a separate patch
> doing it instead of just reverting the guilty commit (given that we don't
> want cleanup changes to be reverted as well).

How about we get rid of that chunk altogether? floppy->flexible_disk_page is
used only here for the purpose of printk-ing to the syslog and has no "real"
purpose otherwise. Do we need that info spewed into the syslog at all?

-- 
Regards/Gruß,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Shrink ext3_inode_info by 8 bytes for !POSIX_ACL.

2008-01-12 Thread Indan Zupancic

i_file_acl and i_dir_acl aren't always needed.

With certain configs this makes 10 ext3_inode_cache objects fit in
one slab instead of the current 9, as the size shrinks from 416 to
408 bytes for 32 bit, !POSIX_ACL and !EXT3_FS_XATTR configs.

Signed-off-by: Indan Zupancic <[EMAIL PROTECTED]>
---
 fs/ext3/ialloc.c  |2 ++
 fs/ext3/inode.c   |   29 +++--
 include/linux/ext3_fs_i.h |2 ++
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
index 1bc8cd8..01745bc 100644
--- a/fs/ext3/ialloc.c
+++ b/fs/ext3/ialloc.c
@@ -574,8 +574,10 @@ got:
ei->i_frag_no = 0;
ei->i_frag_size = 0;
 #endif
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
ei->i_file_acl = 0;
ei->i_dir_acl = 0;
+#endif
ei->i_dtime = 0;
ei->i_block_alloc_info = NULL;
ei->i_block_group = group;
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 9b162cd..20a8aeb 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -46,9 +46,12 @@ static int ext3_writepage_trans_blocks(struct inode *inode);
  */
 static int ext3_inode_is_fast_symlink(struct inode *inode)
 {
-   int ea_blocks = EXT3_I(inode)->i_file_acl ?
-   (inode->i_sb->s_blocksize >> 9) : 0;
+   int ea_blocks = 0;
 
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   if (EXT3_I(inode)->i_file_acl)
+   ea_blocks = inode->i_sb->s_blocksize >> 9;
+#endif
return (S_ISLNK(inode->i_mode) && inode->i_blocks - ea_blocks == 0);
 }
 
@@ -2717,13 +2720,16 @@ void ext3_read_inode(struct inode * inode)
ei->i_frag_no = raw_inode->i_frag;
ei->i_frag_size = raw_inode->i_fsize;
 #endif
-   ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl);
-   if (!S_ISREG(inode->i_mode)) {
-   ei->i_dir_acl = le32_to_cpu(raw_inode->i_dir_acl);
-   } else {
+   if (S_ISREG(inode->i_mode)) {
inode->i_size |=
((__u64)le32_to_cpu(raw_inode->i_size_high)) << 32;
}
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   else {
+   ei->i_dir_acl = le32_to_cpu(raw_inode->i_dir_acl);
+   }
+   ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl);
+#endif
ei->i_disksize = inode->i_size;
inode->i_generation = le32_to_cpu(raw_inode->i_generation);
ei->i_block_group = iloc.block_group;
@@ -2854,10 +2860,7 @@ static int ext3_do_update_inode(handle_t *handle,
raw_inode->i_frag = ei->i_frag_no;
raw_inode->i_fsize = ei->i_frag_size;
 #endif
-   raw_inode->i_file_acl = cpu_to_le32(ei->i_file_acl);
-   if (!S_ISREG(inode->i_mode)) {
-   raw_inode->i_dir_acl = cpu_to_le32(ei->i_dir_acl);
-   } else {
+   if (S_ISREG(inode->i_mode)) {
raw_inode->i_size_high =
cpu_to_le32(ei->i_disksize >> 32);
if (ei->i_disksize > 0x7fffULL) {
@@ -2883,6 +2886,12 @@ static int ext3_do_update_inode(handle_t *handle,
}
}
}
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   else {
+   raw_inode->i_dir_acl = cpu_to_le32(ei->i_dir_acl);
+   }
+   raw_inode->i_file_acl = cpu_to_le32(ei->i_file_acl);
+#endif
raw_inode->i_generation = cpu_to_le32(inode->i_generation);
if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
if (old_valid_dev(inode->i_rdev)) {
diff --git a/include/linux/ext3_fs_i.h b/include/linux/ext3_fs_i.h
index 7894dd0..9e7f1b6 100644
--- a/include/linux/ext3_fs_i.h
+++ b/include/linux/ext3_fs_i.h
@@ -75,8 +75,10 @@ struct ext3_inode_info {
__u8i_frag_no;
__u8i_frag_size;
 #endif
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
ext3_fsblk_ti_file_acl;
__u32   i_dir_acl;
+#endif
__u32   i_dtime;
 
/*
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM REMAINS: [sata_nv ADMA breaks ATAPI] Crash on accessing DVD-RAM

2008-01-12 Thread James Bottomley

On Sat, 2008-01-12 at 13:25 -0600, Robert Hancock wrote:
> Alexander wrote:
> > Robert Hancock wrote:
> >> There's this patch which was intended to fix it:
> >>
> >> http://lkml.org/lkml/2007/11/22/148
> > 
> > I applied this patch to 2.6.24-rc7. Now at boot time my DVD-RW is
> > normaly detected as:
> > 
> > sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray
> > 
> > But I cannot mount it. All my attempts failed with
> > 
> > ISOFS: Unable to identify CD-ROM format.
> > 
> > With mem<=4098M or sata_nv.adma=0 it still mounts and works ok.
> 
> As I wrote, it would appear that somehow the blk_queue_bounce_limit 
> setting that the driver has made is not being respected and the block 
> layer is still trying to feed it addresses over 4GB. Any ideas anyone?

Actually, I'd be very sceptical that the blk_queue_bounce_limit isn't
working as advertised; there'd be a large number of failures if it were
not.

However, the "as advertised" part doesn't seem to be generally well
understood.  The point being that block commands are only bounced if
they come from the filesystem or the user, not if they're generated
directly inside the kernel.  Since the fault occurs before mount, it's
suggestive of the latter.

A long time ago, GFP_KERNEL allocations meant that the memory allocated
was physically under 4GB.  Then x86_64 (and before it ia64) wanted to
break this.  So they introduced a new flag:  GFP_DMA32 that behaved like
the old GFP_KERNEL and then changed GFP_KERNEL on their architectures to
return memory from anywhere.  I'd strongly suggest some piece of kernel
memory was allocated for a transfer buffer without GFP_DMA32 and then
passed in to the driver.  Unfortunately, that could be anywhere inside
cdrom or sr.  Knowing what the actual command is might help ... some of
the distinctive MMC media ones only have a single source.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Lenovo ThinkPads need acpi_osi="Linux"

2008-01-12 Thread Henrique de Moraes Holschuh

On Sat, 12 Jan 2008, Len Brown wrote:
> > Lenovo has been attempting to make things a bit easier for Linux on their
> > ThinkPads, by disabling the more obnoxious behaviours of the firmware (used
> > by their Windows drivers) when in Linux.  It looks like they used the OSI
> > string for that.

[...]

> > The most concrete example I have right now of changed behaviour is the Mute
> > key on the T61, which just plain disappears if acpi_osi="!Linux" (2.6.23
> > default).

[...]

> I discussed this with Lenovo a few months ago
> and made it clear that Linux will unlikely
> default to answer yes to OSI(Linux) in the future --
> for all the reasons I stated when I disabled it.
> 
> However, they told me they wanted to use OSI(Linux)
> to enable the backlight (via video re-post) on S3 resume,
> and the problem at hand was a distro kernel which still
> had OSI(Linux), so that is what they did.
> 
> I was unaware that they're using it for anything else,
> and the fact that they are lends further support to
> the case that as an OS interface string "Linux"
> is too vague to be meaningful.
> 
> They seemed open to looking for another string, say
> "Needs S3 video re-post".  However, we didn't
> agree on such a string, and so it isn't in Linux
> or the Lenovo BIOS.
> 
> I think until we have native Linux graphics driver for (fast)
> backlight restore, we need to add a DMI to enable OSI(Linux)
> for the offending Lenovo models.   We'll need to have
> some coordination with the graphics guys so that they
> can turn this off from user-space when they don't need it.

Unless we get them to lay off using OSI(Linux) altogether, that won't work
well.  You'd stop the slow POST on S3 resume, but you'd also change the
behaviour of the firmware somewhere else too (not to mention it is useless
to change the OSI string in most AML code I've seen after you ran the _INI
stuff).

It would have been way easier if ACPI had a more complete and thought-out
video control API.  There should be a way to request a BIOS re-post of the
video hardware through an optional ACPI call.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] serial: remove "too much work for irq" printk

2008-01-12 Thread Christoph Hellwig

On Sat, Jan 12, 2008 at 03:15:43PM -0500, Benjamin LaHaise wrote:
> When using kvm with a serial console, the serial driver will print out 
> "too much work for irq4" on any heavy activity (ie vi on a file repainting 
> the terminal).  This message is entirely spurious, as output continues to 
> work fine.  Remove the message as it corrupts screen output and is far too 
> easy to trigger.

Yeah, this message has been annoying me too.  Thanks for sending the
patch.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 19/21] ide-floppy: fix most of the remaining checkpatch.pl issues

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Friday 11 January 2008, Borislav Petkov wrote:
> i.e.,
> ERROR: switch and case should be at the same indent
> ERROR: need spaces around that '=' (ctx:VxV)
> ERROR: trailing statements should be on next line
> WARNING: no space between function name and open parenthesis '('
> WARNING: printk() should include KERN_ facility level
> ERROR: That open brace { should be on the previous line
> ERROR: use tabs not spaces
> ERROR: do not use assignment in if condition
> WARNING: braces {} are not necessary for single statement blocks
> ERROR: need space after that ',' (ctx:VxV)
> WARNING: line over 80 characters
> ERROR: do not use assignment in if condition
> ...

This should be the very last patch in the series
(and combined with patch #11).

> Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> ---
>  drivers/ide/ide-floppy.c |  147 +++--
>  1 files changed, 75 insertions(+), 72 deletions(-)
> 
> diff --git a/drivers/ide/ide-floppy.c b/drivers/ide/ide-floppy.c
> index 0729df5..3d9b1e5 100644
> --- a/drivers/ide/ide-floppy.c
> +++ b/drivers/ide/ide-floppy.c
> @@ -47,13 +47,13 @@
>  #define IDEFLOPPY_DEBUG_INFO 0
>  #define IDEFLOPPY_DEBUG_BUGS 1
>  
> -#define IDEFLOPPY_DEBUG( fmt, args... )
> +#define IDEFLOPPY_DEBUG(fmt, args...)
>  
>  #if IDEFLOPPY_DEBUG_LOG
>  #define debug_log(fmt, args...) \
>   printk(KERN_INFO "ide-floppy: " fmt, ## args)
>  #else
> -#define debug_log(fmt, args... ) do {} while(0)
> +#define debug_log(fmt, args...) do {} while (0)
>  #endif

Hmmm, these could have been dealt with in patch #4...

[...]

> @@ -1314,34 +1314,34 @@ static int idefloppy_identify_device (ide_drive_t 
> *drive,struct hd_driveid *id)
>  #if IDEFLOPPY_DEBUG_INFO
>   printk(KERN_INFO "Dumping ATAPI Identify Device floppy parameters\n");
>   switch (gcw.protocol) {
> - case 0: case 1: sprintf(buffer, "ATA");break;
> - case 2: sprintf(buffer, "ATAPI");break;
> - case 3: sprintf(buffer, "Reserved (Unknown to 
> ide-floppy)");break;
> + case 0: case 1: sprintf(buffer, "ATA"); break;
> + case 2: sprintf(buffer, "ATAPI"); break;
> + case 3: sprintf(buffer, "Reserved (Unknown to ide-floppy)"); break;
>   }
>   printk(KERN_INFO "Protocol Type: %s\n", buffer);
>   switch (gcw.device_type) {
> - case 0: sprintf(buffer, "Direct-access Device");break;
> - case 1: sprintf(buffer, "Streaming Tape Device");break;
> - case 2: case 3: case 4: sprintf (buffer, "Reserved");break;
> - case 5: sprintf(buffer, "CD-ROM Device");break;
> - case 6: sprintf(buffer, "Reserved");
> - case 7: sprintf(buffer, "Optical memory Device");break;
> - case 0x1f: sprintf(buffer, "Unknown or no Device type");break;
> - default: sprintf(buffer, "Reserved");
> + case 0: sprintf(buffer, "Direct-access Device"); break;
> + case 1: sprintf(buffer, "Streaming Tape Device"); break;
> + case 2: case 3: case 4: sprintf(buffer, "Reserved"); break;
> + case 5: sprintf(buffer, "CD-ROM Device"); break;
> + case 6: sprintf(buffer, "Reserved");
> + case 7: sprintf(buffer, "Optical memory Device"); break;
> + case 0x1f: sprintf(buffer, "Unknown or no Device type"); break;
> + default: sprintf(buffer, "Reserved");
>   }
>   printk(KERN_INFO "Device Type: %x - %s\n", gcw.device_type, buffer);
>   printk(KERN_INFO "Removable: %s\n", gcw.removable ? "Yes":"No");
>   switch (gcw.drq_type) {
> - case 0: sprintf(buffer, "Microprocessor DRQ");break;
> - case 1: sprintf(buffer, "Interrupt DRQ");break;
> - case 2: sprintf(buffer, "Accelerated DRQ");break;
> - case 3: sprintf(buffer, "Reserved");break;
> + case 0: sprintf(buffer, "Microprocessor DRQ"); break;
> + case 1: sprintf(buffer, "Interrupt DRQ"); break;
> + case 2: sprintf(buffer, "Accelerated DRQ"); break;
> + case 3: sprintf(buffer, "Reserved"); break;
>   }
>   printk(KERN_INFO "Command Packet DRQ Type: %s\n", buffer);
>   switch (gcw.packet_size) {
> - case 0: sprintf(buffer, "12 bytes");break;
> - case 1: sprintf(buffer, "16 bytes");break;
> - default: sprintf(buffer, "Reserved");break;
> + case 0: sprintf(buffer, "12 bytes"); break;
> + case 1: sprintf(buffer, "16 bytes"); break;
> + default: sprintf(buffer, "Reserved"); break;
>   }
>   printk(KERN_INFO "Command Packet Size: %s\n", buffer);
>  #endif /* IDEFLOPPY_DEBUG_INFO */

> @@ -1349,13 +1349,16 @@ static int idefloppy_identify_device (ide_drive_t 
> *drive,struct hd_driveid *id)
>   if (gcw.protocol != 2)
>   printk(KERN_ERR "ide-floppy: Protocol is not ATAPI\n");
>   else if (gcw.device_type != 0)
> - printk(KERN_ERR "ide-floppy: Device type is not set to 
> floppy\n");
> + printk(KERN_ERR "ide-floppy: Device typ

Re: [PATCH 18/21] ide-floppy: fix error handling in idefloppy_probe()

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Friday 11 January 2008, Borislav Petkov wrote:
> Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> ---
>  drivers/ide/ide-floppy.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/ide/ide-floppy.c b/drivers/ide/ide-floppy.c
> index 89b26ea..0729df5 100644
> --- a/drivers/ide/ide-floppy.c
> +++ b/drivers/ide/ide-floppy.c
> @@ -1737,7 +1737,8 @@ static int ide_floppy_probe(ide_drive_t *drive)
>   " emulation.\n", drive->name);
>   goto failed;
>   }
> - if ((floppy = kzalloc(sizeof (idefloppy_floppy_t), GFP_KERNEL)) == 
> NULL) {
> + floppy = kzalloc(sizeof(idefloppy_floppy_t), GFP_KERNEL);
> + if (!floppy) {
>   printk(KERN_ERR "ide-floppy: %s: Can't allocate a floppy"
>   " structure\n", drive->name);
>   goto failed;

I'm unable to see any problem with error handling here?

This change should be combined with the rest of checkpatch.pl fixes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 20/21] ide-floppy: merge idefloppy_{input,output}_buffers

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Friday 11 January 2008, Borislav Petkov wrote:
> We merge idefloppy_{input,output}_buffers() into idefloppy_io_buffers() by
> introducing a 4th arg. called direction. According to its value
> we atapi_input_bytes() or atapi_output_bytes(). Also, simplify the interrupt

This change is fine but ...

> handler by removing multiple calls testing the data direction and using a 
> local
> variable instead.

... the patch replaces 'test_bit(PC_WRITING, &pc->flags)' check with
'rq_data_dir(rq) == WRITE' one.  While this may look as "trivial" change it
is not such.  It should be done only after auditing the driver and making
sure that we are not introducing subtle regressions (=> I see that some
commands are setting PC_WRITING but are not setting REQ_RW bit), especially
given that these changes were not tested with the real hardware.

Please separate this change to another (post-)patch.

PS It would also be nice to remove IDEFLOPPY_DEBUG_BUGS define in a pre-patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 14/21] ide-floppy: mv idefloppy_{should_,}report_error

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Friday 11 January 2008, Borislav Petkov wrote:
> In addition to shortening the function name, move the printk-call into the
> function thereby saving some code lines. Also, make the function out_of_line
> since it is not on a performance critical path.
> 
> Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> ---
>  drivers/ide/ide-floppy.c |   37 ++---
>  1 files changed, 14 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/ide/ide-floppy.c b/drivers/ide/ide-floppy.c
> index 49d83a1..b718615 100644
> --- a/drivers/ide/ide-floppy.c
> +++ b/drivers/ide/ide-floppy.c
> @@ -707,16 +707,18 @@ static ide_startstop_t 
> idefloppy_transfer_pc1(ide_drive_t *drive)
>   return ide_started;
>  }
>  
> -/*
> - * Suppresses error messages resulting from Medium not present.
> - */
> -static inline int idefloppy_should_report_error(idefloppy_floppy_t *floppy)
> +static void idefloppy_report_error(idefloppy_floppy_t *floppy,
> + idefloppy_pc_t *pc)
>  {

-> Would make a sense to move the comment here instead of removing it
(it is useful unless you remeber all ->{sense_key,asc,ascq} value).

>   if (floppy->sense_key == 0x02 &&
>   floppy->asc   == 0x3a &&
>   floppy->ascq  == 0x00)
> - return 0;
> - return 1;
> + return;

Otherwise the patch is fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 21/21] ide-floppy: remove atomic test_*bit macros

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Friday 11 January 2008, Borislav Petkov wrote:
> This change is temporary and after unification of the IDE subsystem proper
> bit setting and testing macros will be introduced.
> 
> Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> ---
>  drivers/ide/ide-floppy.c |   82 
> +-
>  1 files changed, 45 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/ide/ide-floppy.c b/drivers/ide/ide-floppy.c
> index 4106eb4..29c1983 100644
> --- a/drivers/ide/ide-floppy.c
> +++ b/drivers/ide/ide-floppy.c
> @@ -479,12 +479,12 @@ static ide_startstop_t idefloppy_pc_intr(ide_drive_t 
> *drive)
>  
>   debug_log("Reached %s interrupt handler\n", __FUNCTION__);
>  
> - if (test_bit(PC_DMA_IN_PROGRESS, &pc->flags)) {
> + if ((1UL << PC_DMA_IN_PROGRESS) & pc->flags) {

How's about introducing new defines i.e.

enum {
IDE_FLOPPY_FLAG_PC_ABORT= (1 << 0),
IDE_FLOPPY_FLAG_PC_DMA_RECOMMENDED  = (1 << 1),
IDE_FLOPPY_FLAG_PC_DMA_IN_PROGRESS  = (1 << 2),
...
}

instead of open-coding the bit-shifts?

>   dma_error = HWIF(drive)->ide_dma_end(drive);
>   if (dma_error) {
>   printk(KERN_ERR "%s: DMA %s error\n", drive->name,
>   write ? "write" : "read");
> - set_bit(PC_DMA_ERROR, &pc->flags);
> + pc->flags |= (1UL << PC_DMA_ERROR);
>   } else {
>   pc->actually_transferred = pc->request_transfer;
>   idefloppy_update_buffers(drive, pc);
> @@ -499,11 +499,11 @@ static ide_startstop_t idefloppy_pc_intr(ide_drive_t 
> *drive)
>   /* No more interrupts */
>   debug_log("Packet command completed, %d bytes transferred\n",
>   pc->actually_transferred);
> - clear_bit(PC_DMA_IN_PROGRESS, &pc->flags);
> + pc->flags &= ((1UL << PC_DMA_IN_PROGRESS) ^ ~0UL);

Same can be achieved with:

pc->flags &= ~(1 << PC_DMA_IN_PROGRESS);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/21] ide-floppy: fix comments formatting

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Friday 11 January 2008, Borislav Petkov wrote:
> That is,
> - remove unnecessary comments
> - shorten comments
> - shorten lines longer 80 columns
> - cleanup whitespace
> - add a missing loglevel KERN_ to a printk-call
> - fix misc checkpatch warnings

Majority of this patch consists of checkpatch.pl fixes so it should be merged
with patch #20.  Also a lot of code beautified here is _heavily_ modified in
later patches so some of fixups below could be moved there (which would also
decrease the size of this patch significantly).

> Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>
> ---
>  drivers/ide/ide-floppy.c |  402 
> +-
>  1 files changed, 181 insertions(+), 221 deletions(-)

[...]

> +#define  PC_ABORT0   /* Set when an error is 
> considered\
> +normal - We won't retry */

'\' shouldn't be there and

/* ... */
#define PC_ABORT0

would be probably more readable

> +#define PC_DMA_RECOMMENDED   2   /* DMA use preferred, if possible */

please make it match the other flags while at it (space -> tab)

[...]

> -#define IDEFLOPPY_USE_READ12 2   /* Use READ12/WRITE12 or 
> READ10/WRITE10 */
> +#define IDEFLOPPY_USE_READ12 2   /* READ(10|12)/WRITE(10|12) */

The original comment was way more informative.

Moreover this particular flag is never set so it could be just removed
(together with some dead code for handling it) in a separate (pre-)patch.

>  #define  IDEFLOPPY_FORMAT_IN_PROGRESS3   /* Format in progress */

please make it match the other flags while at it (tab -> space)

> -#define IDEFLOPPY_CLIK_DRIVE 4   /* Avoid commands not supported 
> in Clik drive */
> -#define IDEFLOPPY_ZIP_DRIVE  5   /* Requires BH algorithm for 
> packets */
> +#define IDEFLOPPY_CLIK_DRIVE 4   /* Avoid commands not supported\
> +in Clik drive */
> +#define IDEFLOPPY_ZIP_DRIVE  5   /* Requires BH algorithm for\
> +packets */

no need for '\' characters

[...]

> -static void idefloppy_queue_pc_head (ide_drive_t *drive,idefloppy_pc_t 
> *pc,struct request *rq)
> +static void idefloppy_queue_pc_head(ide_drive_t *drive, idefloppy_pc_t *pc,
> + struct request *rq)

minor CodingStyle nitpick:

static void idefloppy_queue_pc_head(ide_drive_t *drive, idefloppy_pc_t *pc,
struct request *rq)

would be more readable IMO but it is a matter of personal taste

[ same applies for other similar modifications done by this patch series ]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 12/21] ide-floppy: factor out ioctl handlers from idefloppy_ioctl()

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Friday 11 January 2008, Borislav Petkov wrote:
> By passing idefloppy_floppy_t *floppy to the factored out functions, we get
> rid of (almost) all local vars so stack usage should be at minimum here. Also,
> we merge idefloppy_begin_format() into idefloppy_format_start() since it is 
> its
> only user.
> 
> Also,
> - remove unneeded scsi ioctl chunk

They are _needed_, despite the name these ioctls are _not_ limited
to SCSI subsystem.

[...]

> + int prevent = (arg) ? 1 : 0;

parentheses are unnecessary

[...]

> +static int idefloppy_format_unit(idefloppy_floppy_t *floppy, unsigned long 
> arg)

__user tag was dropped from 'arg'
(I bet that this would make sparse checking unhappy)

> +{
> + int blocks, length, flags, err = 0;
> + int __user *argp = (int __user *)arg;

wouldn't be needed if the 'arg' was of 'int __user' type and the casting was
done in the caller function

[...]

> + if (idefloppy_queue_pc_tail(floppy->drive, &pc)) {
> + err = -EIO;
> + goto out;

'goto out' is unnecessary

> + }
> +
> +out:
> + if (err)
> + clear_bit(IDEFLOPPY_FORMAT_IN_PROGRESS, &floppy->flags);
> + return err;
> +}

[...]

> - /*
> -  * skip SCSI_IOCTL_SEND_COMMAND (deprecated)
> -  * and CDROM_SEND_PACKET (legacy) ioctls
> -  */
> - if (cmd != CDROM_SEND_PACKET && cmd != SCSI_IOCTL_SEND_COMMAND)
> - err = scsi_cmd_ioctl(file, bdev->bd_disk->queue,
> - bdev->bd_disk, cmd, argp);
> - else
> - err = -ENOTTY;
> -
> - if (err == -ENOTTY)
> - err = generic_ide_ioctl(drive, file, bdev, cmd, arg);
> -
> - return err;
> + return generic_ide_ioctl(drive, file, bdev, cmd, arg);

this whole chunk needs to be reverted

Please recast/resubmit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 06/21] ide-floppy: remove struct idefloppy_flexible_disk_page

2008-01-12 Thread Bartlomiej Zolnierkiewicz

On Saturday 12 January 2008, Bartlomiej Zolnierkiewicz wrote:

[...]

> > -   header = (idefloppy_mode_parameter_header_t *) pc.buffer;
> > -   floppy->wp = header->wp;
> > +   floppy->wp = pc.buffer[3] & 0x80;
> 
> This is not an equivalent transformation:
> 
> header->wp is 0 or 1
> pc.buffer[3] & 0x80 is 0 or 0x80
> 
> It seems to work fine for ->wp (because it is needlessly defined as 'int')
> but may seriously confuse set_disk_ro() and thus bdev_read_only() users.
> 
> Should be fixed to '(pc.buffer[3] & 0x80) ? 1 : 0' (or something similar).

Update: this change belongs to patch #10 (+ the need for such change in
patch #6 is a hint that #10 should be before #6)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/21] ide-floppy redux v2

2008-01-12 Thread Bartlomiej Zolnierkiewicz


Hi,

On Saturday 12 January 2008, Bartlomiej Zolnierkiewicz wrote:
> On Friday 11 January 2008, Borislav Petkov wrote:
> > 
> > Hi Bart,
> > 
> >here's the second version of the ide-floppy refactoring trail. All the
> > patches are based on the version of your quilt tree from the 05.01. Also, 
> > you've
> > already applied patch 5 in this series but i'm submitting it still for the 
> > sake
> > of completeness.
> > 
> > 
> >  drivers/ide/ide-cd.c |2 -
> >  drivers/ide/ide-floppy.c | 1403 
> > ++
> >  include/linux/cdrom.h|1 +
> >  include/linux/ide.h  |3 +
> >  4 files changed, 558 insertions(+), 851 deletions(-)
> 
> applied patches #1-4, #8
> (#4 with some changes)
> 
> #5 was already applied so I just re-ordered it into the right spot
> 
> I have some comments for #6-7

applied #9, #15 and #17

some comments for #11-12, #14 and #18-21 in separate mails

#10 and #16 are fine but because they depend on unmerged patches
I'm unable to apply them currently

Overall: good job!  300 LOC removed from the driver, code size savings
and a lot of preparations for the future ATAPI handling unification. :)

Thanks,
Bart

PS1 Please rebase the patches still needing polishing on top of updated
IDE quilt tree, recast them and respin the patch series (no need to post
already merged patches).

PS2 what happend to "fix DMA error reporting" patch? (#13 is missing, hmm?)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: new runtime scsi warnings in 2.6.24-rc6+git

2008-01-12 Thread James Bottomley

On Sat, 2008-01-12 at 20:58 +0100, Rafael J. Wysocki wrote:
> On Friday, 4 of January 2008, Meelis Roos wrote:
> > Todays git gives the following warning during bootup on a Intel 845+PATA 
> > PC (using libata to drive PATA):
> > 
> > Driver 'sd' needs updating - please use bus_type methods
> > Driver 'sr' needs updating - please use bus_type methods
> 
> They are due to commit 751bf4d7865e4ced406be93b04c7436d866d3684, AFAICS, and
> they don't mean anything wrong.

Yes ... SCSI actually is actually using the bus_type methods.  The issue
is that we need to cascade the removes so we're using the driver remove
method for this.  The warning appears if you have both a bus_type remove
and a driver remove (and a host of other duplicate methods), I think
because if the bus_type method is set, the driver method won't be
called.

However, we've taken all this into account and our bus type remove calls
into the driver remove itself ... unfortunately there seems to be no way
to persuade the driver core we know what we're doing.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] serial: remove "too much work for irq" printk

2008-01-12 Thread Benjamin LaHaise

When using kvm with a serial console, the serial driver will print out 
"too much work for irq4" on any heavy activity (ie vi on a file repainting 
the terminal).  This message is entirely spurious, as output continues to 
work fine.  Remove the message as it corrupts screen output and is far too 
easy to trigger.

-ben

Signed-off-by: Benjamin LaHaise <[EMAIL PROTECTED]>
diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c
index f94109c..030b8b5 100644
--- a/drivers/serial/8250.c
+++ b/drivers/serial/8250.c
@@ -1493,12 +1493,8 @@ static irqreturn_t serial8250_interrupt(int irq, void 
*dev_id)
 
l = l->next;
 
-   if (l == i->head && pass_counter++ > PASS_LIMIT) {
-   /* If we hit this, we're dead. */
-   printk(KERN_ERR "serial8250: too much work for "
-   "irq%d\n", irq);
+   if (l == i->head && pass_counter++ > PASS_LIMIT)
break;
-   }
} while (l != end);
 
spin_unlock(&i->lock);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] input: psmouse: fix potential input register race in psmouse_connect()

2008-01-12 Thread Andres Salomon


If we successfully call input_register_device() in
psmouse_connect but sysfs_create_group() fails, we'll enter the error
path without ever having called input_unregister_device() (potentially
leaking memory, or creating a race condition if something else attempts
to access the new input device).  This calls input_unregister_device
from the error path, and sets input_dev to NULL so that we don't
attempt to also call input_free_device on it.

Signed-off-by: Andres Salomon <[EMAIL PROTECTED]>
---
 drivers/input/mouse/psmouse-base.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/input/mouse/psmouse-base.c 
b/drivers/input/mouse/psmouse-base.c
index 21a9c0b..df25e7b 100644
--- a/drivers/input/mouse/psmouse-base.c
+++ b/drivers/input/mouse/psmouse-base.c
@@ -1247,6 +1247,8 @@ static int psmouse_connect(struct serio *serio, struct 
serio_driver *drv)
  err_pt_deactivate:
if (parent && parent->pt_deactivate)
parent->pt_deactivate(parent);
+   input_unregister_device(psmouse->dev);
+   input_dev = NULL;
  err_protocol_disconnect:
if (psmouse->disconnect)
psmouse->disconnect(psmouse);
-- 
1.5.3.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] input: psmouse: fix input_dev leak in lifebook driver

2008-01-12 Thread Andres Salomon


The lifebook driver may register a second input device, but it never
unregisters it.  This fixes that.

Signed-off-by: Andres Salomon <[EMAIL PROTECTED]>
---
 drivers/input/mouse/lifebook.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/input/mouse/lifebook.c b/drivers/input/mouse/lifebook.c
index 9ec57d8..df81b0a 100644
--- a/drivers/input/mouse/lifebook.c
+++ b/drivers/input/mouse/lifebook.c
@@ -225,8 +225,13 @@ static void lifebook_set_resolution(struct psmouse 
*psmouse, unsigned int resolu
 
 static void lifebook_disconnect(struct psmouse *psmouse)
 {
+   struct lifebook_data *priv = psmouse->private;
+
psmouse_reset(psmouse);
-   kfree(psmouse->private);
+   if (priv) {
+   input_unregister_device(priv->dev2);
+   kfree(priv);
+   }
psmouse->private = NULL;
 }
 
-- 
1.5.3.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] input: check return value of input_register_device() in hil_ptr.c's init

2008-01-12 Thread Andres Salomon


Signed-off-by: Andres Salomon <[EMAIL PROTECTED]>
---
 drivers/input/mouse/hil_ptr.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/input/mouse/hil_ptr.c b/drivers/input/mouse/hil_ptr.c
index 27f88fb..de8b836 100644
--- a/drivers/input/mouse/hil_ptr.c
+++ b/drivers/input/mouse/hil_ptr.c
@@ -380,7 +380,10 @@ static int hil_ptr_connect(struct serio *serio, struct 
serio_driver *driver)
ptr->dev->id.version= 0x0100; /* TODO: get from ptr->rsc */
ptr->dev->dev.parent= &serio->dev;
 
-   input_register_device(ptr->dev);
+   if (input_register_device(ptr->dev)) {
+   printk(KERN_INFO PREFIX "Unable to register input device\n");
+   goto bail2;
+   }
printk(KERN_INFO "input: %s (%s), ID: %d\n",
ptr->dev->name,
(btntype == BTN_MOUSE) ? "HIL mouse":"HIL tablet or touchpad",
-- 
1.5.3.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-mm: pnp-do-not-stop-start-devices-in-suspend-resume-path.patch breaks resuming isapnp cards

2008-01-12 Thread Rene Herman


Hi Andrew.

pnp-do-not-stop-start-devices-in-suspend-resume-path.patch in current -mm 
breaks resuming isapnp cards from hibernation. They need the pnp_start_dev 
to enable the device again after hibernation.


They don't really need the pnp_stop_dev() which the above mentioned patch 
also removes but with the pnp_start_dev() restored it seems pnp_stop_dev() 
should also stay. Bjorn Helgaas should decide  -- currently the patch as you 
have it breaks drivers though. Could you drop it?


Then, if so and after you do that, could you apply the attached? That's also 
needed to resume (ALSA) ISA-PnP cards from hibernation due to the 
RES_DO_NOT_CHANGE test triggering for ALSA drivers and the pnp_start_dev() 
still not happening. More in the changelog...


On 12-01-08 20:08, Rafael J. Wysocki wrote:


On Saturday, 12 of January 2008, Rene Herman wrote:


It seems all PnP drivers would need to stick a pnp_start_dev in their resume 
method


Yes.


then which means it really belongs in core.


Yes, if practical.


One important point where PnP and PCI differ is that PnP allows to change the
resources on a protocol level and I don't see how it could ever not be
necessary to restore the state a user may have set if power has been
removed. Hibernate is just that, isn't it?


Basically, yes, it is.


Rene.

commit 7d16e8b3e7739599d32c8006f9e84fadb86b8296
Author: Rene Herman <[EMAIL PROTECTED]>
Date:   Sat Jan 12 00:00:35 2008 +0100

PNP: do not test PNP_DRIVER_RES_DO_NOT_CHANGE on suspend/resume

The PNP_DRIVER_RES_DO_NOT_CHANGE flag is meant to signify that
the PNP core should not change resources for the device -- not
that it shouldn't disable/enable the device on suspend/resume.

ALSA ISAPnP drivers set PNP_DRIVER_RES_DO_NOT_CHANAGE (0x0001)
through setting PNP_DRIVER_RES_DISABLE (0x0003). The latter
including the former may in itself be considered rather
unexpected but doesn't change that suspend/resume wouldn't seem
to have any business testing the flag.

As reported by Ondrej Zary for snd-cs4236, ALSA driven ISAPnP
cards don't survive swsusp hibernation with the resume skipping
setting the resources due to testing the flag -- the same test
in the suspend path isn't enough to keep hibernation from
disabling the card it seems.

These tests were added (in 2005) by Piere Ossman in commit
68094e3251a664ee1389fcf179497237cbf78331, "alsa: Improved PnP
suspend support" who doesn't remember why. This deletes them.

Signed-off-by: Rene Herman <[EMAIL PROTECTED]>
Tested-by: Ondrej Zary <[EMAIL PROTECTED]>

diff --git a/drivers/pnp/driver.c b/drivers/pnp/driver.c
index a262762..12a1645 100644
--- a/drivers/pnp/driver.c
+++ b/drivers/pnp/driver.c
@@ -161,8 +161,7 @@ static int pnp_bus_suspend(struct device *dev, pm_message_t 
state)
return error;
}
 
-   if (!(pnp_drv->flags & PNP_DRIVER_RES_DO_NOT_CHANGE) &&
-   pnp_can_disable(pnp_dev)) {
+   if (pnp_can_disable(pnp_dev)) {
error = pnp_stop_dev(pnp_dev);
if (error)
return error;
@@ -185,14 +184,17 @@ static int pnp_bus_resume(struct device *dev)
if (pnp_dev->protocol && pnp_dev->protocol->resume)
pnp_dev->protocol->resume(pnp_dev);
 
-   if (!(pnp_drv->flags & PNP_DRIVER_RES_DO_NOT_CHANGE)) {
+   if (pnp_can_write(pnp_dev)) {
error = pnp_start_dev(pnp_dev);
if (error)
return error;
}
 
-   if (pnp_drv->resume)
-   return pnp_drv->resume(pnp_dev);
+   if (pnp_drv->resume) {
+   error = pnp_drv->resume(pnp_dev);
+   if (error)
+   return error;
+   }
 
return 0;
 }

Re: [PATCH 1/7] driver-core : add class iteration api

2008-01-12 Thread Jarek Poplawski

On Sat, Jan 12, 2008 at 05:47:54PM +0800, Dave Young wrote:
> Add the following class iteration functions for driver use:
> class_for_each_device
> class_find_device
> class_for_each_child
> class_find_child
> 
> Signed-off-by: Dave Young <[EMAIL PROTECTED]> 
> 
> ---
>  drivers/base/class.c   |  159 
> +
>  include/linux/device.h |8 ++
>  2 files changed, 167 insertions(+)
> 
> diff -upr linux/drivers/base/class.c linux.new/drivers/base/class.c
> --- linux/drivers/base/class.c2008-01-12 14:42:24.0 +0800
> +++ linux.new/drivers/base/class.c2008-01-12 14:42:24.0 +0800
> @@ -798,6 +798,165 @@ void class_device_put(struct class_devic
>   kobject_put(&class_dev->kobj);
>  }
>  
> +/**
> + *   class_for_each_device - device iterator
> + *   @class: the class we're iterating
> + *   @data: data for the callback
> + *   @fn: function to be called for each device
> + *
> + *   Iterate over @class's list of devices, and call @fn for each,
> + *   passing it @data.
> + *
> + *   We check the return of @fn each time. If it returns anything
> + *   other than 0, we break out and return that value.
> + */
> +int class_for_each_device(struct class *class, void *data,
> +int (*fn)(struct device *, void *))
> +{
> + struct device *dev;
> + int error = 0;
> +
> + if (!class)
> + return -EINVAL;
> + down(&class->sem);
> + list_for_each_entry(dev, &class->devices, node) {

Probably some tiny oversight, but I see this comment to struct class
doesn't mention devices list, so maybe this needs to be updated BTW?:

(from include/linux/device.h)
"struct semaphoresem;/* locks both the children and 
interfaces lists */"

Regards,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/8] ide: add ->cable_detect method to ide_hwif_t

2008-01-12 Thread Sergei Shtylyov


Hello.

Bartlomiej Zolnierkiewicz wrote:


* Add ->cable_detect method to ide_hwif_t.



* Call the new method in ide_init_port() if:
  - the host supports UDMA modes > UDMA2 ('hwif->ultra_mask & 78')
  - DMA initialization was successful (if hwif->dma_base is not set
ide_init_port() sets hwif->ultra_mask to zero)
  - "idex=ata66" is not used ('hwif->cbl != ATA_CBL_PATA40_SHORT')



* Convert PCI host drivers to use ->cable_detect method.



While at it:



* Factor out cable detection to separate functions (if not already done).



* hpt366.c/it8213.c/slc90e66.c:
  - don't check cable type if "idex=ata66" is used



* pdc202xx_new.c:
  - add __devinit tag to pdcnew_cable_detect()



* pdc202xx_old.c:
  - rename pdc202xx_old_cable_detect() to pdc2026x_old_cable_detect()
  - add __devinit tag to pdc2026x_old_cable_detect()



Signed-off-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>



Index: b/drivers/ide/ide-probe.c
===
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -1343,6 +1343,11 @@ static void ide_init_port(ide_hwif_t *hw
/* call chipset specific routine for each enabled port */
if (d->init_hwif)
d->init_hwif(hwif);
+
+   if (hwif->cable_detect && (hwif->ultra_mask & 0x78)) {
+   if (hwif->cbl != ATA_CBL_PATA40_SHORT)
+   hwif->cbl = hwif->cable_detect(hwif);
+   }


   Could be collapsed to a single *if* statement...


Index: b/drivers/ide/pci/alim15x3.c
===
--- a/drivers/ide/pci/alim15x3.c
+++ b/drivers/ide/pci/alim15x3.c
@@ -666,13 +666,12 @@ static void __devinit init_hwif_common_a
hwif->set_dma_mode = &ali_set_dma_mode;
hwif->udma_filter = &ali_udma_filter;
 
+	hwif->cable_detect = ata66_ali15x3;


   Why not give that function a "standard" name while at it?


 static const struct ide_port_info atiixp_pci_info[] __devinitdata = {
Index: b/drivers/ide/pci/cmd64x.c
===
--- a/drivers/ide/pci/cmd64x.c
+++ b/drivers/ide/pci/cmd64x.c
@@ -393,6 +393,8 @@ static void __devinit init_hwif_cmd64x(i
hwif->set_pio_mode = &cmd64x_set_pio_mode;
hwif->set_dma_mode = &cmd64x_set_dma_mode;
 
+	hwif->cable_detect = ata66_cmd64x;

+


   Same question...


Index: b/drivers/ide/pci/hpt366.c
===
--- a/drivers/ide/pci/hpt366.c
+++ b/drivers/ide/pci/hpt366.c
@@ -1279,12 +1279,55 @@ static unsigned int __devinit init_chips
return dev->irq;
 }
 
+static u8 __devinit hpt3xx_cable_detect(ide_hwif_t *hwif)

+{
+   struct pci_dev  *dev= to_pci_dev(hwif->dev);
+   struct hpt_info *info   = pci_get_drvdata(dev);
+   u8 chip_type= info->chip_type;
+   u8 scr1 = 0, ata66  = hwif->channel ? 0x01 : 0x02;


   The 'ata66' is pretty bad name for this variable (reversed sense), let's 
go with 'mask'...



+
+   /*
+* The HPT37x uses the CBLID pins as outputs for MA15/MA16
+* address lines to access an external EEPROM.  To read valid
+* cable detect state the pins must be enabled as inputs.
+*/
+   if (chip_type == HPT374 && (PCI_FUNC(dev->devfn) & 1)) {
+   /*
+* HPT374 PCI function 1
+* - set bit 15 of reg 0x52 to enable TCBLID as input
+* - set bit 15 of reg 0x56 to enable FCBLID as input
+*/
+   u8  mcr_addr = hwif->select_data + 2;
+   u16 mcr;
+
+   pci_read_config_word(dev, mcr_addr, &mcr);
+   pci_write_config_word(dev, mcr_addr, (mcr | 0x8000));
+   /* now read cable id register */
+   pci_read_config_byte(dev, 0x5a, &scr1);
+   pci_write_config_word(dev, mcr_addr, mcr);
+   } else if (chip_type >= HPT370) {
+   /*
+* HPT370/372 and 374 pcifn 0
+* - clear bit 0 of reg 0x5b to enable P/SCBLID as inputs
+*/
+   u8 scr2 = 0;
+
+   pci_read_config_byte(dev, 0x5b, &scr2);
+   pci_write_config_byte(dev, 0x5b, (scr2 & ~1));
+   /* now read cable id register */
+   pci_read_config_byte(dev, 0x5a, &scr1);
+   pci_write_config_byte(dev, 0x5b,  scr2);


   Sigh, my pretty formatting is gone... at least don't leave double spaces 
and needless parens. :-)



+   } else
+   pci_read_config_byte(dev, 0x5a, &scr1);
+
+   return (scr1 & ata66) ? ATA_CBL_PATA40 : ATA_CBL_PATA80;
+}
+

[...]

--- a/drivers/ide/pci/it8213.c
+++ b/drivers/ide/pci/it8213.c
Index: b/drivers/ide/pci/it821x.c
===
--- a/drivers/ide/pci/it821x.c
+++ b/drivers/ide/pci/it821x.c
@@ -579,14 +579,13 @@ static void __devinit init_hwif_it821x(i

[PATCH] [RESEND] Assign IRQs to HPET Timers

2008-01-12 Thread Balaji Rao

The userspace API for the HPET (see Documentation/hpet.txt) did not work. The 
HPET_IE_ON ioctl was failing as there was no IRQ assigned to the timer 
device. This patch fixes it by allocating IRQs to timer blocks in the HPET.

arch/x86/kernel/hpet.c |   13 +
drivers/char/hpet.c|   45 ++---
include/linux/hpet.h   |2 +-
3 files changed, 44 insertions(+), 16 deletions(-)

Signed-off-by: Balaji Rao <[EMAIL PROTECTED]>

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 4a86ffd..08ee998 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -116,8 +116,7 @@ int is_hpet_enabled(void)
 static void hpet_reserve_platform_timers(unsigned long id)
 {
struct hpet __iomem *hpet = hpet_virt_address;
-   struct hpet_timer __iomem *timer = &hpet->hpet_timers[2];
-   unsigned int nrtimers, i;
+   unsigned int nrtimers;
struct hpet_data hd;
 
nrtimers = ((id & HPET_ID_NUMBER) >> HPET_ID_NUMBER_SHIFT) + 1;
@@ -132,16 +131,14 @@ static void hpet_reserve_platform_timers(unsigned long 
id)
 #ifdef CONFIG_HPET_EMULATE_RTC
hpet_reserve_timer(&hd, 1);
 #endif
-
hd.hd_irq[0] = HPET_LEGACY_8254;
hd.hd_irq[1] = HPET_LEGACY_RTC;
 
-   for (i = 2; i < nrtimers; timer++, i++)
-   hd.hd_irq[i] = (timer->hpet_config & Tn_INT_ROUTE_CNF_MASK) >>
-   Tn_INT_ROUTE_CNF_SHIFT;
-
+   /*
+* IRQs for the other timers are assigned dynamically
+* in hpet_alloc
+*/
hpet_alloc(&hd);
-
 }
 #else
 static void hpet_reserve_platform_timers(unsigned long id) { }
diff --git a/drivers/char/hpet.c b/drivers/char/hpet.c
index 4c16778..593b32c 100644
--- a/drivers/char/hpet.c
+++ b/drivers/char/hpet.c
@@ -806,14 +806,14 @@ static unsigned long hpet_calibrate(struct hpets *hpetp)
 
 int hpet_alloc(struct hpet_data *hdp)
 {
-   u64 cap, mcfg;
+   u64 cap, mcfg, hpet_config;
struct hpet_dev *devp;
-   u32 i, ntimer;
+   u32 i, ntimer, irq;
struct hpets *hpetp;
size_t siz;
struct hpet __iomem *hpet;
static struct hpets *last = NULL;
-   unsigned long period;
+   unsigned long period, irq_bitmap;
unsigned long long temp;
 
/*
@@ -840,11 +840,41 @@ int hpet_alloc(struct hpet_data *hdp)
hpetp->hp_hpet_phys = hdp->hd_phys_address;
 
hpetp->hp_ntimer = hdp->hd_nirqs;
+   hpet = hpetp->hp_hpet;
 
-   for (i = 0; i < hdp->hd_nirqs; i++)
-   hpetp->hp_dev[i].hd_hdwirq = hdp->hd_irq[i];
+   /* Assign IRQs statically for legacy devices */
+   hpetp->hp_dev[0].hd_hdwirq = hdp->hd_irq[0];
+   hpetp->hp_dev[1].hd_hdwirq = hdp->hd_irq[1];
 
-   hpet = hpetp->hp_hpet;
+   /* Assign IRQs dynamically for the others */
+   for (i = 2, devp = &hpetp->hp_dev[2]; i < hdp->hd_nirqs; i++, devp++) {
+   struct hpet_timer __iomem *timer;
+
+   timer = &hpet->hpet_timers[devp - hpetp->hp_dev];
+
+   hpet_config = readq(&timer->hpet_config);
+   irq_bitmap = (hpet_config & Tn_INT_ROUTE_CAP_MASK)
+   >> Tn_INT_ROUTE_CAP_SHIFT;
+   if (!irq_bitmap)
+   irq = 0;/* No valid IRQ Assignable */
+   else {
+   irq = find_first_bit(&irq_bitmap, 32);
+   do {
+   hpet_config |= irq << Tn_INT_ROUTE_CNF_SHIFT;
+   writeq(hpet_config, &timer->hpet_config);
+
+   /*
+* Verify whether we have written a valid
+* IRQ number by reading it back again
+*/
+   hpet_config = readq(&timer->hpet_config);
+   if (irq == (hpet_config & Tn_INT_ROUTE_CNF_MASK)
+   >> Tn_INT_ROUTE_CNF_SHIFT)
+   break;  /* Success */
+   } while ((irq = (find_next_bit(&irq_bitmap, 32, irq;
+   }
+   hpetp->hp_dev[i].hd_hdwirq = irq;
+   }
 
cap = readq(&hpet->hpet_cap);
 
@@ -875,7 +905,8 @@ int hpet_alloc(struct hpet_data *hdp)
hpetp->hp_which, hdp->hd_phys_address,
hpetp->hp_ntimer > 1 ? "s" : "");
for (i = 0; i < hpetp->hp_ntimer; i++)
-   printk("%s %d", i > 0 ? "," : "", hdp->hd_irq[i]);
+   printk("%s %d", i > 0 ? "," : "",
+   hpetp->hp_dev[i].hd_hdwirq);
printk("\n");
 
printk(KERN_INFO "hpet%u: %u %d-bit timers, %Lu Hz\n",
diff --git a/include/linux/hpet.h b/include/linux/hpet.h
index 707f7cb..e3c0b2a 100644
--- a/include/linux/hpet.h
+++ b/include/linux/hpet.h
@@ -64,7 +64,7 @@ struct hpet {
  */
 
 #defineTn_INT_ROUTE_CAP_MASK

Re: new runtime scsi warnings in 2.6.24-rc6+git

2008-01-12 Thread Rafael J. Wysocki

On Friday, 4 of January 2008, Meelis Roos wrote:
> Todays git gives the following warning during bootup on a Intel 845+PATA 
> PC (using libata to drive PATA):
> 
> Driver 'sd' needs updating - please use bus_type methods
> Driver 'sr' needs updating - please use bus_type methods

They are due to commit 751bf4d7865e4ced406be93b04c7436d866d3684, AFAICS, and
they don't mean anything wrong.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] USB Kconfig: Reorganize USB Kconfig Menu

2008-01-12 Thread Greg KH

On Sat, Jan 12, 2008 at 01:20:46PM +0300, Al Boldi wrote:
> Greg KH wrote:
> > On Sat, Jan 05, 2008 at 06:40:38PM +0300, Al Boldi wrote:
> > > Reorganize USB Kconfig Menu, and move USB_GADGET out into the Device
> > > Driver Menu. ?This helps the USB Kconfig Menu to be more logical/usable.
> > >
> > > Patchset against 2.6.23
> >
> > So what was the final verdict in this patch set?
> 
> IMHO, it's a lot better than what we have right now, and it's split up so 
> that you can pick and choose what you think is useful.
> 
> > Can you rsend this against 2.6.24-rc7 with the requested changes (if
> > any) in it?
> 
> The only critical change is in patch 2/5:
> 
> menuconfig USB_STORAGE
> tristate "USB Mass Storage support"
> -   depends on USB && SCSI
> +   depends on USB && BLOCK
> +   select SCSI
> 
> 
> I was hoping you could take care of it, or maybe wait until 2.6.25 is out.

Well, if you want such a change to go into 2.6.25, I need the patches
now :)

So, can you respin these against 2.6.24-rc7, with the above fix, so that
I can apply them and test them out?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

nosmp/maxcpus=0 or 1 -> TSC unstable

2008-01-12 Thread dean gaudet

if i boot an x86 64-bit 2.6.24-rc7 kernel with nosmp, maxcpus=0 or 1 it 
still disables TSC :)

Marking TSC unstable due to TSCs unsynchronized

this is an opteron 2xx box which does have two cpus and no clock-divide in 
halt or cpufreq enabled so TSC should be fine with only one cpu.

pretty sure this is the culprit is that num_possible_cpus() > 1, which 
would mean cpu_possible_map contains the second cpu... but i'm not quite 
sure what the right fix is... or perhaps this is all intended.

-dean
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] PM: Acquire device locks on suspend (rev. 4)

2008-01-12 Thread Rafael J. Wysocki

Hi Greg,

Please drop

gregkh-driver-pm-acquire-device-locks-prior-to-suspending.patch

that deadlocks suspend and hibernation on some systems, and apply the appended
$subject patch (which provides the equivalent functionality and introduces
safeguards against deadlocking in the relevant cases) instead.

Thanks,
Rafael

---
From: Rafael J. Wysocki <[EMAIL PROTECTED]>, Alan Stern <[EMAIL PROTECTED]>

This patch reorganizes the way suspend and resume notifications are
sent to drivers.  The major changes are that now the PM core acquires
every device semaphore before calling the methods, and calls to
device_add() during suspends will fail, while calls to device_del()
during suspends will block.

It also provides a way to safely remove a suspended device with the
help of the PM core, by using the device_pm_schedule_removal() callback
introduced specifically for this purpose, and updates two drivers (msr
and cpuid) that need to use it.

Signed-off-by: Alan Stern <[EMAIL PROTECTED]>
Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]>
---
 arch/x86/kernel/cpuid.c|6 
 arch/x86/kernel/msr.c  |6 
 drivers/base/core.c|   65 +
 drivers/base/power/main.c  |  504 +
 drivers/base/power/power.h |   12 +
 include/linux/device.h |8 
 6 files changed, 414 insertions(+), 187 deletions(-)

Index: linux-2.6/drivers/base/core.c
===
--- linux-2.6.orig/drivers/base/core.c
+++ linux-2.6/drivers/base/core.c
@@ -726,11 +726,20 @@ int device_add(struct device *dev)
 {
struct device *parent = NULL;
struct class_interface *class_intf;
-   int error = -EINVAL;
+   int error;
+
+   error = pm_sleep_lock();
+   if (error) {
+   dev_warn(dev, "Suspicious %s during suspend\n", __FUNCTION__);
+   dump_stack();
+   return error;
+   }
 
dev = get_device(dev);
-   if (!dev || !strlen(dev->bus_id))
+   if (!dev || !strlen(dev->bus_id)) {
+   error = -EINVAL;
goto Error;
+   }
 
pr_debug("DEV: registering device: ID = '%s'\n", dev->bus_id);
 
@@ -795,6 +804,7 @@ int device_add(struct device *dev)
}
  Done:
put_device(dev);
+   pm_sleep_unlock();
return error;
  BusError:
device_pm_remove(dev);
@@ -905,6 +915,7 @@ void device_del(struct device * dev)
struct device * parent = dev->parent;
struct class_interface *class_intf;
 
+   device_pm_remove(dev);
if (parent)
klist_del(&dev->knode_parent);
if (MAJOR(dev->devt))
@@ -981,7 +992,6 @@ void device_del(struct device * dev)
if (dev->bus)
blocking_notifier_call_chain(&dev->bus->bus_notifier,
 BUS_NOTIFY_DEL_DEVICE, dev);
-   device_pm_remove(dev);
kobject_uevent(&dev->kobj, KOBJ_REMOVE);
kobject_del(&dev->kobj);
if (parent)
@@ -1156,14 +1166,11 @@ error:
 EXPORT_SYMBOL_GPL(device_create);
 
 /**
- * device_destroy - removes a device that was created with device_create()
+ * find_device - finds a device that was created with device_create()
  * @class: pointer to the struct class that this device was registered with
  * @devt: the dev_t of the device that was previously registered
- *
- * This call unregisters and cleans up a device that was created with a
- * call to device_create().
  */
-void device_destroy(struct class *class, dev_t devt)
+static struct device *find_device(struct class *class, dev_t devt)
 {
struct device *dev = NULL;
struct device *dev_tmp;
@@ -1176,12 +1183,54 @@ void device_destroy(struct class *class,
}
}
up(&class->sem);
+   return dev;
+}
+
+/**
+ * device_destroy - removes a device that was created with device_create()
+ * @class: pointer to the struct class that this device was registered with
+ * @devt: the dev_t of the device that was previously registered
+ *
+ * This call unregisters and cleans up a device that was created with a
+ * call to device_create().
+ */
+void device_destroy(struct class *class, dev_t devt)
+{
+   struct device *dev;
 
+   dev = find_device(class, devt);
if (dev)
device_unregister(dev);
 }
 EXPORT_SYMBOL_GPL(device_destroy);
 
+#ifdef CONFIG_PM_SLEEP
+/**
+ * destroy_suspended_device - asks the PM core to remove a suspended device
+ * @class: pointer to the struct class that this device was registered with
+ * @devt: the dev_t of the device that was previously registered
+ *
+ * This call notifies the PM core of the necessity to unregister a suspended
+ * device created with a call to device_create() (devices cannot be
+ * unregistered directly while suspended, since the PM core holds their
+ * semaphores at that time).
+ *
+ * It can only be called within the scope of a system sleep transition.  In
+ * practice t

Re: [PATCH] Only print SCSI data direction warning once for a command

2008-01-12 Thread James Bottomley

On Wed, 2008-01-02 at 07:03 +0100, Andi Kleen wrote:
> When I use cdparanoia my logs get spammed a lot by
> 
> printk: 464 messages suppressed.
> sg_write: data in/out 30576/30576 bytes for SCSI command 0xbe--guessing data 
> in;
>program cdparanoia not setting count and/or reply_len properly
> printk: 1078 messages suppressed.
> 
> and many more of those. With this patch the message is only printed once
> for a command in a row.

My reaction is that the intent of these warnings is to try to get people
to fix broken applications, so I'm not sure any action is appropriate;
however, it's Doug's driver, so I'll defer to him.

Even if he does say yes, though, your patch looks wrong.  It's still
going to spew the 

printk: 1078 messages suppressed.

to the log because they come from printk_ratelimit().  So all you've
done is halved the volume of flow to the logs and left a dangling printk
suppressed message that keeps spewing, so I don't think the patch even
does what you describe it as doing ...  if you reverse the order of the
operands in the if() it will ...

James

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: rtl8187 rate control doesn't work

2008-01-12 Thread Stefano Brivio

On Sat, 12 Jan 2008 19:23:53 +0100
Hauke Mehrtens <[EMAIL PROTECTED]> wrote:

> I have tried wireless-2.6 and I have the same problem. If the rate goes
> over 11M no TCP/IP traffic goes through the wireless connecting. If rate
> is set to auto and the rate control algorithm changes it to something
> less than 11M TCP/IP traffic goes through, but if it is more than 11M no
> TCP/IP traffic goes through the wireless link.

And this is a problem if the rate control algorithm often sets the rate to
more than 11M even if connection is unreliable at that rate. Does this
actually happen? Are you currently using 'pid'?

Also, it would be fine to have a dump of the PID events. In order to get
this, you need to:
- build a kernel with debugfs enabled;
- mount debugfs (mount -t debugfs foo /mnt/debug)

Then, before testing:
cat /mnt/debug/ieee80211/phy0/stations/*/rc_pid_events > events_dump

You can even make a graph out of that, with a python script which Mattias
posted on this list some time ago, but a dump would be just fine.


--
Ciao
Stefano
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM REMAINS: [sata_nv ADMA breaks ATAPI] Crash on accessing DVD-RAM

2008-01-12 Thread Robert Hancock


Alexander wrote:

Robert Hancock wrote:

There's this patch which was intended to fix it:

http://lkml.org/lkml/2007/11/22/148


I applied this patch to 2.6.24-rc7. Now at boot time my DVD-RW is
normaly detected as:

sr0: scsi3-mmc drive: 48x/48x writer dvd-ram cd/rw xa/form2 cdda tray

But I cannot mount it. All my attempts failed with

ISOFS: Unable to identify CD-ROM format.

With mem<=4098M or sata_nv.adma=0 it still mounts and works ok.


As I wrote, it would appear that somehow the blk_queue_bounce_limit 
setting that the driver has made is not being respected and the block 
layer is still trying to feed it addresses over 4GB. Any ideas anyone?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: The ext3 way of journalling

2008-01-12 Thread Andrey Vul

On Jan 12, 2008 10:06 AM, Theodore Tso <[EMAIL PROTECTED]> wrote:
[snip]
> Unfortunately Ubuntu users [snip] fit this demographic hugely, and
> Ubuntu refuses to fix this problem[1], so it's been personally very
> vexing, because the users complain to *me*, and I can't fix the problem,
> because it's a distribution init script issue.
Ubuntu refuses to be power user friendly. They've forgotten the True
Meaning (tm) of Linux and try to be Windows-friendly, i.e., No Choices
(tm).


> Maybe someday Ubuntu will get this right --- but I'm not counting on it.
The alternative CD installer still looks like a semi-dumbed-down
debian installer. Hell, even the command-line base install is severely
bloated - it's the exact opposite of LFS or gentoo.
Still, it's *usable* in comparison to the livecd.
>
> [1] Something about installer CD's, and not wanting to ask the users
> any questions, not even what time zone they are in, or some other
> crazyness.  I never completely understood the argument and their
> design constraints.
Idiot friendliness and no exceptions to power users (e.g.., bloated
init scripts, UUID fstab). I switched to debian-unstable ages ago
*just* because apt is _really_ easy to use. Which I use secondarily to
Gentoo, where things Just Work (tm), once you patch the package
ebuilds to process your .patch files anyway and, while the packages
have *lots* of patches, it doesn't bloat the code *and* you can
disable the patches with the "vanilla" USE flag.

-- 
Andrey Vul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] SH/Dreamcast - add support for GD-Rom CDROM drive on SEGA Dreamcast

2008-01-12 Thread Andrew Morton

On Sat, 12 Jan 2008 14:14:01 + Adrian McMenamin <[EMAIL PROTECTED]> wrote:

> 
> > > + spin_command->cmd[0] = 0x70;
> > > + spin_command->cmd[2] = 0x1f;
> > > + spin_command->buflen = 0;
> > > + gd.pending = 1;
> > > + gdrom_packetcommand(gd.cd_info, spin_command);
> > > + /* 60 second timeout */
> > > + wait_event_interruptible_timeout(command_queue, gd.pending == 0, HZ * 
> > > 60);
> > > + gd.pending = 0;
> > > + kfree(spin_command);
> > > + if (gd.status & 0x01) {
> > > + /* log an error */
> > > + gdrom_getsense(NULL);
> > > + return -EIO;
> > > + }
> > > + return 0;
> > > +}
> > 
> > If the wait_event_interruptible_timeout() indeed times out, we go ahead and
> > free spin_command.  But someone else could potentially be using it. 
> > 
> > Suppose gdrom_packetcommand() got stuck for a minute due to bad hardware,
> > or some SCHED_FIFO task preempting us here and running for 61 seconds 
> > without
> > yielding or something similarly weird.
> > 
> 
> 
> Maybe I am being stupid here, but I don't follow this. They'll get a
> non-fatal error, that's all. Who else would be using spin_command? It's
> just a series of bytes to plug into the GD Rom registers, that's all.
> 

After programming the registers we need to wait for the interrupt to clear
gd.pending, don't we?



oh, I see.  gd is a global singleton and we only support one command at a
time and one device.  hrm.

> > > +
> > > +static int __devinit gdrom_set_interrupt_handlers(void)
> > > +{
> > > + int err;
> > > + init_waitqueue_head(&command_queue);
> > > + err = request_irq(HW_EVENT_GDROM_CMD, gdrom_command_interrupt, 
> > > IRQF_DISABLED, "gdrom_command", &gd);
> > > + if (err)
> > > + return err;
> > > + init_waitqueue_head(&request_queue);
> > 
> > You can initialise command_queue and request_queue at compile-time with
> > DECLARE_WAIT_QUEUE_HEAD().
> > 
> 
> Are you saying that is better?

Yup.  Less source code, less object code, no startup-ordering issues.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] PNP_DRIVER_RES_DISABLE breaks swsusp at least with snd_cs4236

2008-01-12 Thread Rafael J. Wysocki

On Saturday, 12 of January 2008, Rene Herman wrote:
> On 12-01-08 16:21, Pierre Ossman wrote:
> 
> > Ah, sorry. It was a different thread. Look for a mail with the subject 
> > "PNP: do not stop/start devices in suspend/resume path" in the LKML och 
> > linux-pm archives.
> 
> Right, and I see that the removal of start/stop is already in -mm. That's 
> not going to work. Something (such as removing power) disabled Ondrej's 
> CS4236 and the pnp_start_dev() is needed to re-enable it upon resume.
> 
> >> But we certainly need the pnp_start_dev() in the current flow of
> >> things. It not being called is the problem this fixes...
> > 
> > I think the previous suggestion was that the drivers should call this,
> > not the core, so that it behaved more like other parts of the kernel
> > (e.g. PCI).
> 
> It seems all PnP drivers would need to stick a pnp_start_dev in their resume 
> method

Yes.

> then which means it really belongs in core.

Yes, if practical.

> One important point where PnP and PCI differ is that PnP allows to change the
> resources on a protocol level and I don't see how it could ever not be
> necessary to restore the state a user may have set if power has been
> removed. Hibernate is just that, isn't it?

Basically, yes, it is.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 191 matches

Mail list logo