Re: [patch] CFS scheduler, -v6
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > I don't know if Mike still has problems with SD, but there are now > several interesting reports of SD giving better feedback than CFS on > real work. In my experience, CFS seems smoother on *technical* tests, > which I agree that they do not really simulate real work. well, there are several reports of CFS being significantly better than SD on a number of workloads - and i know of only two reports where SD was reported to be better than CFS: in Kasper's test (where i'd like to know what the "3D stuff" he uses is and take a good look at that workload), and another 3D report which was done against -v6. (And even in these two reports the 'smoothness advantage' was not dramatic. If you know of any other reports then please let me know!) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v7
* S.Çağlar Onur <[EMAIL PROTECTED]> wrote: > Ingo, please ignore my first report until i found a proper way to > reproduce the slowness cause currently CFS-v7, CFS-v7 + "renice > patch", CFS-v7 + renice + your private mail suggestions and CFS-v6 + > "PI support for futexes patch" seems works equally (which is a good > thing so X renicing seems really not needed, [...] oh, good! > [...] and there were no regression instead of my daydreams) or im too > tired to understand the differences. could the CPU have dropped speed for that bootup (some CPUs do that automatically upon overheating), or perhaps if you are using some RAID array, could it have done a background resync? Especially the bootup slowdown you saw seemed significant, and because bootup speed is 90% IO dominated, the CPU scheduler seems an unlikely candidate. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.21-git2] sk_buff changes break Cisco VPN client
On 4/29/07, David Miller <[EMAIL PROTECTED]> wrote: From: Roland Dreier <[EMAIL PROTECTED]> Date: Sat, 28 Apr 2007 14:05:27 -0700 > However I can suggest vpnc (http://www.unix-ag.uni-kl.de/~massar/vpnc/) > as an alternative. I'm not forced to use Cisco VPN access any more, > but when I tried it, vpnc was tons better than the Cisco product. Also, and I know this might be a COMPLETE SHOCK to some people, but we do have a full in-kernel IPSEC stack and using it with openswan to connect to VPNs works perfectly fine. I use it every day. It's quite amusing that people use a userland IPSEC implementation via VPNC, in spite of this. Have a look here https://lists.dulug.duke.edu/pipermail/dulug/2007-March/010792.html where someone seems to be in my same boat. No, openswan/IPSEC does not work in all configurations with Cisco VPN concentrators - and by Murphy's law, I'm in the non-working configuration. Hope this clarifies the reason I asked. And if anyone out there is in doubt about it, I absolutely hate having to rebuild an out of kernel module every time I build a kernel, crossing fingers it doesn't break (again). I would use *anything* else. --alessandro "Did you get married but forgot to get divorced ?" (Danny and Dusty, 'The Good Old Days') - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: High Resolution Timer DOS
* Lee Revell <[EMAIL PROTECTED]> wrote: > > Well, it is not really a DoS. The rescheduling of the process is > > limited by the scheduler and the available CPU time (depending on > > the number of runnable tasks in the system). > > Shouldn't an unprivileged process be rate limited somehow to avoid > flooding the machine with interrupts? We restrict nonroot users from > setting the RTC interrupt rate higher than 64Hz for a similar reason > (granted, this limit dates back to the 486 days and should probably be > increased to 1024 Hz). No. An interrupt in this case is really just 'CPU time used up', and an unprivileged process can take up as much CPU time as the scheduler allows. So it's _not_ a DoS, and neither is any other unprivileged infinit loop (or high-rate context-switching task) a DoS. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, Apr 29, 2007 at 08:59:01AM +0200, Ingo Molnar wrote: > > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > I don't know if Mike still has problems with SD, but there are now > > several interesting reports of SD giving better feedback than CFS on > > real work. In my experience, CFS seems smoother on *technical* tests, > > which I agree that they do not really simulate real work. > > well, there are several reports of CFS being significantly better than > SD on a number of workloads - and i know of only two reports where SD > was reported to be better than CFS: in Kasper's test (where i'd like to > know what the "3D stuff" he uses is and take a good look at that > workload), and another 3D report which was done against -v6. (And even > in these two reports the 'smoothness advantage' was not dramatic. If you > know of any other reports then please let me know!) There was Caglar Onur too but he said he will redo all the tests. I'm not tracking all tests nor versions, so it might be possible that some of the differences vanish with v7. In fact, what I'd like to see in 2.6.22 is something better for everybody and with *no* regression, even if it's not perfect. I had the feeling that SD matched that goal right now, except for Mike who has not tested recent versions. Don't get me wrong, I still think that CFS is a more interesting long-term target. But it may require more time to satisfy everyone. At least with one of them in 2.6.22, we won't waste time comparing to current mainline. > Ingo Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21
On Sat, 28 Apr 2007, David Miller wrote: From: "Markus Rechberger" <[EMAIL PROTECTED]> Date: Sun, 29 Apr 2007 00:58:09 +0200 On 4/29/07, Linus Torvalds <[EMAIL PROTECTED]> wrote: On Sat, 28 Apr 2007, Adrian Bunk wrote: We are already quite good at ignoring bug reports that come through linux-kernel, and it's an _advantage_ of the kernel Bugzilla to see more than 1600 open bugs because this tells how bad we are at handling bugs. No, it just shows that bugzilla doesn't matter for most of the kernel. Don't say that "bugzilla tells how bad we are at handling bugs". It tells how bad *bugzilla* is for handling bugs, nothing more. I totally disagree here, bugzilla is a very good tool. No, Bugzilla really does suck, and I personally refuse to use it when I have a choice. And guess what? You better be concerned about that because I maintain all of the networking code :-) It puts the onus FAR too much on the developer and not enough on the reporter and other minions. We have a small resource of developers, yet lots of users, bug reporters, and minions, so something that doesn't take advantage of the larger resource we have is going to not function efficiently at all. Yet that is what bugzilla does. I'll say that as a user I hate having to deal with bugzilla. there's nothing more frustrating then spending a good chunk of time trying to find a similar bug, then jumping through all the bugzilla hoops to file a report to eventually (days/weeks later) get a message 'closed becouse it's a duplicate report), then have to go and track down what it's a duplicate of, read through that bug report, only to find that it's not solved there either, and to top it off, the people working on that bug won't see my report or that I'm available to troubleshoot it. from a user poit of view, e-mailing the kernel list (retrying a few days later of there is no response) tends to work _much_ better. David Lang - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patches] [PATCH] [21/22] x86_64: Extend bzImage protocol for relocatable bzImage
Eric W. Biederman wrote: > All it does is set a flag that tells a bootloader. > "Hey. I can run when loaded a non-default address, and this is what > you have to align me to." > > All relocation processing happens in the kernel itself. > Is it possible to decompress and extract the kernel image from the bzImage without executing it? Ie, is there enough information to find the compressed data part of the bzImage by inspection? At some point we'll need to change the Xen domain builder to handle bzImage files, and it would be best if we didn't need to run them. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fwd: Re: [RFC] pata_icside driver
Resend. Copying linux-ide as requested appears to result in being ignored. ;( - Forwarded message from Russell King <[EMAIL PROTECTED]> - Date: Sat, 21 Apr 2007 16:09:03 +0100 From: Russell King <[EMAIL PROTECTED]> To: linux-kernel@vger.kernel.org, Andrew Morton <[EMAIL PROTECTED]>, Jeff Garzik <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject: Re: [RFC] pata_icside driver On Sun, Apr 08, 2007 at 11:18:26AM +0100, Russell King wrote: > Below is an initial attempt at converting the ICS IDE driver to fit > into the PATA infrastructure. > > There's a number of FIXMEs in there: due to the hardware missing > resistors on the interrupt signals from the drives, a port > without any drives attached results in spurious interrupts being > generated. > > To prevent this, we need to disable the interrupts from the port > on the card if no drives are found, but unfortunately ATA doesn't > call the "port_disable" method in this circumstance. Here's an updated version. I've removed the correction of the cycle time - since we're checking whether all of active, recovery and cycle periods fit the hardware, the correction becomes unnecessary. I still suggest that the PATA core folk consider fixing their timing calculation function in that respect though. This driver continues to have the so far ignored issue concerning port_disable. It would be good to have some feedback on this instead of this driver continuing to be crippled by the libata core code. This really needs resolving before this driver can be merged, though I'm not sure how. diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig index 7bdbe5a..9cd8a61 100644 --- a/drivers/ata/Kconfig +++ b/drivers/ata/Kconfig @@ -552,6 +552,14 @@ config PATA_PLATFORM If unsure, say N. +config PATA_ICSIDE + tristate "Acorn ICS PATA support" + depends on ARM && ARCH_ACORN + help + On Acorn systems, say Y here if you wish to use the ICS PATA + interface card. This is not required for ICS partition support. + If you are unsure, say N to this. + config PATA_IXP4XX_CF tristate "IXP4XX Compact Flash support" depends on ARCH_IXP4XX diff --git a/drivers/ata/Makefile b/drivers/ata/Makefile index 13d7397..cc8798b 100644 --- a/drivers/ata/Makefile +++ b/drivers/ata/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_PATA_TRIFLEX)+= pata_triflex.o obj-$(CONFIG_PATA_IXP4XX_CF) += pata_ixp4xx_cf.o obj-$(CONFIG_PATA_SCC) += pata_scc.o obj-$(CONFIG_PATA_PLATFORM)+= pata_platform.o +obj-$(CONFIG_PATA_ICSIDE) += pata_icside.o # Should be last but one libata driver obj-$(CONFIG_ATA_GENERIC) += ata_generic.o # Should be last libata driver diff --git a/drivers/ata/pata_icside.c b/drivers/ata/pata_icside.c new file mode 100644 index 000..75b22da --- /dev/null +++ b/drivers/ata/pata_icside.c @@ -0,0 +1,686 @@ +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define DRV_NAME "pata_icside" + +#define ICS_IDENT_OFFSET 0x2280 + +#define ICS_ARCIN_V5_INTRSTAT 0x +#define ICS_ARCIN_V5_INTROFFSET0x0004 + +#define ICS_ARCIN_V6_INTROFFSET_1 0x2200 +#define ICS_ARCIN_V6_INTRSTAT_10x2290 +#define ICS_ARCIN_V6_INTROFFSET_2 0x3200 +#define ICS_ARCIN_V6_INTRSTAT_20x3290 + +struct portinfo { + unsigned int dataoffset; + unsigned int ctrloffset; + unsigned int stepping; +}; + +static const struct portinfo pata_icside_portinfo_v5 = { + .dataoffset = 0x2800, + .ctrloffset = 0x2b80, + .stepping = 6, +}; + +static const struct portinfo pata_icside_portinfo_v6_1 = { + .dataoffset = 0x2000, + .ctrloffset = 0x2380, + .stepping = 6, +}; + +static const struct portinfo pata_icside_portinfo_v6_2 = { + .dataoffset = 0x3000, + .ctrloffset = 0x3380, + .stepping = 6, +}; + +#define PATA_ICSIDE_MAX_SG 128 + +struct pata_icside_state { + void __iomem *irq_port; + void __iomem *ioc_base; + unsigned int type; + unsigned int dma; + struct { + u8 port_sel; + u8 disabled; + unsigned int speed[ATA_MAX_DEVICES]; + } port[2]; + struct scatterlist sg[PATA_ICSIDE_MAX_SG]; +}; + +#define ICS_TYPE_A3IN 0 +#define ICS_TYPE_A3USER1 +#define ICS_TYPE_V63 +#define ICS_TYPE_V515 +#define ICS_TYPE_NOTYPE((unsigned int)-1) + +/* Version 5 PCB Support Functions - */ +/* Prototype: pata_icside_irqenable_arcin_v5 (struct expansion_card *ec, int irqnr) + * Purpose : enable interrupts from card + */ +static void pata_icside_irqenable_arcin_v5 (struct expansion_card *ec, int irqnr) +{ + struct pata_icside_state *state = ec->irq_data; + + writeb(0, state->irq_port + ICS_ARCIN_V5_INTROFFSET); +} + +/* Pro
Re: [patch] CFS scheduler, -v6
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > > know of any other reports then please let me know!) > > There was Caglar Onur too but he said he will redo all the tests. > [...] well, Caglar said CFSv7 works as well as CFSv6 in his latest tests and that he'll redo all the tests to re-verify his original regression report :) > In fact, what I'd like to see in 2.6.22 is something better for > everybody and with *no* regression, even if it's not perfect. > > I had the feeling that SD matched that goal right now, [...] curious, which are the reports where in your opinion CFS behaves worse than vanilla? There were two audio skipping reports against CFS, the most serious one got resolved and i hope the other one has been resolved by the same fix as well. (i'm still waiting for feedback on that one) > [...] except for Mike who has not tested recent versions. [...] actually, dont discount Mark Lord's test results either. And it might be a good idea for Mike to re-test SD 0.46? Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21
On Sun, Apr 29, 2007 at 12:49:04AM +0200, Adrian Bunk wrote: > On Sat, Apr 28, 2007 at 09:27:01PM +0100, Russell King wrote: > > On Sat, Apr 28, 2007 at 09:53:20PM +0200, Adrian Bunk wrote: > > > We are already quite good at ignoring bug reports that come through > > > linux-kernel, and it's an _advantage_ of the kernel Bugzilla to see more > > > than 1600 open bugs because this tells how bad we are at handling bugs. > > > How many thousand bug reports have been ignored during the same time on > > > linux-kernel? > > > > However, look at this bug: > > > > http://bugme.osdl.org/show_bug.cgi?id=7760 > > > > It's outside my knowledge to be able to fix for various reasons: > >... > > I'm personally very tempted to close it as "won't fix" (I wish there was > > a "can't fix" category.) > >... > > So this is a completely debugged bug in a well-maintained subsystem > (no matter what the status in Bugzilla is). You're being very optimistic. I'm not sure where you get the idea that it's "completely debugged". It isn't - I've no real idea what the problem is, let alone what the solution might be. I've only one guess based upon what is sane in the kernel, and that isn't even based on the data provided in the bug report. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] crypto: Use padlock.ko only as a module
Hi Scott, On Sunday 29 April 2007, Simon Arlott wrote: > Ideally I'd just remove that module completely, all it does is > trigger the loading of the other two modules when modules are > used - so I'll submit a patch for that instead. That's much better! When you force a feature to be a module on a kernel without module support, it will effectivly be disabled. And if it is so simple to do the same in userspace like you suggest, than that's much better. Best Regards Ingo Oeser - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21
On Sun, Apr 29, 2007 at 12:58:09AM +0200, Markus Rechberger wrote: > I totally disagree here, bugzilla is a very good tool. If someone is > too lazy to look at it it's his problem. If you think so, try reading my email and responding constructively on how the issues there can be resolved. That email contains good examples where bugzilla fails, and bugs end up sitting around for ages untouched. And no, it's not because I'm "lazy". -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] reduce AER init error information
PCI-Express AER support in kernel requires BIOS to provide _OSC support to allow the AER Root port service driver to request for native control of AER. If a root port supports AER capablity, but BIOS doesn't provide _OSC for it, aerdriver will print many debug information to system console. Below is a log example. ***Error information example Evaluate _OSC Set fails. Status = 0x0005 Evaluate _OSC Set fails. Status = 0x0005 aer_init: AER service init fails - Run ACPI _OSC fails aer: probe of :00:02.0:pcie01 failed with error 2 Evaluate _OSC Set fails. Status = 0x0005 Evaluate _OSC Set fails. Status = 0x0005 aer_init: AER service init fails - Run ACPI _OSC fails aer: probe of :00:04.0:pcie01 failed with error 2 Evaluate _OSC Set fails. Status = 0x0005 Evaluate _OSC Set fails. Status = 0x0005 aer_init: AER service init fails - Run ACPI _OSC fails aer: probe of :00:06.0:pcie01 failed with error 2 **End of Error information example** As _OSC is an optional capability of BIOS, such error information looks like overly-verbosed. The patch against kernel 2.6.21 changes it to just print one line report messages if aerdriver fails to attach the root port service device. Below is an example of new output. AER service couldn't init device :00:02.0:pcie01 - no _OSC support Signed-off-by: Zhang Yanmin <[EMAIL PROTECTED]> --- diff -Nraup linux-2.6.21/drivers/pci/pci-acpi.c linux-2.6.21_aer/drivers/pci/pci-acpi.c --- linux-2.6.21/drivers/pci/pci-acpi.c 2007-02-05 02:44:54.0 +0800 +++ linux-2.6.21_aer/drivers/pci/pci-acpi.c 2007-04-30 22:03:08.0 +0800 @@ -55,8 +55,6 @@ acpi_query_osc ( status = acpi_evaluate_object(handle, "_OSC", &input, &output); if (ACPI_FAILURE (status)) { - printk(KERN_DEBUG - "Evaluate _OSC Set fails. Status = 0x%04x\n", status); *ret_status = status; return status; } @@ -124,11 +122,9 @@ acpi_run_osc ( in_params[3].buffer.pointer = (u8 *)context; status = acpi_evaluate_object(handle, "_OSC", &input, &output); - if (ACPI_FAILURE (status)) { - printk(KERN_DEBUG - "Evaluate _OSC Set fails. Status = 0x%04x\n", status); + if (ACPI_FAILURE (status)) return status; - } + out_obj = output.pointer; if (out_obj->type != ACPI_TYPE_BUFFER) { printk(KERN_DEBUG diff -Nraup linux-2.6.21/drivers/pci/pcie/aer/aerdrv_acpi.c linux-2.6.21_aer/drivers/pci/pcie/aer/aerdrv_acpi.c --- linux-2.6.21/drivers/pci/pcie/aer/aerdrv_acpi.c 2007-02-05 02:44:54.0 +0800 +++ linux-2.6.21_aer/drivers/pci/pcie/aer/aerdrv_acpi.c 2007-04-30 21:06:43.0 +0800 @@ -27,10 +27,9 @@ * Invoked when PCIE bus loads AER service driver. To avoid conflict with * BIOS AER support requires BIOS to yield AER control to OS native driver. **/ -int aer_osc_setup(struct pci_dev *dev) +acpi_status aer_osc_setup(struct pci_dev *dev) { - int retval = OSC_METHOD_RUN_SUCCESS; - acpi_status status; + acpi_status status = AE_NOT_FOUND; acpi_handle handle = DEVICE_ACPI_HANDLE(&dev->dev); struct pci_dev *pdev = dev; struct pci_bus *parent; @@ -51,18 +50,12 @@ int aer_osc_setup(struct pci_dev *dev) } if (!handle) - return OSC_METHOD_NOT_SUPPORTED; + return status; pci_osc_support_set(OSC_EXT_PCI_CONFIG_SUPPORT); status = pci_osc_control_set(handle, OSC_PCI_EXPRESS_AER_CONTROL | OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL); - if (ACPI_FAILURE(status)) { - if (status == AE_SUPPORT) - retval = OSC_METHOD_NOT_SUPPORTED; - else - retval = OSC_METHOD_RUN_FAILURE; - } - return retval; + return status; } diff -Nraup linux-2.6.21/drivers/pci/pcie/aer/aerdrv_core.c linux-2.6.21_aer/drivers/pci/pcie/aer/aerdrv_core.c --- linux-2.6.21/drivers/pci/pcie/aer/aerdrv_core.c 2007-02-05 02:44:54.0 +0800 +++ linux-2.6.21_aer/drivers/pci/pcie/aer/aerdrv_core.c 2007-04-30 22:33:16.0 +0800 @@ -733,19 +733,19 @@ void aer_delete_rootport(struct aer_rpc **/ int aer_init(struct pcie_device *dev) { - int status; + acpi_status status; /* Run _OSC Method */ status = aer_osc_setup(dev->port); - if(status != OSC_METHOD_RUN_SUCCESS) { - printk(KERN_DEBUG "%s: AER service init fails - %s\n", - __FUNCTION__, - (status == OSC_METHOD_NOT_SUPPORTED) ? - "No ACPI _OSC support" : "Run ACPI _OSC fails"); + if (ACPI_FAILURE(status)) { + printk(KERN_DEBUG "AER service couldn't init device %s - %s\n", + dev->device.bus_id, + (status == AE_SUPPORT || status == AE_NOT_FOUND
Re: [patch] CFS scheduler, -v6
On Sun, Apr 29, 2007 at 09:30:30AM +0200, Ingo Molnar wrote: > > In fact, what I'd like to see in 2.6.22 is something better for > > everybody and with *no* regression, even if it's not perfect. > > > > I had the feeling that SD matched that goal right now, [...] > > curious, which are the reports where in your opinion CFS behaves worse > than vanilla? see below :-) > There were two audio skipping reports against CFS, the > most serious one got resolved and i hope the other one has been resolved > by the same fix as well. (i'm still waiting for feedback on that one) your answer to your question above ;-) Yes, we're all waiting for feedback. And I said I did not track the versions involved, so it is possible that all previously encountered regressions are fixed by now. > > [...] except for Mike who has not tested recent versions. [...] > > actually, dont discount Mark Lord's test results either. And it might be > a good idea for Mike to re-test SD 0.46? In any case, it might be a good idea because Mike encountered a problem that nobody could reproduce. It may come from hardware, scheduler design, scheduler bug, or any other bug, but whatever the cause, it would be interesting to conclude on it. > Ingo Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 22/25] xen: xen-netfront: use skb.cb for storing private data
Herbert Xu wrote: > BTW, the version I posted to you is missing the following line. > > --- linux-2.6.20.i386/drivers/xen/core/skbuff.c 2007-04-28 > 15:30:16.0 +1000 > +++ build-2.6.20.i386/drivers/xen/core/skbuff.c 2007-04-28 > 15:30:52.0 +1000 > @@ -89,6 +89,7 @@ > skb->h.raw = (unsigned char *)skb->nh.iph + 4*skb->nh.iph->ihl; > if (skb->h.raw >= skb->tail) > goto out; > + skb->csum_start = skb->h.raw - skb->head; > switch (skb->nh.iph->protocol) { > case IPPROTO_TCP: > skb->csum_offset = offsetof(struct tcphdr, check); > drivers/xen/core/skbuff.c? What's that? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [DOC] Fix wrong identifier name in Documentation/driver-model/devres.txt
Jeff Garzik wrote: > Now that devres is in the kernel, I don't think I am the best person to > merge these sort of patches. Certainly I can, and I know the code from > my original review and subsequent usage, but I think the patch is more > appropriate for Greg, going through normal maintainership channels. > > IOW, I think devres is too generic to be queued via libata-dev.git. > > Tejun, comments? I don't have problem either way. If it's okay with Greg, I'll queue future devres updates through Greg. BTW, converting network drivers to devres is on my ever-growing todo list and when those are done they will go through you, Jeff. :-) -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: What's in infiniband.git for 2.6.22
> Quoting Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: What's in infiniband.git for 2.6.22 > > > What about the mthca patch to use separate HW queues for kernel > RC/UD/userspace RC? > > right, I'll queue that up too. I think you want to queue the following obvios bugix up as well: http://www.openfabrics.org/git/?p=~vlad/ofed_1_2/.git;a=blob;f=kernel_patches/fixes/ipoib_crash_on_error.patch;hb=HEAD -- MST - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, Apr 29, 2007 at 09:16:27AM +0200, Willy Tarreau wrote: > In fact, what I'd like to see in 2.6.22 is something better for everybody > and with *no* regression, even if it's not perfect. I had the feeling > that SD matched that goal right now, except for Mike who has not tested > recent versions. Don't get me wrong, I still think that CFS is a more > interesting long-term target. But it may require more time to satisfy > everyone. At least with one of them in 2.6.22, we won't waste time > comparing to current mainline. I think it'd be a good idea to merge scheduler classes before changing over the policy so future changes to policy have smaller code impact. Basically, get scheduler classes going with the mainline scheduler. There are other pieces that can be merged earlier, too, for instance, the correction to the comment in init/main.c. Directed yields can probably also go in as nops or -ENOSYS returns if not fully implemented, though I suspect there shouldn't be much in the way of implementing them. p->array vs. p->on_rq can be merged early too. Common code for rbtree- based priority queues can be factored out of cfq, cfs, and hrtimers. There are extensive /proc/ reporting changes, large chunks of which could go in before the policy as well. I'm camping in this weekend, so I'll see what I can eke out. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, 2007-04-29 at 08:59 +0200, Ingo Molnar wrote: > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > I don't know if Mike still has problems with SD, but there are now > > several interesting reports of SD giving better feedback than CFS on > > real work. In my experience, CFS seems smoother on *technical* tests, > > which I agree that they do not really simulate real work. > > well, there are several reports of CFS being significantly better than > SD on a number of workloads - and i know of only two reports where SD > was reported to be better than CFS: in Kasper's test (where i'd like to > know what the "3D stuff" he uses is and take a good look at that > workload), and another 3D report which was done against -v6. (And even > in these two reports the 'smoothness advantage' was not dramatic. If you > know of any other reports then please let me know!) I can tell you one thing, its not just me that has observed the smoothness in 3d stuff, after i tried rsdl first i've had lots of people try rsdl and subsequently sd because of the significant improvement in smoothness, and they have all found the same results. The stuff i have tested with in particular is unreal tournament 2004 and world of warcraft through wine, both running opengl, and consuming all the cpu time it can get. and the thing that happens is simply that even when theres only that process, sd is still smoother, but the significance is much larger once just something starts, like if the mail client starts fetching mail, and running some somewhat demanding stuff like spamasassin, the only way you notice it is by the drop in fps, smoothness is 100% intact with SD (ofcourse if you started HUGE load it probably would get so little cpu it would stutter), but with every other scheduler you will notice immediate and quite severe stuttering, in fact to many it will seem intolerable. I can tell you how I first noticed this, i was experimenting in ut2k4 with sd, and usually i always have to close my mail client, because when spamasassin starts (nice 0), the game would stutter quite much, but when i was playing i noticed some IO activity and work noises from my disk, but that was all, no noticable stutter or problems with the 3d, but i couldnt figure out why, i then discovered i had forgotten to close my mail client which i previously ALWAYS have had to do. If you have some ideas on how these problems might be fixed i'd surely try fixes and stuff, or if you have some data you need me to collect to better understand whats going on. But i suspect any somewhat demanding 3d application will do, and the difference is so staggering that when you see it in effect, you cant miss it. > > Ingo > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/kvm/mmu.c: fix an if() condition
Adrian Bunk wrote: It might have worked in this case since PT_PRESENT_MASK is 1, but let's express this correctly. Applied, thanks. -- error compiling committee.c: too many arguments to function - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH (v2)] crypto: Remove pointless padlock module
When this is compiled in it is run too early to do anything useful: [6.052000] padlock: No VIA PadLock drivers have been loaded. [6.052000] padlock: Using VIA PadLock ACE for AES algorithm. [6.052000] padlock: Using VIA PadLock ACE for SHA1/SHA256 algorithms. When it's a module it isn't doing anything special, the same functionality can be provided in userspace by "probeall padlock padlock-aes padlock-sha" in modules.conf if it is required. Signed-off-by: Simon Arlott <[EMAIL PROTECTED]> Cc: Herbert Xu <[EMAIL PROTECTED]> Cc: Michal Ludvig <[EMAIL PROTECTED]> --- On 29/04/07 02:59, Randy Dunlap wrote: Simon Arlott wrote: +depends on CRYPTO && X86_32 All of drivers/crypto/Kconfig already depends on CRYPTO, so just depends on X86_32 should be enough. Ok, I've changed this for geode too. On 29/04/07 08:28, Ingo Oeser wrote: On Sunday 29 April 2007, Simon Arlott wrote: > Ideally I'd just remove that module completely, all it does is > trigger the loading of the other two modules when modules are > used - so I'll submit a patch for that instead. That's much better! When you force a feature to be a module on a kernel without module support, it will effectivly be disabled. Well that's mostly the point - it shouldn't get compiled in - ever, but it also has other modules depending on it in Kconfig that shouldn't need to be modules. drivers/crypto/Kconfig | 16 ++-- drivers/crypto/Makefile |1 - drivers/crypto/padlock.c | 58 -- 3 files changed, 3 insertions(+), 72 deletions(-) delete mode 100644 drivers/crypto/padlock.c diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig index ff8c4be..f21fe66 100644 --- a/drivers/crypto/Kconfig +++ b/drivers/crypto/Kconfig @@ -1,10 +1,10 @@ menu "Hardware crypto devices" config CRYPTO_DEV_PADLOCK - tristate "Support for VIA PadLock ACE" + bool "Support for VIA PadLock ACE" depends on X86_32 select CRYPTO_ALGAPI - default m + default y help Some VIA processors come with an integrated crypto engine (so called VIA PadLock ACE, Advanced Cryptography Engine) @@ -14,16 +14,6 @@ config CRYPTO_DEV_PADLOCK The instructions are used only when the CPU supports them. Otherwise software encryption is used. - Selecting M for this option will compile a helper module - padlock.ko that should autoload all below configured - algorithms. Don't worry if your hardware does not support - some or all of them. In such case padlock.ko will - simply write a single line into the kernel log informing - about its failure but everything will keep working fine. - - If you are unsure, say M. The compiled module will be - called padlock.ko - config CRYPTO_DEV_PADLOCK_AES tristate "PadLock driver for AES algorithm" depends on CRYPTO_DEV_PADLOCK @@ -55,7 +45,7 @@ source "arch/s390/crypto/Kconfig" config CRYPTO_DEV_GEODE tristate "Support for the Geode LX AES engine" - depends on CRYPTO && X86_32 && PCI + depends on X86_32 && PCI select CRYPTO_ALGAPI select CRYPTO_BLKCIPHER default m diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile index 6059cf8..d070030 100644 --- a/drivers/crypto/Makefile +++ b/drivers/crypto/Makefile @@ -1,4 +1,3 @@ -obj-$(CONFIG_CRYPTO_DEV_PADLOCK) += padlock.o obj-$(CONFIG_CRYPTO_DEV_PADLOCK_AES) += padlock-aes.o obj-$(CONFIG_CRYPTO_DEV_PADLOCK_SHA) += padlock-sha.o obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o diff --git a/drivers/crypto/padlock.c b/drivers/crypto/padlock.c deleted file mode 100644 index d6d7dd5..000 --- a/drivers/crypto/padlock.c +++ /dev/null @@ -1,58 +0,0 @@ -/* - * Cryptographic API. - * - * Support for VIA PadLock hardware crypto engine. - * - * Copyright (c) 2006 Michal Ludvig <[EMAIL PROTECTED]> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include "padlock.h" - -static int __init padlock_init(void) -{ - int success = 0; - - if (crypto_has_cipher("aes-padlock", 0, 0)) - success++; - - if (crypto_has_hash("sha1-padlock", 0, 0)) - success++; - - if (crypto_has_hash("sha256-padlock", 0, 0)) - success++; - - if (!success) { - printk(KERN_WARNING PFX "No VIA PadLock drivers have been loaded.\n"); - return -ENODEV; - } - - printk(KERN_NOTICE PFX "%d drivers are available.\n", success); - - return 0; -} - -static void __exit padlock_fini(void) -{ -} - -module_init(padlock_init); -module_exit(padlock_fini);
Re: [patch] CFS scheduler, -v6
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > [...] except for Mike who has not tested recent versions. [...] > > > > actually, dont discount Mark Lord's test results either. And it > > might be a good idea for Mike to re-test SD 0.46? > > In any case, it might be a good idea because Mike encountered a > problem that nobody could reproduce. [...] actually, Mark Lord too reproduced something similar to Mike's results. Please try those workloads yourself. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PATCH] ACPI patches for 2.6.22
On Sun, 29 Apr 2007 01:02:33 -0400 Len Brown <[EMAIL PROTECTED]> wrote: > please pull from: > > git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6.git release > > This batch mostly updates the platform-specific drivers that use ACPI. > The EC and sbs changes are primarily cleanups. > There are no changes to the ACPICA core, except a single bugfix > that was related to a 2.6.21 boot regression on some older machines. > And then the usual mix of random tweaks. There might still be a few regressions in this lot: - Miles Lane's "2.6.21-rc7-mm2 -- gnome-power-manager always shows the power as coming from AC" - "battery caching introduces a lock up" http://bugzilla.kernel.org/show_bug.cgi?id=8351 These are older and might have been fixed: - Mat Mackall's "Thinkpads not waking up on lid open with -rc6-mm1" - Helge Hafting's "2.6.21-rc3-mm2 hangs my opteron during bootup, ACPI?" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, Apr 29, 2007 at 10:00:28AM +0200, Ingo Molnar wrote: > > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > > > [...] except for Mike who has not tested recent versions. [...] > > > > > > actually, dont discount Mark Lord's test results either. And it > > > might be a good idea for Mike to re-test SD 0.46? > > > > In any case, it might be a good idea because Mike encountered a > > problem that nobody could reproduce. [...] > > actually, Mark Lord too reproduced something similar to Mike's results. OK. > Please try those workloads yourself. Unfortunately, I do not have their tools, environments nor hardware. That's the advantage of having multiple testers ;-) > Ingo Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc7-mm2
On Thu, 26 Apr 2007, Randy Dunlap wrote: > On Thu, 26 Apr 2007 13:37:20 -0700 Andrew Morton wrote: > > On Thu, 26 Apr 2007 13:47:14 +0200 Gabriel C <[EMAIL PROTECTED]> wrote: > > > Andrew Morton wrote: > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc7/2.6.21-rc7-mm2/ > > > > > > > > > > > > - this has everything which is in 2.6.21. Plus more! > > > > > > > > - a number of nasty bugs were fixed. This should be (a lot) more stable > > > > than 2.6.21-rc7-mm1. > > > > > > > > > > > > > > I get this warning here : > > > > > > > > > drivers/net/Kconfig:2327:warning: 'select' used by config symbol > > > 'UCC_GETH' refer to undefined symbol 'UCC_FAST' > > > > Yes, we get so many of those that I tend to ignore them, assuming that > > someone will pick it up and fix it. > > > > This one was added by git-powerpc, presumably > > 7d776cb596994219584257eb5956b87628e5deaf "QE: automatically select QE > > options" > > There was a similar problem with PS3_xyz (don't recall exactly > which PS3_option it was) that was "solved" by introducing an > intermediate config symbol IIRC. Maybe that trick^W fix can be > done here also... CONFIG_PS3_ADVANCED, commit 3f555c700b6c90f9ac24bc81a4f509583d906278 Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- Sony Network and Software Technology Center Europe (NSCE) [EMAIL PROTECTED] --- Sint-Stevens-Woluwestraat 55 Voice +32-2-2908453 Fax +32-2-7262686 B-1130 Brussels, Belgium - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
* William Lee Irwin III <[EMAIL PROTECTED]> wrote: > I think it'd be a good idea to merge scheduler classes before changing > over the policy so future changes to policy have smaller code impact. > Basically, get scheduler classes going with the mainline scheduler. i've got a split up patch for the class stuff already, but lets first get some wider test-coverage before even thinking about upstream integration. This is all v2.6.22 stuff at the earliest. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
* Kasper Sandberg <[EMAIL PROTECTED]> wrote: > If you have some ideas on how these problems might be fixed i'd surely > try fixes and stuff, or if you have some data you need me to collect > to better understand whats going on. But i suspect any somewhat > demanding 3d application will do, and the difference is so staggering > that when you see it in effect, you cant miss it. it would be great if you could try a simple experiment: does something as simple as glxgears resized to a large window trigger this 'stuttering' phenomenon when other stuff is running? If not, could you try to find the simplest 3D stuff under Linux that already triggers it so that i can reproduce it? (Also, as an independent debug-test, could you try CONFIG_PREEMPT too perhaps? I.e. is this 'stuttering' behavior independent of the preemption model and a general property of CFS?) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 22/25] xen: xen-netfront: use skb.cb for storing private data
On Sun, Apr 29, 2007 at 12:43:33AM -0700, Jeremy Fitzhardinge wrote: > Herbert Xu wrote: > > BTW, the version I posted to you is missing the following line. > > > > --- linux-2.6.20.i386/drivers/xen/core/skbuff.c 2007-04-28 > > 15:30:16.0 +1000 > > +++ build-2.6.20.i386/drivers/xen/core/skbuff.c 2007-04-28 > > 15:30:52.0 +1000 > > @@ -89,6 +89,7 @@ > > skb->h.raw = (unsigned char *)skb->nh.iph + 4*skb->nh.iph->ihl; > > if (skb->h.raw >= skb->tail) > > goto out; > > + skb->csum_start = skb->h.raw - skb->head; > > switch (skb->nh.iph->protocol) { > > case IPPROTO_TCP: > > skb->csum_offset = offsetof(struct tcphdr, check); > > > > drivers/xen/core/skbuff.c? What's that? It's part of the skb_checksum_setup function which we still need for this because the current netback protocol doesn't pass the csum_start and csum_offset values along. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, 2007-04-29 at 09:16 +0200, Willy Tarreau wrote: > In fact, what I'd like to see in 2.6.22 is something better for everybody > and with *no* regression, even if it's not perfect. I had the feeling > that SD matched that goal right now, except for Mike who has not tested > recent versions. Don't get me wrong, I still think that CFS is a more > interesting long-term target. While I haven't tested recent SD versions, unless it's design has radically changed recently, I know what to expect. CFS is giving me a very high quality experience already (it's at a whopping v7), while RSDL/SD irritated me greatly at version v40. As far as I'm concerned, CFS is the superior target, short-term, long-term whatever-term. For the tree where I make the decisions, the hammer has fallen, and RSDL/SD is history. Heck, I'm _almost_ ready to rm -rf my own scheduler trees as well... I could really use some free disk space. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, Apr 29, 2007 at 12:54:36AM -0700, William Lee Irwin III wrote: > On Sun, Apr 29, 2007 at 09:16:27AM +0200, Willy Tarreau wrote: > > In fact, what I'd like to see in 2.6.22 is something better for everybody > > and with *no* regression, even if it's not perfect. I had the feeling > > that SD matched that goal right now, except for Mike who has not tested > > recent versions. Don't get me wrong, I still think that CFS is a more > > interesting long-term target. But it may require more time to satisfy > > everyone. At least with one of them in 2.6.22, we won't waste time > > comparing to current mainline. > > I think it'd be a good idea to merge scheduler classes before changing > over the policy so future changes to policy have smaller code impact. > Basically, get scheduler classes going with the mainline scheduler. > > There are other pieces that can be merged earlier, too, for instance, > the correction to the comment in init/main.c. Directed yields can > probably also go in as nops or -ENOSYS returns if not fully implemented, > though I suspect there shouldn't be much in the way of implementing them. > p->array vs. p->on_rq can be merged early too. I agree that merging some framework is a good way to proceed. > Common code for rbtree-based priority queues can be factored out of > cfq, cfs, and hrtimers. In my experience, rbtrees are painfully slow. Yesterday, I spent the day replacing them in haproxy with other trees I developped a few years ago, which look like radix trees. They are about 2-3 times as fast to insert 64-bit data, and you walk through them in O(1). I have many changes to apply to them before they could be used in kernel, but at least I think we already have code available for other types of trees. > There are extensive /proc/ reporting changes, large chunks of which > could go in before the policy as well. > > I'm camping in this weekend, so I'll see what I can eke out. good luck ! Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: vmstat: use our own timer events
On Sat, 28 Apr 2007 22:09:04 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > vmstat is currently using the cache reaper to periodically bring the > statistics up to date. The cache reaper does only exists in SLUB > as a way to provide compatibility with SLAB. This patch removes > the vmstat calls from the slab allocators and provides its own > handling. > > The advantage is also that we can use a different frequency for the > updates. Refreshing vm stats is a pretty fast job so we can run this > every second and stagger this by only one tick. This will lead to > some overlap in large systems. F.e a system running at 250 HZ with > 1024 processors will have 4 vm updates occurring at once. > > However, the vm stats update only accesses per node information. > It is only necessary to stagger the vm statistics updates per > processor in each node. Vm counter updates occurring on distant > nodes will not cause cacheline contention. > > We could implement an alternate approach that runs the first processor > on each node at the second and then each of the other processor on a > node on a subsequent tick. That may be useful to keep a large amount > of the second free of timer activity. Maybe the timer folks will have > some feedback on this one? The one-per-second timer interrupt will upset the people who are really aggressive about power consumption (eg, OLPC). Perhaps there isn't (yet) an intersection between those people and SMP. However a knob to set the frequency would be nice, if it's not too expensive to implement. Presumably anyone who cares enough will come along and add one, but then they have to wait for a long period for that change to propagate out to their users, which is a bit sad for something which we already knew about. Having each CPU touch every zone looks a bit expensive - I'd have thought that it would be showing up a little on your monster NUMA machines? > @@ -648,11 +664,21 @@ static int __cpuinit vmstat_cpuup_callba > unsigned long action, > void *hcpu) > { > + long cpu = (long)hcpu; > + > switch (action) { > - case CPU_UP_PREPARE: > - case CPU_UP_PREPARE_FROZEN: > - case CPU_UP_CANCELED: > - case CPU_UP_CANCELED_FROZEN: > + case CPU_ONLINE: > + case CPU_ONLINE_FROZEN: > + start_cpu_timer(cpu); > + break; > + case CPU_DOWN_PREPARE: > + case CPU_DOWN_PREPARE_FROZEN: > + cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu)); > + per_cpu(vmstat_work, cpu).work.func = NULL; > + case CPU_DOWN_FAILED: > + case CPU_DOWN_FAILED_FROZEN: > + start_cpu_timer(cpu); > + break; > case CPU_DEAD: > case CPU_DEAD_FROZEN: > refresh_zone_stat_thresholds(); Oh dear. Some of these new notifier types are added by a patch which is a few hundred patches later than slub. I can park this patch after that one, but that introduces a risk that later slub patches will also get disconnected. Oh well, we'll see how things go. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
bad_page from quicklist patches
I am getting many 'bad_page' failures from the quicklist patches in 2.6.21-rc7-mm1. I have bisected the problem down the following patches: quicklists-for-page-table-pages.patch quicklists-for-page-table-pages-avoid-useless-virt_to_page-conversion.patch quicklist-support-for-ia64.patch quicklist-support-for-x86_64.patch quicklist-support-for-sparc64.patch This is on an ia64, compiled with the sn2_defconfig configuration. If I have these quicklist patches included in the build, then starting with the init process entering user land, I start getting many complaints such as the following. I don't know if it ever actually gets past the init scripts to finish userland booting, due the large number of such complaints, and due to another problem that I haven't looked into yet, involving my boot hanging after the "Setting up service network" message is displayed during the init script sequence. But ... back to this problem ... here is the boot output, up through the first few such 'bad_page' failures that I see when booting with a kernel that includes the above quicklist patches: === ELILO Uncompressing Linux... done Linux version 2.6.21-rc7-mm1 ([EMAIL PROTECTED]) (gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)) #12 SMP PREEMPT Sun Apr 29 00:43:14 PDT 2007 EFI v1.10 by INTEL: SALsystab=0x3002a092f0 ACPI 2.0=0x3002a09ac0 ACPI: RSDP 3002A09AC0, 0024 (r2SGI) ACPI: XSDT 3002A09B00, 0044 (r1SGI XSDTSN2100011) ACPI: APIC 3002A09B60, 008C (r1SGI APICSN2100011) ACPI: SRAT 3002A09C00, 0150 (r1SGI SRATSN2100011) ACPI: SLIT 3002A09D60, 003C (r1SGI SLITSN2100011) ACPI: FACP 3002A09E40, 00F4 (r3SGI FACPSN2300011) ACPI: DSDT 3002A09E00, 0024 (r2SGI DSDTSN2200011) ACPI: FACS 3002A09DB0, 0040 Number of logical nodes in system = 4 Number of memory chunks in system = 4 SAL 2.9: SGI SN2 version 4.50 SAL Platform features: ITC_Drift SAL: AP wakeup using external interrupt vector 0x12 No logical to physical processor mapping available SAL_CAL_FLUSH failed with -1 ACPI: Local APIC address c000fee0 ACPI: Error parsing MADT - no IOSAPIC entries register_intr: No IOSAPIC for GSI 52 8 CPUs available, 8 CPUs total Increasing MCA rendezvous timeout from 2 to 49000 milliseconds MCA related initialization done ACPI: RSDP 3002A09AC0, 0024 (r2SGI) ACPI: XSDT 3002A09B00, 0044 (r1SGI XSDTSN2100011) ACPI: APIC 3002A09B60, 008C (r1SGI APICSN2100011) ACPI: SRAT 3002A09C00, 0150 (r1SGI SRATSN2100011) ACPI: SLIT 3002A09D60, 003C (r1SGI SLITSN2100011) ACPI: FACP 3002A09E40, 00F4 (r3SGI FACPSN2300011) ACPI: DSDT 3002A09E00, 0024 (r2SGI DSDTSN2200011) ACPI: FACS 3002A09DB0, 0040 SGI SAL version 4.50 Virtual mem_map starts at 0xa0007ffe85c8 Zone PFN ranges: Normal 12585984 -> 113307648 Movable zone start PFN for each node early_node_map[7] active PFN ranges 0: 12585984 -> 12709887 1: 46140416 -> 46264320 2: 79694848 -> 79753216 3: 113249280 -> 113306111 3: 113307136 -> 113307481 3: 113307496 -> 113307524 3: 113307552 -> 113307560 Built 4 zonelists, mobility grouping on. Total pages: 362143 Kernel command line: BOOT_IMAGE=scsi1:\efi\SuSE\vmlinuz.pj6 root=/dev/sda5 console=ttySG0 splash=silent thash_entries=2097152 kdb=on ro PID hash table entries: 4096 (order: 12, 32768 bytes) Console: colour dummy device 80x25 Memory: 5644256k/5796608k available (7960k code, 169936k reserved, 6062k data, 1664k init) McKinley Errata 9 workaround not needed; disabling it SLUB: General Slabs=20, HW alignment=128, Processors=8, Nodes=1024 Dentry cache hash table entries: 1048576 (order: 9, 8388608 bytes) Inode-cache hash table entries: 524288 (order: 8, 4194304 bytes) Mount-cache hash table entries: 1024 ACPI: Core revision 20070126 Boot processor id 0x0/0x0 Brought up 8 CPUs Total of 8 processors activated (15564.80 BogoMIPS). migration_cost=5743,44436 DMI not present or invalid. NET: Registered protocol family 16 ACPI DSDT OEM Rev 0x20001 ACPI: bus type pci registered ACPI: SCI (ACPI GSI 52) not registered ACPI: Interpreter enabled ACPI: Using IOSAPIC for interrupt routing Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 0 devices SCSI subsystem initialized NET: Registered protocol family 2 IP route cache hash table entries: 262144 (order: 7, 2097152 bytes) TCP established hash table entries: 2097152 (order: 11, 50331648 bytes) TCP bind hash table entries: 65536 (order: 6, 1048576 bytes) TCP: Hash tables configured (established 2097152 bind 65536) TCP reno registered perfmon: version 2.0 IRQ 238 perfmon: Itanium 2 PMU detected, 16 PMCs, 18 PMDs, 4 counters (47 bits) PAL Information Facility v0.5 perfmon: added sampling format d
Re: [patch] CFS scheduler, -v6
* William Lee Irwin III <[EMAIL PROTECTED]> wrote: >> I think it'd be a good idea to merge scheduler classes before changing >> over the policy so future changes to policy have smaller code impact. >> Basically, get scheduler classes going with the mainline scheduler. On Sun, Apr 29, 2007 at 10:03:59AM +0200, Ingo Molnar wrote: > i've got a split up patch for the class stuff already, but lets first > get some wider test-coverage before even thinking about upstream > integration. This is all v2.6.22 stuff at the earliest. I'd like to get some regression testing (standard macrobenchmarks) in on the scheduler class bits in isolation, as they do have rather non-negligible impacts on load balancing code, to changes in which such macrobenchmarks are quite sensitive. This shouldn't take much more than kicking off a benchmark on an internal box at work already set up to do such testing routinely. I won't need to write any fresh testcases etc. for it. Availability of the test systems may have to wait until Monday, since various people not wanting benchmarks disturbed are likely to be out for the weekend. It would also be beneficial for the other schedulers to be able to standardize on the scheduling class framework as far in advance as possible. In such a manner comparative testing by end-users and more industrial regression testing can be facilitated. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memory.c: remove warning from an uninitialized spinlock. was: Re: 2.6.21-rc7-mm2
On Sun, 29 Apr 2007 08:50:49 +0200 Borislav Petkov <[EMAIL PROTECTED]> wrote: > > Introduce a macro for suppressing gcc from generating a warning about a > probable > unitialized state of a variable. > > Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]> > > --- > > Index: linux-mm/include/linux/compiler.h > === > --- linux-mm.orig/include/linux/compiler.h > +++ linux-mm/include/linux/compiler.h > @@ -109,6 +109,10 @@ extern int do_check_likely(struct likeli > (typeof(ptr)) (__ptr + (off)); }) > #endif > > +#ifndef unitialized_var > +# define unitialized_var(x) x = x > +#endif > + > #endif /* __KERNEL__ */ > > #endif /* __ASSEMBLY__ */ > Index: linux-mm/mm/memory.c > === > --- linux-mm.orig/mm/memory.c > +++ linux-mm/mm/memory.c > @@ -1488,7 +1488,7 @@ static int apply_to_pte_range(struct mm_ > pte_t *pte; > int err; > struct page *pmd_page; > - spinlock_t *ptl; > + spinlock_t *unitialized_var(ptl); > > pte = (mm == &init_mm) ? > pte_alloc_kernel(pmd, addr) : Ho hum. I guess I'll slide this over to Linus if there's not too much howling, and unless someone can come up with anything better. I will, however, fix the spelling to "uninitialized" ;) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad_page from quicklist patches
On Sun, 29 Apr 2007 01:16:10 -0700 Paul Jackson <[EMAIL PROTECTED]> wrote: > I am getting many 'bad_page' failures from the quicklist patches > in 2.6.21-rc7-mm1. I have bisected the problem down the following > patches: > > quicklists-for-page-table-pages.patch > > quicklists-for-page-table-pages-avoid-useless-virt_to_page-conversion.patch > quicklist-support-for-ia64.patch > quicklist-support-for-x86_64.patch > quicklist-support-for-sparc64.patch > > This is on an ia64, compiled with the sn2_defconfig configuration. That should have been fixed in -mm2, by the below: --- a/include/linux/quicklist.h~quicklists-for-page-table-pages-avoid-useless-virt_to_page-conversion-fix +++ a/include/linux/quicklist.h @@ -61,7 +61,7 @@ static inline void __quicklist_free(int if (unlikely(nid != numa_node_id())) { if (dtor) dtor(p); - free_hot_page(page); + __free_page(page); return; } _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: send_IPI_mask_bitmask() (Re: 2.6.21 known regressions (v2) (for -stable team))
On Sun, 2007-04-29 at 01:34 -0400, Len Brown wrote: > > clockevents_notify() is called with the power verify information for an > > offline CPU. I can handle this in the clockevents code, but I think acpi > > is the correct place. > > So the CONFIG_GENERIC_CLOCKEVENTS=y case is broken, > but the CONFIG_GENERIC_CLOCKEVENTS=n below is okay? > Not immediately clear why both cases can't fail. True, that's strange. Even more strange is that 2.6.21-rc7 does not have the problem, but 2.6.21 has. We did not change anything in those code pathes between rc7 and final. Jeff, can you please verify which -rcX was the last which did not have this problem. Thanks, tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Back to the future.
Hi! > > > The freezer has *caused* those deadlocks (eg by stopping threads that > > > were > > > needed for the suspend writeouts to succeed!), not solved them. > > > > I can't remember anything like this, but I believe you have a specific test > > case in mind. > > Ehh.. Why do you thik we _have_ that PF_NOFREEZE thing in the first place? > > Rafael, you really don't know what you're talking about, do you? > > Just _look_ at them. It's the IO threads etc that shouldn't be frozen, > exactly *because* they do IO. You claim that kernel threads shouldn't do > IO, but that's the point: if you cannot do IO when snapshotting to disk, > here's a damn big clue for you: how do you think that snapshot is going to > get written? > > I *guarantee* you that we've had a lot more problems with threads that > should *not* have been frozen than with those hypothetical threads that > you think should have been frozen. Well, we had nasty corruption on XFS, caused by thread that was not frozen and should be. (While the other case leads "only" to deadlocks, so it is easier to debug.) The locking point.. when I added freezing to swsusp, I knew very little about kernel locking, so I "simply" decided to avoid the problem altogether... using the freezer. You may be right that locks are not a big problem for the hibernation after all; I just do not know. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad_page from quicklist patches
> That should have been fixed in -mm2, by the below: Ah - ok - you're quick - thanks. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PATCH] UIO patches for 2.6.21
On Sat, 2007-04-28 at 18:23 -0700, Greg KH wrote: > On Sat, Apr 28, 2007 at 10:31:37PM +0200, Thomas Gleixner wrote: > > On Sat, 2007-04-28 at 21:15 +0100, Alan Cox wrote: > > > > > I have a political question, if I have a user space driver, is my > > > > > kernel > > > > > tainted or not? > > > > > > > > Surely not. By using the kernel's userspace interface, you create no > > > > "derived work" of the kernel. See COPYING in the root directory of the > > > > kernel sources for details. > > > > > > That only covers normal system calls - but I don't think thats what is > > > relevant, taints are for debug assistance not politics. > > > > > > I think we should have a taint flag for UIO type drivers. Not for any > > > licensing or political reason but for the simple fact it means that there > > > may be other complexities to debugging - and not the same one as a binary > > > module. Probably we want the same marker for mmap /dev/mem too. > > > > I agree, if we make it entirely clear that the flag is nonpolitical. > > Hm, I don't know, what makes this different from the fact that we can > mmap PCI device space today through the proc and sysfs entries? That's > how X gets direct access to the hardware for a number of different > cards, and that's pretty much the same thing as the UIO interface is > doing. > > Unless you think we should also use the same "taint" flag on those > accesses too, and if so, I have no objection. Right, this is just a hint, that something in user space is accessing the hardware directly. Not a too bad idea, but pretty much useless when we add X to the picture as it will be set always :) tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21
On Sat, 2007-04-28 at 20:41 -0700, David Miller wrote: > From: Adrian Bunk <[EMAIL PROTECTED]> > Date: Sun, 29 Apr 2007 01:04:16 +0200 > > > Bugzilla has an email interface. > > Andrew forwards bugs from Bugzilla to developers. > > Therefore, bugzilla only works at all when Andrew forwards things > around by-hand. That's not entirely true. There are people watching the bugs which might be relevant for them on their own. It does not make bugzilla better though. The user interface sucks and getting things correlated is simply not possible. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm 1/2] Separate freezer from PM code
On Fri, Apr 27, 2007 at 11:29:34PM +0200, Rafael J. Wysocki wrote: > On Friday, 27 April 2007 22:20, Jeremy Fitzhardinge wrote: > > Rafael J. Wysocki wrote: > > > Makes sense. Please have a look at the updated patch below. > > > > > > Sam, does this one look better to you? > > > > > > > If freezer.c is in kernel/, then shouldn't the corresponding config var > > be in a non-arch Kconfig file? > > Well, I though it would look strange. Still, I can do that, of course: It would have been much better to have a single kernel/Kconfig so you avoided all the arch specific source lines. But thats not this patch and can come later. So I'm OK with this one. Sam > > --- > From: Rafael J. Wysocki <[EMAIL PROTECTED]> > > Now that the freezer is used by kprobes, it is no longer a PM-specific piece > of > code. Move the freezer code out of kernel/power and introduce the > CONFIG_FREEZER option that will be chosen automatically if PM or KPROBES is > set. > > Signed-off-by: Rafael J. Wysocki <[EMAIL PROTECTED]> > --- > arch/arm/Kconfig |2 > arch/avr32/Kconfig |2 > arch/avr32/Kconfig.debug |1 > arch/blackfin/Kconfig|2 > arch/frv/Kconfig |2 > arch/i386/Kconfig|3 > arch/ia64/Kconfig|3 > arch/mips/Kconfig|2 > arch/powerpc/Kconfig |3 > arch/ppc/Kconfig |2 > arch/s390/Kconfig|3 > arch/sh/Kconfig |2 > arch/sparc64/Kconfig |3 > arch/x86_64/Kconfig |3 > include/linux/freezer.h |2 > kernel/Kconfig.freezer |5 > kernel/Makefile |1 > kernel/freezer.c | 236 > +++ > kernel/kprobes.c |2 > kernel/power/Kconfig |1 > kernel/power/Makefile|2 > kernel/power/process.c | 236 > --- > 22 files changed, 279 insertions(+), 239 deletions(-) > > Index: linux-2.6.21-rc7-mm2/arch/x86_64/Kconfig > === > --- linux-2.6.21-rc7-mm2.orig/arch/x86_64/Kconfig 2007-04-27 > 21:41:05.0 +0200 > +++ linux-2.6.21-rc7-mm2/arch/x86_64/Kconfig 2007-04-27 23:20:43.0 > +0200 > @@ -703,6 +703,8 @@ config GENERIC_PENDING_IRQ > depends on GENERIC_HARDIRQS && SMP > default y > > +source "kernel/Kconfig.freezer" > + > menu "Power management options" > > source kernel/power/Kconfig > @@ -791,6 +793,7 @@ source "arch/x86_64/oprofile/Kconfig" > config KPROBES > bool "Kprobes (EXPERIMENTAL)" > depends on KALLSYMS && EXPERIMENTAL && MODULES > + select FREEZER > help > Kprobes allows you to trap at almost any kernel address and > execute a callback function. register_kprobe() establishes > Index: linux-2.6.21-rc7-mm2/arch/avr32/Kconfig.debug > === > --- linux-2.6.21-rc7-mm2.orig/arch/avr32/Kconfig.debug2007-04-27 > 21:41:05.0 +0200 > +++ linux-2.6.21-rc7-mm2/arch/avr32/Kconfig.debug 2007-04-27 > 21:54:19.0 +0200 > @@ -12,6 +12,7 @@ menu "Instrumentation Support" > config KPROBES > bool "Kprobes" > depends on DEBUG_KERNEL > + select FREEZER > help > Kprobes allows you to trap at almost any kernel address and >execute a callback function. register_kprobe() establishes > Index: linux-2.6.21-rc7-mm2/arch/frv/Kconfig > === > --- linux-2.6.21-rc7-mm2.orig/arch/frv/Kconfig2007-04-27 > 21:41:05.0 +0200 > +++ linux-2.6.21-rc7-mm2/arch/frv/Kconfig 2007-04-27 23:13:27.0 > +0200 > @@ -364,6 +364,8 @@ source "drivers/pcmcia/Kconfig" > # sleep-deprived psychotic hacker types can say Y now, everyone else > # should probably wait a while. > > +source "kernel/Kconfig.freezer" > + > menu "Power management options" > source kernel/power/Kconfig > endmenu > Index: linux-2.6.21-rc7-mm2/arch/i386/Kconfig > === > --- linux-2.6.21-rc7-mm2.orig/arch/i386/Kconfig 2007-04-27 > 21:41:05.0 +0200 > +++ linux-2.6.21-rc7-mm2/arch/i386/Kconfig2007-04-27 23:17:36.0 > +0200 > @@ -912,6 +912,8 @@ config ARCH_ENABLE_MEMORY_HOTPLUG > def_bool y > depends on HIGHMEM > > +source "kernel/Kconfig.freezer" > + > menu "Power management options (ACPI, APM)" > depends on !X86_VOYAGER > > @@ -1218,6 +1220,7 @@ source "arch/i386/oprofile/Kconfig" > config KPROBES > bool "Kprobes (EXPERIMENTAL)" > depends on KALLSYMS && EXPERIMENTAL && MODULES > + select FREEZER > help > Kprobes allows you to trap at almost any kernel address and > execute a callback function. register_kprobe() establishes > Index: linux-2.6.21-rc7-mm2/arch/ia64/Kconf
Re: Back to the future.
On Sunday, 29 April 2007 01:45, Linus Torvalds wrote: > > On Sun, 29 Apr 2007, Rafael J. Wysocki wrote: > > > > OK, more precisely: fs-related threads should not try to process their > > queues, > > etc., after the snapshot is done, because that may cause some fs data to be > > written at that time and then the fs in question may be corrupted after the > > restore. Not all of the I/O in general, fs data. > > But that's not true _either_. That's only true because right now I think > we cannot even suspend to a swapfile (I might be wrong). You are. > If you have a swapfile on a filesystem, you'd need those fs queues > running! No, I don't. It's done by bmapping the file and writing directly to the underlying blockdev. Otherwise we'd have corrupted filesystems after the restore. Swapfiles are handled this way anyway, so we just use the same code. > > Well, I'm not sure whether or not that still would have been the case if we > > had > > stopped to freeze kernel threads for the hibernation/suspend. > > Did you miss the email where Paul pointed out that Mac/PowerPC didn't use > to do any of this? No, I didn't. > And apparently never had any issues with it? On one platform with a limited subset of device drivers. > And probably worked more reliably several years ago than suspend/hibernation > does _today_? I have no problems with the hibernation on my test boxes (six of them), except for one network driver that doesn't bother to define a .suspend() callback. There are problems with the suspend (s2ram), but they are _not_ related to the freezing of kernel threads. Some of them are related to the other issue that you have risen, which is that the same callbacks should not be used for the suspend and hibernation, and which I think is absolutely valid. The remaining ones are related to the fact that graphic card vendors don't care for us at all. > Ie we do have history of _not_ freezing things. The freezing came later, > and came with the subsystem that had more problems.. It doesn't have that many problems as you are trying to suggest. At present, the only problems with it happen if someone tries to "improve" it in the way I did with the workqueues. Anyway, the freezing of tasks, including kernel threads, is one of the few things on which Pavel, Nigel and me completely agree that they should be done, so perhaps you could accept that? Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad_page from quicklist patches
> That should have been fixed in -mm2 Verified. 2.6.21-rc7-mm2 builds and boots on my SN2 (ia64). -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Back to the future.
Hi! > > Ie we do have history of _not_ freezing things. The freezing came later, > > and came with the subsystem that had more problems.. > > It doesn't have that many problems as you are trying to suggest. At present, > the only problems with it happen if someone tries to "improve" it in the way > I did with the workqueues. > > Anyway, the freezing of tasks, including kernel threads, is one of the few > things on which Pavel, Nigel and me completely agree that they should be done, > so perhaps you could accept that? Actually, if we want to support OLPC _nicely_, we'll need to get rid of freezer from suspend-to-RAM. Of course, that _will_ put more pressure at the drivers -- and break few of them... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, Apr 29, 2007 at 12:54:36AM -0700, William Lee Irwin III wrote: >> Common code for rbtree-based priority queues can be factored out of >> cfq, cfs, and hrtimers. On Sun, Apr 29, 2007 at 10:13:17AM +0200, Willy Tarreau wrote: > In my experience, rbtrees are painfully slow. Yesterday, I spent the > day replacing them in haproxy with other trees I developped a few > years ago, which look like radix trees. They are about 2-3 times as > fast to insert 64-bit data, and you walk through them in O(1). I have > many changes to apply to them before they could be used in kernel, but > at least I think we already have code available for other types of trees. Dynamic allocation of auxiliary indexing structures is problematic for the scheduler, which significantly constrains the algorithms one may use for this purpose. rbtrees are not my favorite either. Faster alternatives to rbtrees exist even among binary trees; for instance, it's not so difficult to implement a heap-ordered tree maintaining the red-black invariant with looser constraints on the tree structure and hence less rebalancing. One could always try implementing a van Emde Boas queue, if he felt particularly brave. Some explanation of the structure may be found at: http://courses.csail.mit.edu/6.897/spring03/scribe_notes/L1/lecture1.pdf According to that, y-trees use less space, and exponential trees are asymptotically faster with a worst-case asymptotic running time of O(min(lg(lg(u))*lg(lg(n))/lg(lg(lg(u))), sqrt(lg(n)/lg(lg(n) for all operations, so van Emde Boas is not the ultimate algorithm by any means at O(lg(lg(u))); in these estimates, u is the size of the "universe," or otherwise the range of the key data type. Not to say that any of those are appropriate for the kernel; it's rather likely we'll have to settle for something less interesting, if we bother ditching rbtrees at all, on account of the constraints of the kernel environment. I'll see what I can do about a userspace test harness for priority queues more comprehensive than smart-queue.c. I have in mind the ability to replay traces obtained from queues in the kernel and loading priority queue implementations via dlopen()/dlsym() et al. valgrind can do most of the dirty work. Otherwise running a trace for some period of time and emitting the number of operations it got through should serve as a benchmark. With that in hand, people can grind out priority queue implementations and see how they compare on real operation sequences logged from live kernels. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] paravirt: fix startup_ipi_hook config dependency
startup_ipi_hook depends on CONFIG_X86_LOCAL_APIC, so move it to the right part of the paravirt_ops initialization. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/i386/kernel/paravirt.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) === --- a/arch/i386/kernel/paravirt.c +++ b/arch/i386/kernel/paravirt.c @@ -292,6 +292,7 @@ struct paravirt_ops paravirt_ops = { .apic_read = native_apic_read, .setup_boot_clock = setup_boot_APIC_clock, .setup_secondary_clock = setup_secondary_APIC_clock, + .startup_ipi_hook = paravirt_nop, #endif .set_lazy_mode = paravirt_nop, @@ -342,8 +343,6 @@ struct paravirt_ops paravirt_ops = { .dup_mmap = paravirt_nop, .exit_mmap = paravirt_nop, .activate_mm = paravirt_nop, - - .startup_ipi_hook = paravirt_nop, }; EXPORT_SYMBOL(paravirt_ops); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Back to the future.
On Sunday, 29 April 2007 10:23, Pavel Machek wrote: > Hi! > > > > > The freezer has *caused* those deadlocks (eg by stopping threads that > > > > were > > > > needed for the suspend writeouts to succeed!), not solved them. > > > > > > I can't remember anything like this, but I believe you have a specific > > > test > > > case in mind. > > > > Ehh.. Why do you thik we _have_ that PF_NOFREEZE thing in the first place? > > > > Rafael, you really don't know what you're talking about, do you? > > > > Just _look_ at them. It's the IO threads etc that shouldn't be frozen, > > exactly *because* they do IO. You claim that kernel threads shouldn't do > > IO, but that's the point: if you cannot do IO when snapshotting to disk, > > here's a damn big clue for you: how do you think that snapshot is going to > > get written? > > > > I *guarantee* you that we've had a lot more problems with threads that > > should *not* have been frozen than with those hypothetical threads that > > you think should have been frozen. > > Well, we had nasty corruption on XFS, caused by thread that was not > frozen and should be. (While the other case leads "only" to deadlocks, > so it is easier to debug.) > > The locking point.. when I added freezing to swsusp, I knew very > little about kernel locking, so I "simply" decided to avoid the > problem altogether... using the freezer. > > You may be right that locks are not a big problem for the hibernation > after all; I just do not know. Still, I think, if a kernel thread is a part of a device driver, then _in_ _principle_ it needs _some_ synchronization with the driver's suspend/freeze and resume/thaw callbacks. For example, it's reasonable to assume that the thread should be quiet between suspend/freeze and resume/thaw. With the freezing of kernel threads we provide a simple means of such synchronization: use try_to_freeze() in a suitable place of your kernel thread and you're done. [Well, there should be a second part for making the thread die if the thaw callback doesn't find the device, but that's in the works.] Without it, there may be race conditions that we are not even aware of and that may trigger in, say, 1 in 10 suspends or so and I wish you good luck with debugging such things. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memory.c: remove warning from an uninitialized spinlock. was: Re: 2.6.21-rc7-mm2
On Sun, 29 Apr 2007 08:50:49 +0200 Borislav Petkov <[EMAIL PROTECTED]> wrote: > Introduce a macro for suppressing gcc from generating a warning about a > probable > unitialized state of a variable. I ended up doing the below. It's better to make this a per-compiler-version thing: later versions of gcc might need different tricks, or might provide __attribute__((stfu)) or whatever. Plus I don't know if the x=x trick is needed on the intel compiler, nor if it even works, so I left ICC alone. From: Borislav Petkov <[EMAIL PROTECTED]> Introduce a macro for suppressing gcc from generating a warning about a probable uninitialized state of a variable. Example: - spinlock_t *ptl; + spinlock_t *uninitialized_var(ptl); Not a happy solution, but those warnings are obnoxious. - Using the usual pointlessly-set-it-to-zero approach wastes several bytes of text. - Using a macro means we can (hopefully) do something else if gcc changes cause the `x = x' hack to stop working - Using a macro means that people who are worried about hiding true bugs can easily turn it off. Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- include/linux/compiler-gcc3.h |6 ++ include/linux/compiler-gcc4.h |6 ++ include/linux/compiler-intel.h |2 ++ 3 files changed, 14 insertions(+) diff -puN include/linux/compiler-gcc3.h~add-unitialized_var-macro-for-suppressing-gcc-warnings include/linux/compiler-gcc3.h --- a/include/linux/compiler-gcc3.h~add-unitialized_var-macro-for-suppressing-gcc-warnings +++ a/include/linux/compiler-gcc3.h @@ -13,4 +13,10 @@ #define __must_check __attribute__((warn_unused_result)) #endif +/* + * A trick to suppress uninitialized variable warning without generating any + * code + */ +#define uninitialized_var(x) x = x + #define __always_inlineinline __attribute__((always_inline)) diff -puN include/linux/compiler-gcc4.h~add-unitialized_var-macro-for-suppressing-gcc-warnings include/linux/compiler-gcc4.h --- a/include/linux/compiler-gcc4.h~add-unitialized_var-macro-for-suppressing-gcc-warnings +++ a/include/linux/compiler-gcc4.h @@ -16,3 +16,9 @@ #define __must_check __attribute__((warn_unused_result)) #define __compiler_offsetof(a,b) __builtin_offsetof(a,b) #define __always_inlineinline __attribute__((always_inline)) + +/* + * A trick to suppress uninitialized variable warning without generating any + * code + */ +#define uninitialized_var(x) x = x diff -puN include/linux/compiler-intel.h~add-unitialized_var-macro-for-suppressing-gcc-warnings include/linux/compiler-intel.h --- a/include/linux/compiler-intel.h~add-unitialized_var-macro-for-suppressing-gcc-warnings +++ a/include/linux/compiler-intel.h @@ -22,3 +22,5 @@ (typeof(ptr)) (__ptr + (off)); }) #endif + +#define uninitialized_var(x) x _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Back to the future.
On Sunday, 29 April 2007 10:59, Pavel Machek wrote: > Hi! > > > > Ie we do have history of _not_ freezing things. The freezing came later, > > > and came with the subsystem that had more problems.. > > > > It doesn't have that many problems as you are trying to suggest. At > > present, > > the only problems with it happen if someone tries to "improve" it in the way > > I did with the workqueues. > > > > Anyway, the freezing of tasks, including kernel threads, is one of the few > > things on which Pavel, Nigel and me completely agree that they should be > > done, > > so perhaps you could accept that? > > Actually, if we want to support OLPC _nicely_, we'll need to get rid > of freezer from suspend-to-RAM. Of course, that _will_ put more > pressure at the drivers -- and break few of them... I think the removal of sys_sync() from freeze_processes() in the s2ram case might help. I'm really afraid of dropping the freezing of kernel threads from the hibernation/suspend altogether before we know we won't break drivers, because we can introduce some very subtle and difficult to debug problems this way. Moreover, apart from speeding up the suspend slightly (kernel threads are frozen very quickly) this won't buy us anything, since kprobes uses the freezer and all of the infrastructure is needed anyway. Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21
David Lang wrote: > I'll say that as a user I hate having to deal with bugzilla. > > there's nothing more frustrating then spending a good chunk of time > trying to find a similar bug, then jumping through all the bugzilla > hoops to file a report to eventually (days/weeks later) get a message > 'closed becouse it's a duplicate report), then have to go and track down > what it's a duplicate of, read through that bug report, only to find > that it's not solved there either, and to top it off, the people working > on that bug won't see my report or that I'm available to troubleshoot it. Ideally, joining duplicate reports should be a low-cost, lossless operation. That said, when bug B is marked as duplicate of bug A, people at bug A at least get a link to bug B, aren't they? If they are too lazy to read the report B, they obviously are not very interested in A either. Tough luck. Vice versa, people at bug B get notified that the matter is now continued at bug A and can add their Cc there. Of course that addition is one of the very few things that could probably be automated. Joining duplicate reports at a mailinglist involves responding to multiple threads and send links into web archives of the list, which happens to be redundant to and disparate from your local e-mail storage. I can't see how this aspect of bug-handling works easier on mailinglists. > from a user poit of view, e-mailing the kernel list (retrying a few days > later of there is no response) tends to work _much_ better. What I from a maintainer's POV agree with is that a report to the appropriate mailinglist is often easier to triage than a report at bugzilla, because the reporter often needs initial help to properly define the problem. Bugzilla becomes useful after a report reached a minimum level of quality (after minimum initial triage) and if the bug can be clearly associated with a maintained subsystem of the kernel (as e.g. Linus already pointed out in this thread). -- Stefan Richter -=-=-=== -=-- ===-= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/9] Containers (V9): Generic Process Containers
I'm afraid that this patch set doesn't do cpusets very well. It builds and boots and mounts the cpuset file system ok. But trying to write the 'mems' file hangs the system hard. I had to test it against 2.6.21-rc7-mm2, because I can't boot 2.6.21-rc7-mm1, due to the 'bad_page' problem that I noted in an earlier post this evening on lkml. These container patches seemed to apply ok against 2.6.21-rc7-mm2, and built and booted ok. I built with an sn2_defconfig configuration, having the following CONTAINER and CPUSET settings: CONFIG_CONTAINERS=y CONFIG_CONTAINER_DEBUG=y CONFIG_CPUSETS=y CONFIG_CONTAINER_CPUACCT=y CONFIG_PROC_PID_CPUSET=y # CONFIG_ACPI_CONTAINER is not set I could mount the cpuset file system on /dev/cpuset just fine. Then I invoked the following commands: # cd /dev/cpuset # mkdir foo # cd foo # echo 0-3 > cpus # echo 0-1 > mems At that point, the system hangs. Reproduced three times, on two boots. I never get a shell prompt back from that second echo. I have to hit Reset. The three different hangs were done with the following three different values: echo 0-3 > mems echo 0-1 > mems echo 1 > mems On that last one, "echo 1 > mems", I did not do the echo to cpus first. The test system had 8 cpus, numbered 0-7, and 4 mems, numbered 0-3. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.925.600.0401 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21
I wrote: > Joining duplicate reports at a mailinglist involves responding to > multiple threads and send links into web archives of the list, which > happens to be redundant to and disparate from your local e-mail storage. > I can't see how this aspect of bug-handling works easier on mailinglists. PS: Of course what _does_ work better on mailinglists than on bugzilla is to recognize duplicates as such in the first place, when the symptoms seem only loosely related. (I.e. seeing the big picture and recognize patterns.) -- Stefan Richter -=-=-=== -=-- ===-= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21
Willy Tarreau wrote: > On Sun, Apr 29, 2007 at 12:58:09AM +0200, Markus Rechberger wrote: >> I totally disagree here, bugzilla is a very good tool. If someone is >> too lazy to look at it it's his problem. > > I'm glad we finally found _the_ person using it ! > > More seriously, it's so much a complicated interface ! It's hard to > bring more people into a discussion, it's hard to comment on code or > suggested patches, etc... Mail is by far more adapted to the job ! To continue on the sarcastic tangent: This flaw of bugzilla is irrelevant for subsystems where there are less than three or two persons who steadily hunt bugs anyway. At the field I work on, I wouldn't have anybody else to bring in in the first place, except that I sometimes suggest to reporters to subscribe to a bug ticket. -- Stefan Richter -=-=-=== -=-- ===-= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sunday 29 April 2007 18:00, Ingo Molnar wrote: > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > > [...] except for Mike who has not tested recent versions. [...] > > > > > > actually, dont discount Mark Lord's test results either. And it > > > might be a good idea for Mike to re-test SD 0.46? > > > > In any case, it might be a good idea because Mike encountered a > > problem that nobody could reproduce. [...] > > actually, Mark Lord too reproduced something similar to Mike's results. > Please try those workloads yourself. I see no suggestion that either Mark or Mike have tested, or for that matter _have any intention of testing_, the current version of SD without fancy renicing or anything involved. Willy I grealy appreciate you trying, but I don't know why you're bothering even trying here since clearly 1. Ingo is the scheduler maintainer 2. he's working on a competing implementation and 3. in my excellent physical and mental state I seem to have slighted the two testers (both?) somewhere along the line. Mike feels his testing was a complete waste of time yet it would be ludicrous for me to say that SD didn't evolve 20 versions further due to his earlier testing, and was the impetus for you to start work on CFS. The crunch came that we couldn't agree that fair was appropriate for mainline and we parted ways. That fairness has not been a problem for his view on CFS though but he has only tested older versions of SD that still had bugs. Given facts 1 and 2 above I have all but resigned myself to the fact that SD has -less than zero- chance of ever being considered for mainline and it's my job to use it as something to compare your competing design with to make sure that when (and I do mean when since there seems no doubt in everyone else's mind) CFS becomes part of mainline that it is as good as SD. Saying people found CFS better than SD is, in my humble opinion, an exaggeration since every one I could find was a glowing standalone report of CFS rather than any comparison to the current very stable bug free version of SD. On the other hand I still see that when people compare them side to side they find SD is better, so I will hold CFS against that comparison - when comparing fairness based designs. On a related note - implementing a framework is nice but doesn't address any of the current fairness/starvation/corner case problems mainline has. I don't see much point in rushing the framework merging since it's still in flux. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Fastboot] restoring x86 BIOS state before reboot
* Eric W. Biederman <[EMAIL PROTECTED]> [2007-04-29 08:51]: > > It may also be worth investigating if it is possible to bypass > the part of windows that uses BIOS calls. I really don't have > a clue how a modern windows systems boots. I think ReactOS has a bootlader that can also boot Windows, or at least they are close to booting the Windows kernel. That bootloader could be then used for booting Windows without BIOS calls. I also heard that the 100-$-Laptop should now be able to use Windows. As it's using LinuxBIOS, this could also be interesting. However, maybe Microsoft simply modifies a special Windows for this, or they integrate a ClosedSource part in that LinuxBIOS, I don't know ... :-( Thanks, Bernhard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: PCI Express MMCONFIG and BIOS Bug messages..
> > I tried adapting a patch by Rajesh Shah to do this for current kernels: The Intel patches checked against ACPI which also didn't work in all cases. You're right the e820 check is overzealous and has a lot of false positives, but it is the only generic way we know right now to handle a common i965 BIOS bug. Also there is the nasty case of the Apple EFI boxes where only mmconfig works which has to be handled too. I expect eventually the logic to be: - If we know the hardware: read it from hw registers; trust them; ignore BIOS. - Otherwise check e820 and ACPI resources and be very trigger happy at not using it > It walks through all the motherboard resource devices and tries to pull > out the resource settings for all of them using the _CRS method. I tested it originally on a Intel system with the above BIOS problem and it didn't help there. > (Depending on how you do the probing, the _STA method is called as well, > either before or after.) From my limited ACPI knowledge, the problem is > that the PCI MMCONFIG initialization is called before the main ACPI > interpreter is enabled, and these control methods may try to access > operation regions who don't have handlers set up for them yet, so a > bunch of "no handler for region" errors show up. mmconfig access can be switched later without problems; so it would be possible to boot using Type1 if it works (e.g. detect the Apple case) and switch later. It's all quite tricky unfortunately; that is why i left it at the current relatively safe state for now. After all mmconfig is normally not needed. > So essentially if we want to do this check based on ACPI resource > reservations, we need to be able to execute control methods at the point > that MMCONFIG is set up. Is there a reason why this can't be made > possible (like by moving the necessary parts of ACPI initialization > earlier)? ACPI Interpreter wants to allocate memory and use other kernel services that are not available in really early boot. It could be probably done somehow, but would be quite ugly with lots of special cases. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, 2007-04-29 at 19:52 +1000, Con Kolivas wrote: > On Sunday 29 April 2007 18:00, Ingo Molnar wrote: > > * Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > > > [...] except for Mike who has not tested recent versions. [...] > > > > > > > > actually, dont discount Mark Lord's test results either. And it > > > > might be a good idea for Mike to re-test SD 0.46? > > > > > > In any case, it might be a good idea because Mike encountered a > > > problem that nobody could reproduce. [...] > > > > actually, Mark Lord too reproduced something similar to Mike's results. > > Please try those workloads yourself. > > I see no suggestion that either Mark or Mike have tested, or for that matter > _have any intention of testing_, the current version of SD without fancy > renicing or anything involved. Willy I grealy appreciate you trying, but I > don't know why you're bothering even trying here since clearly 1. Ingo is the > scheduler maintainer 2. he's working on a competing implementation and 3. in > my excellent physical and mental state I seem to have slighted the two > testers (both?) somewhere along the line. Mike feels his testing was a > complete waste of time yet it would be ludicrous for me to say that SD didn't > evolve 20 versions further due to his earlier testing, and was the impetus > for you to start work on CFS. The crunch came that we couldn't agree that > fair was appropriate for mainline and we parted ways. That fairness has not > been a problem for his view on CFS though but he has only tested older > versions of SD that still had bugs. The crunch for me came when you started hand-waving and spin-doctoring as you are doing now. Listening to twisted echoes of my voice is not my idea of a good time. -Mike - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
Willy, On Sun, 2007-04-29 at 09:16 +0200, Willy Tarreau wrote: > In fact, what I'd like to see in 2.6.22 is something better for everybody > and with *no* regression, even if it's not perfect. I had the feeling > that SD matched that goal right now, except for Mike who has not tested > recent versions. Don't get me wrong, I still think that CFS is a more > interesting long-term target. But it may require more time to satisfy > everyone. At least with one of them in 2.6.22, we won't waste time > comparing to current mainline. Oh no, we really do _NOT_ want to throw SD or anything else at mainline in a hurry just for not wasting time on comparing to the current scheduler. I agree that CFS is the more interesting target and I prefer to push the more interesting one even if it takes a release cycle longer. The main reason for me is the design of CFS. Even if it is not really modular right now, it is not rocket science to make it fully modular. Looking at the areas where people work on, e.g. containers, resource management, cpu isolation, fully tickless systems , we really need to go into that direction, when we want to avoid permanent tinkering in the core scheduler code for the next five years. As a sidenote: I really wonder if anybody noticed yet, that the whole CFS / SD comparison is so ridiculous, that it is not even funny anymore. CFS modifies the scheduler and nothing else, SD fiddles all over the kernel in interesting ways. This is worse than apples and oranges, it's more like apples and screwdrivers. Can we please stop this useless pissing contest and sit down and get a modular design into mainline, which allows folks to work and integrate their "workload X perfect scheduler" and gives us the flexibility to adjust to the needs of upcoming functionality. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sun, Apr 29, 2007 at 12:30:54PM +0200, Thomas Gleixner wrote: > Can we please stop this useless pissing contest and sit down and get a > modular design into mainline, which allows folks to work and integrate > their "workload X perfect scheduler" and gives us the flexibility to > adjust to the needs of upcoming functionality. If I don't see some sort of modularity patch soon I'll post one myself. -- wli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.22 patch] the scheduled removal of the i8xx_tco watchdog driver
Hi Adrian, > I might be blind, but I'm not seeing what should be removed. [EMAIL PROTECTED] linux-2.6]$ grep -i tco Documentation/watchdog/* Documentation/watchdog/watchdog-api.txt:i810-tco.c -- Intel 810 chipset Documentation/watchdog/watchdog.txt:The i810 TCO watchdog modules can be configured with the "i810_margin" Documentation/watchdog/watchdog.txt:The i810 TCO watchdog driver also implements the WDIOC_GETSTATUS and Documentation/watchdog/watchdog.txt:and WDIOC_GETBOOTSTATUS returns the value of TCO2 Status Register (see Intel's Documentation/watchdog/watchdog.txt:WDT501P WDT500P SoftwareBerkshire i810 TCOSA1100WD But I need to clean this up anyway. So no real need to put this in the same patch yet. I'll create a new patch for this that deals with the complete watchdog Documentation. Greetings, Wim. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[2.6.21-rc7-mm2] BUG while suspend to ram
BUG: at kernel/kthread.c:166 kthread_bind() [] _cpu_down+0x16c/0x250 [] disable_nonboot_cpus+0x60/0xf0 [] pm_suspend_disk+0x177/0x2c0 [] enter_state+0xb5/0x200 [] state_store+0xbd/0xd0 [] state_store+0x0/0xd0 [] subsys_attr_store+0x29/0x40 [] sysfs_write_file+0xd4/0x160 [] vfs_write+0xc1/0x160 [] sysfs_write_file+0x0/0x160 [] sys_write+0x41/0x70 [] sys_dup2+0xd5/0x100 [] sysenter_past_esp+0x5f/0x85 [] xfrm_policy_insert+0x210/0x400 === dmesg: http://www.unixy.pl/maciek/download/kernel/2.6.21-rc7--mm2/dmesg.txt.gz lsmod: http://www.unixy.pl/maciek/download/kernel/2.6.21-rc7--mm2/lsmod.txt.gz ver_linux: http://www.unixy.pl/maciek/download/kernel/2.6.21-rc7--mm2/ver_linux.txt.gz lspci: http://www.unixy.pl/maciek/download/kernel/2.6.21-rc7--mm2/lspci.txt.gz config: http://www.unixy.pl/maciek/download/kernel/2.6.21-rc7--mm2/config-2.6.21-rc7-mm2.gz -- Maciej Rutecki <[EMAIL PROTECTED]> www.unixy.pl LTG - Linux Testers Group (PL) (http://www.stardust.webpages.pl/ltg/) smime.p7s Description: S/MIME Cryptographic Signature
Re: [patch] CFS scheduler, -v6
On Sun, 2007-04-29 at 12:30 +0200, Thomas Gleixner wrote: > Willy, > As a sidenote: I really wonder if anybody noticed yet, that the whole > CFS / SD comparison is so ridiculous, that it is not even funny anymore. > CFS modifies the scheduler and nothing else, SD fiddles all over the > kernel in interesting ways. > have you looked at diffstat lately? :) sd: Documentation/sched-design.txt | 241 +++ Documentation/sysctl/kernel.txt | 14 Makefile|2 fs/pipe.c |7 fs/proc/array.c |2 include/linux/init_task.h |4 include/linux/sched.h | 32 - kernel/sched.c | 1279 +++- kernel/softirq.c|2 kernel/sysctl.c | 26 kernel/workqueue.c |2 11 files changed, 919 insertions(+), 692 deletions(-) cfs: Documentation/kernel-parameters.txt | 43 Documentation/sched-design-CFS.txt | 107 + Makefile|2 arch/i386/kernel/smpboot.c | 13 arch/i386/kernel/tsc.c |8 arch/ia64/kernel/setup.c|6 arch/mips/kernel/smp.c | 11 arch/sparc/kernel/smp.c | 10 arch/sparc64/kernel/smp.c | 36 fs/proc/array.c | 11 fs/proc/base.c |2 fs/proc/internal.h |1 include/asm-i386/unistd.h |3 include/asm-x86_64/unistd.h |4 include/linux/hardirq.h | 13 include/linux/sched.h | 94 + init/main.c |2 kernel/exit.c |3 kernel/fork.c |4 kernel/posix-cpu-timers.c | 34 kernel/sched.c | 2288 +--- kernel/sched_debug.c| 152 ++ kernel/sched_fair.c | 601 + kernel/sched_rt.c | 184 ++ kernel/sched_stats.h| 235 +++ kernel/sysctl.c | 32 26 files changed, 2062 insertions(+), 1837 deletions(-) > This is worse than apples and oranges, it's more like apples and > screwdrivers. > > tglx > > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [3/48] x86_64: Use new shared sched_clock in x86-64 too
Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/kernel/Makefile |3 ++- arch/x86_64/kernel/time.c |1 - arch/x86_64/kernel/tsc.c| 28 include/asm-x86_64/timer.h |1 + include/asm-x86_64/timex.h |1 - 5 files changed, 3 insertions(+), 31 deletions(-) Index: linux/arch/x86_64/kernel/Makefile === --- linux.orig/arch/x86_64/kernel/Makefile +++ linux/arch/x86_64/kernel/Makefile @@ -8,7 +8,7 @@ obj-y := process.o signal.o entry.o trap ptrace.o time.o ioport.o ldt.o setup.o i8259.o sys_x86_64.o \ x8664_ksyms.o i387.o syscall.o vsyscall.o \ setup64.o bootflag.o e820.o reboot.o quirks.o i8237.o \ - pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o + pci-dma.o pci-nommu.o alternative.o hpet.o tsc.o sched-clock.o obj-$(CONFIG_STACKTRACE) += stacktrace.o obj-$(CONFIG_X86_MCE) += mce.o therm_throt.o @@ -57,3 +57,4 @@ i8237-y += ../../i386/kernel/i8237.o msr-$(subst m,y,$(CONFIG_X86_MSR)) += ../../i386/kernel/msr.o alternative-y += ../../i386/kernel/alternative.o pcspeaker-y+= ../../i386/kernel/pcspeaker.o +sched-clock-y += ../../i386/kernel/sched-clock.o Index: linux/arch/x86_64/kernel/tsc.c === --- linux.orig/arch/x86_64/kernel/tsc.c +++ linux/arch/x86_64/kernel/tsc.c @@ -16,32 +16,6 @@ EXPORT_SYMBOL(cpu_khz); unsigned int tsc_khz; EXPORT_SYMBOL(tsc_khz); -static unsigned int cyc2ns_scale __read_mostly; - -void set_cyc2ns_scale(unsigned long khz) -{ - cyc2ns_scale = (NSEC_PER_MSEC << NS_SCALE) / khz; -} - -static unsigned long long cycles_2_ns(unsigned long long cyc) -{ - return (cyc * cyc2ns_scale) >> NS_SCALE; -} - -unsigned long long sched_clock(void) -{ - unsigned long a = 0; - - /* Could do CPU core sync here. Opteron can execute rdtsc speculatively, -* which means it is not completely exact and may not be monotonous -* between CPUs. But the errors should be too small to matter for -* scheduling purposes. -*/ - - rdtscll(a); - return cycles_2_ns(a); -} - static int tsc_unstable; static inline int check_tsc_unstable(void) @@ -114,8 +88,6 @@ static int time_cpufreq_notifier(struct mark_tsc_unstable(); } - set_cyc2ns_scale(tsc_khz_ref); - return 0; } Index: linux/include/asm-x86_64/timer.h === --- /dev/null +++ linux/include/asm-x86_64/timer.h @@ -0,0 +1 @@ +#define get_scheduled_cycles(x) rdtscll(x) Index: linux/arch/x86_64/kernel/time.c === --- linux.orig/arch/x86_64/kernel/time.c +++ linux/arch/x86_64/kernel/time.c @@ -404,7 +404,6 @@ void __init time_init(void) else vgetcpu_mode = VGETCPU_LSL; - set_cyc2ns_scale(tsc_khz); printk(KERN_INFO "time.c: Detected %d.%03d MHz processor.\n", cpu_khz / 1000, cpu_khz % 1000); init_tsc_clocksource(); Index: linux/include/asm-x86_64/timex.h === --- linux.orig/include/asm-x86_64/timex.h +++ linux/include/asm-x86_64/timex.h @@ -28,5 +28,4 @@ extern int read_current_timer(unsigned l #define US_SCALE32 /* 2^32, arbitralrily chosen */ extern void mark_tsc_unstable(void); -extern void set_cyc2ns_scale(unsigned long khz); #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [0/48] x86 candidate patches for review III: various stuff
- Rewritten all dancing sched_clock() - Extended numa emulation support from David Rientjes; now the sizes of the emulated nodes can be configured on the command line. - Faster vgettimeofday from Eric Dumazet - GDT cleanups from Rusty - cpa() fixes and better kernel protection from Jan Beulich - PGD handling cleanup from Christoph Lameter - Various other changes - Lots of minor cleanups from various people Please review. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [28/48] i386: prevent ACPI quirk warning mass spamming in logs
From: Thierry Vignaud <[EMAIL PROTECTED]> The following patch prevent this warning to be displayed again & again (eg: nine times on my NForce2 motherboard) and thus improve signal to noise ratio in logs. The ATI quirk below probably needs a similar "fix" but I don't have the hardware to test. Btw arch/x86_64/kernel/early-quirks.c::nvidia_bugs() would probably need to be synced (but I don't have an x86_64 NVidia motherboard to boot test it). Still it shows the usefullity of the recent x86 merge thread. [EMAIL PROTECTED]: cleanup] Signed-off-by: Thierry Vignaud <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Len Brown <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/acpi/earlyquirk.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) Index: linux/arch/i386/kernel/acpi/earlyquirk.c === --- linux.orig/arch/i386/kernel/acpi/earlyquirk.c +++ linux/arch/i386/kernel/acpi/earlyquirk.c @@ -21,11 +21,14 @@ static int __init nvidia_hpet_check(stru static int __init check_bridge(int vendor, int device) { + static int warned; #ifdef CONFIG_ACPI /* According to Nvidia all timer overrides are bogus unless HPET is enabled. */ if (!acpi_use_timer_override && vendor == PCI_VENDOR_ID_NVIDIA) { - if (acpi_table_parse(ACPI_SIG_HPET, nvidia_hpet_check)) { + if (!warned && acpi_table_parse(ACPI_SIG_HPET, + nvidia_hpet_check)) { + warned = 1; acpi_skip_timer_override = 1; printk(KERN_INFO "Nvidia board " "detected. Ignoring ACPI " - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [30/48] i386: Use per-cpu variables for GDT, PDA
From: Rusty Russell <[EMAIL PROTECTED]> Allocating PDA and GDT at boot is a pain. Using simple per-cpu variables adds happiness (although we need the GDT page-aligned for Xen, which we do in a followup patch). [EMAIL PROTECTED]: build fix] Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/common.c| 94 --- arch/i386/kernel/smpboot.c | 21 --- arch/i386/mach-voyager/voyager_smp.c | 10 --- include/asm-generic/percpu.h |1 include/asm-i386/desc.h |1 include/asm-i386/pda.h |7 +- include/asm-i386/processor.h |2 7 files changed, 21 insertions(+), 115 deletions(-) Index: linux/arch/i386/kernel/cpu/common.c === --- linux.orig/arch/i386/kernel/cpu/common.c +++ linux/arch/i386/kernel/cpu/common.c @@ -25,8 +25,10 @@ DEFINE_PER_CPU(struct Xgt_desc_struct, cpu_gdt_descr); EXPORT_PER_CPU_SYMBOL(cpu_gdt_descr); -struct i386_pda *_cpu_pda[NR_CPUS] __read_mostly; -EXPORT_SYMBOL(_cpu_pda); +DEFINE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]); + +DEFINE_PER_CPU(struct i386_pda, _cpu_pda); +EXPORT_PER_CPU_SYMBOL(_cpu_pda); static int cachesize_override __cpuinitdata = -1; static int disable_x86_fxsr __cpuinitdata; @@ -609,52 +611,6 @@ struct pt_regs * __devinit idle_regs(str return regs; } -static __cpuinit int alloc_gdt(int cpu) -{ - struct Xgt_desc_struct *cpu_gdt_descr = &per_cpu(cpu_gdt_descr, cpu); - struct desc_struct *gdt; - struct i386_pda *pda; - - gdt = (struct desc_struct *)cpu_gdt_descr->address; - pda = cpu_pda(cpu); - - /* -* This is a horrible hack to allocate the GDT. The problem -* is that cpu_init() is called really early for the boot CPU -* (and hence needs bootmem) but much later for the secondary -* CPUs, when bootmem will have gone away -*/ - if (NODE_DATA(0)->bdata->node_bootmem_map) { - BUG_ON(gdt != NULL || pda != NULL); - - gdt = alloc_bootmem_pages(PAGE_SIZE); - pda = alloc_bootmem(sizeof(*pda)); - /* alloc_bootmem(_pages) panics on failure, so no check */ - - memset(gdt, 0, PAGE_SIZE); - memset(pda, 0, sizeof(*pda)); - } else { - /* GDT and PDA might already have been allocated if - this is a CPU hotplug re-insertion. */ - if (gdt == NULL) - gdt = (struct desc_struct *)get_zeroed_page(GFP_KERNEL); - - if (pda == NULL) - pda = kmalloc_node(sizeof(*pda), GFP_KERNEL, cpu_to_node(cpu)); - - if (unlikely(!gdt || !pda)) { - free_pages((unsigned long)gdt, 0); - kfree(pda); - return 0; - } - } - - cpu_gdt_descr->address = (unsigned long)gdt; - cpu_pda(cpu) = pda; - - return 1; -} - /* Initial PDA used by boot CPU */ struct i386_pda boot_pda = { ._pda = &boot_pda, @@ -670,31 +626,17 @@ static inline void set_kernel_fs(void) asm volatile ("mov %0, %%fs" : : "r" (__KERNEL_PDA) : "memory"); } -/* Initialize the CPU's GDT and PDA. The boot CPU does this for - itself, but secondaries find this done for them. */ -__cpuinit int init_gdt(int cpu, struct task_struct *idle) +/* Initialize the CPU's GDT and PDA. This is either the boot CPU doing itself + (still using cpu_gdt_table), or a CPU doing it for a secondary which + will soon come up. */ +__cpuinit void init_gdt(int cpu, struct task_struct *idle) { struct Xgt_desc_struct *cpu_gdt_descr = &per_cpu(cpu_gdt_descr, cpu); - struct desc_struct *gdt; - struct i386_pda *pda; - - /* For non-boot CPUs, the GDT and PDA should already have been - allocated. */ - if (!alloc_gdt(cpu)) { - printk(KERN_CRIT "CPU%d failed to allocate GDT or PDA\n", cpu); - return 0; - } - - gdt = (struct desc_struct *)cpu_gdt_descr->address; - pda = cpu_pda(cpu); - - BUG_ON(gdt == NULL || pda == NULL); + struct desc_struct *gdt = per_cpu(cpu_gdt, cpu); + struct i386_pda *pda = &per_cpu(_cpu_pda, cpu); - /* -* Initialize the per-CPU GDT with the boot GDT, -* and set up the GDT descriptor: -*/ memcpy(gdt, cpu_gdt_table, GDT_SIZE); + cpu_gdt_descr->address = (unsigned long)gdt; cpu_gdt_descr->size = GDT_SIZE - 1; pack_descriptor((u32 *)&gdt[GDT_ENTRY_PDA].a, @@ -706,17 +648,12 @@ __cpuinit int init_gdt(int cpu, struct t pda->_pda = pda; pda->cpu_number = cpu; pda->pcurrent = idle; - - return 1; } void __cpuinit c
[PATCH] [40/48] x86_64: use lru instead of page->index and page->private for pgd lists management.
From: Christoph Lameter <[EMAIL PROTECTED]> x86_64 currently simulates a list using the index and private fields of the page struct. Seems that the code was inherited from i386. But x86_64 does not use the slab to allocate pgds and pmds etc. So the lru field is not used by the slab and therefore available. This patch uses standard list operations on page->lru to realize pgd tracking. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/x86_64/mm/fault.c |5 ++--- include/asm-x86_64/pgalloc.h | 14 +++--- include/asm-x86_64/pgtable.h |2 +- 3 files changed, 6 insertions(+), 15 deletions(-) Index: linux/arch/x86_64/mm/fault.c === --- linux.orig/arch/x86_64/mm/fault.c +++ linux/arch/x86_64/mm/fault.c @@ -585,7 +585,7 @@ do_sigbus: } DEFINE_SPINLOCK(pgd_lock); -struct page *pgd_list; +LIST_HEAD(pgd_list); void vmalloc_sync_all(void) { @@ -605,8 +605,7 @@ void vmalloc_sync_all(void) if (pgd_none(*pgd_ref)) continue; spin_lock(&pgd_lock); - for (page = pgd_list; page; -page = (struct page *)page->index) { + list_for_each_entry(page, &pgd_list, lru) { pgd_t *pgd; pgd = (pgd_t *)page_address(page) + pgd_index(address); if (pgd_none(*pgd)) Index: linux/include/asm-x86_64/pgalloc.h === --- linux.orig/include/asm-x86_64/pgalloc.h +++ linux/include/asm-x86_64/pgalloc.h @@ -44,24 +44,16 @@ static inline void pgd_list_add(pgd_t *p struct page *page = virt_to_page(pgd); spin_lock(&pgd_lock); - page->index = (pgoff_t)pgd_list; - if (pgd_list) - pgd_list->private = (unsigned long)&page->index; - pgd_list = page; - page->private = (unsigned long)&pgd_list; + list_add(&page->lru, &pgd_list); spin_unlock(&pgd_lock); } static inline void pgd_list_del(pgd_t *pgd) { - struct page *next, **pprev, *page = virt_to_page(pgd); + struct page *page = virt_to_page(pgd); spin_lock(&pgd_lock); - next = (struct page *)page->index; - pprev = (struct page **)page->private; - *pprev = next; - if (next) - next->private = (unsigned long)pprev; + list_del(&page->lru); spin_unlock(&pgd_lock); } Index: linux/include/asm-x86_64/pgtable.h === --- linux.orig/include/asm-x86_64/pgtable.h +++ linux/include/asm-x86_64/pgtable.h @@ -410,7 +410,7 @@ static inline pte_t pte_modify(pte_t pte #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) extern spinlock_t pgd_lock; -extern struct page *pgd_list; +extern struct list_head pgd_list; void vmalloc_sync_all(void); extern int kern_addr_valid(unsigned long addr); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [38/48] x86_64: Remove unused stext symbol
suggested by Jan Beulich Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/kernel/head.S |1 - 1 file changed, 1 deletion(-) Index: linux/arch/x86_64/kernel/head.S === --- linux.orig/arch/x86_64/kernel/head.S +++ linux/arch/x86_64/kernel/head.S @@ -279,7 +279,6 @@ early_idt_ripmsg: .asciz "RIP %s\n" .balign PAGE_SIZE -ENTRY(stext) #define NEXT_PAGE(name) \ .balign PAGE_SIZE; \ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [39/48] i386: remove the APM_RTC_IS_GMT config option.
From: "Parag Warudkar" <[EMAIL PROTECTED]> Signed-off-by: Parag Warudkar <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/i386/Kconfig | 13 - 1 file changed, 13 deletions(-) Index: linux/arch/i386/Kconfig === --- linux.orig/arch/i386/Kconfig +++ linux/arch/i386/Kconfig @@ -1029,19 +1029,6 @@ config APM_DISPLAY_BLANK backlight at all, or it might print a lot of errors to the console, especially if you are using gpm. -config APM_RTC_IS_GMT - bool "RTC stores time in GMT" - depends on APM - help - Say Y here if your RTC (Real Time Clock a.k.a. hardware clock) - stores the time in GMT (Greenwich Mean Time). Say N if your RTC - stores localtime. - - It is in fact recommended to store GMT in your RTC, because then you - don't have to worry about daylight savings time changes. The only - reason not to use GMT in your RTC is if you also run a broken OS - that doesn't understand GMT. - config APM_ALLOW_INTS bool "Allow interrupts during APM BIOS calls" depends on APM - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [45/48] x86_64: Inhibit machine from asserting an NMI when doing Alt-SysRq-M operation.
From: Konrad Rzeszutek <[EMAIL PROTECTED]> This patch touches the NMI watchdog every MAX_ORDER_NR_PAGES to inhibit the machine from triggering an NMI while the CPUs are locked. This situation is happening on boxes with more than 64CPUs and 128GB of RAM when Alt-SysRq-m is performed. It has been succesfully tested for regression on uni, 2, 4, 8 32, and 64 CPU boxes with various memory configuration. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/mm/init.c |6 ++ 1 file changed, 6 insertions(+) Index: linux/arch/x86_64/mm/init.c === --- linux.orig/arch/x86_64/mm/init.c +++ linux/arch/x86_64/mm/init.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -73,6 +74,11 @@ void show_mem(void) for_each_online_pgdat(pgdat) { for (i = 0; i < pgdat->node_spanned_pages; ++i) { + /* this loop can take a while with 256 GB and 4k pages + so update the NMI watchdog */ + if (unlikely(i % MAX_ORDER_NR_PAGES == 0)) { + touch_nmi_watchdog(); + } page = pfn_to_page(pgdat->node_start_pfn + i); total++; if (PageReserved(page)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [36/48] i386: get rid of unused variables
From: Parag Warudkar <[EMAIL PROTECTED]> Signed-off-by: Parag Warudkar <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/i386/kernel/apm.c |7 --- 1 file changed, 7 deletions(-) Index: linux/arch/i386/kernel/apm.c === --- linux.orig/arch/i386/kernel/apm.c +++ linux/arch/i386/kernel/apm.c @@ -384,13 +384,6 @@ static int ignore_sys_suspend; static int ignore_normal_resume; static int bounce_interval __read_mostly = DEFAULT_BOUNCE_INTERVAL; -#ifdef CONFIG_APM_RTC_IS_GMT -# define clock_cmos_diff 0 -# define got_clock_diff 1 -#else -static longclock_cmos_diff; -static int got_clock_diff; -#endif static int debug __read_mostly; static int smp __read_mostly; static int apm_disabled = -1; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [43/48] x86_64: fix vtime() vsyscall
From: Eric Dumazet <[EMAIL PROTECTED]> There is a tiny probability that the return value from vtime(time_t *t) is Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> different than the value stored in *t Using a temporary variable solves the problem and gives a faster code. 17: 48 85 fftest %rdi,%rdi 1a: 48 8b 05 00 00 00 00mov0(%rip),%rax# __vsyscall_gtod_data.wall_time_tv.tv_sec 21: 74 03 je 26 23: 48 89 07mov%rax,(%rdi) 26: c9 leaveq 27: c3 retq Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> --- arch/x86_64/kernel/vsyscall.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) Index: linux/arch/x86_64/kernel/vsyscall.c === --- linux.orig/arch/x86_64/kernel/vsyscall.c +++ linux/arch/x86_64/kernel/vsyscall.c @@ -156,11 +156,13 @@ int __vsyscall(0) vgettimeofday(struct t * unlikely */ time_t __vsyscall(1) vtime(time_t *t) { + time_t result; if (unlikely(!__vsyscall_gtod_data.sysctl_enabled)) return time_syscall(t); - else if (t) - *t = __vsyscall_gtod_data.wall_time_tv.tv_sec; - return __vsyscall_gtod_data.wall_time_tv.tv_sec; + result = __vsyscall_gtod_data.wall_time_tv.tv_sec; + if (t) + *t = result; + return result; } /* Fast way to get current CPU and node. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [32/48] i386: clean up cpu_init()
From: Rusty Russell <[EMAIL PROTECTED]> We now have cpu_init() and secondary_cpu_init() doing nothing but calling _cpu_init() with the same arguments. Rename _cpu_init() to cpu_init() and use it as a replcement for secondary_cpu_init(). Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/common.c | 34 +- arch/i386/kernel/smpboot.c|8 include/asm-i386/processor.h |2 +- 3 files changed, 14 insertions(+), 30 deletions(-) Index: linux/arch/i386/kernel/cpu/common.c === --- linux.orig/arch/i386/kernel/cpu/common.c +++ linux/arch/i386/kernel/cpu/common.c @@ -644,9 +644,16 @@ struct i386_pda boot_pda = { .pcurrent = &init_task, }; -/* Common CPU init for both boot and secondary CPUs */ -static void __cpuinit _cpu_init(int cpu, struct task_struct *curr) +/* + * cpu_init() initializes state that is per-CPU. Some data is already + * initialized (naturally) in the bootstrap process, such as the GDT + * and IDT. We reload them nevertheless, this function acts as a + * 'CPU state barrier', nothing should get across. + */ +void __cpuinit cpu_init(void) { + int cpu = smp_processor_id(); + struct task_struct *curr = current; struct tss_struct * t = &per_cpu(init_tss, cpu); struct thread_struct *thread = &curr->thread; @@ -706,29 +713,6 @@ static void __cpuinit _cpu_init(int cpu, mxcsr_feature_mask_init(); } -/* Entrypoint to initialize secondary CPU */ -void __cpuinit secondary_cpu_init(void) -{ - int cpu = smp_processor_id(); - struct task_struct *curr = current; - - _cpu_init(cpu, curr); -} - -/* - * cpu_init() initializes state that is per-CPU. Some data is already - * initialized (naturally) in the bootstrap process, such as the GDT - * and IDT. We reload them nevertheless, this function acts as a - * 'CPU state barrier', nothing should get across. - */ -void __cpuinit cpu_init(void) -{ - int cpu = smp_processor_id(); - struct task_struct *curr = current; - - _cpu_init(cpu, curr); -} - #ifdef CONFIG_HOTPLUG_CPU void __cpuinit cpu_uninit(void) { Index: linux/arch/i386/kernel/smpboot.c === --- linux.orig/arch/i386/kernel/smpboot.c +++ linux/arch/i386/kernel/smpboot.c @@ -378,14 +378,14 @@ set_cpu_sibling_map(int cpu) static void __cpuinit start_secondary(void *unused) { /* -* Don't put *anything* before secondary_cpu_init(), SMP -* booting is too fragile that we want to limit the -* things done here to the most necessary things. +* Don't put *anything* before cpu_init(), SMP booting is too +* fragile that we want to limit the things done here to the +* most necessary things. */ #ifdef CONFIG_VMI vmi_bringup(); #endif - secondary_cpu_init(); + cpu_init(); preempt_disable(); smp_callin(); while (!cpu_isset(smp_processor_id(), smp_commenced_mask)) Index: linux/include/asm-i386/processor.h === --- linux.orig/include/asm-i386/processor.h +++ linux/include/asm-i386/processor.h @@ -744,6 +744,6 @@ extern void enable_sep_cpu(void); extern int sysenter_setup(void); extern void cpu_set_gdt(int); -extern void secondary_cpu_init(void); +extern void cpu_init(void); #endif /* __ASM_I386_PROCESSOR_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [33/48] i386: Rename boot_gdt_table to boot_gdt
From: Sebastien Dugue <[EMAIL PROTECTED]> Rename boot_gdt_table to boot_gdt to avoid the duplicate T(able). Signed-off-by: Sebastien Dugue <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Acked-by: Rusty Russell <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/head.S |9 - arch/i386/kernel/trampoline.S | 12 ++-- 2 files changed, 10 insertions(+), 11 deletions(-) Index: linux/arch/i386/kernel/head.S === --- linux.orig/arch/i386/kernel/head.S +++ linux/arch/i386/kernel/head.S @@ -147,8 +147,7 @@ page_pde_offset = (__PAGE_OFFSET >> 20); /* * Non-boot CPU entry point; entered from trampoline.S * We can't lgdt here, because lgdt itself uses a data segment, but - * we know the trampoline has already loaded the boot_gdt_table GDT - * for us. + * we know the trampoline has already loaded the boot_gdt for us. * * If cpu hotplug is not supported then this code can go in init section * which will be freed later @@ -588,7 +587,7 @@ fault_msg: .word 0 # 32 bit align gdt_desc.address boot_gdt_descr: .word __BOOT_DS+7 - .long boot_gdt_table - __PAGE_OFFSET + .long boot_gdt - __PAGE_OFFSET .word 0 # 32-bit align idt_desc.address idt_descr: @@ -602,11 +601,11 @@ ENTRY(early_gdt_descr) .long per_cpu__cpu_gdt /* Overwritten for secondary CPUs */ /* - * The boot_gdt_table must mirror the equivalent in setup.S and is + * The boot_gdt must mirror the equivalent in setup.S and is * used only for booting. */ .align L1_CACHE_BYTES -ENTRY(boot_gdt_table) +ENTRY(boot_gdt) .fill GDT_ENTRY_BOOT_CS,8,0 .quad 0x00cf9a00/* kernel 4GB code at 0x */ .quad 0x00cf9200/* kernel 4GB data at 0x */ Index: linux/arch/i386/kernel/trampoline.S === --- linux.orig/arch/i386/kernel/trampoline.S +++ linux/arch/i386/kernel/trampoline.S @@ -29,7 +29,7 @@ * * TYPE VALUE * R_386_32 startup_32_smp - * R_386_32 boot_gdt_table + * R_386_32 boot_gdt */ #include @@ -62,8 +62,8 @@ r_base = . * to 32 bit. */ - lidtl boot_idt - r_base # load idt with 0, 0 - lgdtl boot_gdt - r_base # load gdt with whatever is appropriate + lidtl boot_idt_descr - r_base # load idt with 0, 0 + lgdtl boot_gdt_descr - r_base # load gdt with whatever is appropriate xor %ax, %ax inc %ax # protected mode (PE) bit @@ -73,11 +73,11 @@ r_base = . # These need to be in the same 64K segment as the above; # hence we don't use the boot_gdt_descr defined in head.S -boot_gdt: +boot_gdt_descr: .word __BOOT_DS + 7 # gdt limit - .long boot_gdt_table-__PAGE_OFFSET# gdt base + .long boot_gdt - __PAGE_OFFSET# gdt base -boot_idt: +boot_idt_descr: .word 0 # idt limit = 0 .long 0 # idt base = 0L - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [31/48] i386: Use per-cpu GDT immediately upon boot
From: Rusty Russell <[EMAIL PROTECTED]> Now we are no longer dynamically allocating the GDT, we don't need the "cpu_gdt_table" at all: we can switch straight from "boot_gdt_table" to the per-cpu GDT. This means initializing the cpu_gdt array in C. The boot CPU uses the per-cpu var directly, then in smp_prepare_cpus() it switches to the per-cpu copy just allocated. For secondary CPUs, the early_gdt_descr is set to point directly to their per-cpu copy. For UP the code is very simple: it keeps using the "per-cpu" GDT as per SMP, but we never have to move. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/common.c| 72 +-- arch/i386/kernel/head.S | 55 -- arch/i386/kernel/smpboot.c | 59 ++-- arch/i386/mach-voyager/voyager_smp.c |6 -- include/asm-i386/desc.h |2 include/asm-i386/processor.h |1 6 files changed, 75 insertions(+), 120 deletions(-) Index: linux/arch/i386/kernel/cpu/common.c === --- linux.orig/arch/i386/kernel/cpu/common.c +++ linux/arch/i386/kernel/cpu/common.c @@ -25,7 +25,33 @@ DEFINE_PER_CPU(struct Xgt_desc_struct, cpu_gdt_descr); EXPORT_PER_CPU_SYMBOL(cpu_gdt_descr); -DEFINE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]); +DEFINE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]) = { + [GDT_ENTRY_KERNEL_CS] = { 0x, 0x00cf9a00 }, + [GDT_ENTRY_KERNEL_DS] = { 0x, 0x00cf9200 }, + [GDT_ENTRY_DEFAULT_USER_CS] = { 0x, 0x00cffa00 }, + [GDT_ENTRY_DEFAULT_USER_DS] = { 0x, 0x00cff200 }, + /* +* Segments used for calling PnP BIOS have byte granularity. +* They code segments and data segments have fixed 64k limits, +* the transfer segment sizes are set at run time. +*/ + [GDT_ENTRY_PNPBIOS_CS32] = { 0x, 0x00409a00 },/* 32-bit code */ + [GDT_ENTRY_PNPBIOS_CS16] = { 0x, 0x9a00 },/* 16-bit code */ + [GDT_ENTRY_PNPBIOS_DS] = { 0x, 0x9200 }, /* 16-bit data */ + [GDT_ENTRY_PNPBIOS_TS1] = { 0x, 0x9200 },/* 16-bit data */ + [GDT_ENTRY_PNPBIOS_TS2] = { 0x, 0x9200 },/* 16-bit data */ + /* +* The APM segments have byte granularity and their bases +* are set at run time. All have 64k limits. +*/ + [GDT_ENTRY_APMBIOS_BASE] = { 0x, 0x00409a00 },/* 32-bit code */ + /* 16-bit code */ + [GDT_ENTRY_APMBIOS_BASE+1] = { 0x, 0x9a00 }, + [GDT_ENTRY_APMBIOS_BASE+2] = { 0x, 0x00409200 }, /* data */ + + [GDT_ENTRY_ESPFIX_SS] = { 0x, 0x00c09200 }, + [GDT_ENTRY_PDA] = { 0x, 0x00c09200 }, /* set in setup_pda */ +}; DEFINE_PER_CPU(struct i386_pda, _cpu_pda); EXPORT_PER_CPU_SYMBOL(_cpu_pda); @@ -618,46 +644,6 @@ struct i386_pda boot_pda = { .pcurrent = &init_task, }; -static inline void set_kernel_fs(void) -{ - /* Set %fs for this CPU's PDA. Memory clobber is to create a - barrier with respect to any PDA operations, so the compiler - doesn't move any before here. */ - asm volatile ("mov %0, %%fs" : : "r" (__KERNEL_PDA) : "memory"); -} - -/* Initialize the CPU's GDT and PDA. This is either the boot CPU doing itself - (still using cpu_gdt_table), or a CPU doing it for a secondary which - will soon come up. */ -__cpuinit void init_gdt(int cpu, struct task_struct *idle) -{ - struct Xgt_desc_struct *cpu_gdt_descr = &per_cpu(cpu_gdt_descr, cpu); - struct desc_struct *gdt = per_cpu(cpu_gdt, cpu); - struct i386_pda *pda = &per_cpu(_cpu_pda, cpu); - - memcpy(gdt, cpu_gdt_table, GDT_SIZE); - cpu_gdt_descr->address = (unsigned long)gdt; - cpu_gdt_descr->size = GDT_SIZE - 1; - - pack_descriptor((u32 *)&gdt[GDT_ENTRY_PDA].a, - (u32 *)&gdt[GDT_ENTRY_PDA].b, - (unsigned long)pda, sizeof(*pda) - 1, - 0x80 | DESCTYPE_S | 0x2, 0); /* present read-write data segment */ - - memset(pda, 0, sizeof(*pda)); - pda->_pda = pda; - pda->cpu_number = cpu; - pda->pcurrent = idle; -} - -void __cpuinit cpu_set_gdt(int cpu) -{ - struct Xgt_desc_struct *cpu_gdt_descr = &per_cpu(cpu_gdt_descr, cpu); - - load_gdt(cpu_gdt_descr); - set_kernel_fs(); -} - /* Common CPU init for both boot and secondary CPUs */ static void __cpuinit _cpu_init(int cpu, struct task_struct *curr) { @@ -740,10 +726,6 @@ void __cpuinit cpu_init(void) int cpu = smp_processor_id(); struct task_struct *curr = current; - /* Set up the real GDT and PDA, so we can transition from the - boot_gdt_table & b
[PATCH] [48/48] i386: cleanup GDT Access
From: Rusty Russell <[EMAIL PROTECTED]> Now we have an explicit per-cpu GDT variable, we don't need to keep the descriptors around to use them to find the GDT: expose cpu_gdt directly. We could go further and make load_gdt() pack the descriptor for us, or even assume it means "load the current cpu's GDT" which is what it always does. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/common.c |4 +--- arch/i386/kernel/efi.c| 16 arch/i386/kernel/entry.S |3 +-- arch/i386/kernel/smpboot.c| 12 ++-- arch/i386/kernel/traps.c |4 +--- include/asm-i386/desc.h |7 ++- 6 files changed, 19 insertions(+), 27 deletions(-) Index: linux/arch/i386/kernel/cpu/common.c === --- linux.orig/arch/i386/kernel/cpu/common.c +++ linux/arch/i386/kernel/cpu/common.c @@ -22,9 +22,6 @@ #include "cpu.h" -DEFINE_PER_CPU(struct Xgt_desc_struct, cpu_gdt_descr); -EXPORT_PER_CPU_SYMBOL(cpu_gdt_descr); - DEFINE_PER_CPU(struct desc_struct, cpu_gdt[GDT_ENTRIES]) = { [GDT_ENTRY_KERNEL_CS] = { 0x, 0x00cf9a00 }, [GDT_ENTRY_KERNEL_DS] = { 0x, 0x00cf9200 }, @@ -52,6 +49,7 @@ DEFINE_PER_CPU(struct desc_struct, cpu_g [GDT_ENTRY_ESPFIX_SS] = { 0x, 0x00c09200 }, [GDT_ENTRY_PDA] = { 0x, 0x00c09200 }, /* set in setup_pda */ }; +EXPORT_PER_CPU_SYMBOL_GPL(cpu_gdt); DEFINE_PER_CPU(struct i386_pda, _cpu_pda); EXPORT_PER_CPU_SYMBOL(_cpu_pda); Index: linux/arch/i386/kernel/efi.c === --- linux.orig/arch/i386/kernel/efi.c +++ linux/arch/i386/kernel/efi.c @@ -69,13 +69,11 @@ static void efi_call_phys_prelog(void) _ { unsigned long cr4; unsigned long temp; - struct Xgt_desc_struct *cpu_gdt_descr; + struct Xgt_desc_struct gdt_descr; spin_lock(&efi_rt_lock); local_irq_save(efi_rt_eflags); - cpu_gdt_descr = &per_cpu(cpu_gdt_descr, 0); - /* * If I don't have PSE, I should just duplicate two entries in page * directory. If I have PSE, I just need to duplicate one entry in @@ -105,17 +103,19 @@ static void efi_call_phys_prelog(void) _ */ local_flush_tlb(); - cpu_gdt_descr->address = __pa(cpu_gdt_descr->address); - load_gdt(cpu_gdt_descr); + gdt_descr.address = __pa(get_cpu_gdt_table(0)); + gdt_descr.size = GDT_SIZE - 1; + load_gdt(&gdt_descr); } static void efi_call_phys_epilog(void) __releases(efi_rt_lock) { unsigned long cr4; - struct Xgt_desc_struct *cpu_gdt_descr = &per_cpu(cpu_gdt_descr, 0); + struct Xgt_desc_struct gdt_descr; - cpu_gdt_descr->address = (unsigned long)__va(cpu_gdt_descr->address); - load_gdt(cpu_gdt_descr); + gdt_descr.address = (unsigned long)get_cpu_gdt_table(0); + gdt_descr.size = GDT_SIZE - 1; + load_gdt(&gdt_descr); cr4 = read_cr4(); Index: linux/arch/i386/kernel/entry.S === --- linux.orig/arch/i386/kernel/entry.S +++ linux/arch/i386/kernel/entry.S @@ -561,8 +561,7 @@ END(syscall_badsys) #define FIXUP_ESPFIX_STACK \ /* since we are on a wrong stack, we cant make it a C code :( */ \ movl %fs:PDA_cpu, %ebx; \ - PER_CPU(cpu_gdt_descr, %ebx); \ - movl GDS_address(%ebx), %ebx; \ + PER_CPU(cpu_gdt, %ebx); \ GET_DESC_BASE(GDT_ENTRY_ESPFIX_SS, %ebx, %eax, %ax, %al, %ah); \ addl %esp, %eax; \ pushl $__KERNEL_DS; \ Index: linux/arch/i386/kernel/smpboot.c === --- linux.orig/arch/i386/kernel/smpboot.c +++ linux/arch/i386/kernel/smpboot.c @@ -786,13 +786,9 @@ static inline struct task_struct * alloc secondary which will soon come up. */ static __cpuinit void init_gdt(int cpu, struct task_struct *idle) { - struct Xgt_desc_struct *cpu_gdt_descr = &per_cpu(cpu_gdt_descr, cpu); - struct desc_struct *gdt = per_cpu(cpu_gdt, cpu); + struct desc_struct *gdt = get_cpu_gdt_table(cpu); struct i386_pda *pda = &per_cpu(_cpu_pda, cpu); - cpu_gdt_descr->address = (unsigned long)gdt; - cpu_gdt_descr->size = GDT_SIZE - 1; - pack_descriptor((u32 *)&gdt[GDT_ENTRY_PDA].a, (u32 *)&gdt[GDT_ENTRY_PDA].b, (unsigned long)pda, sizeof(*pda) - 1, @@ -1187,7 +1183,11 @@ void __init smp_prepare_cpus(unsigned in * it's on the real one. */ static inline void switch_to_new_gdt(void) { - load_gdt(&per_cpu(cpu_gdt_descr, smp_processor_id())); + struct Xgt_desc_struct gdt_descr; + + gdt_descr.address = (long)get_cpu_gdt_table(smp_processor_id()); + gdt
Re: Probable PCIE prob
Syren Baran <[EMAIL PROTECTED]> writes: > i got a problem with the combination of an Asrock AM2NF4G-SATA2 > mainboard with a Radeon X1900 (chip 1002,724b) graphics > card. /i386/pci/mmconfig.c reports a buggy bios (e000 is not > E820-reserved). This message is harmless and likely unrelated. System crashes only happen when viewing films (neither > xine nor mplayer run with root privileges) and independent of video > drivers (framebuffer, vesa and fglrx). Logs dont show any anomalies > before crashing. Anybody got a clue? Sounds like some sort of hardware problem. Maybe the power supply is not up to the task? Or the board could be broken. Also always worth updating the BIOS. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [24/48] i386: Initialize esp0 properly all the time
From: Rusty Russell <[EMAIL PROTECTED]> Whenever we schedule, __switch_to calls load_esp0 which does: tss->esp0 = thread->esp0; This is never initialized for the initial thread (ie "swapper"), so when we're scheduling that, we end up setting esp0 to 0. This is fine: the swapper never leaves ring 0, so this field is never used. lguest, however, gets upset that we're trying to used an unmapped page as our kernel stack. Rather than work around it there, let's initialize it. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- include/asm-i386/processor.h |1 + 1 file changed, 1 insertion(+) Index: linux/include/asm-i386/processor.h === --- linux.orig/include/asm-i386/processor.h +++ linux/include/asm-i386/processor.h @@ -421,6 +421,7 @@ struct thread_struct { }; #define INIT_THREAD { \ + .esp0 = sizeof(init_stack) + (long)&init_stack, \ .vm86_info = NULL, \ .sysenter_cs = __KERNEL_CS, \ .io_bitmap_ptr = NULL, \ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [25/48] x86_64: Introduce load_TLS to the "for" loop.
From: Rusty Russell <[EMAIL PROTECTED]> GCC (4.1 at least) unrolls it anyway, but I can't believe this code was ever justifiable. (I've also submitted a patch which cleans up i386, which is even uglier). Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- include/asm-x86_64/desc.h | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) Index: linux/include/asm-x86_64/desc.h === --- linux.orig/include/asm-x86_64/desc.h +++ linux/include/asm-x86_64/desc.h @@ -135,16 +135,13 @@ static inline void set_ldt_desc(unsigned (info)->useable == 0&& \ (info)->lm == 0) -#if TLS_SIZE != 24 -# error update this code. -#endif - static inline void load_TLS(struct thread_struct *t, unsigned int cpu) { + unsigned int i; u64 *gdt = (u64 *)(cpu_gdt(cpu) + GDT_ENTRY_TLS_MIN); - gdt[0] = t->tls_array[0]; - gdt[1] = t->tls_array[1]; - gdt[2] = t->tls_array[2]; + + for (i = 0; i < GDT_ENTRY_TLS_ENTRIES; i++) + gdt[i] = t->tls_array[i]; } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [47/48] x86_64: Fix "Section mismatch" compile warning
From: Bernhard Walle <[EMAIL PROTECTED]> Fix "Section mismatch" warnings in arch/x86_64/kernel/time.c Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/kernel/time.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Index: linux/arch/x86_64/kernel/time.c === --- linux.orig/arch/x86_64/kernel/time.c +++ linux/arch/x86_64/kernel/time.c @@ -328,7 +328,7 @@ static unsigned int __init pit_calibrate #define PIT_MODE 0x43 #define PIT_CH0 0x40 -static void __init __pit_init(int val, u8 mode) +static void __pit_init(int val, u8 mode) { unsigned long flags; @@ -344,12 +344,12 @@ void __init pit_init(void) __pit_init(LATCH, 0x34); /* binary, mode 2, LSB/MSB, ch 0 */ } -void __init pit_stop_interrupt(void) +void pit_stop_interrupt(void) { __pit_init(0, 0x30); /* mode 0 */ } -void __init stop_timer_interrupt(void) +void stop_timer_interrupt(void) { char *name; if (hpet_address) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [44/48] x86_64: vsyscall_gtod_data diet and vgettimeofday() fix
From: Eric Dumazet <[EMAIL PROTECTED]> Current vsyscall_gtod_data is large (3 or 4 cache lines dirtied at timer interrupt). We can shrink it to exactly 64 bytes (1 cache line on AMD64) Instead of copying a whole struct clocksource, we copy only needed fields. I deleted an unused field : offset_base This patch fixes one oddity in vgettimeofday(): It can returns a timeval with tv_usec = 100. Maybe not a bug, but why not doing the right thing ? Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/kernel/vsyscall.c | 53 -- 1 file changed, 36 insertions(+), 17 deletions(-) Index: linux/arch/x86_64/kernel/vsyscall.c === --- linux.orig/arch/x86_64/kernel/vsyscall.c +++ linux/arch/x86_64/kernel/vsyscall.c @@ -51,13 +51,28 @@ asm("" : "=r" (v) : "0" (x)); \ ((v - VSYSCALL_FIRST_PAGE) + __pa_symbol(&__vsyscall_0)); }) +/* + * vsyscall_gtod_data contains data that is : + * - readonly from vsyscalls + * - writen by timer interrupt or systcl (/proc/sys/kernel/vsyscall64) + * Try to keep this structure as small as possible to avoid cache line ping pongs + */ struct vsyscall_gtod_data_t { - seqlock_t lock; - int sysctl_enabled; - struct timeval wall_time_tv; + seqlock_t lock; + + /* open coded 'struct timespec' */ + time_t wall_time_sec; + u32 wall_time_nsec; + + int sysctl_enabled; struct timezone sys_tz; - cycle_t offset_base; - struct clocksource clock; + struct { /* extract of a clocksource struct */ + cycle_t (*vread)(void); + cycle_t cycle_last; + cycle_t mask; + u32 mult; + u32 shift; + } clock; }; int __vgetcpu_mode __section_vgetcpu_mode; @@ -73,9 +88,13 @@ void update_vsyscall(struct timespec *wa write_seqlock_irqsave(&vsyscall_gtod_data.lock, flags); /* copy vsyscall data */ - vsyscall_gtod_data.clock = *clock; - vsyscall_gtod_data.wall_time_tv.tv_sec = wall_time->tv_sec; - vsyscall_gtod_data.wall_time_tv.tv_usec = wall_time->tv_nsec/1000; + vsyscall_gtod_data.clock.vread = clock->vread; + vsyscall_gtod_data.clock.cycle_last = clock->cycle_last; + vsyscall_gtod_data.clock.mask = clock->mask; + vsyscall_gtod_data.clock.mult = clock->mult; + vsyscall_gtod_data.clock.shift = clock->shift; + vsyscall_gtod_data.wall_time_sec = wall_time->tv_sec; + vsyscall_gtod_data.wall_time_nsec = wall_time->tv_nsec; vsyscall_gtod_data.sys_tz = sys_tz; write_sequnlock_irqrestore(&vsyscall_gtod_data.lock, flags); } @@ -110,7 +129,8 @@ static __always_inline long time_syscall static __always_inline void do_vgettimeofday(struct timeval * tv) { cycle_t now, base, mask, cycle_delta; - unsigned long seq, mult, shift, nsec_delta; + unsigned seq; + unsigned long mult, shift, nsec; cycle_t (*vread)(void); do { seq = read_seqbegin(&__vsyscall_gtod_data.lock); @@ -126,21 +146,20 @@ static __always_inline void do_vgettimeo mult = __vsyscall_gtod_data.clock.mult; shift = __vsyscall_gtod_data.clock.shift; - *tv = __vsyscall_gtod_data.wall_time_tv; - + tv->tv_sec = __vsyscall_gtod_data.wall_time_sec; + nsec = __vsyscall_gtod_data.wall_time_nsec; } while (read_seqretry(&__vsyscall_gtod_data.lock, seq)); /* calculate interval: */ cycle_delta = (now - base) & mask; /* convert to nsecs: */ - nsec_delta = (cycle_delta * mult) >> shift; + nsec += (cycle_delta * mult) >> shift; - /* convert to usecs and add to timespec: */ - tv->tv_usec += nsec_delta / NSEC_PER_USEC; - while (tv->tv_usec > USEC_PER_SEC) { + while (nsec >= NSEC_PER_SEC) { tv->tv_sec += 1; - tv->tv_usec -= USEC_PER_SEC; + nsec -= NSEC_PER_SEC; } + tv->tv_usec = nsec / NSEC_PER_USEC; } int __vsyscall(0) vgettimeofday(struct timeval * tv, struct timezone * tz) @@ -159,7 +178,7 @@ time_t __vsyscall(1) vtime(time_t *t) time_t result; if (unlikely(!__vsyscall_gtod_data.sysctl_enabled)) return time_syscall(t); - result = __vsyscall_gtod_data.wall_time_tv.tv_sec; + result = __vsyscall_gtod_data.wall_time_sec; if (t) *t = result; return result; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [46/48] x86_64: adjust EDID retrieval
From: "Jan Beulich" <[EMAIL PROTECTED]> commit 5e518d7672dea4cd7c60871e40d0490c52f01d13 did the same change to i386's variant. With this change, i386's and x86-64's versions are identical, raising the question whether the x86-64 one should go (just like there's only one instance of edd.S). Signed-off-by: Jan Beulich <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/boot/video.S |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/arch/x86_64/boot/video.S === --- linux.orig/arch/x86_64/boot/video.S +++ linux/arch/x86_64/boot/video.S @@ -1977,7 +1977,7 @@ store_edid: movw$0x4f15, %ax# do VBE/DDC movw$0x01, %bx movw$0x00, %cx - movw$0x01, %dx + movw$0x00, %dx movw$0x140, %di int $0x10 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] CFS scheduler, -v6
On Sunday 29 April 2007 20:30, Thomas Gleixner wrote: > As a sidenote: I really wonder if anybody noticed yet, that the whole > CFS / SD comparison is so ridiculous, that it is not even funny anymore. > CFS modifies the scheduler and nothing else, SD fiddles all over the > kernel in interesting ways. This is a WTF if ever I saw one. -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [27/48] i386: Allow i386 crash kernels to handle x86_64 dumps
From: Ian Campbell <[EMAIL PROTECTED]> The specific case I am encountering is kdump under Xen with a 64 bit hypervisor and 32 bit kernel/userspace. The dump created is 64 bit due to the hypervisor but the dump kernel is 32 bit for maximum compatibility. It's possibly less likely to be useful in a purely native scenario but I see no reason to disallow it. [EMAIL PROTECTED]: build fix] Signed-off-by: Ian Campbell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Acked-by: Vivek Goyal <[EMAIL PROTECTED]> Cc: Horms <[EMAIL PROTECTED]> Cc: Magnus Damm <[EMAIL PROTECTED]> Cc: "Eric W. Biederman" <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- fs/proc/vmcore.c |2 +- include/asm-i386/kexec.h |3 +++ include/linux/crash_dump.h |8 3 files changed, 12 insertions(+), 1 deletion(-) Index: linux/fs/proc/vmcore.c === --- linux.orig/fs/proc/vmcore.c +++ linux/fs/proc/vmcore.c @@ -514,7 +514,7 @@ static int __init parse_crash_elf64_head /* Do some basic Verification. */ if (memcmp(ehdr.e_ident, ELFMAG, SELFMAG) != 0 || (ehdr.e_type != ET_CORE) || - !elf_check_arch(&ehdr) || + !vmcore_elf_check_arch(&ehdr) || ehdr.e_ident[EI_CLASS] != ELFCLASS64 || ehdr.e_ident[EI_VERSION] != EV_CURRENT || ehdr.e_version != EV_CURRENT || Index: linux/include/asm-i386/kexec.h === --- linux.orig/include/asm-i386/kexec.h +++ linux/include/asm-i386/kexec.h @@ -42,6 +42,9 @@ /* The native architecture */ #define KEXEC_ARCH KEXEC_ARCH_386 +/* We can also handle crash dumps from 64 bit kernel. */ +#define vmcore_elf_check_arch_cross(x) ((x)->e_machine == EM_X86_64) + #define MAX_NOTE_BYTES 1024 /* CPU does not save ss and esp on stack if execution is already Index: linux/include/linux/crash_dump.h === --- linux.orig/include/linux/crash_dump.h +++ linux/include/linux/crash_dump.h @@ -14,5 +14,13 @@ extern ssize_t copy_oldmem_page(unsigned extern const struct file_operations proc_vmcore_operations; extern struct proc_dir_entry *proc_vmcore; +/* Architecture code defines this if there are other possible ELF + * machine types, e.g. on bi-arch capable hardware. */ +#ifndef vmcore_elf_check_arch_cross +#define vmcore_elf_check_arch_cross(x) 0 +#endif + +#define vmcore_elf_check_arch(x) (elf_check_arch(x) || vmcore_elf_check_arch_cross(x)) + #endif /* CONFIG_CRASH_DUMP */ #endif /* LINUX_CRASHDUMP_H */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [29/48] x86: add command line length to boot protocol
From: Bernhard Walle <[EMAIL PROTECTED]> Because the command line is increased to 2048 characters after 2.6.21, it's not possible for boot loaders and userspace tools to determine the length of the command line the kernel can understand. The benefit of knowing the length is that users can be warned if the command line size is too long which prevents surprise if things don't work after bootup. This patch updates the boot protocol to contain a field called "cmdline_size" that contain the length of the command line (excluding the terminating zero). The patch also adds missing fields (of protocol version 2.05) to the x86_64 setup code. Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Alon Bar-Lev <[EMAIL PROTECTED]> Acked-by: H. Peter Anvin <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- Documentation/i386/boot.txt | 23 +-- arch/i386/boot/setup.S |7 ++- arch/x86_64/boot/setup.S|7 ++- 3 files changed, 29 insertions(+), 8 deletions(-) Index: linux/Documentation/i386/boot.txt === --- linux.orig/Documentation/i386/boot.txt +++ linux/Documentation/i386/boot.txt @@ -2,7 +2,7 @@ H. Peter Anvin <[EMAIL PROTECTED]> - Last update 2007-01-26 + Last update 2007-03-06 On the i386 platform, the Linux kernel uses a rather complicated boot convention. This has evolved partially due to historical aspects, as @@ -35,9 +35,13 @@ Protocol 2.03: (Kernel 2.4.18-pre1) Expl initrd address available to the bootloader. Protocol 2.04: (Kernel 2.6.14) Extend the syssize field to four bytes. + Protocol 2.05: (Kernel 2.6.20) Make protected mode kernel relocatable. Introduce relocatable_kernel and kernel_alignment fields. +Protocol 2.06: (Kernel 2.6.22) Added a field that contains the size of + the boot command line + MEMORY LAYOUT @@ -133,6 +137,8 @@ Offset Proto NameMeaning 022C/4 2.03+ initrd_addr_max Highest legal initrd address 0230/4 2.05+ kernel_alignment Physical addr alignment required for kernel 0234/1 2.05+ relocatable_kernel Whether kernel is relocatable or not +0235/3 N/A pad2Unused +0238/4 2.06+ cmdline_sizeMaximum size of the kernel command line (1) For backwards compatibility, if the setup_sects field contains 0, the real value is 4. @@ -233,6 +239,12 @@ filled out, however: if your ramdisk is exactly 131072 bytes long and this field is 0x37FF, you can start your ramdisk at 0x37FE.) + cmdline_size: + The maximum size of the command line without the terminating + zero. This means that the command line can contain at most + cmdline_size characters. With protocol version 2.05 and + earlier, the maximum size was 255. + THE KERNEL COMMAND LINE @@ -241,11 +253,10 @@ loader to communicate with the kernel. relevant to the boot loader itself, see "special command line options" below. -The kernel command line is a null-terminated string currently up to -255 characters long, plus the final null. A string that is too long -will be automatically truncated by the kernel, a boot loader may allow -a longer command line to be passed to permit future kernels to extend -this limit. +The kernel command line is a null-terminated string. The maximum +length can be retrieved from the field cmdline_size. Before protocol +version 2.06, the maximum was 255 characters. A string that is too +long will be automatically truncated by the kernel. If the boot protocol version is 2.02 or later, the address of the kernel command line is given by the header field cmd_line_ptr (see Index: linux/arch/i386/boot/setup.S === --- linux.orig/arch/i386/boot/setup.S +++ linux/arch/i386/boot/setup.S @@ -52,6 +52,7 @@ #include #include #include +#include /* Signature words to ensure LILO loaded us right */ #define SIG1 0xAA55 @@ -81,7 +82,7 @@ start: # This is the setup header, and it must start at %cs:2 (old 0x9020:2) .ascii "HdrS" # header signature - .word 0x0205 # header version number (>= 0x0105) + .word 0x0206 # header version number (>= 0x0105) # or else old loadlin-1.5 will fail) realmode_swtch:.word 0, 0# default_switch, SETUPSEG start_sys_seg: .word SYSSEG @@ -171,6 +172,10 @@ relocatable_kernel:.byte 0 pad2: .byte 0 pad3: .word 0 +cmdline_size: .long COMMAND_LINE_SIZE-1 #length of the command line, +
[PATCH] [1/48] x86_64: fix x86_64-mm-sched-clock-share
From: Andrew Morton <[EMAIL PROTECTED]> Fix for the following patch. Provide dummy cpufreq functions when CPUFREQ is not compiled in. Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Dave Jones <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- include/linux/cpufreq.h | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) Index: linux/include/linux/cpufreq.h === --- linux.orig/include/linux/cpufreq.h +++ linux/include/linux/cpufreq.h @@ -32,7 +32,15 @@ * CPUFREQ NOTIFIER INTERFACE* */ +#ifdef CONFIG_CPU_FREQ int cpufreq_register_notifier(struct notifier_block *nb, unsigned int list); +#else +static inline int cpufreq_register_notifier(struct notifier_block *nb, + unsigned int list) +{ + return 0; +} +#endif int cpufreq_unregister_notifier(struct notifier_block *nb, unsigned int list); #define CPUFREQ_TRANSITION_NOTIFIER(0) @@ -261,17 +269,22 @@ int cpufreq_set_policy(struct cpufreq_po int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu); int cpufreq_update_policy(unsigned int cpu); -/* query the current CPU frequency (in kHz). If zero, cpufreq couldn't detect it */ -unsigned int cpufreq_get(unsigned int cpu); -/* query the last known CPU freq (in kHz). If zero, cpufreq couldn't detect it */ +/* + * query the last known CPU freq (in kHz). If zero, cpufreq couldn't detect it + */ #ifdef CONFIG_CPU_FREQ unsigned int cpufreq_quick_get(unsigned int cpu); +unsigned int cpufreq_get(unsigned int cpu); #else static inline unsigned int cpufreq_quick_get(unsigned int cpu) { return 0; } +static inline unsigned int cpufreq_get(unsigned int cpu) +{ + return 0; +} #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [37/48] i386: ignore vgacon if hardware not present
From: Rusty Russell <[EMAIL PROTECTED]> On Thu, 2007-03-29 at 12:36 +0200, Andi Kleen wrote: > On Thu, Mar 29, 2007 at 05:46:48PM +1000, Rusty Russell wrote: > > (Did this fall through the cracks? I don't see it in -mm. It's > > standalone, and saves some silly code in lguest and presumably others). > > Normally it should go to some some console maintainer? Hmm, but who? > Ok I can add it. Thanks. While you're in a patch-applying mood, how about this? Cheers, Rusty. == Use X86_EFLAGS_IF in irqflags.h. Move X86_EFLAGS_IF et al out to a new header: processor-flags.h, so we can include it from irqflags.h and use it in raw_irqs_disabled_flags(). As a side-effect, we could now use these flags in .S files. Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- include/asm-i386/irqflags.h|3 ++- include/asm-i386/processor-flags.h | 26 ++ include/asm-i386/processor.h | 22 +- 3 files changed, 29 insertions(+), 22 deletions(-) === Index: linux/include/asm-i386/processor-flags.h === --- /dev/null +++ linux/include/asm-i386/processor-flags.h @@ -0,0 +1,26 @@ +#ifndef __ASM_I386_PROCESSOR_FLAGS_H +#define __ASM_I386_PROCESSOR_FLAGS_H +/* Various flags defined: can be included from assembler. */ + +/* + * EFLAGS bits + */ +#define X86_EFLAGS_CF 0x0001 /* Carry Flag */ +#define X86_EFLAGS_PF 0x0004 /* Parity Flag */ +#define X86_EFLAGS_AF 0x0010 /* Auxillary carry Flag */ +#define X86_EFLAGS_ZF 0x0040 /* Zero Flag */ +#define X86_EFLAGS_SF 0x0080 /* Sign Flag */ +#define X86_EFLAGS_TF 0x0100 /* Trap Flag */ +#define X86_EFLAGS_IF 0x0200 /* Interrupt Flag */ +#define X86_EFLAGS_DF 0x0400 /* Direction Flag */ +#define X86_EFLAGS_OF 0x0800 /* Overflow Flag */ +#define X86_EFLAGS_IOPL0x3000 /* IOPL mask */ +#define X86_EFLAGS_NT 0x4000 /* Nested Task */ +#define X86_EFLAGS_RF 0x0001 /* Resume Flag */ +#define X86_EFLAGS_VM 0x0002 /* Virtual Mode */ +#define X86_EFLAGS_AC 0x0004 /* Alignment Check */ +#define X86_EFLAGS_VIF 0x0008 /* Virtual Interrupt Flag */ +#define X86_EFLAGS_VIP 0x0010 /* Virtual Interrupt Pending */ +#define X86_EFLAGS_ID 0x0020 /* CPUID detection flag */ + +#endif /* __ASM_I386_PROCESSOR_FLAGS_H */ Index: linux/include/asm-i386/irqflags.h === --- linux.orig/include/asm-i386/irqflags.h +++ linux/include/asm-i386/irqflags.h @@ -9,6 +9,7 @@ */ #ifndef _ASM_IRQFLAGS_H #define _ASM_IRQFLAGS_H +#include #ifndef __ASSEMBLY__ static inline unsigned long native_save_fl(void) @@ -119,7 +120,7 @@ static inline unsigned long __raw_local_ static inline int raw_irqs_disabled_flags(unsigned long flags) { - return !(flags & (1 << 9)); + return !(flags & X86_EFLAGS_IF); } static inline int raw_irqs_disabled(void) Index: linux/include/asm-i386/processor.h === --- linux.orig/include/asm-i386/processor.h +++ linux/include/asm-i386/processor.h @@ -21,6 +21,7 @@ #include #include #include +#include /* flag for disabling the tsc */ extern int tsc_disable; @@ -126,27 +127,6 @@ extern void detect_ht(struct cpuinfo_x86 static inline void detect_ht(struct cpuinfo_x86 *c) {} #endif -/* - * EFLAGS bits - */ -#define X86_EFLAGS_CF 0x0001 /* Carry Flag */ -#define X86_EFLAGS_PF 0x0004 /* Parity Flag */ -#define X86_EFLAGS_AF 0x0010 /* Auxillary carry Flag */ -#define X86_EFLAGS_ZF 0x0040 /* Zero Flag */ -#define X86_EFLAGS_SF 0x0080 /* Sign Flag */ -#define X86_EFLAGS_TF 0x0100 /* Trap Flag */ -#define X86_EFLAGS_IF 0x0200 /* Interrupt Flag */ -#define X86_EFLAGS_DF 0x0400 /* Direction Flag */ -#define X86_EFLAGS_OF 0x0800 /* Overflow Flag */ -#define X86_EFLAGS_IOPL0x3000 /* IOPL mask */ -#define X86_EFLAGS_NT 0x4000 /* Nested Task */ -#define X86_EFLAGS_RF 0x0001 /* Resume Flag */ -#define X86_EFLAGS_VM 0x0002 /* Virtual Mode */ -#define X86_EFLAGS_AC 0x0004 /* Alignment Check */ -#define X86_EFLAGS_VIF 0x0008 /* Virtual Interrupt Flag */ -#define X86_EFLAGS_VIP 0x0010 /* Virtual Interrupt Pending */ -#define X86_EFLAGS_ID 0x0020 /* CPUID detection flag */ - static inline void native_cpuid(unsigned int *eax, unsigned int *ebx, unsigned int *ecx, unsigned int *edx) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [26/48] x86_64: Clarify CONFIG_REORDER explanation
From: Rusty Russell <[EMAIL PROTECTED]> if (1 && X) => if (X). Signed-off-by: Rusty Russell <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/x86_64/Kconfig |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux/arch/x86_64/Kconfig === --- linux.orig/arch/x86_64/Kconfig +++ linux/arch/x86_64/Kconfig @@ -665,8 +665,8 @@ config REORDER default n help This option enables the toolchain to reorder functions for a more - optimal TLB usage. If you have pretty much any version of binutils, -this can increase your kernel build time by roughly one minute. + optimal TLB usage. This will slow your kernel build by +roughly one minute. config K8_NB def_bool y - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [4/48] x86_64: Don't disable basic block reordering
When compiling with -Os (which is default) the compiler defaults to it anyways. And with -O2 it probably generates somewhat better (although also larger) code. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/Makefile |3 --- 1 file changed, 3 deletions(-) Index: linux/arch/x86_64/Makefile === --- linux.orig/arch/x86_64/Makefile +++ linux/arch/x86_64/Makefile @@ -41,9 +41,6 @@ cflags-y += -mno-red-zone cflags-y += -mcmodel=kernel cflags-y += -pipe cflags-kernel-$(CONFIG_REORDER) += -ffunction-sections -# this makes reading assembly source easier, but produces worse code -# actually it makes the kernel smaller too. -cflags-y += -fno-reorder-blocks cflags-y += -Wno-sign-compare cflags-y += -fno-asynchronous-unwind-tables ifneq ($(CONFIG_DEBUG_INFO),y) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [10/48] i386: type cast clean up for find_next_zero_bit
From: "Ken Chen" <[EMAIL PROTECTED]> clean up unneeded type cast by properly declare data type. Signed-off-by: Ken Chen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/i386/lib/bitops.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux/arch/i386/lib/bitops.c === --- linux.orig/arch/i386/lib/bitops.c +++ linux/arch/i386/lib/bitops.c @@ -43,7 +43,7 @@ EXPORT_SYMBOL(find_next_bit); */ int find_next_zero_bit(const unsigned long *addr, int size, int offset) { - unsigned long * p = ((unsigned long *) addr) + (offset >> 5); + const unsigned long *p = addr + (offset >> 5); int set = 0, bit = offset & 31, res; if (bit) { @@ -64,7 +64,7 @@ int find_next_zero_bit(const unsigned lo /* * No zero yet, search remaining full bytes for a zero */ - res = find_first_zero_bit (p, size - 32 * (p - (unsigned long *) addr)); + res = find_first_zero_bit(p, size - 32 * (p - addr)); return (offset + set + res); } EXPORT_SYMBOL(find_next_zero_bit); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [9/48] i386: make struct vmi_ops static
From: Adrian Bunk <[EMAIL PROTECTED]> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Zachary Amsden <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/vmi.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux/arch/i386/kernel/vmi.c === --- linux.orig/arch/i386/kernel/vmi.c +++ linux/arch/i386/kernel/vmi.c @@ -56,7 +56,7 @@ static int disable_noidle; static int disable_vmi_timer; /* Cached VMI operations */ -struct { +static struct { void (*cpuid)(void /* non-c */); void (*_set_ldt)(u32 selector); void (*set_tr)(u32 selector); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [8/48] i386: modpost apic related warning fixes
From: Vivek Goyal <[EMAIL PROTECTED]> o Modpost generates warnings for i386 if compiled with CONFIG_RELOCATABLE=y WARNING: vmlinux - Section mismatch: reference to .init.text:find_unisys_acpi_oem_table from .text between 'acpi_madt_oem_check' (at offset 0xc0101eda) and 'enable_apic_mode' WARNING: vmlinux - Section mismatch: reference to .init.text:acpi_get_table_header_early from .text between 'acpi_madt_oem_check' (at offset 0xc0101ef0) and 'enable_apic_mode' WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'acpi_madt_oem_check' (at offset 0xc0101f2e) and 'enable_apic_mode' WARNING: vmlinux - Section mismatch: reference to .init.text:setup_unisys from .text between 'acpi_madt_oem_check' (at offset 0xc0101f37) and 'enable_apic_mode'WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'mps_oem_check' (at offset 0xc0101ec7) and 'acpi_madt_oem_check' WARNING: vmlinux - Section mismatch: reference to .init.text:es7000_sw_apic from .text between 'enable_apic_mode' (at offset 0xc0101f48) and 'check_apicid_present' o Some functions which are inline (acpi_madt_oem_check) are not inlined by compiler as these functions are accessed using function pointer. These functions are put in .text section and they in-turn access __init type functions hence modpost generates warnings. o Do not iniline acpi_madt_oem_check, instead make it __init. Signed-off-by: Vivek Goyal <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Len Brown <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/mach-generic/es7000.c | 41 include/asm-i386/mach-es7000/mach_apic.h|7 include/asm-i386/mach-es7000/mach_mpparse.h | 32 - scripts/mod/modpost.c |1 4 files changed, 42 insertions(+), 39 deletions(-) Index: linux/arch/i386/mach-generic/es7000.c === --- linux.orig/arch/i386/mach-generic/es7000.c +++ linux/arch/i386/mach-generic/es7000.c @@ -25,4 +25,45 @@ static int probe_es7000(void) return 0; } +extern void es7000_sw_apic(void); +static void __init enable_apic_mode(void) +{ + es7000_sw_apic(); + return; +} + +static __init int mps_oem_check(struct mp_config_table *mpc, char *oem, + char *productid) +{ + if (mpc->mpc_oemptr) { + struct mp_config_oemtable *oem_table = + (struct mp_config_oemtable *)mpc->mpc_oemptr; + if (!strncmp(oem, "UNISYS", 6)) + return parse_unisys_oem((char *)oem_table); + } + return 0; +} + +#ifdef CONFIG_ACPI +/* Hook from generic ACPI tables.c */ +static int __init acpi_madt_oem_check(char *oem_id, char *oem_table_id) +{ + unsigned long oem_addr; + if (!find_unisys_acpi_oem_table(&oem_addr)) { + if (es7000_check_dsdt()) + return parse_unisys_oem((char *)oem_addr); + else { + setup_unisys(); + return 1; + } + } + return 0; +} +#else +static int __init acpi_madt_oem_check(char *oem_id, char *oem_table_id) +{ + return 0; +} +#endif + struct genapic apic_es7000 = APIC_INIT("es7000", probe_es7000); Index: linux/include/asm-i386/mach-es7000/mach_apic.h === --- linux.orig/include/asm-i386/mach-es7000/mach_apic.h +++ linux/include/asm-i386/mach-es7000/mach_apic.h @@ -73,13 +73,6 @@ static inline void init_apic_ldr(void) apic_write_around(APIC_LDR, val); } -extern void es7000_sw_apic(void); -static inline void enable_apic_mode(void) -{ - es7000_sw_apic(); - return; -} - extern int apic_version [MAX_APICS]; static inline void setup_apic_routing(void) { Index: linux/include/asm-i386/mach-es7000/mach_mpparse.h === --- linux.orig/include/asm-i386/mach-es7000/mach_mpparse.h +++ linux/include/asm-i386/mach-es7000/mach_mpparse.h @@ -18,18 +18,6 @@ extern int parse_unisys_oem (char *oempt extern int find_unisys_acpi_oem_table(unsigned long *oem_addr); extern void setup_unisys(void); -static inline int mps_oem_check(struct mp_config_table *mpc, char *oem, - char *productid) -{ - if (mpc->mpc_oemptr) { - struct mp_config_oemtable *oem_table = - (struct mp_config_oemtable *)mpc->mpc_oemptr; - if (!strncmp(oem, "UNISYS", 6)) - return parse_unisys_oem((char *)oem_table); - } - return 0; -} - #ifdef CONFIG_ACPI static inline int es7000_check_dsdt(void) @@ -41,26 +29,6 @@ static inline int es7000_check_dsdt(void return 1; return 0; }
[PATCH] [11/48] i386: workaround for a -Wmissing-prototypes warning
From: Adrian Bunk <[EMAIL PROTECTED]> Work around a warning with -Wmissing-prototypes in arch/i386/kernel/asm-offsets.c The warning isn't gcc's fault - asm-offsets.c is simply a special file. Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/i386/kernel/asm-offsets.c |3 +++ 1 file changed, 3 insertions(+) Index: linux/arch/i386/kernel/asm-offsets.c === --- linux.orig/arch/i386/kernel/asm-offsets.c +++ linux/arch/i386/kernel/asm-offsets.c @@ -25,6 +25,9 @@ #define OFFSET(sym, str, mem) \ DEFINE(sym, offsetof(struct str, mem)); +/* workaround for a warning with -Wmissing-prototypes */ +void foo(void); + void foo(void) { OFFSET(SIGCONTEXT_eax, sigcontext, eax); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [13/48] x86_64: fix ia32_binfmt.c build error
From: Ralf Baechle <[EMAIL PROTECTED]> Reorder code to avoid multiple inclusion of elf.h. #undef several symbols to avoid build errors over redefinitions. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- arch/x86_64/ia32/ia32_binfmt.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) Index: linux/arch/x86_64/ia32/ia32_binfmt.c === --- linux.orig/arch/x86_64/ia32/ia32_binfmt.c +++ linux/arch/x86_64/ia32/ia32_binfmt.c @@ -5,6 +5,11 @@ * This tricks binfmt_elf.c into loading 32bit binaries using lots * of ugly preprocessor tricks. Talk about very very poor man's inheritance. */ +#define __ASM_X86_64_ELF_H 1 + +#undef ELF_CLASS +#define ELF_CLASS ELFCLASS32 + #include #include #include @@ -50,9 +55,6 @@ struct elf_phdr; #undef ELF_ARCH #define ELF_ARCH EM_386 -#undef ELF_CLASS -#define ELF_CLASS ELFCLASS32 - #define ELF_DATA ELFDATA2LSB #define USE_ELF_CORE_DUMP 1 @@ -136,7 +138,7 @@ struct elf_prpsinfo #define user user32 -#define __ASM_X86_64_ELF_H 1 +#undef elf_read_implies_exec #define elf_read_implies_exec(ex, executable_stack) (executable_stack != EXSTACK_DISABLE_X) //#include #include - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.21 frozen for a few minutes, swapping to disk
Hi all, today, with 2.6.21, my laptop had a really odd behaviour. It started writing to disk for a few minutes with no interactivity at all (no redraw on screen, only hdd led on). It's the first time i noticed OOM-killer started do kill programs. It was totally unresponsive for minutes, after back to life it had a load of ~19.0, and 300+ MB on swap (first time i saw this). It's an HP pavillon core duo 2.0 GHz, 1 GB RAM kern.log details: http://www.debianpt.org/~elmig/pool/kernel/20070429/kern.log .config: http://www.debianpt.org/~elmig/pool/kernel/20070429/2.6.21.config dmesg: http://www.debianpt.org/~elmig/pool/kernel/20070429/dmesg As this is the first time it happened and it felt odd i am reporting. If aditional info is needed please CC me as i am not on the list. -- Com os melhores cumprimentos/Best regards, Miguel Figueiredo http://www.DebianPT.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [2/48] i386: Rewrite sched_clock
Move it into an own file for easy sharing. Do everything per CPU. This avoids problems with TSCs that tick at different frequencies per CPU. Resync properly on cpufreq changes. CPU frequency is instable around cpu frequency changing, so fall back during a backing clock during this period. Hopefully TSC will work now on all systems except when there isn't a physical TSC. And +From: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Three cleanups there: - change "instable" -> "unstable" - it's better to use get_cpu_var for getting this cpu's variables - change cycles_2_ns to do the full computation rather than just the tsc->ns scaling. It's a simpler interface, and it makes the function Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/i386/kernel/Makefile |3 arch/i386/kernel/sched-clock.c | 213 + arch/i386/kernel/tsc.c | 62 --- 3 files changed, 215 insertions(+), 63 deletions(-) Index: linux/arch/i386/kernel/sched-clock.c === --- /dev/null +++ linux/arch/i386/kernel/sched-clock.c @@ -0,0 +1,213 @@ +/* A fast clock for the scheduler. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * convert from cycles(64bits) => nanoseconds (64bits) + * basic equation: + * ns = cycles / (freq / ns_per_sec) + * ns = cycles * (ns_per_sec / freq) + * ns = cycles * (10^9 / (cpu_khz * 10^3)) + * ns = cycles * (10^6 / cpu_khz) + * + * Then we use scaling math (suggested by [EMAIL PROTECTED]) to get: + * ns = cycles * (10^6 * SC / cpu_khz) / SC + * ns = cycles * cyc2ns_scale / SC + * + * And since SC is a constant power of two, we can convert the div + * into a shift. + * + * We can use khz divisor instead of mhz to keep a better percision, since + * cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits. + * ([EMAIL PROTECTED]) + * + * [EMAIL PROTECTED] "math is hard, lets go shopping!" + */ + +#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */ + +struct sc_data { + unsigned cyc2ns_scale; + unsigned unstable; + unsigned long long sync_base; /* TSC or jiffies at syncpoint*/ + unsigned long long ns_base; /* nanoseconds at sync point */ + unsigned long long last_val;/* Last returned value */ +}; + +static DEFINE_PER_CPU(struct sc_data, sc_data) = + { .unstable = 1, .sync_base = INITIAL_JIFFIES }; + +static inline u64 cycles_2_ns(struct sc_data *sc, u64 cyc) +{ + u64 ns; + + cyc -= sc->sync_base; + ns = (cyc * sc->cyc2ns_scale) >> CYC2NS_SCALE_FACTOR; + ns += sc->ns_base; + + return ns; +} + +/* + * Scheduler clock - returns current time in nanosec units. + * All data is local to the CPU. + * The values are approximately[1] monotonic local to a CPU, but not + * between CPUs. There might be also an occasionally random error, + * but not too bad. Between CPUs the values can be non monotonic. + * + * [1] no attempt to stop CPU instruction reordering, which can hit + * in a 100 instruction window or so. + * + * The clock can be in two states: stable and unstable. + * When it is stable we use the TSC per CPU. + * When it is unstable we use jiffies as fallback. + * stable->unstable->stable transitions can happen regularly + * during CPU frequency changes. + * There is special code to avoid having the clock jump backwards + * when we switch from TSC to jiffies, which needs to keep some state + * per CPU. This state is protected against parallel state changes + * with interrupts off. + */ +unsigned long long sched_clock(void) +{ + unsigned long long r; + struct sc_data *sc = &get_cpu_var(sc_data); + + if (sc->unstable) { + unsigned long flags; + r = (jiffies_64 - sc->sync_base) * (10 / HZ); + r += sc->ns_base; + local_irq_save(flags); + /* last_val is used to avoid non monotonity on a + stable->unstable transition. Make sure the time + never goes to before the last value returned by + the TSC clock */ + if (r <= sc->last_val) + r = sc->last_val + 1; + sc->last_val = r; + local_irq_restore(flags); + } else { + get_scheduled_cycles(r); + r = cycles_2_ns(sc, r); + sc->last_val = r; + } + + put_cpu_var(sc_data); + + return r; +} + +/* Resync with new CPU frequency */ +static void resync_sc_freq(struct sc_data *sc, unsigned int newfreq) +{ + sc->sync_base = jiffies; + if (!cpu_has_tsc) { + sc->unstable = 1; + return; + } + /* Handle nesting, but when we're z
[PATCH] [5/48] x86_64: Allow sys_uselib unconditionally
Previously it wasn't enabled in the binfmt_aout is a module case. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/ia32/ia32entry.S |4 1 file changed, 4 deletions(-) Index: linux/arch/x86_64/ia32/ia32entry.S === --- linux.orig/arch/x86_64/ia32/ia32entry.S +++ linux/arch/x86_64/ia32/ia32entry.S @@ -481,11 +481,7 @@ ia32_sys_call_table: .quad sys_symlink .quad sys_lstat .quad sys_readlink /* 85 */ -#ifdef CONFIG_IA32_AOUT .quad sys_uselib -#else - .quad quiet_ni_syscall -#endif .quad sys_swapon .quad sys_reboot .quad compat_sys_old_readdir - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [6/48] x86_64: Minor white space cleanup in traps.c
Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/x86_64/kernel/traps.c |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) Index: linux/arch/x86_64/kernel/traps.c === --- linux.orig/arch/x86_64/kernel/traps.c +++ linux/arch/x86_64/kernel/traps.c @@ -426,8 +426,7 @@ void show_registers(struct pt_regs *regs const int cpu = smp_processor_id(); struct task_struct *cur = cpu_pda(cpu)->pcurrent; - rsp = regs->rsp; - + rsp = regs->rsp; printk("CPU %d ", cpu); __show_regs(regs); printk("Process %s (pid: %d, threadinfo %p, task %p)\n", @@ -438,7 +437,6 @@ void show_registers(struct pt_regs *regs * time of the fault.. */ if (in_kernel) { - printk("Stack: "); _show_stack(NULL, regs, (unsigned long*)rsp); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [19/48] x86_64: split remaining fake nodes equally
From: David Rientjes <[EMAIL PROTECTED]> Extends the numa=fake x86_64 command-line option to split the remaining system memory into equal-sized nodes. For example: numa=fake=2*512,4* gives two 512M nodes and the remaining system memory is split into four approximately equal chunks. This is beneficial for systems where the exact size of RAM is unknown or not necessarily relevant, but the granularity with which nodes shall be allocated is known. Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: David Rientjes <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Cc: Paul Jackson <[EMAIL PROTECTED]> Cc: Christoph Lameter <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- Documentation/x86_64/boot-options.txt |4 +++- arch/x86_64/mm/numa.c | 22 ++ 2 files changed, 21 insertions(+), 5 deletions(-) Index: linux/Documentation/x86_64/boot-options.txt === --- linux.orig/Documentation/x86_64/boot-options.txt +++ linux/Documentation/x86_64/boot-options.txt @@ -155,7 +155,9 @@ NUMA depending on the sizes and coefficients listed. For example: numa=fake=2*512,1024,4*256 gives two 512M nodes, a 1024M node, and four 256M nodes. The - remaining system RAM is allocated to an additional node. + remaining system RAM is allocated to an additional node. If + the last character of CMDLINE is a *, the remaining system RAM + is instead divided up equally among its coefficient. numa=hotadd=percent Only allow hotadd memory to preallocate page structures upto Index: linux/arch/x86_64/mm/numa.c === --- linux.orig/arch/x86_64/mm/numa.c +++ linux/arch/x86_64/mm/numa.c @@ -418,11 +418,25 @@ static int __init numa_emulation(unsigne done: if (!num_nodes) return -1; - /* Fill remainder of system RAM with a final node, if appropriate. */ + /* Fill remainder of system RAM, if appropriate. */ if (addr < max_addr) { - setup_node_range(num_nodes, nodes, &addr, max_addr - addr, -max_addr); - num_nodes++; + switch (*(cmdline - 1)) { + case '*': + /* Split remaining nodes into coeff chunks */ + if (coeff <= 0) + break; + num_nodes += split_nodes_equally(nodes, &addr, max_addr, +num_nodes, coeff); + break; + case ',': + /* Do not allocate remaining system RAM */ + break; + default: + /* Give one final node */ + setup_node_range(num_nodes, nodes, &addr, +max_addr - addr, max_addr); + num_nodes++; + } } out: memnode_shift = compute_hash_shift(nodes, num_nodes); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] [12/48] x86: Log reason why TSC was marked unstable
From: john stultz <[EMAIL PROTECTED]> Change mark_tsc_unstable() so it takes a string argument, which holds the reason the TSC was marked unstable. This is then displayed the first time mark_tsc_unstable is called. This should help us better debug why the TSC was marked unstable on certain systems and allow us to make sure we're not being overly paranoid when throwing out this troublesome clocksource. Cc: Ingo Molnar <[EMAIL PROTECTED]> Cc: Thomas Gleixner <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> --- arch/i386/kernel/cpu/cyrix.c|2 +- arch/i386/kernel/tsc.c |5 +++-- arch/x86_64/kernel/time.c |2 +- arch/x86_64/kernel/tsc.c|5 +++-- arch/x86_64/kernel/tsc_sync.c |2 +- drivers/acpi/processor_idle.c |4 ++-- include/asm-i386/mach-summit/mach_mpparse.h |4 ++-- include/asm-i386/tsc.h |2 +- include/asm-x86_64/timex.h |2 +- 9 files changed, 15 insertions(+), 13 deletions(-) Index: linux/arch/i386/kernel/cpu/cyrix.c === --- linux.orig/arch/i386/kernel/cpu/cyrix.c +++ linux/arch/i386/kernel/cpu/cyrix.c @@ -279,7 +279,7 @@ static void __cpuinit init_cyrix(struct */ if (vendor == PCI_VENDOR_ID_CYRIX && (device == PCI_DEVICE_ID_CYRIX_5510 || device == PCI_DEVICE_ID_CYRIX_5520)) - mark_tsc_unstable(); + mark_tsc_unstable("cyrix 5510/5520 detected"); } #endif c->x86_cache_size=16; /* Yep 16K integrated cache thats it */ Index: linux/arch/i386/kernel/tsc.c === --- linux.orig/arch/i386/kernel/tsc.c +++ linux/arch/i386/kernel/tsc.c @@ -172,7 +172,7 @@ time_cpufreq_notifier(struct notifier_bl ref_freq, freq->new); if (!(freq->flags & CPUFREQ_CONST_LOOPS)) { tsc_khz = cpu_khz; - mark_tsc_unstable(); + mark_tsc_unstable("cpufreq changes"); } } } @@ -220,11 +220,12 @@ static struct clocksource clocksource_ts CLOCK_SOURCE_MUST_VERIFY, }; -void mark_tsc_unstable(void) +void mark_tsc_unstable(char *reason) { if (!tsc_unstable) { tsc_unstable = 1; tsc_enabled = 0; + printk("Marking TSC unstable due to: %s.\n", reason); /* Can be called before registration */ if (clocksource_tsc.mult) clocksource_change_rating(&clocksource_tsc, 0); Index: linux/arch/x86_64/kernel/time.c === --- linux.orig/arch/x86_64/kernel/time.c +++ linux/arch/x86_64/kernel/time.c @@ -397,7 +397,7 @@ void __init time_init(void) cpu_khz = tsc_calibrate_cpu_khz(); if (unsynchronized_tsc()) - mark_tsc_unstable(); + mark_tsc_unstable("TSCs unsynchronized"); if (cpu_has(&boot_cpu_data, X86_FEATURE_RDTSCP)) vgetcpu_mode = VGETCPU_RDTSCP; Index: linux/arch/x86_64/kernel/tsc.c === --- linux.orig/arch/x86_64/kernel/tsc.c +++ linux/arch/x86_64/kernel/tsc.c @@ -85,7 +85,7 @@ static int time_cpufreq_notifier(struct tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new); if (!(freq->flags & CPUFREQ_CONST_LOOPS)) - mark_tsc_unstable(); + mark_tsc_unstable("cpufreq changes"); } return 0; @@ -171,10 +171,11 @@ static struct clocksource clocksource_ts .vread = vread_tsc, }; -void mark_tsc_unstable(void) +void mark_tsc_unstable(char *reason) { if (!tsc_unstable) { tsc_unstable = 1; + printk("Marking TSC unstable due to %s\n", reason); /* Change only the rating, when not registered */ if (clocksource_tsc.mult) clocksource_change_rating(&clocksource_tsc, 0); Index: linux/arch/x86_64/kernel/tsc_sync.c === --- linux.orig/arch/x86_64/kernel/tsc_sync.c +++ linux/arch/x86_64/kernel/tsc_sync.c @@ -138,7 +138,7 @@ void __cpuinit check_tsc_sync_source(int printk("\n"); printk(KERN_WARNING "Measured %Ld cycles TSC warp between CPUs," " turning off TSC clock.\n", max_warp); - mark_tsc_unstable(); + mark_tsc_unstable("check_ts