date:20080205

Re: [git pull] Input updates for 2.6.25-rc0

2008-02-05 Thread Sam Ravnborg

>  
> > 
> > 
> > 
> > i8042_platform_init():
> > 
> > +#if defined(__i386__) || defined(__x86_64__)
> > 
> > use #ifdef CONFIG_X86?
> > 
> 
> I considered it but above was tested and in line with the style of the
> rest of the file...
Then please change the rest of the file so it is consistent
with the usual style to use our CONFIG_ symbols for
conditionals like the above.

Sam
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

tool to keep multiple architectures/kernel configs straight?

2008-02-05 Thread Marty Leisner

Supporting a number of kernels/platforms there are:
 1) common options (which should be all the same across all platforms)
 2) platform/processor specific options
 3) some options are common across kernel versions but not across
all versions

Is there a good way to maintain this (beyond eyeballing the
.config files?)

marty
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] DS1WM: decouple host IRQ and INTR active state settings.

2008-02-05 Thread Andrew Morton

On Tue, 8 Jan 2008 09:21:55 +0100 "pHilipp Zabel" <[EMAIL PROTECTED]> wrote:

> On Jan 8, 2008 1:13 AM, Matt Reimer <[EMAIL PROTECTED]> wrote:
> > On Mon, 2008-01-07 at 15:10 -0800, Andrew Morton wrote:
> > > On Sun, 06 Jan 2008 14:46:14 +0100
> > > Philipp Zabel <[EMAIL PROTECTED]> wrote:
> > >
> > > > The DS1WM driver incorrectly infers the IAS bit (1-wire interrupt active
> > > > high) from IRQ settings. There are devices that have IAS=0 but still 
> > > > need
> > > > the IRQ to trigger on a rising edge. With this patch, machines with 
> > > > DS1WM
> > > > that need IAS=1 have to set .active_high=1 in the ds1wm_platform_data.
> >
> > > But no drivers are converted to set ds1wm_platform_data.active_high.  
> > > Won't
> > > IORESOURCE_IRQ_HIGHEDGE devices be broken by this change?
> >
> > Good point; I think you're right. I'd guess the other platforms that use
> > this driver are in the handhelds.org tree, but I've been out of the loop
> > a while. Philipp, is this the case?
> 
> Yes, I think so. I am only aware of four chips that include a DS1WM:
> HTC's ASIC3, PASIC2 and PASIC3 and Samsung SAMCOP.
> All of those drivers have yet to be submitted.
> 
> I will also apply this patch to hh.org CVS and fix up the devices that are
> affected by this change (aximx30, blueangel, magician, h1900, h4000,
> h5400, himalaya, hx4700, sable, universal).
> But none of those set IORESOURCE_IRQ_HIGHEDGE (most are just
> missing the IORESOURCE_IRQ_LOWEDGE flag). I am not sure about
> the status of rx3000 or other devices that might live in other trees.
> 
> I'm currently cleaning up the PASIC2/3 driver. After that I'll try to help
> cleaning up ASIC3 and finally getting it ready for submission.
> A whole load of devices in the hh.org tree depend on it.
> 

Guys, I'm thinking that by the time we actually need this patch in the
mainline tree, it may well be obsolete.  So I should drop the copy I have?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Information about MMU

2008-02-05 Thread Matti Aarnio

On Wed, Feb 06, 2008 at 10:31:05AM +0530, Pravin Nanaware wrote:
> Hi,
> 
> Can somebody point me where could I get the MMU(Memory management Unit)
> details ?  

>From system programming manuals of the processor you are interested in,
and in general, from any half-decent computer architecture book written
after about 1980.

> Regards,
> Pravin
> 
> -**Nihilent***
> " *** All information contained in this communication is confidential, 
..

Do make sure that in future emails to this list there is no longer any
of those stupid boilerplate texts that are of dubious in legal status
even in USA, and entirely ineffective in most other parts of the world.

/Matti Aarnio
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [alsa-devel] [PATCH] duplicate strcasecmp test for "rj-master" in mpc8610_hpcd_probe()

2008-02-05 Thread Takashi Iwai

At Tue,  5 Feb 2008 10:19:43 +,
Mark Brown wrote:
> 
> From: Roel Kluin <[EMAIL PROTECTED]>
> 
> In linus' git tree I found this problem. Is it also in the alsa tree?
> please confirm it's the right fix. The patch was not yet tested.

Thanks, I applied it to ALSA tree now.  Let me know if it's a wrong
fix.


Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add IPv6 support to TCP SYN cookies

2008-02-05 Thread Andi Kleen

> +static __init int init_syncookies(void)
> +{
> + get_random_bytes(syncookie_secret, sizeof(syncookie_secret));
> + return 0;
> +}
> +module_init(init_syncookies);

I didn't think a module could have multiple module_inits. Are you
sure that works?

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: T61P sound issue

2008-02-05 Thread Takashi Iwai

At Tue, 5 Feb 2008 22:16:08 +0100 (CET),
Jiri Kosina wrote:
> 
> [ added Takashi ]
> 
> On Tue, 5 Feb 2008, Felipe Balbi wrote:
> 
> > > > > > Could anyone make T61P's ICH8 sound controller to work properly?
> > Good that there's a lot of people using T61p, it's a good machine.
> > I'll upgrade my BIOS and try again the crappy sound.
> 
> I have just bought X61s, and it seems to have the very same soundcard as 
> your T61p does:
> 
> 00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio 
> Controller (rev 03)
> Subsystem: Lenovo Lenovo Thinkpad T61
> Flags: bus master, fast devsel, latency 0, IRQ 17
> Memory at fe22 (64-bit, non-prefetchable) [size=16K]
> Capabilities: [50] Power Management version 2
> Capabilities: [60] Message Signalled Interrupts: Mask- 64bit+ 
> Queue=0/0 Enable-
> Capabilities: [70] Express Unknown type IRQ 0
> Capabilities: [100] Virtual Channel
> Capabilities: [130] Unknown (5)
> 
> The sound also doesn't work with 2.6.24 (tried modprobing the 
> snd-hda-intel with 'model=thinkpad', didn't make any difference). The 
> mixer settings seem to be correct, but there is no sound.
> 
> Two strange things in alsamixer:
> 
> - it is possible to change volume of the "PCM" toggle, but it is missing 
>   the possibility to mute/unmute (the box with "MM"/"OO" simply isn't 
>   there)
> 
> - the "Headphone" toggle has "OO" as it is unmuted, but there is no 
>   possibility to change its volume, the volume box is completely missing

These are intentional.  There is no proper widgets (in HD-audio term,
corresponding to registers) to behave for these purposes.
On 2.6.25, you have additional "Master" volume and switch so that
these won't matter much, I hope.

> Takashi, if you need any other information which would help resolving 
> this, please let me know.

First, make sure that you unmuted the hardware volume / mute via
laptop keyboards.

If everything looks OK ("PCM" adjusted, "Headphone" and "Speaker"
unmuted) but still it doesn't work, it can be a driver problem.
If you set CONFIG_SND_HDA_POWER_SAVE, try to undefine it.

Or, try Ingo's patch Ted suggested.  (To be noted, this couldn't get
in to the mainline because this causes problems on other devices.
2.6.24-git already inclues this patch but with some workarounds.)


Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] Input updates for 2.6.25-rc0

2008-02-05 Thread Andrew Morton

On Wed, 6 Feb 2008 01:08:30 -0500 Dmitry Torokhov <[EMAIL PROTECTED]> wrote:

> Andrew Morton (1):
>   Input: i8042 - non-x86 build fix

Also, please prefer to fold patches like this into the patch which they
fix, to avoid breaking git-bisect, thanks.

(I add a little one-line mention in the changelog so the contributor gets
a nod, and as a record of what happened)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Whine about suspicious return values from module's ->init() hook

2008-02-05 Thread Rusty Russell

On Wednesday 06 February 2008 10:37:52 Andrew Morton wrote:
> On Wed, 6 Feb 2008 09:48:10 +1100
>
> Rusty Russell <[EMAIL PROTECTED]> wrote:
> > > It's a no-brainer.
> >
> > For non-developers, WARN_ON is a noop.
>
> Oh..  Rusty.  The mailing list and bugzilla are *full* of WARN_ON reports
> from testers.  Your statement is empirically wrong.

My apologies.  I had extrapolated from my own behaviour: I don't notice 
WARN_ON unless something else goes wrong to make me look in the logs.

> > BUG_ON() will make us fix it in return for short-term pain.
>
> Pain to our users and testers.  People upon whom we are very dependent and
> to whom we are hugely indebted.  People who I have to spend a lot of time
> defending from the likes of you!

I think you misunderstand.  I proposed that we audit all the code before such 
a change.  We shouldn't do *anything* until we can estimate the impact this 
change will have.

Our users deserve better than "I don't know if this will break anything so I 
used WARN_ON".  They deserve "we have confidence that this change won't break 
any existing code".

Now, if an audit is impractical or unreliable, we are better off with a 
WARN_ON.  But it is still an admission of ignorance.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] Input updates for 2.6.25-rc0

2008-02-05 Thread Dmitry Torokhov

On Wednesday 06 February 2008 01:32, Andrew Morton wrote:
> 
> Looks OK.  Minorish things from a quick scan:
> 
> 
> 
> tosakbd_scankeyboard() looks like it'll spend a perfectly wicked amount of
> time under spin_lock_irqsave().
>

I think you are right. I will check with Dmitry if it can be relaxed a bit.
 
> 
> 
> This code, in tosakbd_probe():
> 
> +fail2:
> + while (--i >= 0)
> + gpio_free(TOSA_GPIO_KEY_STROBE(i));
> +
> + i = TOSA_KEY_SENSE_NUM;
> +fail:
> + while (--i >= 0) {
> + free_irq(gpio_to_irq(TOSA_GPIO_KEY_SENSE(i)), pdev);
> + gpio_free(TOSA_GPIO_KEY_SENSE(i));
> + }
> 
> looks like it'll free irqs and gpios which were never allocated (if i <
> TOSA_KEY_SENSE_NUM on entry).
>

Umm? There are 2 groups of gpios (sense and strobe) with sense group
registered first. Looks ok to me.
 
> 
> 
> +static int __devinit tosakbd_probe(struct platform_device *pdev) {
> 
> please integrate checkpatch into your merging process.
>

Will do.
 
> 
> 
> 
> i8042_platform_init():
> 
> +#if defined(__i386__) || defined(__x86_64__)
> 
> use #ifdef CONFIG_X86?
> 

I considered it but above was tested and in line with the style of the
rest of the file...
 
-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24 regression][BUGFIX] numactl --interleave=all doesn't works on memoryless node.

2008-02-05 Thread KOSAKI Motohiro

Hi Lee-san

> Here's a patch that addresses the problem w/o requiring change to
> numactl or libnuma.  It DOES have side affects, discussed in the
> description.

Thank you!

but unfortunately, My machine is broken phisically today ;-)
I will test it tommorow or later.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH for review] ACPI: Create /sys/firmware/acpi/interrupts/ counters

2008-02-05 Thread Greg KH

On Wed, Feb 06, 2008 at 01:33:25AM -0500, Len Brown wrote:
> Thanks for the reply, Greg.
> 
> If these counters are not always available by default, then
> I've lost most of their value for supporting systems in the field.
> 
> Plus, I think I can live with just the individual sysfs files,
> given Bjorn's "grep . *" tip.
> 
> So if I follow the rules, do you think it is okay to put
> these stats in sysfs?  In the hopes the answer is "yes",
> I'll reply with a refreshed patch

Sure, one value per file is fine.  I'll be glad to review a patch like
that, with a Documentation/ABI/ entry :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ACPI: create /sys/firmware/acpi/interrupts/ (v2)

2008-02-05 Thread Len Brown

From: Len Brown <[EMAIL PROTECTED]>

See Documentation/ABI/testing/sysfs-firmware-acpi

Inspired-by: Luming Yu <[EMAIL PROTECTED]>
Signed-off-by: Len Brown <[EMAIL PROTECTED]>
---
 Documentation/ABI/testing/sysfs-firmware-acpi |   99 
 drivers/acpi/events/evevent.c |2 +-
 drivers/acpi/events/evgpe.c   |2 +-
 drivers/acpi/osl.c|   12 ++-
 drivers/acpi/system.c |  208 +
 drivers/acpi/utilities/utglobal.c |2 -
 include/acpi/acglobal.h   |4 -
 include/acpi/acpiosxf.h   |3 +
 include/linux/acpi.h  |2 +
 9 files changed, 325 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-firmware-acpi

diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi 
b/Documentation/ABI/testing/sysfs-firmware-acpi
new file mode 100644
index 000..9470ed9
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-firmware-acpi
@@ -0,0 +1,99 @@
+What:  /sys/firmware/acpi/interrupts/
+Date:  February 2008
+Contact:   Len Brown <[EMAIL PROTECTED]>
+Description:
+   All ACPI interrupts are handled via a single IRQ,
+   the System Control Interrupt (SCI), which appears
+   as "acpi" in /proc/interrupts.
+
+   However, one of the main functions of ACPI is to make
+   the platform understand random hardware without
+   special driver support.  So while the SCI handles a few
+   well known (fixed feature) interrupts sources, such
+   as the power button, it can also handle a variable
+   number of a "General Purpose Events" (GPE).
+
+   A GPE vectors to a specified handler in AML, which
+   can do a anything the BIOS writer wants from
+   OS context.  GPE 0x12, for example, would vector
+   to a level or edge handler called _L12 or _E12.
+   The handler may do its business and return.
+   Or the handler may send send a Notify event
+   to a Linux device driver registered on an ACPI device,
+   such as a battery, or a processor.
+
+   To figure out where all the SCI's are coming from,
+   /sys/firmware/acpi/interrupts contains a file listing
+   every possible source, and the count of how many
+   times it has triggered.
+
+   $ cd /sys/firmware/acpi/interrupts
+   $ grep . *
+   error:0
+   ff_gbl_lock:0
+   ff_pmtimer:0
+   ff_pwr_btn:0
+   ff_rt_clk:0
+   ff_slp_btn:0
+   gpe00:0
+   gpe01:0
+   gpe02:0
+   gpe03:0
+   gpe04:0
+   gpe05:0
+   gpe06:0
+   gpe07:0
+   gpe08:0
+   gpe09:174
+   gpe0A:0
+   gpe0B:0
+   gpe0C:0
+   gpe0D:0
+   gpe0E:0
+   gpe0F:0
+   gpe10:0
+   gpe11:60
+   gpe12:0
+   gpe13:0
+   gpe14:0
+   gpe15:0
+   gpe16:0
+   gpe17:0
+   gpe18:0
+   gpe19:7
+   gpe1A:0
+   gpe1B:0
+   gpe1C:0
+   gpe1D:0
+   gpe1E:0
+   gpe1F:0
+   gpe_all:241
+   sci:241
+
+   sci - The total number of times the ACPI SCI
+   has claimed an interrupt.
+
+   gpe_all - count of SCI caused by GPEs.
+
+   gpeXX - count for individual GPE source
+
+   ff_gbl_lock - Global Lock
+
+   ff_pmtimer - PM Timer
+
+   ff_pwr_btn - Power Button
+
+   ff_rt_clk - Real Time Clock
+
+   ff_slp_btn - Sleep Button
+
+   error - an interrupt that can't be accounted for above.
+
+   Root has permission to clear any of these counters.  Eg.
+   # echo 0 > gpe11
+
+   All counters can be cleared by clearing the total "sci":
+   # echo 0 > sci
+
+   None of these counters has an effect on the function
+   of the system, they are simply statistics.
diff --git a/drivers/acpi/events/evevent.c b/drivers/acpi/events/evevent.c
index e412878..3048801 100644
--- a/drivers/acpi/events/evevent.c
+++ b/drivers/acpi/events/evevent.c
@@ -259,7 +259,7 @@ u32 acpi_ev_fixed_event_detect(void)
enable_bit_mask)) {
 
/* Found an active (signalled) event */
-
+   acpi_os_fixed_event_count(i);
int_status |= acpi_ev_fixed_event_dispatch((u32) i);
}
}
diff --git

Re: RT scheduler config, suggestions and questions

2008-02-05 Thread Peter Zijlstra


On Tue, 2008-02-05 at 15:37 -0800, Max Krasnyanskiy wrote:
> Folks,
> 
> I just realized that in latest Linus' tree following sysctls are under 
> SCHED_DEBUG:
>   sched_rt_period
>   sched_rt_ratio
> 
> I do not believe that is correct. I know that we do not want to expose 
> scheduler knobs
> in general but theses are not the heuristic kind of knobs. There is no way 
> the scheduler 
> can magically figure out what the correct setting should be here.

Yeah, since fixed.

> Also shouldn't those new RT features that recently went be configurable and 
> _disabled_ 
> by default ? For example "RT watchdog" and "RT throttling" actually seem very 
> questionable. 
> SCHED_FIFO is clearly defined as
> "
>   A SCHED_FIFO process runs until either it is blocked by an I/O request, it 
> is preempted 
>   by a higher priority process, or it calls sched_yield(2).
> "

The watchdog is disabled by default, the bandwidth is .95s every 1s,
which is mainly a safe-guard against run-away real-time tasks. As long
as real-time usage stays within those limits nothing happens. If you
don't like it set sched_rt_runtime [*] to -1.

[*] provided in the interface changes posted a few days ago.

> Both the watchdog and the throttling are clearly braking that rule. I think 
> it's good to have 
> those features but not enabled by default and certainly not with sysctls that 
> disable them 
> hidden under debugging.
> How about this:
> - We introduce Kconfig options for them ?

I don't see why this would be needed.

> - Expose all rt sysctls outside of #ifdef DEBUG

Already did this somewhere along the line.

> btw I can see "watchdog" being very useful to catch hard-RT tasks that exceed 
> the deadline.
> But's it gotta be per thread.

It is.

> Single setting per user is not enough. Unless a use has a single RT task.

?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH for review] ACPI: Create /sys/firmware/acpi/interrupts/ counters

2008-02-05 Thread Len Brown

Thanks for the reply, Greg.

If these counters are not always available by default, then
I've lost most of their value for supporting systems in the field.

Plus, I think I can live with just the individual sysfs files,
given Bjorn's "grep . *" tip.

So if I follow the rules, do you think it is okay to put
these stats in sysfs?  In the hopes the answer is "yes",
I'll reply with a refreshed patch

cheers,
-Len
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] Input updates for 2.6.25-rc0

2008-02-05 Thread Andrew Morton


Looks OK.  Minorish things from a quick scan:



tosakbd_scankeyboard() looks like it'll spend a perfectly wicked amount of
time under spin_lock_irqsave().



This code, in tosakbd_probe():

+fail2:
+   while (--i >= 0)
+   gpio_free(TOSA_GPIO_KEY_STROBE(i));
+
+   i = TOSA_KEY_SENSE_NUM;
+fail:
+   while (--i >= 0) {
+   free_irq(gpio_to_irq(TOSA_GPIO_KEY_SENSE(i)), pdev);
+   gpio_free(TOSA_GPIO_KEY_SENSE(i));
+   }

looks like it'll free irqs and gpios which were never allocated (if i <
TOSA_KEY_SENSE_NUM on entry).



+static int __devinit tosakbd_probe(struct platform_device *pdev) {

please integrate checkpatch into your merging process.




i8042_platform_init():

+#if defined(__i386__) || defined(__x86_64__)

use #ifdef CONFIG_X86?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: {2.6.22.y} quicklists must keep even off node pages on the quicklists until the TLB flush has been completed.

2008-02-05 Thread Oliver Pinter

I use this, without errors ... but the machine is i386 desktop

On Feb 6, 2008 7:02 AM, Dhaval Giani <[EMAIL PROTECTED]> wrote:
>
> On Tue, Feb 05, 2008 at 10:06:02PM +0100, Oliver Pinter wrote:
> > it is already im queue for 2.6.23,
> >
> > 8<-
> > >From [EMAIL PROTECTED] Sat Dec 22 14:04:08 2007
> > From: Christoph Lameter <[EMAIL PROTECTED]>
> > Date: Sat, 22 Dec 2007 14:03:23 -0800
> > Subject: quicklists: do not release off node pages early
> > To: [EMAIL PROTECTED]
> > Cc: [EMAIL PROTECTED], [EMAIL PROTECTED],
> > [EMAIL PROTECTED], [EMAIL PROTECTED]
> > Message-ID: <[EMAIL PROTECTED]>
> >
> >
> > From: Christoph Lameter <[EMAIL PROTECTED]>
> >
> > patch ed367fc3a7349b17354c7acef55157764859 in mainline.
> >
> > quicklists must keep even off node pages on the quicklists until the TLB
> > flush has been completed.
> >
> > Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> > Cc: Dhaval Giani <[EMAIL PROTECTED]>
> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> > Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
> > Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>
> >
> > ---
> >  include/linux/quicklist.h |8 
> >  1 file changed, 8 deletions(-)
> >
> > --- a/include/linux/quicklist.h
> > +++ b/include/linux/quicklist.h
> > @@ -56,14 +56,6 @@ static inline void __quicklist_free(int
> >   struct page *page)
> >  {
> >   struct quicklist *q;
> > - int nid = page_to_nid(page);
> > -
> > - if (unlikely(nid != numa_node_id())) {
> > - if (dtor)
> > - dtor(p);
> > - __free_page(page);
> > - return;
> > - }
> >
> >   q = _cpu_var(quicklist)[nr];
> >   *(void **)p = q->page;
> >
> > >8--
> > Tested-by: Oliver Pinter <[EMAIL PROTECTED]> (on i386)
> >
>
> Christoph,
>
> Is this one also supposed to be backported?
>
> --
> regards,
> Dhaval
>



-- 
Thanks,
Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: uli526x doesn't get link if no link when interface is set up

2008-02-05 Thread Grant Grundler

On Sat, Feb 02, 2008 at 03:56:38PM +0100, Santiago Garcia Mantinan wrote:
> Hi!
> 
> I've been experiencing problems with this (internal) card ever since I
> bought this motherboard, lately I've been doing some tests and I found out
> some things, maybe not enough to let us debug this, but I'll explain it just
> in case.

Santiago,
Thanks for the excellent bug report.
Is this perhaps the same problem as described here?

http://bugzilla.kernel.org/show_bug.cgi?id=5839

It sounds similar but I only skimmed over your report.
If you think it is, could you add (just cut/paste)
you email to that bug for me?
Could you also try the patch I attached to that bug report?
If it works for you, we can both pester Kyle McMartin to
push it upstream. ;)

Thanks for the excellent bug report...I'll take a closer look
at it later this week or weekend.

thanks,
grant

> 
> The problem is that if the uli526x card is set up (ifconfig ethX up) when
> there is no cable plugged or the cable is plugged to something that is off
> at that time, this means, when there is no link for the card to detect,
> then, when the cable or the switch or whatever is plugged, the link is not
> detected by the uli526x card an thus no link is stablished (led on the
> switch remains off).
> 
> This is the status reported by ethtool when the link should be on but the
> driver is not detecting this condition and thus it is off:
> 
> Supported ports: [ MII ]
> Supported link modes:   10baseT/Half 10baseT/Full 
> 100baseT/Half 100baseT/Full 
> Supports auto-negotiation: Yes
> Advertised link modes:  10baseT/Half 10baseT/Full 
> 100baseT/Half 100baseT/Full 
> Advertised auto-negotiation: Yes
> Speed: Unknown! (65535)
> Duplex: Unknown! (255)
> Port: MII
> PHYAD: 1
> Transceiver: external
> Auto-negotiation: on
> Supports Wake-on: pg
> Wake-on: d
> Link detected: no
> 
> If at this time we run: "ifconfig ethX down" and after some time: "ifconfig
> ethX up" then the link is detected, but if we run this two commands without
> waiting a time between them, the link remains undetected.
> 
> In fact, if with an stablished and detected link, we run: "ifconfig ethX
> down;ifconfig ethX up" the link is lost again and is not detected till we
> run the two commands waiting some time between them.
> 
> Once the link is stablished if we don't touch the interface config (we don't
> ifconfig it down) then we can unplug the cable or turn off the switch or
> whatever and the card will detect the link whenever it becomes available
> again.
> 
> This makes me think that the problem is something related to the way on
> which we are setting the card up.
> 
> I'm running 2.6.24 on a amd64 machine, these are the messages I get from the
> driver on load:
> uli526x: ULi M5261/M5263 net driver, version 0.9.3 (2005-7-29)
> ACPI: PCI Interrupt :00:12.0[A] -> GSI 20 (level, low) -> IRQ 20
> eth1: ULi M5263 at pci:00:12.0, 00:13:8f:a7:af:b4, irq 20.
> uli526x: eth1 NIC Link is Up 100 Mbps Full duplex
> 
> This is my lspci output:
> 00:12.0 0200: 10b9:5263 (rev 60)
> Subsystem: 1849:5263
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- 
> Stepping- SERR+ FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
> SERR-  Latency: 32 (5000ns min, 1ns max), Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 20
> Region 0: I/O ports at c800 [size=256]
> Region 1: Memory at dedefc00 (32-bit, non-prefetchable) [size=256]
> Capabilities: [50] Power Management version 2
> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
> PME(D0-,D1-,D2-,D3hot+,D3cold+)
> Status: D0 PME-Enable- DSel=0 DScale=0 PME-
> Kernel driver in use: uli526x
> Kernel modules: uli526x
> 
> I don't know what else to add but I offer myself to do all the wanted tests.
> 
> Regards...
> -- 
> Manty/BestiaTester -> http://manty.net
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[git pull] Input updates for 2.6.25-rc0

2008-02-05 Thread Dmitry Torokhov

Hi Linus,

Please pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git for-linus
or
master.kernel.org:/pub/scm/linux/kernel/git/dtor/input.git for-linus

to receive updates for the input subsystem. Most of the changes were in -mm for
some time with the exception of PXA keypad driver update. However, since there
are currently no users of said driver in mainline kernel, I consider the update
pretty safe. 

Changelog:
-

Andre Haupt (1):
  Input: remove duplicate includes

Andrew Morton (1):
  Input: i8042 - non-x86 build fix

Bruce Duncan (1):
  Input: i8042 - enable DMI quirks on x86-64

Carlos Corbacho (2):
  Input: i8042 - add Dritek keyboard extension quirk
  Input: i8042 - add Dritek quirk for Acer Aspire 9110

David Brownell (1):
  Input: ads7846 - stop updating dev->power.power_state

Dmitry Baryshkov (1):
  Input: add Tosa keyboard driver

Dmitry Torokhov (10):
  Input: Add proper locking when changing device's keymap
  Input: keyspan_remote - add support for loadable keymaps
  Input: atlas_btns - add support for loadable keymaps
  Input: cobalt_btns - add support for loadable keymaps
  Input: atkbd - remove unneeded synchronize_sched()
  Input: i8042 - use synchronize_irq() instead of synchronize_sched()
  Input: iforce - don't access input_dev->private directly
  V4L/DVB: Don't access input_dev->private directly
  Input: remove cdev from input_dev structure
  Input: mousedev - use BIT_MASK instead of BIT

Eric Miao (8):
  Input: pxa27x_keypad - rename the driver (was pxa27x_keyboard)
  Input: pxa27x_keypad - remove pin configuration from the driver
  Input: pxa27x_keypad - introduce driver structure and use KEY() to define 
matrix keys
  Input: pxa27x_keypad - introduce pxa27x_keypad_config()
  Input: pxa27x_keypad - enable rotary encoders and direct keys
  Input: pxa27x_keypad - use device resources for I/O memory mapping and IRQ
  Input: pxa27x_keypad - add debounce_interval to the keypad platform data
  Input: pxa27x_keypad - also enable on PXA3xx

Francisco Alecrim (1):
  Input: remove duplicated headers in drivers/char/keyboard.c

Giel de Nijs (1):
  Input: atkbd - properly handle special keys on Dell Latitudes

Jan Engelhardt (1):
  Input: constify function pointer tables (seq_operations)

Jiri Kosina (1):
  Input: i8042 - add Fujitsu-Siemens Amilo Pro 2010 to nomux list

Julia Lawall (1):
  Input: drop redundant includes of moduleparam.h

Richard Purdie (1):
  Input: add input event to APM event bridge

Stephen Hemminger (2):
  Input: implement proper timer rounding for polled devices
  Input: add driver for Fujitsu application buttons

Steven Whitehouse (1):
  Input: fix bug in example code

Diffstat:


 Documentation/input/input-programming.txt |2 +-
 arch/arm/mach-pxa/tosa.c  |   43 ++
 drivers/char/keyboard.c   |5 +-
 drivers/input/Kconfig |   12 +
 drivers/input/Makefile|1 +
 drivers/input/apm-power.c |  131 +
 drivers/input/evdev.c |6 +-
 drivers/input/input-polldev.c |   18 +-
 drivers/input/input.c |   85 +++-
 drivers/input/joystick/amijoy.c   |1 -
 drivers/input/joystick/analog.c   |1 -
 drivers/input/joystick/db9.c  |1 -
 drivers/input/joystick/gamecon.c  |1 -
 drivers/input/joystick/iforce/iforce-main.c   |   17 +-
 drivers/input/joystick/turbografx.c   |1 -
 drivers/input/joystick/xpad.c |1 -
 drivers/input/keyboard/Kconfig|   29 +-
 drivers/input/keyboard/Makefile   |3 +-
 drivers/input/keyboard/atkbd.c|   91 +++-
 drivers/input/keyboard/lkkbd.c|1 -
 drivers/input/keyboard/pxa27x_keyboard.c  |  274 --
 drivers/input/keyboard/pxa27x_keypad.c|  572 +
 drivers/input/keyboard/tosakbd.c  |  415 +++
 drivers/input/misc/Kconfig|   14 +
 drivers/input/misc/Makefile   |1 +
 drivers/input/misc/apanel.c   |  378 ++
 drivers/input/misc/ati_remote.c   |1 -
 drivers/input/misc/atlas_btns.c   |   39 +-
 drivers/input/misc/cobalt_btns.c  |   73 ++--
 drivers/input/misc/keyspan_remote.c   |  119 +++--
 drivers/input/mouse/inport.c  |1 -
 drivers/input/mouse/logibm.c  |1 -
 drivers/input/mouse/psmouse-base.c|1 -
 drivers/input/mouse/trackpoint.c  |1 -

Re: {2.6.22.y} quicklists must keep even off node pages on the quicklists until the TLB flush has been completed.

2008-02-05 Thread Dhaval Giani

On Tue, Feb 05, 2008 at 10:06:02PM +0100, Oliver Pinter wrote:
> it is already im queue for 2.6.23,
> 
> 8<-
> >From [EMAIL PROTECTED] Sat Dec 22 14:04:08 2007
> From: Christoph Lameter <[EMAIL PROTECTED]>
> Date: Sat, 22 Dec 2007 14:03:23 -0800
> Subject: quicklists: do not release off node pages early
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED],
> [EMAIL PROTECTED], [EMAIL PROTECTED]
> Message-ID: <[EMAIL PROTECTED]>
> 
> 
> From: Christoph Lameter <[EMAIL PROTECTED]>
> 
> patch ed367fc3a7349b17354c7acef55157764859 in mainline.
> 
> quicklists must keep even off node pages on the quicklists until the TLB
> flush has been completed.
> 
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> Cc: Dhaval Giani <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]>
> 
> ---
>  include/linux/quicklist.h |8 
>  1 file changed, 8 deletions(-)
> 
> --- a/include/linux/quicklist.h
> +++ b/include/linux/quicklist.h
> @@ -56,14 +56,6 @@ static inline void __quicklist_free(int
>   struct page *page)
>  {
>   struct quicklist *q;
> - int nid = page_to_nid(page);
> -
> - if (unlikely(nid != numa_node_id())) {
> - if (dtor)
> - dtor(p);
> - __free_page(page);
> - return;
> - }
> 
>   q = _cpu_var(quicklist)[nr];
>   *(void **)p = q->page;
> 
> >8--
> Tested-by: Oliver Pinter <[EMAIL PROTECTED]> (on i386)
> 

Christoph,

Is this one also supposed to be backported?

-- 
regards,
Dhaval
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Access MSR functions in userspace?

2008-02-05 Thread H. Peter Anvin


Maxim Levitsky wrote:

On Wednesday, 6 February 2008 05:29:47 Hasan Rashid wrote:

Is there a way to use RDMSR and WRMSR in userspace? I only need to write
to one register, writing a module with a userspace piece to handle that
seems like an over kill.

TIA!



Fortunately linux have /dev/cpu/*/msr
Take  look at $LINUX_SOURCES/arch/x86/kernel/msr.c to learn how to use it.



You might also find these useful:

http://www.kernel.org/pub/linux/utils/cpu/msr-tools/msr-tools-1.1.2.tar.bz2

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ide-tape redux (was: Re:)

2008-02-05 Thread Borislav Petkov

... and while we're at it ...

commit c824f79fe4040f7541d7e35c546bb57a22d2fe11
Author: Borislav Petkov <[EMAIL PROTECTED]>
Date:   Wed Feb 6 06:23:10 2008 +0100

ide-tape: move all struct and other defs to the top

Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>

diff --git a/drivers/ide/ide-tape.c b/drivers/ide/ide-tape.c
index 9455ce4..398aea8 100644
--- a/drivers/ide/ide-tape.c
+++ b/drivers/ide/ide-tape.c
@@ -225,6 +225,69 @@ enum {
PC_FL_WRITING   = (1 << 5),
 };
 
+/* Tape door status */
+#define DOOR_UNLOCKED  0
+#define DOOR_LOCKED1
+#define DOOR_EXPLICITLY_LOCKED 2
+
+/* Tape flag bits values. */
+enum {
+   IDETAPE_FL_IGNORE_DSC   = (1 << 0),
+   /* 0 When the tape position is unknown */
+   IDETAPE_FL_ADDRESS_VALID= (1 << 1),
+   /* Device already opened */
+   IDETAPE_FL_BUSY = (1 << 2),
+   /* Error detected in a pipeline stage */
+   IDETAPE_FL_PIPELINE_ERR = (1 << 3),
+   /* Attempt to auto-detect the current user block size */
+   IDETAPE_FL_DETECT_BS= (1 << 4),
+   /* Currently on a filemark */
+   IDETAPE_FL_FILEMARK = (1 << 5),
+   /* DRQ interrupt device */
+   IDETAPE_FL_DRQ_INTERRUPT= (1 << 6),
+   /* pipeline active */
+   IDETAPE_FL_PIPELINE_ACTIVE  = (1 << 7),
+   /* 0 = no tape is loaded, so we don't rewind after ejecting */
+   IDETAPE_FL_MEDIUM_PRESENT   = (1 << 8),
+};
+
+/* A define for the READ BUFFER command */
+#define IDETAPE_RETRIEVE_FAULTY_BLOCK  6
+
+/* Some defines for the SPACE command */
+#define IDETAPE_SPACE_OVER_FILEMARK1
+#define IDETAPE_SPACE_TO_EOD   3
+
+/* Some defines for the LOAD UNLOAD command */
+#define IDETAPE_LU_LOAD_MASK   1
+#define IDETAPE_LU_RETENSION_MASK  2
+#define IDETAPE_LU_EOT_MASK4
+
+/*
+ * Special requests for our block device strategy routine.
+ *
+ * In order to service a character device command, we add special requests to
+ * the tail of our block device request queue and wait for their completion.
+ */
+
+enum {
+   REQ_IDETAPE_PC1 = (1 << 0), /* packet command (first stage) */
+   REQ_IDETAPE_PC2 = (1 << 1), /* packet command (second stage) */
+   REQ_IDETAPE_READ= (1 << 2),
+   REQ_IDETAPE_WRITE   = (1 << 3),
+   REQ_IDETAPE_READ_BUFFER = (1 << 4),
+};
+
+/* Error codes returned in rq->errors to the higher part of the driver. */
+#defineIDETAPE_ERROR_GENERAL   101
+#defineIDETAPE_ERROR_FILEMARK  102
+#defineIDETAPE_ERROR_EOD   103
+
+/* Structures related to the SELECT SENSE / MODE SENSE packet commands. */
+#define IDETAPE_BLOCK_DESCRIPTOR   0
+#defineIDETAPE_CAPABILITIES_PAGE   0x2a
+
+
 /* A pipeline stage. */
 typedef struct idetape_stage_s {
struct request rq;  /* The corresponding request */
@@ -445,68 +508,6 @@ static void ide_tape_put(struct ide_tape_obj *tape)
mutex_unlock(_ref_mutex);
 }
 
-/* Tape door status */
-#define DOOR_UNLOCKED  0
-#define DOOR_LOCKED1
-#define DOOR_EXPLICITLY_LOCKED 2
-
-/* Tape flag bits values. */
-enum {
-   IDETAPE_FL_IGNORE_DSC   = (1 << 0),
-   /* 0 When the tape position is unknown */
-   IDETAPE_FL_ADDRESS_VALID= (1 << 1),
-   /* Device already opened */
-   IDETAPE_FL_BUSY = (1 << 2),
-   /* Error detected in a pipeline stage */
-   IDETAPE_FL_PIPELINE_ERR = (1 << 3),
-   /* Attempt to auto-detect the current user block size */
-   IDETAPE_FL_DETECT_BS= (1 << 4),
-   /* Currently on a filemark */
-   IDETAPE_FL_FILEMARK = (1 << 5),
-   /* DRQ interrupt device */
-   IDETAPE_FL_DRQ_INTERRUPT= (1 << 6),
-   /* pipeline active */
-   IDETAPE_FL_PIPELINE_ACTIVE  = (1 << 7),
-   /* 0 = no tape is loaded, so we don't rewind after ejecting */
-   IDETAPE_FL_MEDIUM_PRESENT   = (1 << 8),
-};
-
-/* A define for the READ BUFFER command */
-#define IDETAPE_RETRIEVE_FAULTY_BLOCK  6
-
-/* Some defines for the SPACE command */
-#define IDETAPE_SPACE_OVER_FILEMARK1
-#define IDETAPE_SPACE_TO_EOD   3
-
-/* Some defines for the LOAD UNLOAD command */
-#define IDETAPE_LU_LOAD_MASK   1
-#define IDETAPE_LU_RETENSION_MASK  2
-#define IDETAPE_LU_EOT_MASK4
-
-/*
- * Special requests for our block device strategy routine.
- *
- * In order to service a character device command, we add special requests to
- * the tail of our block device request queue and wait for their completion.
- */
-
-enum {
-   REQ_IDETAPE_PC1 = (1 << 0), /* packet command (first stage) */
-   REQ_IDETAPE_PC2 = (1 << 1), /* packet command (second stage) */
-   REQ_IDETAPE_READ= (1 << 2),
-

Re: ide-tape redux (was: Re:)

2008-02-05 Thread Borislav Petkov

On Tue, Feb 05, 2008 at 02:20:22AM +0100, Bartlomiej Zolnierkiewicz wrote:

[...]

> w.r.t. #11 ide-tape uses char devices and supports DSC so it is not as obvious
> as in ide-floppy case that all atomic bitops can be just removed (extra audit
> and some time -mm are required) so please resync/resubmit

Ok, here's what i think we should do here: There are two flags that handle DSC:
PC_FL_WAIT_FOR_DSC and IDETAPE_FL_IGNORE_DSC. The first one is per pc and is 
set in
all the packet command init functions ..create_bla_cmd() after their callers 
have
created a pc on the stack and reached its ptr down for initialization. This case
is carefree since the bit will be tested first in the interrupt handler and this
happens only after the pc is queued (ide_do_drive_cmd()) into the request 
buffer.

The other flag, IDETAPE_FL_IGNORE_DSC, is polled for in the request handler and
can be set when a pc is being retried and we should leave only those atomic
tests intact, imho, but i'm definitely gonna need a second opinion here.

---

commit 1ed8ae92249d5dff7af4ee88710ea08ff3f3356f
Author: Borislav Petkov <[EMAIL PROTECTED]>
Date:   Tue Feb 5 08:05:35 2008 +0100

ide-tape: remove atomic test/set macros

Also, since the driver supports DSC, leave the atomic tests
for the IDETAPE_FL_IGNORE_DSC bit untouched because this is polled
for in the request handler and can be set in the interrupt
handler through idetape_retry_pc() after enabling interrupts.

Finally, remove flag IDETAPE_READ_ERROR since it is unused.

Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]>

diff --git a/drivers/ide/ide-tape.c b/drivers/ide/ide-tape.c
index e59e49e..9455ce4 100644
--- a/drivers/ide/ide-tape.c
+++ b/drivers/ide/ide-tape.c
@@ -206,24 +206,24 @@ typedef struct idetape_packet_command_s {
/* Temporary buffer */
u8 pc_buffer[IDETAPE_PC_BUFFER_SIZE];
/* Status/Action bit flags: long for set_bit */
-   unsigned long flags;
+   unsigned int flags;
 } idetape_pc_t;
 
-/*
- * Packet command flag bits.
- */
-/* Set when an error is considered normal - We won't retry */
-#definePC_ABORT0
-/* 1 When polling for DSC on a media access command */
-#define PC_WAIT_FOR_DSC1
-/* 1 when we prefer to use DMA if possible */
-#define PC_DMA_RECOMMENDED 2
-/* 1 while DMA in progress */
-#definePC_DMA_IN_PROGRESS  3
-/* 1 when encountered problem during DMA */
-#definePC_DMA_ERROR4
-/* Data direction */
-#definePC_WRITING  5
+/* Packet command flag bits. */
+enum {
+   /* Set when an error is considered normal - We won't retry */
+   PC_FL_ABORT = (1 << 0),
+   /* 1 When polling for DSC on a media access command */
+   PC_FL_WAIT_FOR_DSC  = (1 << 1),
+   /* 1 when we prefer to use DMA if possible */
+   PC_FL_DMA_RECOMMENDED   = (1 << 2),
+   /* 1 while DMA in progress */
+   PC_FL_DMA_IN_PROGRESS   = (1 << 3),
+   /* 1 when encountered problem during DMA */
+   PC_FL_DMA_ERROR = (1 << 4),
+   /* Data direction */
+   PC_FL_WRITING   = (1 << 5),
+};
 
 /* A pipeline stage. */
 typedef struct idetape_stage_s {
@@ -357,8 +357,7 @@ typedef struct ide_tape_obj {
/* Wasted space in each stage */
int excess_bh_size;
 
-   /* Status/Action flags: long for set_bit */
-   unsigned long flags;
+   unsigned int flags;
/* protects the ide-tape queue */
spinlock_t lock;
 
@@ -451,20 +450,26 @@ static void ide_tape_put(struct ide_tape_obj *tape)
 #define DOOR_LOCKED1
 #define DOOR_EXPLICITLY_LOCKED 2
 
-/*
- * Tape flag bits values.
- */
-#define IDETAPE_IGNORE_DSC 0
-#define IDETAPE_ADDRESS_VALID  1   /* 0 When the tape position is 
unknown */
-#define IDETAPE_BUSY   2   /* Device already opened */
-#define IDETAPE_PIPELINE_ERROR 3   /* Error detected in a pipeline 
stage */
-#define IDETAPE_DETECT_BS  4   /* Attempt to auto-detect the 
current user block size */
-#define IDETAPE_FILEMARK   5   /* Currently on a filemark */
-#define IDETAPE_DRQ_INTERRUPT  6   /* DRQ interrupt device */
-#define IDETAPE_READ_ERROR 7
-#define IDETAPE_PIPELINE_ACTIVE8   /* pipeline active */
-/* 0 = no tape is loaded, so we don't rewind after ejecting */
-#define IDETAPE_MEDIUM_PRESENT 9
+/* Tape flag bits values. */
+enum {
+   IDETAPE_FL_IGNORE_DSC   = (1 << 0),
+   /* 0 When the tape position is unknown */
+   IDETAPE_FL_ADDRESS_VALID= (1 << 1),
+   /* Device already opened */
+   IDETAPE_FL_BUSY = (1 << 2),
+   /* Error detected in a pipeline stage */
+   IDETAPE_FL_PIPELINE_ERR = (1 << 3),
+   /* Attempt to auto-detect the

Re: [PATCH 0/9] firewire-sbp2: misc hotplug related patches

2008-02-05 Thread Jarod Wilson

On Sunday 03 February 2008 05:00:54 pm Stefan Richter wrote:
> Here is various stuff to hopefully improve fw-sbp2's behavior during bus
> resets.  The main piece is patch 9/9 which considerably raises the
> chance that ongoing I/O survives plugging and unplugging of other
> devices on the same bus as the device which services the I/O.
>
> The other patches are basically side products of patch 9/9 but contain
> quite useful fixes as well.
>
> I got quite good results with several OxSemi based SBP-2 devices

I've got one setup on which this doesn't seem to help much... Two firewire 
drives (both ox911 bridge, v4.0 firmware) hooked to a system, both of which 
are recognized, logged into, etc., on startup. However, pretty much without 
fail, at least one of them has to perform a reconnection. That claims to 
succeed, but the device isn't actually usable when this happens -- 
fdisk /dev/sdx fails with 'unable to read /dev/sdx'.

Example dmesg output when one of the two drives has to be reconnected:

firewire_core: created device fw0: GUID 00023c0031037366, S400
scsi6 : SBP-2 IEEE-1394
firewire_core: created device fw1: GUID 0050c501e001c394, S400
firewire_sbp2: fw1.0: logged in to LUN  (0 retries)
firewire_core: phy config: card 0, new root=ffc1, gap_count=5
scsi 6:0:0:0: Direct-Access-RBC ST312002 6A   8.01 PQ: 0 ANSI: 4
firewire_core: created device fw2: GUID 00010800f605, S800
sd 6:0:0:0: [sdc] 234441648 512-byte hardware sectors (120034 MB)
sd 6:0:0:0: [sdc] Write Protect is off
sd 6:0:0:0: [sdc] Mode Sense: 00 00 00 00
sd 6:0:0:0: [sdc] Asking for cache data failed
firewire_core: created device fw3: GUID d10080a575eb, S400
sd 6:0:0:0: [sdc] Assuming drive cache: write through
sd 6:0:0:0: [sdc] READ CAPACITY failed
sd 6:0:0:0: [sdc] Result: hostbyte=DID_BUS_BUSY 
driverbyte=DRIVER_OK,SUGGEST_OK
sd 6:0:0:0: [sdc] Sense not available.
sd 6:0:0:0: [sdc] Write Protect is off
sd 6:0:0:0: [sdc] Mode Sense: 00 00 00 00
sd 6:0:0:0: [sdc] Asking for cache data failed
sd 6:0:0:0: [sdc] Assuming drive cache: write through
sd 6:0:0:0: [sdc] Attached SCSI disk
scsi7 : SBP-2 IEEE-1394
firewire_sbp2: fw1.0: reconnected to LUN  (0 retries)
firewire_core: created device fw4: GUID 0050c501e00b23e9, S400
firewire_core: phy config: card 2, new root=ffc1, gap_count=5
firewire_sbp2: fw4.0: logged in to LUN  (0 retries)
scsi 7:0:0:0: Direct-Access-RBC ST312002 2A   8.01 PQ: 0 ANSI: 4
sd 7:0:0:0: [sdd] 234441648 512-byte hardware sectors (120034 MB)
sd 7:0:0:0: [sdd] Write Protect is off
sd 7:0:0:0: [sdd] Mode Sense: 11 00 00 00
sd 7:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
sd 7:0:0:0: [sdd] 234441648 512-byte hardware sectors (120034 MB)
sd 7:0:0:0: [sdd] Write Protect is off
sd 7:0:0:0: [sdd] Mode Sense: 11 00 00 00
sd 7:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
 sdd: sdd1
sd 7:0:0:0: [sdd] Attached SCSI disk

fw device decoder ring:
fw0 = fw400 card
fw1 = /dev/sdc (120G HD in ox911 case, hooked to fw0)
fw2 = fw800 card
fw3 = fw400 card
fw4 = /dev/sdd (120G HD in ox911 case, hooked to fw3)


# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [raid0] 
md4 : active raid1 sdd1[1]
  117218176 blocks [2/1] [_U]

# fdisk /dev/sdc

Unable to read /dev/sdc


Given the READ CAPACITY failed and DID_BUS_BUSY messages for sdc (and lack of 
notice about its partitions), it sort of looks like we never set the disk up 
correctly in the first place, and we're subsequently just reconnecting to 
that failed setup... So the reconnect code may be doing the right thing, and 
the real problem I'm looking at is us screwing up the setup of the device in 
the first place, for some reason...


-- 
Jarod Wilson
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] exporting capability code/name pairs (try #3)

2008-02-05 Thread Serge E. Hallyn

Quoting Kohei KaiGai ([EMAIL PROTECTED]):
> Serge E. Hallyn wrote:
>> Quoting Kohei KaiGai ([EMAIL PROTECTED]):
> All that being said, the friendliness factor of this is somewhat
> undeniable, and so I can see why folk might want it in the kernel
> anyway. If so, would it possible to move this code into
> security/capability.c and not in the main kernel per-se - protected 
> with
> a configuration option? If it does appear in the kernel, we'll 
> obviously
> add your libcap changes too. If it doesn't, then perhaps we can meet
> your needs with a slight modification to your libcap patch to read the
> capabilities from an optional /etc/XXX file - and make text visibility
> of 'late breaking' capabilities something that the admin can tweak as
> needed?
 I think optional configuration file is not a good idea.
 It can make unneeded confusion.

 If necessary, I'll move this features into security/capability.c and
 add a Kconfig option to select it.
>>> The following patch enables to export the list of capabilities supported
>>> on the running kernel, under /sys/kernel/capability .
>>>
>>> Changelog from the previous version:
>>> - Implementation is moved into security/capability.c from 
>>> kernel/capability.c
>>> - A Kconfig option SECURITY_CAPABILITIES_EXPORT is added to tuen on/off 
>>> this feature.
>> can you explain one more time exactly what this lets you do that you
>> absolutely can't do with the current api?
>
> Please consider the following situation:
>
> A user intend to run an application which use a new capability supported
> at new kernel without synced libcap. In this case, the application cannot
> work well, because libcap prevent to use new capability.

(Though we don't want to encourage application writers to not use
libcap...)

> When the kernel and libcap are not synced, the header files provided by
> libcap pacakge is not reliable. Typically, kernel developer sometimes
> faces such a situation. :)

Yeah it would definately be nice for me.

> This feature can fill the gap with providing a new interface to collect
> capabilities supported by the running kernel collectly.
>
>> I for one don't really object even if it is "duplicated" since it is far
>> easier to use, and I frequently have systems where kernel and userspace
>> are out of sync so /usr/include/sys/capabilities is worthless...  Though
>> I'm a little worried that b/scripts/mkcapnames.sh is the kind of thing
>> that'll eventually break, but I suppose that's my fault for objecting
>> two duplicated list of capability definitions :)
>
> Are you worried about "mkcapnames.sh" get broken in the future version?
>
> If so, we can add a code to check whether this script works correctly, or 
> not
>
> like:
>   -- at security/capability.c
>   #include 
>:
>   #if CAP_LAST_CAP != ARRAY_SIZE(capability_attrs)
>   #error "mkcapnames.sh added fewer or more entries than expected!"
>   #endif

Yeah, the regexp misfiring was my biggest concern so this should help.

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Information about MMU

2008-02-05 Thread Pravin Nanaware

Hi,

Can somebody point me where could I get the MMU(Memory management Unit) details 
?  

Regards,
Pravin



-**Nihilent***
" *** All information contained in this communication is confidential, 
proprietary, privileged
and is intended for the addressees only. If youhave received this E-mail in 
error please notify
mail administrator by telephone on +91-20-39846100 or E-mail the sender by 
replying to
this message, and then delete this E-mail and other copies of it from your 
computer system.
Any unauthorized dissemination,publication, transfer or use of the contents of 
this communication,
with or without modifications is punishable under the relevant law.

Nihilent has scanned this mail with current virus checking technologies. 
However, Nihilent makes no 
representations or warranties to the effect that this communication is 
virus-free.

Nihilent reserves the right to monitor all E-mail communications through its 
Corporate Network. *** "

*-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.24-rc8-mm1 09/15] (RFC) IPC: new kernel API to change an ID

2008-02-05 Thread Serge E. Hallyn

Quoting Oren Laadan ([EMAIL PROTECTED]):
>
>
> Serge E. Hallyn wrote:
>> Quoting Oren Laadan ([EMAIL PROTECTED]):
>>> I strongly second Kirill on this matter.
>>>
>>> IMHO, we should _avoid_ as much as possible exposing internal kernel
>>> state to applications, unless a _real_ need for it is _clearly_
>>> demonstrated. The reasons for this are quite obvious.
>> Hmm, sure, but this sentence is designed to make us want to agree.  Yes,
>> we want to avoid exporting kernel internals, but generally that means
>> things like the precise layout of the task_struct.  What Pierre is doing
>> is in fact the opposite, exporting resource information in a kernel
>> version invariant way.
>
> LOL ... a bit of misunderstanding - let me put some order here:
>
> my response what with respect to the new interface that Pierre
> suggested, that is - to add a new IPC call to change an identifier
> after it has been allocated (and assigned). This is necessary for the
> restart because applications expect to see the same resource id's as
> they had at the time of the checkpoint.
>
> What you are referring to is the more recent part of the thread, where
> the topic became how data should be saved - in other words, the format
> of the checkpoint data. This is entirely orthogonal to my argument.
>
> Now please re-read my email :)

Heh - by the end of my response I was pretty sure that was the case :)

> That said, I'd advocate for something in between a raw dump and a pure
> "parametric" representation of the data. Raw data tends to be, well,
> too raw, which makes the task of reading data from older version by
> newer kernels harder to maintain. On the other hand, it is impossible
> to abstract everything into kernel-independent format.

Well, that's probably getting a little pedantic, but true.

>> In fact, the very reason not to go the route you and Pavel are
>> advocating is that if we just dump task state to a file or filesystem
>> from the kernel in one shot, we'll be much more tempted to lay out data
>> in a way that exports and ends up depending on kernel internals.  So
>> we'll just want to read and write the task_struct verbatim.
>> So, there are two very different approaches we can start with.
>> Whichever one we follow, we want to avoid having kernel version
>> dependencies.  They both have their merits to be sure.
>
> You will never be able to avoid that completely, simply because new
> kernels will require saving more (or less) data per object, because
> of new (or dropped) features.

Sure.

> The best solution in this sense is to provide a filter (hopefully
> in user space, utility) that would convert a checkpoint image file
> from the old format to a newer format.

Naturally.

> And you keep a lot of compatibility code of the kernel, too.
>
>> But note that in either case we need to deal with a bunch of locking.
>> So getting back to Pierre's patchset, IIRC 1-8 are cleanups worth
>> doing no matter 1.  9-11 sound like they are contentuous until
>> we decide whether we want to go with a create_with_id() type approach
>> or a set_id().  12 is IMO a good locking cleanup regardless.  13 and
>> 15 are contentous until we decide whether we want userspace-controlled
>> checkpoint or a one-shot fs.  14 IMO is useful for both c/r approaches.
>> Is that pretty accurate?
>
> (context switch back to my original reply)
>
> I prefer not to add a new interface to IPC that will provide a new
> functionality that isn't needed, except for the checkpoint - because
> there is a better alternative to do the same task; this alternative
> is more suitable because (a) it can be applied incrementally, (b) it
> provides a consistent method to pre-select identifiers of all syscalls,
> (where is the current suggestion suggests one way for IPC and will
> suggest other hacks for other resources).
>
> (context switch back to the current reply)
>
> I definitely welcome a cleanup of the (insanely multiplexedd) IPC
> code. However I argue that the interface need not be extended.
>
>>> It isn't strictly necessary to export a new interface in order to
>>> support checkpoint/restart. **. Hence, I think that the speculation
>>> "we may need it in the future" is too abstract and isn't a good
>>> excuse to commit to a new, currently unneeded, interface.
>> OTOH it did succeed in starting some conversation :)
>>> Should the
>>> need arise in the future, it will be easy to design a new interface
>>> (also based on aggregated experience until then).
>> What aggregated experience?  We have to start somewhere...
>
> :)  well, assuming the selection of resource IDs is done as I suggested,
> we'll have the restart use it. If someone finds a good reason (other
> than checkpoint/restart) to pre-select/modify an identifier, it will
> be easy to _then_ add an interface. That (hypothetical) interface is
> likely to come out more clever after X months using checkpoint/restart.
>
>>> ** In fact, the suggested interface may prove problematic (as noted
>>> earlier in

Re: [PATCH 1/4] x86 mmiotrace: use lookup_address()

2008-02-05 Thread Christoph Hellwig

On Tue, Feb 05, 2008 at 10:28:07PM +0200, Pekka Paalanen wrote:
> Use lookup_address() from pageattr.c instead of doing the same
> manually. Also had to EXPORT_SYMBOL(lookup_address) to make this
> work for modules. This also fixes "undefined symbol 'init_mm'"
> compile error for x86_32.

This should be a _GPL export for sure.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH mm] stop c_p_a corrupting the pds

2008-02-05 Thread Hugh Dickins

On Tue, 5 Feb 2008, [EMAIL PROTECTED] wrote:
> On Tue, 05 Feb 2008 22:27:21 GMT, Hugh Dickins said:
> > When change_page_attr splits a large page on x86_32 (without PAE), it is
> > currently corrupting every process's page directory: fix that by removing
> > the thinko which passes down a physical instead of a virtual address -
> > this version of the patch being the hotfix for 2.6.24-mm1.
> > 
> > Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
> 
> 
> I *knew* there was a reason we should have had this patch series in -mm for a 
> while.
> 
> 
> :)

Seriously, I do agree with you on that.  It seems like the excitement
of making great changes has overtaken proper caution here.

Though I guess it was just coincidence that made it more debuggable
in my -mm kernel (which gave "bad pgd" errors after starting X), when
the -git kernel just crashed somehow in starting X.  And I was lucky
to have CONFIG_VMSPLIT_2G_OPT on that machine, which placed the
corruption somewhere that soon got noticed.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [git patches] net driver updates #2

2008-02-05 Thread Pravin Nanaware

Hi,

Can somebody point me where could I get the MMU(Memory management Unit) details 
?  

Regards,
Pravin


-**Nihilent***
" *** All information contained in this communication is confidential, 
proprietary, privileged
and is intended for the addressees only. If youhave received this E-mail in 
error please notify
mail administrator by telephone on +91-20-39846100 or E-mail the sender by 
replying to
this message, and then delete this E-mail and other copies of it from your 
computer system.
Any unauthorized dissemination,publication, transfer or use of the contents of 
this communication,
with or without modifications is punishable under the relevant law.

Nihilent has scanned this mail with current virus checking technologies. 
However, Nihilent makes no 
representations or warranties to the effect that this communication is 
virus-free.

Nihilent reserves the right to monitor all E-mail communications through its 
Corporate Network. *** "

*-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] KGDB: Major refactoring

2008-02-05 Thread Jason Wessel

Jan Kiszka wrote:
> As most changes are tightly coupled, this refactoring patch for
> KGDB_8250 as well as the core and the new KGDBOC driver comes as a
> single chunk. The changes are:
>  - Reorganized configuration: I/O drivers can be independently
>configured as module or built-in
>  - Dynamic reconfiguration for KGDB_8250 (just like for KGDBOC)
>  - Reworked KGDB_8250 configuration string format
>  - attachwait removed, arming the debugger via assigning an I/O driver
>implies "attachwait"
>  - Cleaned up I/O driver managment of the core
>  - Matured the various boot-up, configure, unconfigure code paths for
>both I/O drivers
>  - IRQ vs. KGDB_CONSOLE-output SMP race fixed for KGDB_8250
>  - Reduced and cleaned up hooks into serial_core/8250
>  - Kconfig cleanups
>
> What we no longer have:
>  - Simple serial configuration for _early_ debugging, use the io/mem
>format instead or wait until the debugger is able to resolve "ttySx"
>during late-init
>
> To-do:
>  - KGDBOC does not yet cleanly interacts with the TTY subsystem to
>attach to some console
>
> Signed-off-by: Jan Kiszka <[EMAIL PROTECTED]>
>   

Jan,

I pulled in all your changes and made some minor white space fixes. 

I started the 2.6.25 branch with all Ingo's changes, your changes and
several additional patches I received.

http://git.kernel.org/?p=linux/kernel/git/jwessel/linux-2.6-kgdb.git;a=summary

Thanks,
Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

more iommu sg merging fallout

2008-02-05 Thread David Miller


The sparc64 change:

commit fde6a3c82d67f592eb587be4d1b0ae6d4321
Author: FUJITA Tomonori <[EMAIL PROTECTED]>
Date:   Mon Feb 4 22:28:02 2008 -0800

iommu sg merging: sparc64: make iommu respect the segment size limits

This patch makes iommu respect segment size limits when merging sg
lists.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
Cc: Jeff Garzik <[EMAIL PROTECTED]>
Cc: James Bottomley <[EMAIL PROTECTED]>
Acked-by: Jens Axboe <[EMAIL PROTECTED]>
Cc: "David S. Miller" <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

has significant errors and is going to eat people's disks, as it just
nearly did to mine.

Typically what you'll see are NULL pointer derefernces in
dma_4v_map_sg() and dma_4u_map_sg() and then the kernel usually craps
on your superblock very shortly thereafter.

The changeset above modified only prepare_sg() but that is only the
first pass of the SG mapping algorithm of the sparc64 IOMMU layer.

The second pass that fills in the entries depends upon how the first
pass does things.  So if you change the first pass decision making you
have to update the second pass's as well.

That second pass is implemented in fill_sg() (there is a version in
both arch/sparc64/kernel/iommu.c and arch/sparc64/kernel/pci_sun4v.c),
which probably needs new logic as was added to prepare_sg() to handle
dma_get_max_seg_size().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [NFS]: Lock daemon start/stop rework.

2008-02-05 Thread Christoph Hellwig

On Thu, Jan 31, 2008 at 10:48:32AM +0300, Denis V. Lunev wrote:
> Christoph Hellwig wrote:
> > On Wed, Jan 30, 2008 at 02:41:34PM +0300, Denis V. Lunev wrote:
> >> The pid of the locking daemon can be substituted with a task struct
> >> without a problem. Namely, the value if filled in the context of the lockd
> >> thread and used in lockd_up/lockd_down.
> >>
> >> It is possible to save task struct instead and use it to kill the process.
> >> The safety of this operation is guaranteed by the RCU, i.e. task can't
> >> disappear without passing a quiscent state.
> > 
> > We have a patch series pending on the nfs list that does this plus a lot
> > more in the area.
> > 
> > 
> where can I have to look them? :)

The lastest version was just posted on the linux-nfs list:
http://marc.info/?l=linux-nfs=120224048613393=2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] badness() dramatically overcounts memory

2008-02-05 Thread Balbir Singh

KOSAKI Motohiro wrote:
> Hi
> 
 The interesting thing is the use of total_vm and not the RSS which is used 
 as
 the basis by the OOM killer. I need to read/understand the code a bit more.
>>> RSS makes more sense to me as well.
>> Andrea Arcangeli has patches pending which change this to the RSS.  
>> Specifically:
>>
>>  http://marc.info/?l=linux-mm=119977937126925
> 
> I agreed with you that RSS is better :)
> 
> 
> 
> but..
> on many node numa, per zone rss is more better..

Do we have a per zone RSS per task? I don't remember seeing it.


-- 
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [NET/IPv6] Race condition with flow_cache_genid?

2008-02-05 Thread David Miller


You'll get a better set of eyes on this if you post it
to [EMAIL PROTECTED] which is where the networking
developers hang out.

linux-net is for user questions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Access MSR functions in userspace?

2008-02-05 Thread Maxim Levitsky

On Wednesday, 6 February 2008 05:29:47 Hasan Rashid wrote:
> Is there a way to use RDMSR and WRMSR in userspace? I only need to write
> to one register, writing a module with a userspace piece to handle that
> seems like an over kill.
> 
> TIA!


Fortunately linux have /dev/cpu/*/msr
Take  look at $LINUX_SOURCES/arch/x86/kernel/msr.c to learn how to use it.

Best regards,
Maxim Levitsky
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ipvs: Make wrr "no available servers" error message rate-limited

2008-02-05 Thread David Miller

From: Simon Horman <[EMAIL PROTECTED]>
Date: Wed, 6 Feb 2008 11:19:09 +0900

> On Tue, Feb 05, 2008 at 09:30:21PM +0100, Sven Wegener wrote:
> > No available servers is more an error message than something informational. 
> > It
> > should also be rate-limited, else we're going to flood our logs on a busy
> > director, if all real servers are out of order with a weight of zero.
> > 
> > Signed-off-by: Sven Wegener <[EMAIL PROTECTED]>
> 
> Hi Sven,
> 
> this looks good to me.
> 
> Acked-by: Simon Horman <[EMAIL PROTECTED]>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Access MSR functions in userspace?

2008-02-05 Thread Hasan Rashid

Is there a way to use RDMSR and WRMSR in userspace? I only need to write
to one register, writing a module with a userspace piece to handle that
seems like an over kill.

TIA!
-- 
Regards, Hasan R.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] IB: expand ib_umem_get() prototype

2008-02-05 Thread akepner


Add a new parameter, dmasync, to the ib_umem_get() prototype. Use 
dmasync = 1 when mapping user-allocated CQs with ib_umem_get().

Signed-off-by: Arthur Kepner <[EMAIL PROTECTED]>

---

 drivers/infiniband/core/umem.c   |   17 +
 drivers/infiniband/hw/amso1100/c2_provider.c |2 -
 drivers/infiniband/hw/cxgb3/iwch_provider.c  |2 -
 drivers/infiniband/hw/ehca/ehca_mrmw.c   |2 -
 drivers/infiniband/hw/ipath/ipath_mr.c   |3 +-
 drivers/infiniband/hw/mlx4/cq.c  |2 -
 drivers/infiniband/hw/mlx4/doorbell.c|2 -
 drivers/infiniband/hw/mlx4/mr.c  |3 +-
 drivers/infiniband/hw/mlx4/qp.c  |2 -
 drivers/infiniband/hw/mlx4/srq.c |2 -
 drivers/infiniband/hw/mthca/mthca_provider.c |8 +-
 drivers/infiniband/hw/mthca/mthca_user.h |   10 +++-
 include/rdma/ib_umem.h   |4 +--
 include/rdma/ib_verbs.h  |   33 +++
 14 files changed, 74 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 4e3128f..f0e0d10 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "uverbs.h"
 
@@ -72,9 +73,10 @@ static void __ib_umem_release(struct ib_device *dev, struct 
ib_umem *umem, int d
  * @addr: userspace virtual address to start at
  * @size: length of region to pin
  * @access: IB_ACCESS_xxx flags for memory being pinned
+ * @dmasync: flush in-flight DMA when the memory region is written 
  */
 struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
-   size_t size, int access)
+   size_t size, int access, int dmasync)
 {
struct ib_umem *umem;
struct page **page_list;
@@ -87,6 +89,10 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
int ret;
int off;
int i;
+   DECLARE_DMA_ATTRS(attrs);
+
+   if (dmasync)
+   dma_set_attr(, DMA_ATTR_SYNC_ON_WRITE);
 
if (!can_do_mlock())
return ERR_PTR(-EPERM);
@@ -174,10 +180,11 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, 
unsigned long addr,
sg_set_page(>page_list[i], page_list[i + 
off], PAGE_SIZE, 0);
}
 
-   chunk->nmap = ib_dma_map_sg(context->device,
-   >page_list[0],
-   chunk->nents,
-   DMA_BIDIRECTIONAL);
+   chunk->nmap = ib_dma_map_sg_attrs(context->device,
+ >page_list[0],
+ chunk->nents,
+ DMA_BIDIRECTIONAL, 
+ );
if (chunk->nmap <= 0) {
for (i = 0; i < chunk->nents; ++i)
put_page(sg_page(>page_list[i]));
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c 
b/drivers/infiniband/hw/amso1100/c2_provider.c
index 7a6cece..f571dff 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -449,7 +449,7 @@ static struct ib_mr *c2_reg_user_mr(struct ib_pd *pd, u64 
start, u64 length,
return ERR_PTR(-ENOMEM);
c2mr->pd = c2pd;
 
-   c2mr->umem = ib_umem_get(pd->uobject->context, start, length, acc);
+   c2mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0);
if (IS_ERR(c2mr->umem)) {
err = PTR_ERR(c2mr->umem);
kfree(c2mr);
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index b5436ca..66d9d65 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -601,7 +601,7 @@ static struct ib_mr *iwch_reg_user_mr(struct ib_pd *pd, u64 
start, u64 length,
if (!mhp)
return ERR_PTR(-ENOMEM);
 
-   mhp->umem = ib_umem_get(pd->uobject->context, start, length, acc);
+   mhp->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0);
if (IS_ERR(mhp->umem)) {
err = PTR_ERR(mhp->umem);
kfree(mhp);
diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c 
b/drivers/infiniband/hw/ehca/ehca_mrmw.c
index e239bbf..62a382c 100644
--- a/drivers/infiniband/hw/ehca/ehca_mrmw.c
+++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c
@@ -325,7 +325,7 @@ struct ib_mr *ehca_reg_user_mr(struct ib_pd *pd, u64 start, 
u64 length,
}
 
e_mr->umem = ib_umem_get(pd->uobject->context, start, length,
-

Re: brk randomization breaks columns

2008-02-05 Thread Randy Dunlap

On Tue, 5 Feb 2008 23:35:27 +0100 (CET) Jiri Kosina wrote:

> On Tue, 5 Feb 2008, Arjan van de Ven wrote:
> 
> > the combo of a config option + sysctl sounds the right way forward then 
> > ;(
> 
> OK, so I propose the one below (unested yet, but should be trivial). Does 
> anyone have any objections?
> 
> 
> 
> From: Jiri Kosina <[EMAIL PROTECTED]>
> 
> ASLR: add possibility for more fine-grained tweaking
> 
> Some prehistoric binaries don't like when start of brk area is located 
> anywhere else than just after code+bss.
> 
> This patch adds possibility to configure the default behavior of address 
> space randomization. In addition to that, randomize_va_space now can have 
> value of '2', which means full randomization including brk space.
> 
> Also, documentation of randomize_va_space is added.

Thanks.

> Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>
> 
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index 8984a53..0373bbe 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -41,6 +41,7 @@ show up in /proc/sys/kernel:
>  - pid_max
>  - powersave-nap   [ PPC only ]
>  - printk
> +- randomize_va_space
>  - real-root-dev   ==> Documentation/initrd.txt
>  - reboot-cmd  [ SPARC only ]
>  - rtsig-max
> @@ -280,6 +281,37 @@ send before ratelimiting kicks in.
>  
>  ==
>  
> +randomize-va-space:
> +
> +This option can be used to select the type of process address
> +space randomization is used in the system, for architectures

s/is used/that is used/

> +that support this feature.
> +
> +One of the following numeric values is possible:
> +
> +0 - [none]
> + Turn the process address space randomization off by default.
> +
> +1 - [conservative]
> + Conservative address space randomization makes the addresses of
> + mmap base and VDSO page randomized. This, among other things,
> + implies that shared libraries will be loaded to random addresses.
> + Also for PIE binaries, the location of code start is randomized.
> +
> +2 - [full]
> +
> + This includes all the features that Conservative randomization
> + provides. In addition to that, also start of the brk area is
> + randomized.
> + There a few legacy applications out there (such as some ancient
> + versions of libc.so.5 from 1996), that assume that brk area starts

Drop comma  ^ that the brk area

> + just after the end of the code+bss. These applications break when
> + start of the brk area is randomized. There are however no known
> + non-legacy applications that would be broken this way, so for most
> + systems it is safe to chose Full randomization.

  choose

> +
> +==
> +
>  reboot-cmd: (Sparc only)
>  
>  ??? This seems to be a way to give an argument to the Sparc

> diff --git a/init/Kconfig b/init/Kconfig
> index 87f50df..804a3a6 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -662,6 +662,46 @@ config SLOB
>  
>  endchoice
>  
> +choice
> + prompt "Address space randomization type"
> + default RANDOMIZATION_CONSERVATIVE
> + help
> +This option allows to select the type of process address space
> +randomization that will be used by default (for those architectures
> +that support address space randomization). This option can be
> +overriden in runtime through kernel.randomize_va_space sysctl.
> +
> +config RANDOMIZATION_NONE
> + bool "NONE"
> + help
> +Turn the process address space randomization off by default.
> +Equivalent to sysctl kernel.randomize_va_space = 0.
> +
> +config RANDOMIZATION_CONSERVATIVE
> + bool "CONSERVATIVE"
> + help
> +Conservative address space randomization makes the addresses of
> +mmap base and VDSO page randomized. This, among other things,
> +implies that shared libraries will be loaded to random addresses.
> +Also for PIE binaries, the location of code start is randomized.
> +Equivalent to sysctl kernel.randomize_va_space = 1.
> +
> +config RANDOMIZATION_FULL
> + bool "FULL"
> + help
> +This includes all the features that Conservative randomization
> +provides. In addition to that, also start of the brk area is
> +randomized.
> +There a few legacy applications out there (such as some ancient
> +versions of libc.so.5 from 1996), that assume that brk area starts

Drop comma.  s/that brk/that the brk/

> +just after the end of the code+bss. These applications break when
> +start of the brk area is randomized. There are however no known
> +non-legacy applications that would be broken this way, so for most
> +systems it is safe to chose Full randomization.

s/chose/choose/

>

[PATCH 2/3] dma/ia64: update ia64 machvecs

2008-02-05 Thread akepner


Change all ia64 machvecs to use the new dma_{un}map_*_attrs()
interfaces. Implement the old dma_{un}map_*() interfaces in
terms of the corresponding new interfaces. For ia64/sn, make
use of one dma attribute, DMA_ATTR_SYNC_ON_WRITE.

Signed-off-by: Arthur Kepner <[EMAIL PROTECTED]>

---

 arch/ia64/hp/common/hwsw_iommu.c |   60 
 arch/ia64/hp/common/sba_iommu.c  |   62 ++--
 arch/ia64/sn/pci/pci_dma.c   |   77 ---
 include/asm-ia64/dma-mapping.h   |   28 +--
 include/asm-ia64/machvec.h   |   52 
 include/asm-ia64/machvec_hpzx1.h |   16 +++---
 include/asm-ia64/machvec_hpzx1_swiotlb.h |   16 +++---
 include/asm-ia64/machvec_sn2.h   |   16 +++---
 include/linux/dma-attrs.h|   49 +++
 lib/swiotlb.c|   50 
 10 files changed, 290 insertions(+), 136 deletions(-)

diff --git a/arch/ia64/hp/common/hwsw_iommu.c b/arch/ia64/hp/common/hwsw_iommu.c
index 94e5710..8cedd6c 100644
--- a/arch/ia64/hp/common/hwsw_iommu.c
+++ b/arch/ia64/hp/common/hwsw_iommu.c
@@ -20,10 +20,10 @@
 extern int swiotlb_late_init_with_default_size (size_t size);
 extern ia64_mv_dma_alloc_coherent  swiotlb_alloc_coherent;
 extern ia64_mv_dma_free_coherent   swiotlb_free_coherent;
-extern ia64_mv_dma_map_single  swiotlb_map_single;
-extern ia64_mv_dma_unmap_singleswiotlb_unmap_single;
-extern ia64_mv_dma_map_sg  swiotlb_map_sg;
-extern ia64_mv_dma_unmap_sgswiotlb_unmap_sg;
+extern ia64_mv_dma_map_single_attrsswiotlb_map_single_attrs;
+extern ia64_mv_dma_unmap_single_attrs  swiotlb_unmap_single_attrs;
+extern ia64_mv_dma_map_sg_attrsswiotlb_map_sg_attrs;
+extern ia64_mv_dma_unmap_sg_attrs  swiotlb_unmap_sg_attrs;
 extern ia64_mv_dma_supported   swiotlb_dma_supported;
 extern ia64_mv_dma_mapping_error   swiotlb_dma_mapping_error;
 
@@ -31,19 +31,19 @@ extern ia64_mv_dma_mapping_error
swiotlb_dma_mapping_error;
 
 extern ia64_mv_dma_alloc_coherent  sba_alloc_coherent;
 extern ia64_mv_dma_free_coherent   sba_free_coherent;
-extern ia64_mv_dma_map_single  sba_map_single;
-extern ia64_mv_dma_unmap_singlesba_unmap_single;
-extern ia64_mv_dma_map_sg  sba_map_sg;
-extern ia64_mv_dma_unmap_sgsba_unmap_sg;
+extern ia64_mv_dma_map_single_attrssba_map_single_attrs;
+extern ia64_mv_dma_unmap_single_attrs  sba_unmap_single_attrs;
+extern ia64_mv_dma_map_sg_attrssba_map_sg_attrs;
+extern ia64_mv_dma_unmap_sg_attrs  sba_unmap_sg_attrs;
 extern ia64_mv_dma_supported   sba_dma_supported;
 extern ia64_mv_dma_mapping_error   sba_dma_mapping_error;
 
 #define hwiommu_alloc_coherent sba_alloc_coherent
 #define hwiommu_free_coherent  sba_free_coherent
-#define hwiommu_map_single sba_map_single
-#define hwiommu_unmap_single   sba_unmap_single
-#define hwiommu_map_sg sba_map_sg
-#define hwiommu_unmap_sg   sba_unmap_sg
+#define hwiommu_map_single_attrs   sba_map_single_attrs
+#define hwiommu_unmap_single_attrs sba_unmap_single_attrs
+#define hwiommu_map_sg_attrs   sba_map_sg_attrs
+#define hwiommu_unmap_sg_attrs sba_unmap_sg_attrs
 #define hwiommu_dma_supported  sba_dma_supported
 #define hwiommu_dma_mapping_error  sba_dma_mapping_error
 #define hwiommu_sync_single_for_cpumachvec_dma_sync_single
@@ -98,40 +98,44 @@ hwsw_free_coherent (struct device *dev, size_t size, void 
*vaddr, dma_addr_t dma
 }
 
 dma_addr_t
-hwsw_map_single (struct device *dev, void *addr, size_t size, int dir)
+hwsw_map_single_attrs (struct device *dev, void *addr, size_t size, int dir, 
+  struct dma_attrs *attrs)
 {
if (use_swiotlb(dev))
-   return swiotlb_map_single(dev, addr, size, dir);
+   return swiotlb_map_single_attrs(dev, addr, size, dir, attrs);
else
-   return hwiommu_map_single(dev, addr, size, dir);
+   return hwiommu_map_single_attrs(dev, addr, size, dir, attrs);
 }
 
 void
-hwsw_unmap_single (struct device *dev, dma_addr_t iova, size_t size, int dir)
+hwsw_unmap_single_attrs (struct device *dev, dma_addr_t iova, size_t size, 
+int dir, struct dma_attrs *attrs)
 {
if (use_swiotlb(dev))
-   return swiotlb_unmap_single(dev, iova, size, dir);
+   return swiotlb_unmap_single_attrs(dev, iova, size, dir, attrs);
else
-   return hwiommu_unmap_single(dev, iova, size, dir);
+   return hwiommu_unmap_single_attrs(dev, iova, size, dir, attrs);
 }
 
 
 int
-hwsw_map_sg (struct device *dev, struct scatterlist *sglist, int nents, int 
dir)
+hwsw_map_sg_attrs (struct device *dev, struct scatterlist

Re: Pull request: DMA pool updates

2008-02-05 Thread Andrew Morton

On Tue, 5 Feb 2008 19:52:47 -0700 Matthew Wilcox <[EMAIL PROTECTED]> wrote:

> Could I ask you to pull the DMA Pool changes detailed below?
> 
> All the patches have been posted to linux-kernel before, and various
> comments (and acks) have been taken into account.  (see
> http://thread.gmane.org/gmane.linux.kernel/609943)
> 
> It's a fairly nice performance improvement, so would be good to get in.
> It's survived a few hours of *mumble* high-stress database benchmark,
> so I have high confidence in its stability.
> 
> The following changes since commit 21511abd0a248a3f225d3b611cfabb93124605a7:
>   Linus Torvalds (1):
> Merge branch 'release' of git://git.kernel.org/.../aegl/linux-2.6
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc.git dmapool

Looks OK to me - I think I reviewed these a while back?  We really should
have had this tree in -mm for general tyre-kicking.

What's with the #ifdef CONFIG_SLAB stuff in the dmapool code?  Are we just
overloading a convenient Kconfig label here, or is it required for some
reason?  Why not CONFIG_SLUB_DEBUG too?

For the future:

This code:

+struct dma_pool *dma_pool_create(const char *name, struct device *dev,
+size_t size, size_t align, size_t boundary)
+{
+   struct dma_pool *retval;
+   size_t allocation;
+
+   if (align == 0) {
+   align = 1;
+   } else if (align & (align - 1)) {
+   return NULL;
+   }
+
+   if (size == 0) {
+   return NULL;
+   } else if (size < 4) {
+   size = 4;
+   }
+
+   if ((size % align) != 0)
+   size = ALIGN(size, align);
+
+   allocation = max_t(size_t, size, PAGE_SIZE);
+
+   if (!boundary) {
+   boundary = allocation;
+   } else if ((boundary < size) || (boundary & (boundary - 1))) {
+   return NULL;
+   }

could do with some relief from its brace fetish and some education about
is_power_of_2().  And the `if ((size % align) != 0)' can just go away,
which will save code and is likely faster.

It's a separate thing I guess, but I don't see any reason why
dma_pool_alloc() _has_ to use GFP_ATOMIC.  Looks like it can do the
allocation outside spin_lock_irqsave() and use mem_flags instead.  If that
has __GFP_WAIT then we're in much better shape.

I wish all this code wouldn't do

struct dma_page *page;

because one very much expects a local variable called "page" to be of type
`struct page *'.  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3 v2] dma/doc: document dma_{un}map_{single|sg}_attrs() interface

2008-02-05 Thread akepner


Document the new dma_{un}map_{single|sg}_attrs() functions.

Signed-off-by: Arthur Kepner <[EMAIL PROTECTED]>

---

 DMA-API.txt|   65 +
 DMA-attributes.txt |   29 +++
 2 files changed, 94 insertions(+)

diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index b939ebb..4c471fc 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -395,6 +395,71 @@ Notes:  You must do this:
 
 See also dma_map_single().
 
+dma_addr_t 
+dma_map_single_attrs(struct device *dev, void *cpu_addr, size_t size, 
+enum dma_data_direction dir, 
+struct dma_attrs* attrs)
+
+void 
+dma_unmap_single_attrs(struct device *dev, dma_addr_t dma_addr,
+  size_t size, enum dma_data_direction dir,
+  struct dma_attrs* attrs)
+
+int 
+dma_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
+int nents, enum dma_data_direction dir, 
+struct dma_attrs *attrs)
+
+void 
+dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sgl, 
+  int nents, enum dma_data_direction dir,
+  struct dma_attrs *attrs)
+
+The four functions above are just like the counterpart functions 
+without the _attrs suffixes, except that they pass an optional 
+struct dma_attrs*. 
+
+struct dma_attrs encapsulates a set of "dma attributes". For the 
+definition of struct dma_attrs see linux/dma-attrs.h. 
+
+The interpretation of dma attributes is architecture-specific, and 
+each attribute should be documented in Documentation/DMA-attributes.txt. 
+
+If struct dma_attrs* is NULL, the semantics of each of these 
+functions is identical to those of the corresponding function 
+without the _attrs suffix. As a result dma_map_single_attrs() 
+can generally replace dma_map_single(), etc.
+
+As an example of the use of the *_attrs functions, here's how 
+you could pass an attribute DMA_ATTR_FOO when mapping memory 
+for DMA:
+
+#include 
+/* DMA_ATTR_FOO should be defined in linux/dma-attrs.h and 
+ * documented in Documentation/DMA-attributes.txt */
+...
+
+   DECLARE_DMA_ATTRS(attrs);
+   dma_set_attr(, DMA_ATTR_FOO);
+   
+   n = dma_map_sg_attrs(dev, sg, nents, DMA_TO_DEVICE, );
+   
+
+Architectures that care about DMA_ATTR_FOO would check for its 
+presence in their implementations of the mapping and unmapping 
+routines, e.g.:
+
+void whizco_dma_map_sg_attrs(struct device *dev, dma_addr_t dma_addr, 
+size_t size, enum dma_data_direction dir, 
+struct dma_attrs* attrs)
+{
+   
+   int foo =  dma_get_attr(attrs, DMA_ATTR_FOO);
+   
+   if (foo) 
+   /* twizzle the frobnozzle */
+   
+
 
 Part II - Advanced dma_ usage
 -
diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index e69de29..36baea5 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -0,0 +1,29 @@
+   DMA attributes
+   ==
+
+This document describes the semantics of the DMA attributes that are 
+defined in linux/dma-attrs.h. 
+
+
+DMA_ATTR_SYNC_ON_WRITE
+--
+
+DMA_ATTR_SYNC_ON_WRITE is used on the IA64_SGI_SN2 architecture.
+It provides a mechanism for devices to explicitly order their DMA 
+writes.
+
+On IA64_SGI_SN2 machines, DMA may be reordered within the NUMA 
+interconnect. Allowing reordering improves performance, but in some 
+situations it may be necessary to ensure that one DMA write is 
+complete before another is visible. For example, if the device does 
+a DMA write to indicate that data is available in memory, DMA of the 
+"completion indication" can race with DMA of data.
+
+When a memory region is mapped with the DMA_ATTR_SYNC_ON_WRITE attribute, 
+a write to that region causes all in-flight DMA to be flushed to memory. 
+Any pending DMA will complete and be visible in memory before the write 
+to the region with the DMA_ATTR_SYNC_ON_WRITE attribute becomes visible. 
+
+(For more information, see the document titled "SGI Altix Architecture 
+Considerations for Linux Device Drivers" at http://techpubs.sgi.com/.)
+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3 v2] dma: dma_{un}map_{single|sg}_attrs() interface

2008-02-05 Thread akepner


Introduce a new interface for passing architecture-specific
attributes when memory is mapped and unmapped for DMA. Give
the interface a default implementation which ignores
attributes.

Signed-off-by: Arthur Kepner <[EMAIL PROTECTED]>

---

 dma-mapping.h |   35 +++
 1 files changed, 35 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 101a2d4..4a49abe 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -116,4 +116,39 @@ static inline void dmam_release_declared_memory(struct 
device *dev)
 }
 #endif /* ARCH_HAS_DMA_DECLARE_COHERENT_MEMORY */
 
+#ifndef ARCH_USES_DMA_ATTRS
+struct dma_attrs;
+
+static inline dma_addr_t dma_map_single_attrs(struct device *dev, 
+ void *cpu_addr, size_t size, 
+ enum dma_data_direction dir, 
+ struct dma_attrs* attrs)
+{
+   return dma_map_single(dev, cpu_addr, size, dir);
+}
+
+static inline void dma_unmap_single_attrs(struct device *dev, 
+ dma_addr_t dma_addr, size_t size, 
+ enum dma_data_direction dir, 
+ struct dma_attrs* attrs)
+{
+   return dma_unmap_single(dev, dma_addr, size, dir);
+}
+
+static inline int dma_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
+  int nents, enum dma_data_direction dir, 
+  struct dma_attrs *attrs)
+{
+   return dma_map_sg(dev, sgl, nents, dir);
+}
+
+static inline void dma_unmap_sg_attrs(struct device *dev, 
+ struct scatterlist *sgl, int nents, 
+ enum dma_data_direction dir, 
+ struct dma_attrs *attrs)
+{
+   return dma_unmap_sg(dev, sgl, nents, dir);
+}
+#endif /* ARCH_USES_DMA_ATTRS */
+
 #endif

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git pull] x86 arch updates for v2.6.25

2008-02-05 Thread David Cullen


Dear Kernel Maintainers,

I am with Phil Oester and Andrew Morton when it comes to getting 
kgdb into the mainline kernel.  I _am_ a full time developer, and 
when I have to work with Linux kernel code, kgdb makes things a lot 
easier.  I work on many different platforms, with many different 
operating systems, and there just is not enough time in the day to 
learn every line of every version of the kernel.  kgdb allows me to 
get in, see what I need, and get out.


That being said, if kgdb _does_ become mainline, and people are able 
to get more visibility into how the kernel works in real time, you 
will probably see more exploits.  This may be the secret reason for 
reluctance by the powers on high.


CC me if you want me to see your reply as I am not on the list.

--
Thank you,
David Cullen
[EMAIL PROTECTED]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: down_killable implementations for every architecture

2008-02-05 Thread Matthew Wilcox

On Tue, Jan 29, 2008 at 05:23:52AM +0100, Andi Kleen wrote:
> On Tuesday 29 January 2008 00:19, Matthew Wilcox wrote:
> > As part of the TASK_KILLABLE changes, we're going to need
> > down_killable().  Unfortunately, semaphores are implemented for every
> > architecture, which we should probably fix at some point.
> 
> It would be best to just change it now before doing further changes. Right now
> we have the bizarre situation that semaphores are more optimized
> with fast path inline assembly code than the far more critical spinlocks.
> But that clearly doesn't make much sense. So the best approach would
> be likely to just pick some generic C implementation from some architecture
> and use it everywhere.

We don't really have an appropriate one.  So I've invented my own.

 104 files changed, 338 insertions(+), 7335 deletions(-)

(and 228k).  That seems inappropriate to post.

Here's the whole patch (includes deletions from every architecture):
http://www.parisc-linux.org/~willy/generic-semaphore.diff

Here's the interesting/useful bit.  I've only tested it briefly on my
laptop -- it could be full of holes, but I've tried very hard to think
of all the interesting edge/race conditions with multiple
sleepers/wakers.

Please review.  I won't be around to respond to comments for another
four days.

diff --git a/include/linux/semaphore.h b/include/linux/semaphore.h
new file mode 100644
index 000..8e563bc
--- /dev/null
+++ b/include/linux/semaphore.h
@@ -0,0 +1,81 @@
+/*
+ * Copyright (c) 2008 Intel Corporation
+ * Author: Matthew Wilcox <[EMAIL PROTECTED]>
+ *
+ * Distributed under the terms of the GNU GPL, version 2
+ *
+ * Counting semaphores allow up to  tasks to acquire the semaphore
+ * simultaneously.
+ */
+#ifndef __LINUX_SEMAPHORE_H
+#define __LINUX_SEMAPHORE_H
+
+#include 
+#include 
+
+/*
+ * The spinlock controls access to the other members of the semaphore.
+ * 'count' is decremented by every task which calls down*() and incremented
+ * by every call to up().  Thus, if it is positive, it indicates how many
+ * more tasks may acquire the lock.  If it is negative, it indicates how
+ * many tasks are waiting for the lock.  Tasks waiting for the lock are
+ * kept on the wait_list.
+ */
+struct semaphore {
+   spinlock_t  lock;
+   int count;
+   struct list_headwait_list;
+};
+
+#define __SEMAPHORE_INITIALIZER(name, n)   \
+{  \
+   .lock   = __SPIN_LOCK_UNLOCKED((name).lock),\
+   .count  = n,\
+   .wait_list  = LIST_HEAD_INIT((name).wait_list), \
+}
+
+#define __DECLARE_SEMAPHORE_GENERIC(name, count) \
+   struct semaphore name = __SEMAPHORE_INITIALIZER(name, count)
+
+#define DECLARE_MUTEX(name)__DECLARE_SEMAPHORE_GENERIC(name, 1)
+ 
+static inline void sema_init(struct semaphore *sem, int val)
+{
+   *sem = (struct semaphore) __SEMAPHORE_INITIALIZER(*sem, val);
+}
+
+#define init_MUTEX(sem)sema_init(sem, 1)
+#define init_MUTEX_LOCKED(sem) sema_init(sem, 0)
+
+/*
+ * Attempt to acquire the semaphore.  If another task is already holding the
+ * semaphore, sleep until the semaphore is released.
+ */
+extern void fastcall down(struct semaphore *sem);
+
+/*
+ * As down(), except the sleep may be interrupted by a signal.  If it is,
+ * this function will return -EINTR.
+ */
+extern int __must_check fastcall down_interruptible(struct semaphore *sem);
+
+/*
+ * As down_interruptible(), except the sleep may only be interrupted by
+ * signals which are fatal to this process.
+ */
+extern int __must_check fastcall down_killable(struct semaphore *sem);
+
+/*
+ * As down, except this function will not sleep.  It will return 0 if it
+ * acquired the semaphore and 1 if the semaphore was contended.  This
+ * function may be called from any context, including interrupt and softirq.
+ */
+extern int __must_check fastcall down_trylock(struct semaphore *sem);
+
+/*
+ * Release the semaphore.  Unlike mutexes, up() may be called from any
+ * context and even by tasks which have never called down().
+ */
+extern void fastcall up(struct semaphore *sem);
+
+#endif /* __LINUX_SEMAPHORE_H */
diff --git a/kernel/semaphore.c b/kernel/semaphore.c
new file mode 100644
index 000..94f65a1
--- /dev/null
+++ b/kernel/semaphore.c
@@ -0,0 +1,208 @@
+/*
+ * Copyright (c) 2008 Intel Corporation
+ * Author: Matthew Wilcox <[EMAIL PROTECTED]>
+ *
+ * Distributed under the terms of the GNU GPL, version 2
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Some notes on the implementation:
+ *
+ * down_trylock() and up() can be called from interrupt context.
+ * So we have to disable interrupts when taking the lock.
+ *
+ * The ->count variable, if positive, defines how many more tasks can
+ * acquire the semaphore.  If negative,

Re: [PATCH 4/4] x86 mmiotrace: move files into arch/x86/mm/

2008-02-05 Thread Randy Dunlap

On Tue, 5 Feb 2008 22:39:58 +0200 Pekka Paalanen wrote:

> As this patch is too big for the list, it can be found at:
> http://jumi.lut.fi/~paalanen/scratch/mmio25-b/0004-x86-mmiotrace-move-files-into-arch-x86-mm.patch
> 
> The patch is 85 kB and Documentation/SubmittingPatches says I should not
> post it to the list if it exceeds 40 kB.

Hm, we need to change that comment. The mailing list limit
(for lkml) is now 400 KB AFAIK.  Certainly we see plenty of
patches that are larger than 40 KB.

> Ingo, I can also send this directly to your email, if you wish.

Please email it to the mailing list.

---
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread James Bottomley

On Tue, 2008-02-05 at 16:12 -0800, Andrew Morton wrote:
> On Sun, 03 Feb 2008 18:16:51 -0600
> James Bottomley <[EMAIL PROTECTED]> wrote:
> 
> > 
> > From: James Bottomley <[EMAIL PROTECTED]>
> > Date: Sun, 3 Feb 2008 15:40:56 -0600
> > Subject: [SCSI] enclosure: add support for enclosure services
> > 
> > The enclosure misc device is really just a library providing sysfs
> > support for physical enclosure devices and their components.
> > 
> 
> Thanks for sending it out for review.
> 
> > +struct enclosure_device *enclosure_find(struct device *dev)
> > +{
> > +   struct enclosure_device *edev = NULL;
> > +
> > +   mutex_lock(_list_lock);
> > +   list_for_each_entry(edev, _list, node) {
> > +   if (edev->cdev.dev == dev) {
> > +   mutex_unlock(_list_lock);
> > +   return edev;
> > +   }
> > +   }
> > +   mutex_unlock(_list_lock);
> > +
> > +   return NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(enclosure_find);
> 
> This looks a little odd.  We don't take a ref on the object after looking
> it up, so what prevents some other thread of control from freeing or
> otherwise altering the returned object while the caller is playing with it?

The use case is for enclosure destruction, so the free should never
happen, but I take the point; I've added a class_device_get().

> > +/**
> > + * enclosure_for_each_device - calls a function for each enclosure
> > + * @fn:the function to call
> > + * @data:  the data to pass to each call
> > + *
> > + * Loops over all the enclosures calling the function.
> > + *
> > + * Note, this function uses a mutex which will be held across calls to
> > + * @fn, so it must have user context, and @fn should not sleep or
> 
> Probably "non atomic context" would be more accurate.
> 
> fn() actually _can_ sleep.

"should" to me means you don't have to do this but ought to. I'll add a
may (but should not).

> > +   if (!cb) {
> > +   kfree(edev);
> > +   return ERR_PTR(-EINVAL);
> > +   }
> 
> It would be less fuss if this were to test cb before doing the kzalloc().
> 
> Can cb==NULL actually and legitimately happen?

Not really ... I'll make it a BUG_ON.

> > +void enclosure_unregister(struct enclosure_device *edev)
> > +{
> > +   int i;
> > +
> > +   if (!edev)
> > +   return;
> 
> Is this legal?

No ... it'll oops on the null deref later ... I'll remove this.

> > +   mutex_lock(_list_lock);
> > +   list_del(>node);
> > +   mutex_unlock(_list_lock);
> 
> See, right now, someone who found this enclosure_device via
> enclosure_find() could still be playing with it?

Yes, fixed.

> > +   if (!edev || number >= edev->components)
> > +   return ERR_PTR(-EINVAL);
> 
> Is !edev possible and legitimate?

It shouldn't be, no ... I can remove it.

> > +   snprintf(cdev->class_id, BUS_ID_SIZE, "%d", number);
> 
> %u :)

Nitpicker!

> > +   return snprintf(buf, 40, "%d\n", edev->components);
> > +}
> 
> "40"?

I just followed precedence ;-P

There doesn't seem to be a define for this maximum length, so 40 is the
most commonly picked constant.

> > +static char *enclosure_type [] = {
> > +   [ENCLOSURE_COMPONENT_DEVICE] = "device",
> > +   [ENCLOSURE_COMPONENT_ARRAY_DEVICE] = "array device",
> > +};
> 
> One could play with const here, if sufficiently keen.

One will try to summon up the enthusiasm.

> > +static ssize_t set_component_fault(struct class_device *cdev, const char 
> > *buf,
> > +  size_t count)
> > +{
> > +   struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> > +   struct enclosure_component *ecomp = to_enclosure_component(cdev);
> > +   int val = simple_strtoul(buf, NULL, 0);
> 
> hrm, we do this conversion about 1e99 times in the kernel and we have to go
> and pass three args where only one was needed. katoi()?

Yes ... I'll add it to the todo list.

> > +   for (i = 0; enclosure_status[i]; i++) {
> > +   if (strncmp(buf, enclosure_status[i],
> > +   strlen(enclosure_status[i])) == 0 &&
> > +   buf[strlen(enclosure_status[i])] == '\n')
> > +   break;
> > +   }
> 
> So if an application does
> 
>   write(fd, "foo", 3)
> 
> it won't work?  Thye have to do
> 
>   write(fd, "foo\n", 4)
> 
> ?

No ... it's designed for echo; however, I'll add a check for '\0' which
will catch the write case.

> > +#define to_enclosure_device(x) container_of((x), struct enclosure_device, 
> > cdev)
> > +#define to_enclosure_component(x) container_of((x), struct 
> > enclosure_component, cdev)
> 
> These could be C functions...

OK ... I was just following precedence again, but I can make them
inlines.

> Nice looking driver.

Thanks,

James

---

Here's the incremental diff.

diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
index 42e6e43..6fcb0e9 100644
--- a/drivers/misc/enclosure.c
+++ b/drivers/misc/enclosure.c
@@ -39,7 +39,8 @@ static struct class enclosure_component_class;
  *
  * Looks

Pull request: DMA pool updates

2008-02-05 Thread Matthew Wilcox

Hi Linus,

Could I ask you to pull the DMA Pool changes detailed below?

All the patches have been posted to linux-kernel before, and various
comments (and acks) have been taken into account.  (see
http://thread.gmane.org/gmane.linux.kernel/609943)

It's a fairly nice performance improvement, so would be good to get in.
It's survived a few hours of *mumble* high-stress database benchmark,
so I have high confidence in its stability.

The following changes since commit 21511abd0a248a3f225d3b611cfabb93124605a7:
  Linus Torvalds (1):
Merge branch 'release' of git://git.kernel.org/.../aegl/linux-2.6

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc.git dmapool

Matthew Wilcox (7):
  Move dmapool.c to mm/ directory
  dmapool: Fix style problems
  Avoid taking waitqueue lock in dmapool
  dmapool: Validate parameters to dma_pool_create
  dmapool: Tidy up includes and add comments
  Change dmapool free block management
  pool: Improve memory usage for devices which can't cross boundaries

 drivers/base/Makefile  |2 +-
 drivers/base/dmapool.c |  481 --
 mm/Makefile|1 +
 mm/dmapool.c   |  500 
 4 files changed, 502 insertions(+), 482 deletions(-)
 delete mode 100644 drivers/base/dmapool.c
 create mode 100644 mm/dmapool.c

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH for review] ACPI: Create /sys/firmware/acpi/interrupts/ counters

2008-02-05 Thread Greg KH

On Tue, Feb 05, 2008 at 08:58:50PM -0500, Len Brown wrote:
> On Tuesday 05 February 2008 18:18, Greg KH wrote:
> > On Tue, Feb 05, 2008 at 06:12:09PM -0500, Len Brown wrote:
> > > On Tuesday 05 February 2008 17:18, Greg KH wrote:
> > > > On Tue, Feb 05, 2008 at 02:30:10AM -0500, Len Brown wrote:
> > > > > # cat /sys/firmware/acpi/interrupts/summary
> > > > > pm_timer 0
> > > > > glbl_lock0
> > > > > power_btn0
> > > > > sleep_btn0
> > > > > rtc  0
> > > > > gpe000
> > > ...
> > > > > gpe1F0
> > > > > gpe_hi0
> > > > > gpe_total   63
> > > > > acpi_irq63
> > > > 
> > > > Eeek!  Why?  What's wrong with individual files here?
> > > 
> > > My expectation is that this is a shell interface for debugging,
> > > not an API for programs.  ala /proc/interrupts.
> > 
> > Great, then use debugfs for it.  Please, don't put debug stuff like this
> > in sysfs, that's not what it is there for.  You can do whatever you want
> > in debugfs :)
> 
> Can you point to a model of good behaviour that I can copy?

Any user of the debugfs api you could copy for this.

> note that I want this information to be available on every system,
> just like /proc/interrupts is.

Ah, then /proc perhaps?

> /proc/ has seqfile support, is there a reason I shouldn't use it?
> I'd banned additional files from /proc/acpi for a long time
> since the directory layout was ill-conceived.  But maybe I
> should re-consider the headlong rush to use sysfs?

One of the main problems of /proc was that the files were not
documented, or that the format would change between versions, or that
they were different on different arches.

For something like this, yes, maybe you do need to use proc.  It can
handle almost infinite length files, and just make sure you document it
well.

But I would just stick with debugfs, all distros enable it when
shipping, you just have to ask them to mount it by hand usually:
mount -t debugfs none /sys/kernel/debug/

Either way, I wouldn't recommend sysfs for this.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Pull request: TASK_KILLABLE

2008-02-05 Thread Andrew Morton

On Tue, 5 Feb 2008 19:19:42 -0700 Matthew Wilcox <[EMAIL PROTECTED]> wrote:

> > And going back through the mailing list all I can find is a series of five
> > patches in October - it's unclear where and when the other 17 were
> > reviewed, if they were.
> 
> A large number of these patches are just a resplit of the patches sent
> back in October -- you complained they weren't split up enough.  So I
> resplit them.  And sent them to you.  Asking if this was how you
> preferred it.  Which you didn't reply to.

Well, I apologise if that's the case, but...  I can find no record of the
later patch series.  Maybe an MTA ate them?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: ipw3945: not only it periodically dies, it also BUG()s

2008-02-05 Thread Chatre, Reinette

On Tuesday, February 05, 2008 1:45 PM, Pavel Machek  wrote:

> 
> ...I've reported this before, with full debugging. Not sure if
> anything happened. 

Could you please point me to where you have reported it before?

> Now, I got BUG() in iwl3945-base.c: 3824

Which driver and kernel are you using?

Reinette

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] exporting capability code/name pairs (try #3)

2008-02-05 Thread Kohei KaiGai


Serge E. Hallyn wrote:

Quoting Kohei KaiGai ([EMAIL PROTECTED]):

All that being said, the friendliness factor of this is somewhat
undeniable, and so I can see why folk might want it in the kernel
anyway. If so, would it possible to move this code into
security/capability.c and not in the main kernel per-se - protected with
a configuration option? If it does appear in the kernel, we'll obviously
add your libcap changes too. If it doesn't, then perhaps we can meet
your needs with a slight modification to your libcap patch to read the
capabilities from an optional /etc/XXX file - and make text visibility
of 'late breaking' capabilities something that the admin can tweak as
needed?

I think optional configuration file is not a good idea.
It can make unneeded confusion.

If necessary, I'll move this features into security/capability.c and
add a Kconfig option to select it.

The following patch enables to export the list of capabilities supported
on the running kernel, under /sys/kernel/capability .

Changelog from the previous version:
- Implementation is moved into security/capability.c from kernel/capability.c
- A Kconfig option SECURITY_CAPABILITIES_EXPORT is added to tuen on/off this 
feature.


can you explain one more time exactly what this lets you do that you
absolutely can't do with the current api?


Please consider the following situation:

A user intend to run an application which use a new capability supported
at new kernel without synced libcap. In this case, the application cannot
work well, because libcap prevent to use new capability.

When the kernel and libcap are not synced, the header files provided by
libcap pacakge is not reliable. Typically, kernel developer sometimes
faces such a situation. :)

This feature can fill the gap with providing a new interface to collect
capabilities supported by the running kernel collectly.


I for one don't really object even if it is "duplicated" since it is far
easier to use, and I frequently have systems where kernel and userspace
are out of sync so /usr/include/sys/capabilities is worthless...  Though
I'm a little worried that b/scripts/mkcapnames.sh is the kind of thing
that'll eventually break, but I suppose that's my fault for objecting
two duplicated list of capability definitions :)


Are you worried about "mkcapnames.sh" get broken in the future version?

If so, we can add a code to check whether this script works correctly, or not

like:
  -- at security/capability.c
  #include 
   :
  #if CAP_LAST_CAP != ARRAY_SIZE(capability_attrs)
  #error "mkcapnames.sh added fewer or more entries than expected!"
  #endif

Thanks,


-serge



[EMAIL PROTECTED] ~]$ for x in /sys/kernel/capability/*

do
  echo "$x --> `cat $x`"
done

/sys/kernel/capability/cap_audit_control --> 30
/sys/kernel/capability/cap_audit_write --> 29
- snip -
/sys/kernel/capability/cap_sys_time --> 25
/sys/kernel/capability/cap_sys_tty_config --> 26
/sys/kernel/capability/index --> 31
/sys/kernel/capability/version --> 0x19980330
[EMAIL PROTECTED] ~]$

Thanks,

Signed-off-by: KaiGai Kohei <[EMAIL PROTECTED]>

 scripts/mkcapnames.sh |   50 +
 security/Kconfig  |9 
 security/Makefile |   11 ++
 security/capability.c |   49 
 4 files changed, 119 insertions(+), 0 deletions(-)

diff --git a/scripts/mkcapnames.sh b/scripts/mkcapnames.sh
index e69de29..262478e 100644
--- a/scripts/mkcapnames.sh
+++ b/scripts/mkcapnames.sh
@@ -0,0 +1,50 @@
+#!/bin/sh
+
+#
+# generate a cap_names.h file from include/linux/capability.h
+#
+
+BASEDIR=`dirname $0`
+
+echo '#ifndef CAP_NAMES_H'
+echo '#define CAP_NAMES_H'
+echo
+echo '/*'
+echo ' * Do NOT edit this file directly.'
+echo ' * This file is generated from include/linux/capability.h automatically'
+echo ' */'
+echo
+echo '#ifndef SYSFS_CAPABILITY_ENTRY'
+echo '#error cap_names.h should be included from kernel/capability.c'
+echo '#else'
+
+echo 'SYSFS_CAPABILITY_ENTRY(version, "0x%08x\n", _LINUX_CAPABILITY_VERSION);'
+
+cat ${BASEDIR}/../include/linux/capability.h   \
+| egrep '^#define CAP_[A-Z_]+[ ]+[0-9]+$'  \
+| awk 'BEGIN {
+max_code = -1;
+}
+{
+if ($3 > max_code)
+max_code = $3;
+printf("SYSFS_CAPABILITY_ENTRY(%s, \"%%u\\n\", %s);\n", 
tolower($2), $2);
+}
+END {
+printf("SYSFS_CAPABILITY_ENTRY(index, \"%%u\\n\", %u);\n", 
max_code);
+}'
+
+echo
+echo 'static struct attribute *capability_attrs[] = {'
+echo '_attr.attr,'
+echo '_attr.attr,'
+
+cat ${BASEDIR}/../include/linux/capability.h\
+| egrep '^#define CAP_[A-Z_]+[ ]+[0-9]+$'   \
+| awk '{ printf ("&%s_attr.attr,\n", tolower($2)); }'
+
+echo 'NULL,'
+echo '};'
+
+echo '#endif   /* SYSFS_CAPABILITY_ENTRY */'
+echo '#endif   /* CAP_NAMES_H */'
diff --git a/security/Kconfig

Re: T61P sound issue

2008-02-05 Thread Theodore Tso

On Tue, Feb 05, 2008 at 10:16:08PM +0100, Jiri Kosina wrote:
> [ added Takashi ]
> 
> On Tue, 5 Feb 2008, Felipe Balbi wrote:
> 
> > > > > > Could anyone make T61P's ICH8 sound controller to work properly?
> > Good that there's a lot of people using T61p, it's a good machine.
> > I'll upgrade my BIOS and try again the crappy sound.
> 
> I have just bought X61s, and it seems to have the very same soundcard as 
> your T61p does:
>
> The sound also doesn't work with 2.6.24 (tried modprobing the 
> snd-hda-intel with 'model=thinkpad', didn't make any difference). The 
> mixer settings seem to be correct, but there is no sound.
> 

Hmm.. sound works just fine for me on my X61s (model #7668-CTO)
running 2.6.24.  

I do have this private patch applied --- maybe it makes a difference
for you?  I don't think it should make a difference, but

- Ted


commit c9001b03378048cad0f5c4f87dbb97fff1f80c51
Author: Theodore Ts'o <[EMAIL PROTECTED]>
Date:   Wed Jan 9 05:14:14 2008 -0500

hda_intel suspend latency: shorten codec read

not sleeping for every codec read/write but doing a short udelay and
a conditional reschedule has cut suspend+resume latency by about 1
second on my T60.

The patch also fixes the unexpected codec-connection errors that
happen more often in the new power-save mode:
http://lkml.org/lkml/2007/11/8/255
http://bugzilla.kernel.org/show_bug.cgi?id=9332

This had been applied, and then reverted due to problems.  See commit
d238998fbfa49f30b02f0a5de5294ca53c58348c

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Acked-by: Takashi Iwai <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
index 3fa0f97..62b9fb3 100644
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -555,7 +555,8 @@ static unsigned int azx_rirb_get_response(struct hda_codec 
*codec)
}
if (!chip->rirb.cmds)
return chip->rirb.res; /* the last value */
-   schedule_timeout_uninterruptible(1);
+   udelay(10);
+   cond_resched();
} while (time_after_eq(timeout, jiffies));
 
if (chip->msi) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Problem burning DVDs with Marvell 88SE6121 on pata_marvell

2008-02-05 Thread Mike Hokenson



On Tuesday, February 05, 2008 at 03:21PM, Andrew Morton wrote:


(added linux-ide)


thanks


On Sat, 2 Feb 2008 16:30:04 -0600
Mike Hokenson <[EMAIL PROTECTED]> wrote:


I recently put together a new system with a MSI P35 PLATINUM and although
reading from data CDs, DVDs, and watching DVD movies is working fine, DVD
burning isn't. MSI's manual says "1 IDE port by Marvell 88SE6111", but
lspci says it's a 88SE6121. I have two DVD burners, a SONY DRU-510A and
a DRU-820A. They were both working fine with TDK DVD+R media on my ASUS
K8V SE DELUXE (VIA IDE controller) prior to the upgrade.

Here's what I see with dvd+rw-tools version 7.0-9:

sh# dvd+rw-mediainfo /dev/dvd > /dev/null  # this is blank media
:-[ GET CURRENT PERFORMACE failed with SK=5h/ASC=24h/ACQ=00h]: Input/output 
error
:-[ READ TOC failed with SK=5h/ASC=24h/ACQ=00h]: Input/output error

sh# growisofs -dvd-compat -speed=2.4 -Z /dev/dvd -rJ -joliet-long -quiet "burn"
Executing 'genisoimage -rJ -joliet-long -quiet burn | builtin_dd of=/dev/dvd 
obs=32k seek=0'
/dev/dvd: "Current Write Speed" is 2.5x1352KBps.
:-[ [EMAIL PROTECTED] failed with SK=3h/ASC=0Ch/ACQ=00h]: Input/output error
:-( write failed: Input/output error
/dev/dvd: flushing cache
/dev/dvd: closing track
:-[ CLOSE TRACK failed with SK=5h/ASC=30h/ACQ=05h]: Wrong medium type
/dev/dvd: closing disc
:-[ CLOSE DISC failed with SK=5h/ASC=30h/ACQ=05h]: Wrong medium type

sh# dvd+rw-mediainfo /dev/dvd > /dev/null
:-[ READ TRACK INFORMATION failed with SK=3h/ASC=11h/ACQ=05h]: Input/output 
error


(this is 2.6.24)

If nothing happens with this report in the next few days, please create an
entry at bugzilla.kernel.org so we can keep an eye on it, thanks.

Trying older kernels might be interesting, find out if it's a regression or
if it was always this way.


2.6.13.14 shows the same problems and 2.6.22 doesn't see the controller.

It looks like most of the changes happened betwen it's introduction in
2.6.20 (v0.1.1) and 2.6.22 (v0.1.4), with some minor updates for internal
changes after that...

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Pull request: TASK_KILLABLE

2008-02-05 Thread Matthew Wilcox

On Thu, Jan 31, 2008 at 06:02:05PM -0800, Andrew Morton wrote:
> No such export was needed in the patches which I added to -mm.  So
> something changed between then and now.

Not sure abouut that problem -- still on holiday, so just checking ym
mail quickly.

> And going back through the mailing list all I can find is a series of five
> patches in October - it's unclear where and when the other 17 were
> reviewed, if they were.

A large number of these patches are just a resplit of the patches sent
back in October -- you complained they weren't split up enough.  So I
resplit them.  And sent them to you.  Asking if this was how you
preferred it.  Which you didn't reply to.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ipvs: Make wrr "no available servers" error message rate-limited

2008-02-05 Thread Simon Horman

On Tue, Feb 05, 2008 at 09:30:21PM +0100, Sven Wegener wrote:
> No available servers is more an error message than something informational. It
> should also be rate-limited, else we're going to flood our logs on a busy
> director, if all real servers are out of order with a weight of zero.
> 
> Signed-off-by: Sven Wegener <[EMAIL PROTECTED]>

Hi Sven,

this looks good to me.

Acked-by: Simon Horman <[EMAIL PROTECTED]>

> ---
> 
> Actually, do we need this message at all? The wrr scheduler is the only one
> printing an error message in such a case.

I was wondering about that too. Though I'd err on the side of adding
it to the other schedulers as neccessary rather than removing it here.
But if you'd rather just get rid of it, I have no strong objections.

-- 
Horms

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24 regression][BUGFIX] numactl --interleave=all doesn't works on memoryless node.

2008-02-05 Thread David Rientjes

On Tue, 5 Feb 2008, Lee Schermerhorn wrote:

> Index: Linux/mm/mempolicy.c
> ===
> --- Linux.orig/mm/mempolicy.c 2008-02-05 11:25:17.0 -0500
> +++ Linux/mm/mempolicy.c  2008-02-05 16:03:11.0 -0500
> @@ -131,7 +131,7 @@ static int mpol_check_policy(int mode, n
>   return -EINVAL;
>   break;
>   }
> - return nodes_subset(*nodes, node_states[N_HIGH_MEMORY]) ? 0 : -EINVAL;
> + return 0;
>  }
>  
>  /* Generate a custom zonelist for the BIND policy. */

This change will be necessary when the nodemask passed from the syscall is 
saved in the struct mempolicy as the intent of the application as well.

> @@ -188,8 +188,6 @@ static struct mempolicy *mpol_new(int mo
>   switch (mode) {
>   case MPOL_INTERLEAVE:
>   policy->v.nodes = *nodes;
> - nodes_and(policy->v.nodes, policy->v.nodes,
> - node_states[N_HIGH_MEMORY]);
>   if (nodes_weight(policy->v.nodes) == 0) {
>   kmem_cache_free(policy_cache, policy);
>   return ERR_PTR(-EINVAL);
> @@ -426,9 +424,13 @@ static int contextualize_policy(int mode
>   if (!nodes)
>   return 0;
>  
> + /*
> +  * Restrict the nodes to the allowed nodes in the cpuset.
> +  * This is guaranteed to be a subset of nodes with memory.
> +  */
>   cpuset_update_task_memory_state();
> - if (!cpuset_nodes_subset_current_mems_allowed(*nodes))
> - return -EINVAL;
> + nodes_and(*nodes, *nodes, cpuset_current_mems_allowed);
> +
>   return mpol_check_policy(mode, nodes);
>  }
>  

I would defer the intersection until later because contextualize_policy() 
is called before mpol_new() so we have no struct mempolicy to save the 
intent in.  It doesn't matter for the sake of this change, I know, but you 
could move this intersection to mpol_new() and give us an opportunity to 
store the user's nodemask in the mempolicy with a one-line change and get 
the same desired result.

You can now remove cpuset_nodes_subset_current_mems_allowed() from 
linux/cpuset.h.

> @@ -797,7 +799,7 @@ static long do_mbind(unsigned long start
>   if (end == start)
>   return 0;
>  
> - if (mpol_check_policy(mode, nmask))
> + if (contextualize_policy(mode, nmask))
>   return -EINVAL;
>  
>   new = mpol_new(mode, nmask);
> @@ -915,10 +917,6 @@ asmlinkage long sys_mbind(unsigned long 
>   err = get_nodes(, nmask, maxnode);
>   if (err)
>   return err;
> -#ifdef CONFIG_CPUSETS
> - /* Restrict the nodes to the allowed nodes in the cpuset */
> - nodes_and(nodes, nodes, current->mems_allowed);
> -#endif
>   return do_mbind(start, len, mode, , flags);
>  }
>  

Looks good, thanks for doing this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/4] forcedeth: fix MAC address detection on network card (regression in 2.6.23)

2008-02-05 Thread Michael Pyne

On Monday 04 February 2008, Ayaz Abdulla wrote:
> Jeff Garzik wrote:
> > Ayaz Abdulla wrote:
> >> I believe Michael determined that a newer BIOS fixes this issue.
> >
> > That's a solution that makes vendors happy... but we still have to deal
> > with it in Linux.  There are plenty of the old broken BIOS still out in
> > the field...
> >
> > Jeff
>
> Michael, can you provide which BIOS version had this issue and which
> version fixed the issue?

Ayaz,

One of my earlier messages to the list was from BIOS revision F3 from what I 
can tell (which matches pretty well with what I remember having).  I am 
currently on F8.

I may go back to F3 if I can get booting from USB to work just to verify 
because I could have sworn it was still broken after going to F8.  But since 
unpatched Linux 2.6.23.12 apparently works fine and I'm not sure when exactly 
that happened (I use Ketchup to maintain the sources and somewhere it 
unpatched my forcedeth.c :) I want to double-check that a simple BIOS upgrade 
will solve it.

But I also don't have a lot of time before I go underway for a few months. :-/

Regards,
 - Michael Pyne

signature.asc
Description: This is a digitally signed message part.

Re: [PATCH mm] stop c_p_a corrupting the pds

2008-02-05 Thread Valdis . Kletnieks

On Tue, 05 Feb 2008 22:27:21 GMT, Hugh Dickins said:
> When change_page_attr splits a large page on x86_32 (without PAE), it is
> currently corrupting every process's page directory: fix that by removing
> the thinko which passes down a physical instead of a virtual address -
> this version of the patch being the hotfix for 2.6.24-mm1.
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>


I *knew* there was a reason we should have had this patch series in -mm for a 
while.


:)


pgpp66He9KgMJ.pgp
Description: PGP signature

Re: [PATCH] badness() dramatically overcounts memory

2008-02-05 Thread David Rientjes

On Wed, 6 Feb 2008, KOSAKI Motohiro wrote:

> > Andrea Arcangeli has patches pending which change this to the RSS.  
> > Specifically:
> > 
> > http://marc.info/?l=linux-mm=119977937126925
> 
> I agreed with you that RSS is better :)
> 
> 
> 
> but..
> on many node numa, per zone rss is more better..
> 

It depends on how your applications are taking advantage of NUMA 
optimizations.  If they're constrained by mempolicies to a subset of nodes 
then the badness scoring isn't even used: the task that triggered the OOM 
condition is the one that is automatically killed.

At this point, I think you're going to need to present an actual case 
study where Andrea's patch isn't sufficient for selecting the appropriate 
task on large NUMA machines.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH,RESEND] Basic braille screen reader support

2008-02-05 Thread Samuel Thibault

Andrew Morton, le Tue 05 Feb 2008 16:58:53 -0800, a écrit :
> On Tue, 5 Feb 2008 22:00:54 +
> Samuel Thibault <[EMAIL PROTECTED]> wrote:
> 
> > This adds a minimalistic braille screen reader support.
> > This is meant to be used by blind people e.g. on boot failures or when /
> > cannot be mounted etc and thus the userland screen readers can not work.
> 
> Could you feed it through scritps/checkpatch.pl please?  That finds a lot
> of trivial stuff which we'd prefer be fixed.

Oops, sorry, it's probably been some time since I last read
SubmittingPatches.  Actually, this discovered a false positive in
checkpatch.pl for kbd_table :)

> `lastwrite' is a kernel-wide singleton and hence at least needs some
> locking protecting its consistency.
> 
> If this is a single-opener-only device then I _guess_ this approach is OK. 

It is meant to be yes, though it was lacking protection against this.
Fixed in this new version.

> > +   serial8250_console.write(braille_co, data, c - data);
> 
> hm.  Is it appropriate that this driver wire itself directly into
> serial8250?

We want to have output as early as possible for debugging, just like
early serial consoles.

> What if the screen reader is attached to some other sort of
> uart, or a terminal server, or...

Indeed that's an issue.  For now, there is no clean way to attach to the
early serial drivers, that's why I chose 8250, which should be correct
99% of the time for the current users of this.  We could add a parameter
to the console=brl option that specifies which early serial console to
use.

> Maybe this all should be implemented as a line discipline, or something
> like that?

For a permanent screen reader, yes (that's what we will probably do for
SpeakUp), but for an boot reader, I don't think it may even work.

> > +#ifndef MODULE
> > +   ret = serial8250_console.setup(co, options);
> > +   if (ret < 0)
> > +   return ret;
> > +#endif
> 
> That's pretty ungainly.

The problem is that there is currently no better way to setup the serial
port so early.

> Again, if we had some clear spearation between the
> protocol layer and the device-driver layer and some way of binding them
> under userspace control (like a line discipline), all this would get better.

Again, we need the screen reader working during boot, even before init
exists.  A line discipline would indeed be fine if we had userspace
control, but this tool is precisely intended for the case when userspace
can't be booted :)

> > @@ -0,0 +1,22 @@
> > +menuconfig A11Y
> > +   bool "Accessibility support"
> 
> That's cute,

Well, that's the official name (www.a11y.org) :)

> but perhaps we should be boring and call it
> CONFIG_ACCESSIBILITY.  That would be more accessible ;)

Why not :)

> > +   ---help---
> > + Enable a submenu where accessibility items may be enabled.
> > +
> > + If unsure, say N.
> > +
> > +if A11Y
> > +config A11Y_BRAILLE_CONSOLE
> 
> And that would get very lengthy.

Then keep A11Y here?

Here is a revised patch.

Samuel

This adds a minimalistic braille screen reader support.
This is meant to be used by blind people e.g. on boot failures or when /
cannot be mounted etc and thus the userland screen readers can not work.

Signed-off-by: Samuel Thibault <[EMAIL PROTECTED]>

--- linux-2.6.24-orig/drivers/char/consolemap.c 2008-01-25 08:32:05.0 
+
+++ linux-2.6.24-perso/drivers/char/consolemap.c2008-02-03 
21:27:04.0 +
@@ -277,6 +277,7 @@
return p->inverse_translations[m][glyph];
}
 }
+EXPORT_SYMBOL_GPL(inverse_translate);
 
 static void update_user_maps(void)
 {
--- linux-2.6.24-orig/drivers/char/vt.c 2008-01-25 08:32:06.0 +
+++ linux-2.6.24-perso/drivers/char/vt.c2008-02-03 21:27:04.0 
+
@@ -3982,6 +3982,7 @@
c |= 0x100;
return c;
 }
+EXPORT_SYMBOL_GPL(screen_glyph);
 
 /* used by vcs - note the word offset */
 unsigned short *screen_pos(struct vc_data *vc, int w_offset, int viewed)
--- linux-2.6.24-orig/drivers/char/keyboard.c   2008-01-25 08:32:06.0 
+
+++ linux-2.6.24-perso/drivers/char/keyboard.c  2008-02-04 02:44:37.0 
+
@@ -110,6 +110,7 @@
 const int NR_TYPES = ARRAY_SIZE(max_vals);
 
 struct kbd_struct kbd_table[MAX_NR_CONSOLES];
+EXPORT_SYMBOL_GPL(kbd_table);
 static struct kbd_struct *kbd = kbd_table;
 
 struct vt_spawn_console vt_spawn_con = {
@@ -260,6 +261,7 @@
} else
kd_nosound(0);
 }
+EXPORT_SYMBOL_GPL(kd_mksound);
 
 /*
  * Setting the keyboard rate.
--- linux-2.6.24-orig/drivers/Kconfig   2008-01-25 08:32:04.0 +
+++ linux-2.6.24-perso/drivers/Kconfig  2008-02-04 01:32:17.0 +
@@ -95,4 +95,6 @@
 source "drivers/uio/Kconfig"
 
 source "drivers/virtio/Kconfig"
+
+source "drivers/a11y/Kconfig"
 endmenu
--- linux-2.6.24-orig/drivers/Makefile  2008-01-25 08:32:04.0 +
+++ linux-2.6.24-perso/drivers/Makefile 2008-02-04 01:33:27.0 +
@@ -28,6 +28,8

Re: [PATCH 2.6.24-rc8-mm1 09/15] (RFC) IPC: new kernel API to change an ID

2008-02-05 Thread Oren Laadan




Serge E. Hallyn wrote:

Quoting Oren Laadan ([EMAIL PROTECTED]):

I strongly second Kirill on this matter.

IMHO, we should _avoid_ as much as possible exposing internal kernel
state to applications, unless a _real_ need for it is _clearly_
demonstrated. The reasons for this are quite obvious.


Hmm, sure, but this sentence is designed to make us want to agree.  Yes,
we want to avoid exporting kernel internals, but generally that means
things like the precise layout of the task_struct.  What Pierre is doing
is in fact the opposite, exporting resource information in a kernel
version invariant way.


LOL ... a bit of misunderstanding - let me put some order here:

my response what with respect to the new interface that Pierre
suggested, that is - to add a new IPC call to change an identifier
after it has been allocated (and assigned). This is necessary for the
restart because applications expect to see the same resource id's as
they had at the time of the checkpoint.

What you are referring to is the more recent part of the thread, where
the topic became how data should be saved - in other words, the format
of the checkpoint data. This is entirely orthogonal to my argument.

Now please re-read my email :)

That said, I'd advocate for something in between a raw dump and a pure
"parametric" representation of the data. Raw data tends to be, well,
too raw, which makes the task of reading data from older version by
newer kernels harder to maintain. On the other hand, it is impossible
to abstract everything into kernel-independent format.



In fact, the very reason not to go the route you and Pavel are
advocating is that if we just dump task state to a file or filesystem
from the kernel in one shot, we'll be much more tempted to lay out data
in a way that exports and ends up depending on kernel internals.  So
we'll just want to read and write the task_struct verbatim.

So, there are two very different approaches we can start with.
Whichever one we follow, we want to avoid having kernel version
dependencies.  They both have their merits to be sure.


You will never be able to avoid that completely, simply because new
kernels will require saving more (or less) data per object, because
of new (or dropped) features.
The best solution in this sense is to provide a filter (hopefully
in user space, utility) that would convert a checkpoint image file
from the old format to a newer format.
And you keep a lot of compatibility code of the kernel, too.



But note that in either case we need to deal with a bunch of locking.
So getting back to Pierre's patchset, IIRC 1-8 are cleanups worth
doing no matter 1.  9-11 sound like they are contentuous until
we decide whether we want to go with a create_with_id() type approach
or a set_id().  12 is IMO a good locking cleanup regardless.  13 and
15 are contentous until we decide whether we want userspace-controlled
checkpoint or a one-shot fs.  14 IMO is useful for both c/r approaches.

Is that pretty accurate?


(context switch back to my original reply)

I prefer not to add a new interface to IPC that will provide a new
functionality that isn't needed, except for the checkpoint - because
there is a better alternative to do the same task; this alternative
is more suitable because (a) it can be applied incrementally, (b) it
provides a consistent method to pre-select identifiers of all syscalls,
(where is the current suggestion suggests one way for IPC and will
suggest other hacks for other resources).

(context switch back to the current reply)

I definitely welcome a cleanup of the (insanely multiplexedd) IPC
code. However I argue that the interface need not be extended.




It isn't strictly necessary to export a new interface in order to
support checkpoint/restart. **. Hence, I think that the speculation
"we may need it in the future" is too abstract and isn't a good
excuse to commit to a new, currently unneeded, interface.


OTOH it did succeed in starting some conversation :)


Should the
need arise in the future, it will be easy to design a new interface
(also based on aggregated experience until then).


What aggregated experience?  We have to start somewhere...


:)  well, assuming the selection of resource IDs is done as I suggested,
we'll have the restart use it. If someone finds a good reason (other
than checkpoint/restart) to pre-select/modify an identifier, it will
be easy to _then_ add an interface. That (hypothetical) interface is
likely to come out more clever after X months using checkpoint/restart.




** In fact, the suggested interface may prove problematic (as noted
earlier in this thread): if you first create the resource with some
arbitrary identifier and then modify the identifier (in our case,
IPC id), then the restart procedure is bound to execute sequentially,
because of lack of atomicity.


Hmm?  Lack of atomicity wrt what?  All the tasks being restarted were
checkpointed at the same time so there will be no conflict in the
requested IDs, so I don't

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger

On Wed, 2008-02-06 at 10:29 +0900, FUJITA Tomonori wrote:
> On Tue, 05 Feb 2008 18:09:15 +0100
> Matteo Tescione <[EMAIL PROTECTED]> wrote:
> 
> > On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:
> > 
> > > On Tue, 05 Feb 2008 08:14:01 +0100
> > > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > > 
> > >> James Bottomley schrieb:
> > >> 
> > >>> These are both features being independently worked on, are they not?
> > >>> Even if they weren't, the combination of the size of SCST in kernel plus
> > >>> the problem of having to find a migration path for the current STGT
> > >>> users still looks to me to involve the greater amount of work.
> > >> 
> > >> I don't want to be mean, but does anyone actually use STGT in
> > >> production? Seriously?
> > >> 
> > >> In the latest development version of STGT, it's only possible to stop
> > >> the tgtd target daemon using KILL / 9 signal - which also means all
> > >> iSCSI initiator connections are corrupted when tgtd target daemon is
> > >> started again (kernel upgrade, target daemon upgrade, server reboot 
> > >> etc.).
> > > 
> > > I don't know what "iSCSI initiator connections are corrupted"
> > > mean. But if you reboot a server, how can an iSCSI target
> > > implementation keep iSCSI tcp connections?
> > > 
> > > 
> > >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> > >> server. Not only that - your data is probably corrupted, or at least the
> > >> filesystem deserves checking...
> > 

The TCP connection will drop, remember that the TCP connection state for
one side has completely vanished.  Depending on iSCSI/iSER
ErrorRecoveryLevel that is set, this will mean:

1) Session Recovery, ERL=0 - Restarting the entire nexus and all
connections across all of the possible subnets or comm-links.  All
outstanding un-StatSN acknowledged commands will be returned back to the
SCSI subsystem with RETRY status.  Once a single connection has been
reestablished to start the nexus, the CDBs will be resent.

2) Connection Recovery, ERL=2 - CDBs from the failed connection(s) will
be retried (nothing changes in the PDU) to fill the iSCSI CmdSN ordering
gap, or be explictly retried with TMR TASK_REASSIGN for ones already
acknowledged by the ExpCmdSN that are returned to the initiator in
response packets or by way of unsolicited NopINs.

> > Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> > rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> > manages stop/crash, by sending unit attention to clients on reconnect.
> > Drbd+heartbeat correctly manages those things too.
> > Still from an end-user POV, i was able to reboot/survive a crash only with
> > SCST, IETD still has reconnect problems and STGT are even worst.
> 
> Please tell us on stgt-devel mailing list if you see problems. We will
> try to fix them.
> 

FYI, the LIO code also supports rmmoding iscsi_target_mod while at full
10 Gb/sec speed.  I think it should be a requirement to be able to
control per initiator, per portal group, per LUN, per device, per HBA in
the design without restarting any other objects.

--nab

> Thanks,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ide-pci-generic: kill the unused ifdef/endif/MODULE code

2008-02-05 Thread Bartlomiej Zolnierkiewicz

On Saturday 02 February 2008, Denis Cheng wrote:
> with module_param macro, the __setup code can be killed now:
>   const __setup("all-generic-ide", ide_generic_all_on);
> 
> and the module name "generic.ko" is not descriptive to its functionality,
> can be changed in Makefile, the "ide-pci-generic.ko" is better.
> 
> the ide-pci-generic.all-generic-ide parameter also documented
> in Documentation/kernel-parameters.txt
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>

applied, thanks

PS the other patch will take same more time to review
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH for review] ACPI: Create /sys/firmware/acpi/interrupts/ counters

2008-02-05 Thread Len Brown

On Tuesday 05 February 2008 18:18, Greg KH wrote:
> On Tue, Feb 05, 2008 at 06:12:09PM -0500, Len Brown wrote:
> > On Tuesday 05 February 2008 17:18, Greg KH wrote:
> > > On Tue, Feb 05, 2008 at 02:30:10AM -0500, Len Brown wrote:
> > > > # cat /sys/firmware/acpi/interrupts/summary
> > > > pm_timer 0
> > > > glbl_lock0
> > > > power_btn0
> > > > sleep_btn0
> > > > rtc  0
> > > > gpe000
> > ...
> > > > gpe1F0
> > > > gpe_hi0
> > > > gpe_total   63
> > > > acpi_irq63
> > > 
> > > Eeek!  Why?  What's wrong with individual files here?
> > 
> > My expectation is that this is a shell interface for debugging,
> > not an API for programs.  ala /proc/interrupts.
> 
> Great, then use debugfs for it.  Please, don't put debug stuff like this
> in sysfs, that's not what it is there for.  You can do whatever you want
> in debugfs :)

Can you point to a model of good behaviour that I can copy?

note that I want this information to be available on every system,
just like /proc/interrupts is.

/proc/ has seqfile support, is there a reason I shouldn't use it?
I'd banned additional files from /proc/acpi for a long time
since the directory layout was ill-conceived.  But maybe I
should re-consider the headlong rush to use sysfs?

thanks,
-Len
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] badness() dramatically overcounts memory

2008-02-05 Thread KOSAKI Motohiro

Hi

> > > The interesting thing is the use of total_vm and not the RSS which is 
> > > used as
> > > the basis by the OOM killer. I need to read/understand the code a bit 
> > > more.
> > 
> > RSS makes more sense to me as well.
> 
> Andrea Arcangeli has patches pending which change this to the RSS.  
> Specifically:
> 
>   http://marc.info/?l=linux-mm=119977937126925

I agreed with you that RSS is better :)



but..
on many node numa, per zone rss is more better..


- kosaki


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rft] s2ram wakeup moves to .c, could fix few machines

2008-02-05 Thread Rafael J. Wysocki

On Wednesday, 6 of February 2008, H. Peter Anvin wrote:
> Rafael J. Wysocki wrote:
> >> The asm() for making beeps really need to be moved to a function and 
> >> cleaned up (redone in C using inb()/outb()) if they are to be retained 
> >> at all.
> > 
> > Yes, they are.  For some people they're the only tool to debug broken 
> > resume.
> 
> That's fine, but they should get cleaned up.

I 100% agree.

> /me is tempted to provide a version which can send messages in Morse Code ;)

That would be great.  It could also play some music or something. ;-)

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH for review] ACPI: Create /sys/firmware/acpi/interrupts/ counters

2008-02-05 Thread Len Brown

On Tuesday 05 February 2008 19:44, Bjorn Helgaas wrote:
> On Tuesday 05 February 2008 04:12:09 pm Len Brown wrote:
> > is there
> > a version of cat that prints the file name before
> > the contents of each file?
> 
> I use "grep . *" for this sort of thing.

good tip, bjorn, thanks!

/sys/devices/system/cpu/cpu0/cpufreq # grep . *
affected_cpus:0
cpuinfo_cur_freq:1596000
cpuinfo_max_freq:266
cpuinfo_min_freq:1596000
scaling_available_frequencies:266 2128000 1596000
scaling_available_governors:conservative ondemand userspace powersave 
performance
scaling_cur_freq:1596000
scaling_driver:centrino
scaling_governor:ondemand
scaling_max_freq:266
scaling_min_freq:1596000
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rft] s2ram wakeup moves to .c, could fix few machines

2008-02-05 Thread H. Peter Anvin


Rafael J. Wysocki wrote:
The asm() for making beeps really need to be moved to a function and 
cleaned up (redone in C using inb()/outb()) if they are to be retained 
at all.


Yes, they are.  For some people they're the only tool to debug broken resume.


That's fine, but they should get cleaned up.

/me is tempted to provide a version which can send messages in Morse Code ;)

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger

On Tue, 2008-02-05 at 16:11 -0800, Nicholas A. Bellinger wrote:
> On Tue, 2008-02-05 at 22:21 +0300, Vladislav Bolkhovitin wrote:
> > Jeff Garzik wrote:
> > >>> iSCSI is way, way too complicated. 
> > >>
> > >> I fully agree. From one side, all that complexity is unavoidable for 
> > >> case of multiple connections per session, but for the regular case of 
> > >> one connection per session it must be a lot simpler.
> > > 
> > > Actually, think about those multiple connections...  we already had to 
> > > implement fast-failover (and load bal) SCSI multi-pathing at a higher 
> > > level.  IMO that portion of the protocol is redundant:   You need the 
> > > same capability elsewhere in the OS _anyway_, if you are to support 
> > > multi-pathing.
> > 
> > I'm thinking about MC/S as about a way to improve performance using 
> > several physical links. There's no other way, except MC/S, to keep 
> > commands processing order in that case. So, it's really valuable 
> > property of iSCSI, although with a limited application.
> > 
> > Vlad
> > 
> 
> Greetings,
> 
> I have always observed the case with LIO SE/iSCSI target mode (as well
> as with other software initiators we can leave out of the discussion for
> now, and congrats to the open/iscsi on folks recent release. :-) that
> execution core hardware thread and inter-nexus per 1 Gb/sec ethernet
> port performance scales up to 4x and 2x core x86_64 very well with
> MC/S).  I have been seeing 450 MB/sec using 2x socket 4x core x86_64 for
> a number of years with MC/S.  Using MC/S on 10 Gb/sec (on PCI-X v2.0
> 266mhz as well, which was the first transport that LIO Target ran on
> that was able to reach handle duplex ~1200 MB/sec with 3 initiators and
> MC/S.  In the point to point 10 GB/sec tests on IBM p404 machines, the
> initiators where able to reach ~910 MB/sec with MC/S.  Open/iSCSI was
> able to go a bit faster (~950 MB/sec) because it uses struct sk_buff
> directly. 
> 
 
Sorry, these where IBM p505 express (not p404, duh) which had a 2x
socket 2x core POWER5 setup.  These along with an IBM X-series machine)
where the only ones available for PCI-X v2.0, and this probably is still
the case. :-)

Also, these numbers where with a ~9000 MTU (I don't recall what the
hardware limit on the 10 Gb/sec switch lwas) doing direct struct iovec
to preallocated struct page mapping for payload on the target side.
This is known as RAMDISK_DR plugin in the LIO-SE.  On the initiator, LTP
disktest and O_DIRECT where used for direct to SCSI block device access.

I can big up this paper if anyone is interested.

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

inotify_add_watch() returning ENOSPC in 2.6.24 [watch descriptor leak?]

2008-02-05 Thread Clem Taylor

I'm trying to move a MIPS based embedded system from 2.6.16.16 to
2.6.24. Most things seem to be working, but I'm having troubles with
inotify. The code is using inotify to detect a file written to /tmp
(tmpfs). The writer creates a file with a temporary name and then
rename()s the tmp file over the file I'm monitoring.

With 2.6.16.16, everything works fine, but with 2.6.24, the inotify
process runs for a while (~100 events) and then inotify_add_watch()
returns ENOSPC. Once this happens, I can't add new watches, even if I
kill the process and restart it. fs.inotify.max_user_instances and
fs.inotify.max_user_watches are both 128, so I'd imagine I'm hitting
this limit. For some reason the watches aren't getting cleaned up
(even after the process is killed).

In a loop, the code is doing:
wd = inotify_add_watch(fd, file, IN_CLOSE_WRITE|IN_DELETE_SELF|IN_ONESHOT);
blocking read on notify fd

Has something changed in the inotify() API since 2.6.16.16, or could
this be a leak?

  --Clem
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add IPv6 support to TCP SYN cookies

2008-02-05 Thread Glenn Griffin

I realized an earlier email I sent had an incorrect timestamp and wasn't
associated with the thread, so I thought it would be better to resend.
I apologize if this is duplicated for anyone.

Here is a reworked patch that moves the IPv6 syncookie support out of
the ipv4/syncookies.c file and into it's own ipv6/syncookies.c.  The
same CONFIG options and sysctl variables as ipv4, but this way the code
is isolated to the ipv6 module.


Signed-off-by: Glenn Griffin <[EMAIL PROTECTED]>
---
 include/net/tcp.h |6 +
 net/ipv6/Makefile |1 +
 net/ipv6/syncookies.c |  273 +
 net/ipv6/tcp_ipv6.c   |   77 ++
 4 files changed, 335 insertions(+), 22 deletions(-)
 create mode 100644 net/ipv6/syncookies.c

diff --git a/include/net/tcp.h b/include/net/tcp.h
index cb5b033..d7f620c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -436,6 +436,11 @@ extern struct sock *cookie_v4_check(struct sock *sk, 
struct sk_buff *skb,
 extern __u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, 
 __u16 *mss);
 
+/* From net/ipv6/syncookies.c */
+extern struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb);
+extern __u32 cookie_v6_init_sequence(struct sock *sk, struct sk_buff *skb,
+__u16 *mss);
+
 /* tcp_output.c */
 
 extern void __tcp_push_pending_frames(struct sock *sk, unsigned int cur_mss,
@@ -1337,6 +1342,7 @@ extern int tcp_proc_register(struct tcp_seq_afinfo 
*afinfo);
 extern void tcp_proc_unregister(struct tcp_seq_afinfo *afinfo);
 
 extern struct request_sock_ops tcp_request_sock_ops;
+extern struct request_sock_ops tcp6_request_sock_ops;
 
 extern int tcp_v4_destroy_sock(struct sock *sk);
 
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index 87c23a7..d1a1056 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -15,6 +15,7 @@ ipv6-$(CONFIG_XFRM) += xfrm6_policy.o xfrm6_state.o 
xfrm6_input.o \
 ipv6-$(CONFIG_NETFILTER) += netfilter.o
 ipv6-$(CONFIG_IPV6_MULTIPLE_TABLES) += fib6_rules.o
 ipv6-$(CONFIG_PROC_FS) += proc.o
+ipv6-$(CONFIG_SYN_COOKIES) += syncookies.o
 
 ipv6-objs += $(ipv6-y)
 
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
new file mode 100644
index 000..521c9da
--- /dev/null
+++ b/net/ipv6/syncookies.c
@@ -0,0 +1,273 @@
+/*
+ *  IPv6 Syncookies implementation for the Linux kernel
+ *
+ *  Authors:
+ *  Glenn Griffin  <[EMAIL PROTECTED]>
+ *
+ *  Based on IPv4 implementation by Andi Kleen
+ *  linux/net/ipv4/syncookies.c
+ *
+ * This program is free software; you can redistribute it and/or
+ *  modify it under the terms of the GNU General Public License
+ *  as published by the Free Software Foundation; either version
+ *  2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+extern int sysctl_tcp_syncookies;
+
+static __u32 syncookie_secret[2][16-10+SHA_DIGEST_WORDS];
+
+static __init int init_syncookies(void)
+{
+   get_random_bytes(syncookie_secret, sizeof(syncookie_secret));
+   return 0;
+}
+module_init(init_syncookies);
+
+#define COOKIEBITS 24  /* Upper bits store count */
+#define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
+
+/*
+ * This table has to be sorted and terminated with (__u16)-1.
+ * XXX generate a better table.
+ * Unresolved Issues: HIPPI with a 64k MSS is not well supported.
+ * 
+ * Taken directly from ipv4 implementation. 
+ * Should this list be modified for ipv6 use or is it close enough?
+ * rfc 2460 8.3 suggests mss values 20 bytes less than ipv4 counterpart
+ */
+static __u16 const msstab[] = {
+   64 - 1,
+   256 - 1,
+   512 - 1,
+   536 - 1,
+   1024 - 1,
+   1440 - 1,
+   1460 - 1,
+   4312 - 1,
+   (__u16)-1
+};
+/* The number doesn't include the -1 terminator */
+#define NUM_MSS (ARRAY_SIZE(msstab) - 1)
+
+/*
+ * This (misnamed) value is the age of syncookie which is permitted.
+ * Its ideal value should be dependent on TCP_TIMEOUT_INIT and
+ * sysctl_tcp_retries1. It's a rather complicated formula (exponential
+ * backoff) to compute at runtime so it's currently hardcoded here.
+ */
+#define COUNTER_TRIES 4
+
+static inline struct sock *get_cookie_sock(struct sock *sk, struct sk_buff 
*skb,
+  struct request_sock *req,
+  struct dst_entry *dst)
+{
+   struct inet_connection_sock *icsk = inet_csk(sk);
+   struct sock *child;
+
+   child = icsk->icsk_af_ops->syn_recv_sock(sk, skb, req, dst);
+   if (child)
+   inet_csk_reqsk_queue_add(sk, req, child);
+   else
+   reqsk_free(req);
+
+   return child;
+}
+
+static u32 cookie_hash(struct in6_addr *saddr, struct in6_addr *daddr,
+  __be16 sport, __be16 dport, u32 count, int c)
+{
+   __u32 tmp[16 + 5 + SHA_WORKSPACE_WORDS];
+
+   /*

Re: [rft] s2ram wakeup moves to .c, could fix few machines

2008-02-05 Thread Rafael J. Wysocki

On Wednesday, 6 of February 2008, H. Peter Anvin wrote:
> Rafael J. Wysocki wrote:
> > On Tuesday, 5 of February 2008, Pavel Machek wrote:
> >> This rewrites wakeup code to .c, and it fixes stack (should use movl
> >> ,%esp, not movw). Testers wanted. Makefile infrastructure was done by
> >> hpa, cleanups by rjw.
> > 
> > I'll test it tomorrow and I still have some more cleanups (I was distracted 
> > by
> > a nasty scheduler issue in the current mainline).
> 
> The asm() for making beeps really need to be moved to a function and 
> cleaned up (redone in C using inb()/outb()) if they are to be retained 
> at all.

Yes, they are.  For some people they're the only tool to debug broken resume.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rft] s2ram wakeup moves to .c, could fix few machines

2008-02-05 Thread H. Peter Anvin


Rafael J. Wysocki wrote:

On Tuesday, 5 of February 2008, Pavel Machek wrote:

This rewrites wakeup code to .c, and it fixes stack (should use movl
,%esp, not movw). Testers wanted. Makefile infrastructure was done by
hpa, cleanups by rjw.


I'll test it tomorrow and I still have some more cleanups (I was distracted by
a nasty scheduler issue in the current mainline).


The asm() for making beeps really need to be moved to a function and 
cleaned up (redone in C using inb()/outb()) if they are to be retained 
at all.


-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/8] sched: rt-group: interface

2008-02-05 Thread Randy Dunlap

On Mon, 04 Feb 2008 22:03:01 +0100 Peter Zijlstra wrote:

> Change the rt_ratio interface to rt_runtime_us, to match rt_period_us.
> This avoids picking a granularity for the ratio.
> 
> Extend the /sys/kernel/uids// interface to allow setting
> the group's rt_runtime.
> 
> Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]>
> ---
>  Documentation/ABI/testing/sysfs-kernel-uids |6 +
>  Documentation/sched-rt-group.txt|   59 +++
>  include/linux/sched.h   |7 -
>  kernel/sched.c  |  145 
> +---
>  kernel/sched_rt.c   |   53 --
>  kernel/sysctl.c |   32 +++---
>  kernel/user.c   |   28 +
>  7 files changed, 250 insertions(+), 80 deletions(-)

> Index: linux-2.6/kernel/sched.c
> ===
> --- linux-2.6.orig/kernel/sched.c
> +++ linux-2.6/kernel/sched.c
> @@ -7780,30 +7783,76 @@ unsigned long sched_group_shares(struct 
>  }
>  
>  /*
> - * Ensure the total rt_ratio <= sysctl_sched_rt_ratio
> + * Ensure that the real time constraints are schedulable.
>   */
> -int sched_group_set_rt_ratio(struct task_group *tg, unsigned long rt_ratio)
> +static DEFINE_MUTEX(rt_constraints_mutex);
> +
> +static unsigned long to_ratio(u64 period, u64 runtime)
> +{
> + if (runtime == RUNTIME_INF)
> + return 1ULL << 16;
> +
> + runtime *= (1ULL << 16);
> + do_div(runtime, period);

Isn't do_div() defined as taking (uint64_t, uint32_t) ?

> + return runtime;
> +}
> +

> Index: linux-2.6/Documentation/sched-rt-group.txt
> ===
> --- /dev/null
> +++ linux-2.6/Documentation/sched-rt-group.txt
> @@ -0,0 +1,59 @@
> +
> +
> +Real-Time group scheduling.
> +
> +The problem space:
> +
> +In order to schedule multiple groups of realtime tasks each group must
> +be assigned a fixed portion of the cpu time available. Without a minimum

Use "cpu" or "CPU" consistently, please.  (I prefer CPU, but )

> +guarantee a realtime group can obviously fall short. A fuzzy upper limit
> +is of no use since it cannot be relied upon. Which leaves us with just
> +the single fixed portion.
> +
> +CPU time is divided by means of specifying how much time can be spend

s/spend/spent/

> +running in a given period. Say a frame fixed realtime renderer must
> +deliver a 25 frames a second, which yields a period of 0.04s. Now say

   drop "a"^

> +it will also have to play some music and respond to input, leaving it
> +with around 80% for the graphics. We can then give this group a runtime
> +of 0.8 * 0.04s = 0.032s.
> +
> +This way the graphics group will have a 0.04s period with a 0.032s runtime
> +limit.
> +
> +Now if the audio thread needs to refill the dma buffer every 0.005s, but

DMA preferably.

> +needs only about 3% cpu time to do so, it will can do with a 0.03 * 0.005s

   s/will can do/can do/

> += 0.00015s.
> +
> +
> +The Interface:
> +
> +system wide:
> +
> +/proc/sys/kernel/sched_rt_period_ms
> +/proc/sys/kernel/sched_rt_runtime_us
> +
> +CONFIG_FAIR_USER_SCHED
> +
> +/sys/kernel/uids//cpu_rt_runtime_us
> +
> +or
> +
> +CONFIG_FAIR_CGROUP_SCHED
> +
> +/cgroup//cpu.rt_runtime_us
> +
> +[ time is specified in us because the interface is s32, this gives an

s/,/;/

> +  operating range of ~35m to 1us ]
> +
> +The period takes values in [ 1, INT_MAX ], runtime in [ -1, INT_MAX - 1 ].
> +
> +A runtime of -1 specifies runtime == period, ie. no limit.
> +
> +New groups get the period from /proc/sys/kernel/sched_rt_period_us and
> +a runtime of 0.
> +
> +Settings are constrainted to:

constrained

> +
> +   \Sum_{i} runtime_{i} / global_period <= global_runtime / global_period
> +
> +in order to keep the configuration schedulable.

---
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 18:09:15 +0100
Matteo Tescione <[EMAIL PROTECTED]> wrote:

> On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >> 
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> 
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >> 
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> > 
> > 
> >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> >> server. Not only that - your data is probably corrupted, or at least the
> >> filesystem deserves checking...
> 
> Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> manages stop/crash, by sending unit attention to clients on reconnect.
> Drbd+heartbeat correctly manages those things too.
> Still from an end-user POV, i was able to reboot/survive a crash only with
> SCST, IETD still has reconnect problems and STGT are even worst.

Please tell us on stgt-devel mailing list if you see problems. We will
try to fix them.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rft] s2ram wakeup moves to .c, could fix few machines

2008-02-05 Thread Rafael J. Wysocki

On Tuesday, 5 of February 2008, Pavel Machek wrote:
> 
> This rewrites wakeup code to .c, and it fixes stack (should use movl
> ,%esp, not movw). Testers wanted. Makefile infrastructure was done by
> hpa, cleanups by rjw.

I'll test it tomorrow and I still have some more cleanups (I was distracted by
a nasty scheduler issue in the current mainline).

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: New RTC drivers don't provide the sysctl to set max user frequency

2008-02-05 Thread Lee Revell

On Feb 5, 2008 5:54 PM, Chuck Ebbert <[EMAIL PROTECTED]> wrote:
> Is every application that uses /proc/sys/dev/rtc/max-user-freq
> supposed to be updated to use the new /sys interface?

IMHO the default should be increased to 1024 - the current default of
64 dates back to the 486 era.  This would eliminate the need for a lot
of apps to touch it at all.

Unprivileged users have lots of other ways to generate more than 1024
interrupts/second anyway.

Lee
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Regression] 2.6.24-git9: RT sched mishandles artswrapper (bisected)

2008-02-05 Thread Rafael J. Wysocki

On Tuesday, 5 of February 2008, Dmitry Adamushko wrote:
> Rafael, any progress with this issue? (a few questions below).
> 
> > >
> > > Does this artsmessage thing also run with RT priority?
> >
> > Well, it's in a strange state (after it's broken).  From top:
> >
> > PR = -51
> > NI = 0
> > S = R
> > %CPU = 0.0
> > %MEM = 0.0
> 
> cat /proc/$PID/stat ; sleep 3; cat /proc/$PID/stat ?
> cat /proc/sched_debug; sleep 3 ; cat /proc/sched_debug

Well, instead please find appended a test program that allows me to trigger
the issue.

To reproduce it do:

$ gcc -o break_scheduler break_scheduler.c
$ su -
[...]
# chown root.root $PATH_TO_BINARY/break_scheduler
# chmod u+s $PATH_TO_BINARY/break_scheduler
^D
$ ./break_scheduler

It behaves normally if run directly by root and it also behaves normally if
the execv() at the end is removed.

Hope that helps to understand what the problem is.

Thanks,
Rafael

---
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define EXECUTE "/bin/ls"

void adjust_priority()
{
int sched = sched_getscheduler(0);

if(sched == SCHED_FIFO || sched == SCHED_RR) {
puts(">> non-standard scheduling policy");
} else {
struct sched_param sp;
long int priority = (sched_get_priority_max(SCHED_FIFO) +
 sched_get_priority_min(SCHED_FIFO))/2;

sp.sched_priority = priority;

if (sched_setscheduler(0, SCHED_FIFO, ) != -1) {
printf(">> running as realtime process now "
"(priority %ld)\n", priority);
} else {
/* can't set realtime priority */
puts(">> could not set realtime priority");
}
}
}

int main(int argc, char **argv)
{
adjust_priority();

/* drop root privileges if running setuid root
   (due to realtime priority stuff) */
if (geteuid() != getuid()) {
seteuid(getuid());
if (geteuid() != getuid()) {
perror("setuid()");
return 2;
}
}

puts("OK");

if(argc == 0)
return 1;

argv[0] = EXECUTE;
execv(EXECUTE, argv);
perror(EXECUTE);

return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86: add a crc32 checksum to the kernel image.

2008-02-05 Thread Randy Dunlap

On Fri, 01 Feb 2008 09:02:48 + Ian Campbell wrote:

> ---
> >From 1c614383dc9cb0c7791ebab386dc012db336b28c Mon Sep 17 00:00:00 2001
> From: Ian Campbell <[EMAIL PROTECTED]>
> Date: Fri, 1 Feb 2008 09:01:22 +
> Subject: [PATCH] x86: add a crc32 checksum to the kernel image.
> 
> Signed-off-by: Ian Campbell <[EMAIL PROTECTED]>
> Cc: Thomas Gleixner <[EMAIL PROTECTED]>
> Cc: Ingo Molnar <[EMAIL PROTECTED]>
> Cc: H. Peter Anvin <[EMAIL PROTECTED]>
> ---
>  Documentation/i386/boot.txt |7 +++
>  arch/x86/boot/tools/build.c |   88 
> ++-
>  2 files changed, 94 insertions(+), 1 deletions(-)
> 
> diff --git a/Documentation/i386/boot.txt b/Documentation/i386/boot.txt
> index b5f5ba1..f853d49 100644
> --- a/Documentation/i386/boot.txt
> +++ b/Documentation/i386/boot.txt
> @@ -531,6 +531,13 @@ Protocol:2.08+
>  
>The length of the compressed payload.
>  
> + THE IMAGE CHECKSUM
> +
> +The CRC-32 is calculated over the entire file using an initial
> +remainder of 0x.  The checksum is appended to the file

Run-on sentences.
Use period (full stop) or semi-colon at end of above sentence.
If using a period, capitalize the next word.  Thanks.


> +therefore the CRC of the file up to the limit specified in the syssize
> +field of the header is always 0.
> +
>   THE KERNEL COMMAND LINE
>  
>  The kernel command line has become an important way for the boot

---
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] mtd maps: document MTD_PHYSMAP module name in kconfig

2008-02-05 Thread Mike Frysinger

Help out users by telling them the module name in the Kconfig help when using
the MTD_PHYSMAP option.

Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]>
---
diff --git a/drivers/mtd/maps/Kconfig b/drivers/mtd/maps/Kconfig
index a592fc0..2f15330 100644
--- a/drivers/mtd/maps/Kconfig
+++ b/drivers/mtd/maps/Kconfig
@@ -21,6 +21,9 @@ config MTD_PHYSMAP
  particular board as well as the bus width, either statically
  with config options or at run-time.
 
+ To compile this driver as a module, choose M here: the
+ module will be called physmap.
+
 config MTD_PHYSMAP_START
hex "Physical start address of flash mapping"
depends on MTD_PHYSMAP
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: tg3 broken after "PCI: Fix bus resource assignment on 32 bits with 64b resources"

2008-02-05 Thread Benjamin Herrenschmidt

Looks like your attachment got currupted for some reason (got here as a
corrupted bzip2 file that was encoded as plain text).

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH,RESEND] Basic braille screen reader support

2008-02-05 Thread Andrew Morton

On Tue, 5 Feb 2008 22:00:54 +
Samuel Thibault <[EMAIL PROTECTED]> wrote:

> This adds a minimalistic braille screen reader support.
> This is meant to be used by blind people e.g. on boot failures or when /
> cannot be mounted etc and thus the userland screen readers can not work.

Could you feed it through scritps/checkpatch.pl please?  That finds a lot
of trivial stuff which we'd prefer be fixed.

> +#define beep(freq) do { if (sound) kd_mksound(freq, HZ/10); } while(0)

This can (and hence should!) be impemented in a C function (I think?).

> +
> +/* mini console */
> +#define WIDTH 40
> +#define BRAILLE_KEY KEY_INSERT
> +static u16 console_buf[WIDTH];
> +static int console_cursor = 0;

Unneeded initialisation (checkpatch will tell you about this)

> +/* mini view of VC */
> +static int vc_x, vc_y, lastvc_x, lastvc_y;
> +
> +/* show console ? (or show VC) */
> +static int console_show = 1;
> +/* pending newline ? */
> +static int console_newline = 1;
> +static int lastVC = -1;
> +
> +static struct console *braille_co;
> +
> +/* Very VisioBraille-specific */
> +static void braille_write(u16 *buf)
> +{
> + static u16 lastwrite[WIDTH];
> + unsigned char data[1 + 1 + 2*WIDTH + 2 + 1], csum = 0, *c;
> + u16 out;
> + int i;
> +
> + if (!braille_co)
> + return;
> +
> + if (!memcmp(lastwrite, buf, WIDTH * sizeof(*buf)))
> + return;
> + memcpy(lastwrite, buf, WIDTH * sizeof(*buf));

`lastwrite' is a kernel-wide singleton and hence at least needs some
locking protecting its consistency.

If this is a single-opener-only device then I _guess_ this approach is OK. 
If not, `lastwrite' really should be some dynamically-allocated, per-open thing,
presumably accessed by file.private_data.

> +#define SOH 1
> +#define STX 2
> +#define ETX 2
> +#define EOT 4
> +#define ENQ 5
> + data[0] = STX;
> + data[1] = '>';
> + csum ^= '>';
> + c = [2];
> + for (i=0; i + out = buf[i];
> + if (out >= 0x100)
> + out = '?';
> + else if (out == 0x00)
> + out = ' ';
> + csum ^= out;
> + if (out <= 0x05) {
> + *c++ = SOH;
> + out |= 0x40;
> + }
> + *c++ = out;
> + }
> +
> + if (csum <= 0x05) {
> + *c++ = SOH;
> + csum = 0x40;
> + }
> + *c++ = csum;
> + *c++ = ETX;
> +
> + serial8250_console.write(braille_co, data, c - data);
> +}

hm.  Is it appropriate that this driver wire itself directly into
serial8250?  What if the screen reader is attached to some other sort of
uart, or a terminal server, or...

Maybe this all should be implemented as a line discipline, or something
like that?

> +/*
> + * Link to keyboard
> + */
> +
> +static int keyboard_notifier_call(struct notifier_block *blk, unsigned long 
> code, void *_param)
> +{
> + struct keyboard_notifier_param *param = _param;
> + struct vc_data *vc = param->vc;
> + int ret = NOTIFY_OK;
> +
> + if (!param->down)
> + return ret;
> +
> + switch (code) {
> + case KBD_KEYCODE:

Maybe Dmitry and Jiri would have time to review this code?

> +static struct notifier_block keyboard_notifier_block = {
> + .notifier_call = keyboard_notifier_call,
> +};
> +
> +static int vt_notifier_call(struct notifier_block *blk, unsigned long code, 
> void *_param)
> +{
> + struct vt_notifier_param *param = _param;
> + struct vc_data *vc = param->vc;
> + switch (code) {
> + case VT_ALLOCATE:
> + break;
> + case VT_DEALLOCATE:
> + break;
> + case VT_WRITE:
> + {
> + unsigned char c = param->c;
> + switch (c) {
> + case '\b':
> + case 127:
> + if (console_cursor > 0) {
> + console_cursor--;
> + console_buf[console_cursor] = ' 
> ';
> + }
> + break;
> + case '\n':
> + case '\v':
> + case '\f':
> + case '\r':
> + console_newline = 1;
> + break;
> + case '\t':
> + c = ' ';
> + /* Fallthrough */
> + default:
> + if (c < 32)
> + /* Ignore other control 
> sequences */
> + break;
> + if (console_newline) {
> +

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger

On Tue, 2008-02-05 at 16:48 -0800, Nicholas A. Bellinger wrote:
> On Tue, 2008-02-05 at 22:01 +0300, Vladislav Bolkhovitin wrote:
> > Jeff Garzik wrote:
> > > Alan Cox wrote:
> > > 
> > >>>better. So for example, I personally suspect that ATA-over-ethernet is 
> > >>>way 
> > >>>better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
> > >>>low-level, and against those crazy SCSI people to begin with.
> > >>
> > >>Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > >>would probably trash iSCSI for latency if nothing else.
> > > 
> > > 
> > > AoE is truly a thing of beauty.  It has a two/three page RFC (say no 
> > > more!).
> > > 
> > > But quite so...  AoE is limited to MTU size, which really hurts.  Can't 
> > > really do tagged queueing, etc.
> > > 
> > > 
> > > iSCSI is way, way too complicated. 
> > 
> > I fully agree. From one side, all that complexity is unavoidable for 
> > case of multiple connections per session, but for the regular case of 
> > one connection per session it must be a lot simpler.
> > 
> > And now think about iSER, which brings iSCSI on the whole new complexity 
> > level ;)
> 
> Actually, the iSER protocol wire protocol itself is quite simple,
> because it builds on iSCSI and IPS fundamentals, and because traditional
> iSCSI's recovery logic for CRC failures (and hence alot of
> acknowledgement sequence PDUs that go missing, etc) and the RDMA
> Capable
> Protocol (RCaP).

this should be:

.. and instead the RDMA Capacle Protocol (RCaP) provides the 32-bit or
greater data integrity.

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] usb net: asix does not really need 10/100mbit

2008-02-05 Thread Mike Frysinger

The asix usb driver currently depends on NET_ETHERNET which means you cannot
enable this driver if you only have 1000mbit enabled in your kernel.  Since
there is no real dependency between the NET_ETHERNET portion and the asix
driver, simply drop it.

Signed-off-by: Mike Frysinger <[EMAIL PROTECTED]>
---
diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig
index a12c9c4..0604f3f 100644
--- a/drivers/net/usb/Kconfig
+++ b/drivers/net/usb/Kconfig
@@ -129,7 +129,7 @@ config USB_USBNET
 
 config USB_NET_AX8817X
tristate "ASIX AX88xxx Based USB 2.0 Ethernet Adapters"
-   depends on USB_USBNET && NET_ETHERNET
+   depends on USB_USBNET
select CRC32
default y
help
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger

On Tue, 2008-02-05 at 22:01 +0300, Vladislav Bolkhovitin wrote:
> Jeff Garzik wrote:
> > Alan Cox wrote:
> > 
> >>>better. So for example, I personally suspect that ATA-over-ethernet is way 
> >>>better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
> >>>low-level, and against those crazy SCSI people to begin with.
> >>
> >>Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> >>would probably trash iSCSI for latency if nothing else.
> > 
> > 
> > AoE is truly a thing of beauty.  It has a two/three page RFC (say no more!).
> > 
> > But quite so...  AoE is limited to MTU size, which really hurts.  Can't 
> > really do tagged queueing, etc.
> > 
> > 
> > iSCSI is way, way too complicated. 
> 
> I fully agree. From one side, all that complexity is unavoidable for 
> case of multiple connections per session, but for the regular case of 
> one connection per session it must be a lot simpler.
> 
> And now think about iSER, which brings iSCSI on the whole new complexity 
> level ;)

Actually, the iSER protocol wire protocol itself is quite simple,
because it builds on iSCSI and IPS fundamentals, and because traditional
iSCSI's recovery logic for CRC failures (and hence alot of
acknowledgement sequence PDUs that go missing, etc) and the RDMA Capable
Protocol (RCaP).

The logic that iSER collectively disables is known as within-connection
and within-command recovery (negotiated as ErrorRecoveryLevel=1 on the
wire), RFC-5046 requires that the iSCSI layer that iSER is being enabled
to disable CRC32C checksums and any associated timeouts for ERL=1.

Also, have a look at Appendix A. in the iSER spec.

  A.1. iWARP Message Format for iSER Hello Message ...73
  A.2. iWARP Message Format for iSER HelloReply Message ..74
  A.3. iWARP Message Format for SCSI Read Command PDU 75
  A.4. iWARP Message Format for SCSI Read Data ...76
  A.5. iWARP Message Format for SCSI Write Command PDU ...77
  A.6. iWARP Message Format for RDMA Read Request 78
  A.7. iWARP Message Format for Solicited SCSI Write Data 79
  A.8. iWARP Message Format for SCSI Response PDU 80

This is about as 1/2 as many traditional iSCSI PDUs, that iSER
encapulates.

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks

2008-02-05 Thread Andrew Morton

On Fri, 25 Jan 2008 22:59:17 GMT
Linux Kernel Mailing List  wrote:

> Gitweb: 
> http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=82a1fcb90287052aabfa235e7ffc693ea003fe69
> Commit: 82a1fcb90287052aabfa235e7ffc693ea003fe69
> Parent: d0d23b5432fe61229dd3641c5e94d4130bc4e61b
> Author: Ingo Molnar <[EMAIL PROTECTED]>
> AuthorDate: Fri Jan 25 21:08:02 2008 +0100
> Committer:  Ingo Molnar <[EMAIL PROTECTED]>
> CommitDate: Fri Jan 25 21:08:02 2008 +0100
> 
> softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks

One of my test boxes (an 8-way x86_64 software-development thing from Intel
- I'm not sure what's inside it) no longer powers itself off when I run `halt
-pfn'.

During bisection I found two different problems.  Sometimes the machine
wouldn't power off at all.  Other times it would power off after a pause of
around twenty seconds.

Bisection indicates that this commit is what caused the 20-second pause. 
It could be that some later commit caused the infinity-second pause.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH for review] ACPI: Create /sys/firmware/acpi/interrupts/ counters

2008-02-05 Thread Bjorn Helgaas

On Tuesday 05 February 2008 04:12:09 pm Len Brown wrote:
> is there
> a version of cat that prints the file name before
> the contents of each file?

I use "grep . *" for this sort of thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [git patches] IDE updates part #4

2008-02-05 Thread Bartlomiej Zolnierkiewicz


Hi Linus,

On Tuesday 05 February 2008, Linus Torvalds wrote:
> 
> On Sat, 2 Feb 2008, Bartlomiej Zolnierkiewicz wrote:
> > 
> > * next part of IDE probing code re-organization saga
> >   (that would be me)
> 
> This seems to cause very irritating and bogus messages for me:
> 
>   Probing IDE interface ide0...
>   Probing IDE interface ide1...
>   ide2: I/O resource 0x0-0x7 not free.
>   ide2: ports already in use, skipping probe
>   ide3: I/O resource 0x0-0x7 not free.
>   ide3: ports already in use, skipping probe
>   ide4: I/O resource 0x0-0x7 not free.
>   ide4: ports already in use, skipping probe
>   ide5: I/O resource 0x0-0x7 not free.
>   ide5: ports already in use, skipping probe
>   ide6: I/O resource 0x0-0x7 not free.
>   ide6: ports already in use, skipping probe
>   ide7: I/O resource 0x0-0x7 not free.
>   ide7: ports already in use, skipping probe
>   ide8: I/O resource 0x0-0x7 not free.
>   ide8: ports already in use, skipping probe
>   ide9: I/O resource 0x0-0x7 not free.
>   ide9: ports already in use, skipping probe
> 
> and that's just totally bogus. It shouldn't even request that region, 
> since it's not been allocated!
> 
> So that "ide_device_add_all()" is missing some checks. Should it check the 
> probe[] array like ideprobe_init() used to, or what?

This is ide-generic problem exhibited by recent ide_device_add_all() changes.
Fix below (it works for me) - you may merge the patch as it is or wait an hour
or so for the next IDE tree pull request.

Also sorry for the issue in the first place (it turned out that it slipped
my testing because I has been running with IDE_MAX_HWIFS=2 for some time :).


From: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
Subject: [PATCH] ide-generic: probing bugfix

On Tuesday 05 February 2008, Linus Torvalds wrote:
> 
> On Sat, 2 Feb 2008, Bartlomiej Zolnierkiewicz wrote:
> > 
> > * next part of IDE probing code re-organization saga
> >   (that would be me)
> 
> This seems to cause very irritating and bogus messages for me:
> 
>   Probing IDE interface ide0...
>   Probing IDE interface ide1...
>   ide2: I/O resource 0x0-0x7 not free.
>   ide2: ports already in use, skipping probe
>   ide3: I/O resource 0x0-0x7 not free.
>   ide3: ports already in use, skipping probe
>   ide4: I/O resource 0x0-0x7 not free.
>   ide4: ports already in use, skipping probe
>   ide5: I/O resource 0x0-0x7 not free.
>   ide5: ports already in use, skipping probe
>   ide6: I/O resource 0x0-0x7 not free.
>   ide6: ports already in use, skipping probe
>   ide7: I/O resource 0x0-0x7 not free.
>   ide7: ports already in use, skipping probe
>   ide8: I/O resource 0x0-0x7 not free.
>   ide8: ports already in use, skipping probe
>   ide9: I/O resource 0x0-0x7 not free.
>   ide9: ports already in use, skipping probe
> 
> and that's just totally bogus. It shouldn't even request that region, 
> since it's not been allocated!

The commit 139ddfcab50e5eabcc88341c8743a990ac1be6a2 ("ide: move handling of
I/O resources out of ide_probe_port()") changed the ordering of hwif->noprobe
check vs ide_hwif_request_regions() call (so that we now reserve I/O regions
before checking for hwif->noprobe).  However ide-generic host driver depended
on hwif->noprobe to be set for skipping probing of empty ide_hwifs[] slots.

Fix it by passing only indexes of non-empty slots to ide_device_add_all()
from ide_generic_init().

Signed-off-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
---
 drivers/ide/ide-generic.c |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

Index: b/drivers/ide/ide-generic.c
===
--- a/drivers/ide/ide-generic.c
+++ b/drivers/ide/ide-generic.c
@@ -20,8 +20,14 @@ static int __init ide_generic_init(void)
if (ide_hwifs[0].io_ports[IDE_DATA_OFFSET])
ide_get_lock(NULL, NULL); /* for atari only */
 
-   for (i = 0; i < MAX_HWIFS; i++)
-   idx[i] = ide_hwifs[i].present ? 0xff : i;
+   for (i = 0; i < MAX_HWIFS; i++) {
+   ide_hwif_t *hwif = _hwifs[i];
+
+   if (hwif->io_ports[IDE_DATA_OFFSET] && !hwif->present)
+   idx[i] = i;
+   else
+   idx[i] = 0xff;
+   }
 
ide_device_add_all(idx, NULL);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [PPPOL2TP] Label unused warning when CONFIG_PROC_FS is not set.

2008-02-05 Thread David Miller

From: Jeff Garzik <[EMAIL PROTECTED]>
Date: Tue, 05 Feb 2008 13:22:19 -0500

> David or Paul, wanna pick this up?

I took it, no worries.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [PPPOL2TP] Label unused warning when CONFIG_PROC_FS is not set.

2008-02-05 Thread David Miller

From: James Chapman <[EMAIL PROTECTED]>
Date: Tue, 05 Feb 2008 17:38:10 +

> Acked-by: James Chapman <[EMAIL PROTECTED]>

Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PCIE ASPM support hangs my laptop pretty often

2008-02-05 Thread Kok, Auke

?? ??? wrote:
> I've patched my kernel with the PCIe ASPM and after setting
> echo powersave > /sys/module/pcie_aspm/parameters/policy
>
> I started to experience random hangs of my laptop.
> Hardware info:
> Thinkpad x60s 1704-5UG
 the x60's chipset doesn't support ASPM properly afaik... bad idea.
>>> Well, the code shouldn't then cause a crash of the machine :)
>> The user enabled it specifically (where it is disabled by default)
>>
>> ASPM has been crashing e1000(e), which is why I've recently merged a patch
>> to disable L1 ASPM for the onboard 82573 nic on those platforms.
>>
>> this new infrastructure should work in the default configuration - enabling
>> ASPM where this system leaves it disabled is expected to give problems
>> unless you know what you are doing.
> 
> In my defense, the patch documentation didn't say it doesn't work with my 
> hardware, nor that it hangs the chipset :) and the promised 1.3w surelly 
> looked nice.
> 
> So, are there any benefits of ASPM if I have it in the kernel but it's set to 
> default? I got the impression that "default" means not much power savings?

did the Kconfig not come with a big fat (EXPERIMENTAL) ?

it actually depends for each device on the PCI-Express bus. Most PCI-E ports
support it but the device has the option of advertising enablement of that
capability or not.

both platform and each device on the pci-e bus are involved. some sata chipsets
work great with it, some that might not even advertise the capability... but 
it's
really hit and miss.

Your report is great of course, no doubt about it. I hope that people understand
that this feature can seriously break things at the bus level. It makes me feel 
a
lot better about the issues we had with some of our network cards and ASPM :)

once we get some feeling about how good ASPM works in the field for people we
might have to blacklist certain platforms or devices.

you could (for instance) try to see which device on your busses support ASPM and
work on per-device ASPM parameters (which is one of the things I suggested 
before)
so that we get an idea of which device is badly behaving with ASPM on your 
system.

Cheers,

Auke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] 2.6.24 refuses to boot - NMI watchdog problem?

2008-02-05 Thread Chris Rankin

--- Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Sat, 2 Feb 2008 23:36:42 + (GMT)
> Chris Rankin <[EMAIL PROTECTED]> wrote:

> > I have a 1 GHz Coppermine PC with 512 MB RAM, and it is failing to boot 
> > with the
> nmi_watchdog=1
> > option. This kernel was rebuilt after doing a "make mrproper". The dmesg 
> > log follows:
> 
> Can you tell us if earlier kernels worked OK, and if so which version(s)?
> From your other mail it appears that 2.6.23 was OK?

Oh yes, 2.6.23.14 is fine with nmi_watchdog=1. (This is on a UP machine with a 
SMP/PREEMPT kernel,
BTW. "Just for fun.")

Cheers,
Chris



  __
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger

On Tue, 2008-02-05 at 14:12 -0500, Jeff Garzik wrote:
> Vladislav Bolkhovitin wrote:
> > Jeff Garzik wrote:
> >> iSCSI is way, way too complicated. 
> > 
> > I fully agree. From one side, all that complexity is unavoidable for 
> > case of multiple connections per session, but for the regular case of 
> > one connection per session it must be a lot simpler.
> 
> 
> Actually, think about those multiple connections...  we already had to 
> implement fast-failover (and load bal) SCSI multi-pathing at a higher 
> level.  IMO that portion of the protocol is redundant:   You need the 
> same capability elsewhere in the OS _anyway_, if you are to support 
> multi-pathing.
> 
>   Jeff
> 
> 

Hey Jeff,

I put a whitepaper on the LIO cluster recently about this topic.. It is
from a few years ago but the datapoints are very relevant.

http://linux-iscsi.org/builds/user/nab/Inter.vs.OuterNexus.Multiplexing.pdf

The key advantage to MC/S and ERL=2 has always been that they are
completely OS independent.  They are designed to work together and
actually benefit from one another.

They are also are protocol independent between Traditional iSCSI and
iSER.

--nab

PS: A great thanks for my former colleague Edward Cheng for putting this
together.

> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][v4] ipwireless: driver for 3G PC Card

2008-02-05 Thread Jiri Kosina

On Tue, 5 Feb 2008, Linus Torvalds wrote:

> And especially with wireless, I think the impact for any future changes 
> are about the wireless infrastructure, not PCMCIA. There's little reason 
> to believe that we'll ever make any big PCMCIA overhauls exctly because 
> PCMCIA is so dead. So I'd happily merge it, but I would be even happier 
> if I got ack's from the wireless people or if it was simply merged that 
> way too.

Unfortunately the device has quite confusing name (and the driver of 
course follows). It has 'wireless' in its name, but it has literally 
nothing to do with "wireless" networking, as long as 802.11 is 
involved.

It is in fact a PCMCIA card, that acts as a GPRS/EDGE/UMTS modem. So its 
This has been already discussed, please see 
http://www.ussg.iu.edu/hypermail/linux/kernel/0711.3/2198.html

Thanks,

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread Andrew Morton

On Sun, 03 Feb 2008 18:16:51 -0600
James Bottomley <[EMAIL PROTECTED]> wrote:

> 
> From: James Bottomley <[EMAIL PROTECTED]>
> Date: Sun, 3 Feb 2008 15:40:56 -0600
> Subject: [SCSI] enclosure: add support for enclosure services
> 
> The enclosure misc device is really just a library providing sysfs
> support for physical enclosure devices and their components.
> 

Thanks for sending it out for review.

> +struct enclosure_device *enclosure_find(struct device *dev)
> +{
> + struct enclosure_device *edev = NULL;
> +
> + mutex_lock(_list_lock);
> + list_for_each_entry(edev, _list, node) {
> + if (edev->cdev.dev == dev) {
> + mutex_unlock(_list_lock);
> + return edev;
> + }
> + }
> + mutex_unlock(_list_lock);
> +
> + return NULL;
> +}
> +EXPORT_SYMBOL_GPL(enclosure_find);

This looks a little odd.  We don't take a ref on the object after looking
it up, so what prevents some other thread of control from freeing or
otherwise altering the returned object while the caller is playing with it?

> +/**
> + * enclosure_for_each_device - calls a function for each enclosure
> + * @fn:  the function to call
> + * @data:the data to pass to each call
> + *
> + * Loops over all the enclosures calling the function.
> + *
> + * Note, this function uses a mutex which will be held across calls to
> + * @fn, so it must have user context, and @fn should not sleep or

Probably "non atomic context" would be more accurate.

fn() actually _can_ sleep.

> + * otherwise cause the mutex to be held for indefinite periods
> + */
> +int enclosure_for_each_device(int (*fn)(struct enclosure_device *, void *),
> +   void *data)
> +{
> + int error = 0;
> + struct enclosure_device *edev;
> +
> + mutex_lock(_list_lock);
> + list_for_each_entry(edev, _list, node) {
> + error = fn(edev, data);
> + if (error)
> + break;
> + }
> + mutex_unlock(_list_lock);
> +
> + return error;
> +}
> +EXPORT_SYMBOL_GPL(enclosure_for_each_device);
> +
> +/**
> + * enclosure_register - register device as an enclosure
> + *
> + * @dev: device containing the enclosure
> + * @components:  number of components in the enclosure
> + *
> + * This sets up the device for being an enclosure.  Note that @dev does
> + * not have to be a dedicated enclosure device.  It may be some other type
> + * of device that additionally responds to enclosure services
> + */
> +struct enclosure_device *
> +enclosure_register(struct device *dev, const char *name, int components,
> +struct enclosure_component_callbacks *cb)
> +{
> + struct enclosure_device *edev =
> + kzalloc(sizeof(struct enclosure_device) +
> + sizeof(struct enclosure_component)*components,
> + GFP_KERNEL);
> + int err, i;
> +
> + if (!edev)
> + return ERR_PTR(-ENOMEM);
> +
> + if (!cb) {
> + kfree(edev);
> + return ERR_PTR(-EINVAL);
> + }

It would be less fuss if this were to test cb before doing the kzalloc().

Can cb==NULL actually and legitimately happen?

> + edev->components = components;
> +
> + edev->cdev.class = _class;
> + edev->cdev.dev = get_device(dev);
> + edev->cb = cb;
> + snprintf(edev->cdev.class_id, BUS_ID_SIZE, "%s", name);
> + err = class_device_register(>cdev);
> + if (err)
> + goto err;
> +
> + for (i = 0; i < components; i++)
> + edev->component[i].number = -1;
> +
> + mutex_lock(_list_lock);
> + list_add_tail(>node, _list);
> + mutex_unlock(_list_lock);
> +
> + return edev;
> +
> + err:
> + put_device(edev->cdev.dev);
> + kfree(edev);
> + return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL_GPL(enclosure_register);
> +
> +static struct enclosure_component_callbacks enclosure_null_callbacks;
> +
> +/**
> + * enclosure_unregister - remove an enclosure
> + *
> + * @edev:the registered enclosure to remove;
> + */
> +void enclosure_unregister(struct enclosure_device *edev)
> +{
> + int i;
> +
> + if (!edev)
> + return;

Is this legal?

> + mutex_lock(_list_lock);
> + list_del(>node);
> + mutex_unlock(_list_lock);

See, right now, someone who found this enclosure_device via
enclosure_find() could still be playing with it?

> + for (i = 0; i < edev->components; i++)
> + if (edev->component[i].number != -1)
> + class_device_unregister(>component[i].cdev);
> +
> + /* prevent any callbacks into service user */
> + edev->cb = _null_callbacks;
> + class_device_unregister(>cdev);
> +}
> +EXPORT_SYMBOL_GPL(enclosure_unregister);
> +
> +/**
> + * enclosure_component_register - add a particular component to an enclosure
> + * @edev:the enclosure to add the component
> + * @num: the device number
> + * @type:the type of

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger

On Tue, 2008-02-05 at 22:21 +0300, Vladislav Bolkhovitin wrote:
> Jeff Garzik wrote:
> >>> iSCSI is way, way too complicated. 
> >>
> >> I fully agree. From one side, all that complexity is unavoidable for 
> >> case of multiple connections per session, but for the regular case of 
> >> one connection per session it must be a lot simpler.
> > 
> > Actually, think about those multiple connections...  we already had to 
> > implement fast-failover (and load bal) SCSI multi-pathing at a higher 
> > level.  IMO that portion of the protocol is redundant:   You need the 
> > same capability elsewhere in the OS _anyway_, if you are to support 
> > multi-pathing.
> 
> I'm thinking about MC/S as about a way to improve performance using 
> several physical links. There's no other way, except MC/S, to keep 
> commands processing order in that case. So, it's really valuable 
> property of iSCSI, although with a limited application.
> 
> Vlad
> 

Greetings,

I have always observed the case with LIO SE/iSCSI target mode (as well
as with other software initiators we can leave out of the discussion for
now, and congrats to the open/iscsi on folks recent release. :-) that
execution core hardware thread and inter-nexus per 1 Gb/sec ethernet
port performance scales up to 4x and 2x core x86_64 very well with
MC/S).  I have been seeing 450 MB/sec using 2x socket 4x core x86_64 for
a number of years with MC/S.  Using MC/S on 10 Gb/sec (on PCI-X v2.0
266mhz as well, which was the first transport that LIO Target ran on
that was able to reach handle duplex ~1200 MB/sec with 3 initiators and
MC/S.  In the point to point 10 GB/sec tests on IBM p404 machines, the
initiators where able to reach ~910 MB/sec with MC/S.  Open/iSCSI was
able to go a bit faster (~950 MB/sec) because it uses struct sk_buff
directly. 

A good rule to keep in mind here while considering performance is that
context switching overhead and pipeline <-> bus stalling (along with
other legacy OS specific storage stack limitations with BLOCK and VFS
with O_DIRECT, et al and I will leave out of the discussion for iSCSI
and SE engine target mode) is that a initiator will scale roughly 1/2 as
well as a target, given comparable hardware and virsh output.  The
software target case target case also depends, in great regard in many
cases, if we are talking about something something as simple as doing
contiguous DMA memory allocations in from a SINGLE kernel thread, and
handling direction execution to a storage hardware DMA ring that may
have not been allocated in the current kernel thread.  In MC/S mode this
breaks down to:

1) Sorting logic that handles pre execution statemachine for transport
from local RDMA memory and OS specific data buffers.   TCP application
data buffer, struct sk_buff, or RDMA struct page or SG.  This should be
generic between iSCSI and iSER.

2) Allocation of said memory buffers to OS subsystem dependent code that
can be queued up to these drivers.  It breaks down to what you can get
drivers and OS subsystem folks to agree to implement, and can be made
generic in a Transport / BLOCK / VFS layered storage stack.  In the
"allocate thread DMA ring and use OS supported software and vendor
available hardware" I don't think the kernel space requirement will
every completely be able to go away.

Without diving into RFC-3720 specifics, the statemachine for MC/S side
for memory allocation, login and logout generic to iSCSi and ISER, and
ERL=2 recovery.  My plan is to post the locations in the LIO code where
this has been implemented, and where we where can make this easier, etc.
In the early in the development of what eventually became LIO Target
code, ERL was broken into separete files and separete function
prefixes. 

iscsi_target_erl0, iscsi_target_erl1 and iscsi_target_erl2.

The statemachine for ERL=0 and ERL=2 is pretty simple in RFC-3720 (have
a look for those interested in the discussion)

7.1.1.  State Descriptions for Initiators and Targets

The LIO target code is also pretty simple for this:

[EMAIL PROTECTED] target]# wc -l iscsi_target_erl*
  1115 iscsi_target_erl0.c
45 iscsi_target_erl0.h
   526 iscsi_target_erl0.o
  1426 iscsi_target_erl1.c
51 iscsi_target_erl1.h
  1253 iscsi_target_erl1.o
   605 iscsi_target_erl2.c
45 iscsi_target_erl2.h
   447 iscsi_target_erl2.o
  5513 total

erl1.c is a bit larger than the others because it contains the MC/S
statemachine functions. iscsi_target_erl1.c:iscsi_execute_cmd() and
iscsi_target_util.c:iscsi_check_received_cmdsn() do most of the work for
LIO MC/S state machine.  I would  probably benefit from being in broken
up into say iscsi_target_mcs.c.  Note that all of this code is MC/S
safe, with the exception of the specific SCSI TMR functions.  For the
SCSI TMR pieces, I have always hoped to use SCST code for doing this...

Most of the login/logout code is done in iscsi_target.c, which is could
probably also benefit fot getting broken out...

--nab

--
To unsubscribe from this list: send the

Re: PCIE ASPM support hangs my laptop pretty often

2008-02-05 Thread Дамјан Георгиевски

> >>> I've patched my kernel with the PCIe ASPM and after setting
> >>> echo powersave > /sys/module/pcie_aspm/parameters/policy
> >>>
> >>> I started to experience random hangs of my laptop.
> >>> Hardware info:
> >>> Thinkpad x60s 1704-5UG
> >>
> >> the x60's chipset doesn't support ASPM properly afaik... bad idea.
> >
> > Well, the code shouldn't then cause a crash of the machine :)
>
> The user enabled it specifically (where it is disabled by default)
>
> ASPM has been crashing e1000(e), which is why I've recently merged a patch
> to disable L1 ASPM for the onboard 82573 nic on those platforms.
>
> this new infrastructure should work in the default configuration - enabling
> ASPM where this system leaves it disabled is expected to give problems
> unless you know what you are doing.

In my defense, the patch documentation didn't say it doesn't work with my 
hardware, nor that it hangs the chipset :) and the promised 1.3w surelly 
looked nice.

So, are there any benefits of ASPM if I have it in the kernel but it's set to 
default? I got the impression that "default" means not much power savings?


-- 
Damjan Georgievski
Free Software Macedonia
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mmu notifiers #v5

2008-02-05 Thread Christoph Lameter

On Wed, 6 Feb 2008, Andrea Arcangeli wrote:

> > You can of course setup a 2M granularity lock to get the same granularity 
> > as the pte lock. That would even work for the cases where you have to page 
> > pin now.
> 
> If you set a 2M granularity lock, the _start callback would need to
> do:
> 
>   for_each_2m_lock()
>   mutex_lock()
> 
> so you'd run zillon of mutex_lock in a row, you're the one with the
> million of operations argument.

There is no requirement to do a linear search. No one in his right mind 
would implement a performance critical operation that way.
 
> > The size of the mmap is relevant if you have to perform callbacks on 
> > every mapped page that involved take mmu specific locks. That seems to be 
> > the case with this approach.
> 
> mmap should never trigger any range_start/_end callback unless it's
> overwriting an older mapping which is definitely not the interesting
> workload for those apps including kvm.

There is still at least the need for teardown on exit. And you need to 
consider the boundless creativity of user land programmers. You would not 
believe what I have seen

> > Optimizing do_exit by taking a single lock to zap all external references 
> > instead of 1 mio callbacks somehow leads to slowdown?
> 
> It can if the application runs for more than a couple of seconds,
> i.e. not a fork flood in which you care about do_exit speed. Keep in
> mind if you had 1mio invalidate_pages callback it means you previously
> called follow_page 1 mio of times too...

That is another problem were we are also in need of solutions. I believe 
we have discussed that elsewhere.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1123 matches

Mail list logo