Re: [linux-pm] [demo patch/RFC] sleepy linux

2008-02-26 Thread Randy Dunlap
On Tue, 26 Feb 2008 11:26:53 +0100 Pavel Machek wrote:


Hi Pavel,

Is this limited to UP and only one disk?

[comments below]


> Sleepy linux support, demo version, but it works on my thinkpad x60 ;-).
> 
> Signed-off-by: Pavel Machek <[EMAIL PROTECTED]>
> 
> diff --git a/Documentation/power/sleepy.txt b/Documentation/power/sleepy.txt
> new file mode 100644
> index 000..a9caf05
> --- /dev/null
> +++ b/Documentation/power/sleepy.txt
> @@ -0,0 +1,55 @@
> + Sleepy Linux
> + 
> +
> +Copyright 2007 Pavel Machek <[EMAIL PROTECTED]>
> +   GPLv2
> +
> +Current Linux versions can enter suspend-to-RAM just fine, but only
> +can do it on explicit request. But suspend-to-RAM is important, eating

  Usually "can only do it" AFAIK.

> +something like 10% of power needed for idle system. Starting suspend
> +manually is not too convinient; it is not an option on multiuser

   convenient;

> +machine, and even on single user machine, some things are not easy:
> +
> +1) Download this big chunk in mozilla, then go to sleep
> +
> +2) Compile this, then go to sleep
> +
> +3) You can sleep now, but wake me up in 8:30 with mp3 player
> +
> +Todays hardware is mostly capable of doing better: with correctly set

   Today's

> +up wakeups, machine can sleep and successfully pretend it is not
> +sleeping -- by waking up whenever something interesting happens. Of
> +course, it is easier on machines not connected to the network, and on
> +notebook computers.
> +
> +Requirements:
> +
> +0) Working suspend-to-RAM, with kernel being able to bring video back.
> +
> +1) RTC clock that can wake up system
> +
> +2) Lid that can wake up a system,
> +   or keyboard that can wake up system and does not loose keypress

   lose

> +   or special screensaver setup
> +
> +3) Network card that is either down
> +   or can wake up system on any packet (and not loose too many packets)

   lose

> +
> +How to use it
> +~
> +
> +First, make sure your config is tiny enough that cpu sleeps at least

CPU (please)

> +five or so seconds between wakeups. You'll probably need to disable
> +USB, make some kernel timers way longer than default and boot with
> +init=/bin/bash.
> +
> +Then, enable SCSI powersave by something like:
> +
> +mount /sys

Isn't /sys auto-mounted by kernel?

> +echo auto > 
> /sys/devices/pci:00/:00:1f.2/host0/target0:0:0/0:0:0:0/power/level
> +echo 3 > 
> /sys/devices/pci:00/:00:1f.2/host0/target0:0:0/0:0:0:0/power/autosuspend
> +echo adisk > /sys/power/state
> +mount / -oremount,commit=900
> +
> +Then, echo auto > /sys/power/state should enable sleepy support. Do it
> +twice, and it will ignore open lid and sleep anyway.

> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
> index 29e71bd..0197b1f 100644
> --- a/drivers/ata/ahci.c
> +++ b/drivers/ata/ahci.c
> @@ -268,6 +269,41 @@ static struct class_device_attribute *ah
>   NULL
>  };
>  
> +struct pci_dev *my_pdev;
> +int autosuspend_enabled;
> +
> +struct sleep_disabled_reason ahci_active = {
> +"ahci"
> +};
> +
> +/* The host and its devices are all idle so we can autosuspend */
> +static int autosuspend(struct Scsi_Host *host)
> +{
> + if (my_pdev && autosuspend_enabled) {
> + printk("ahci: should autosuspend\n");

Use printk() KERN_* levels (multiple places).


> + ahci_pci_device_suspend(my_pdev, PMSG_SUSPEND);
> + enable_auto_sleep(_active);
> + return 0;
> + } 
> + printk("ahci: autosuspend disabled\n");
> + return -EINVAL;
> +}
> +
...
> +}
> +
> +
> +
>  static struct scsi_host_template ahci_sht = {
>   .module = THIS_MODULE,
>   .name   = DRV_NAME,
> @@ -1820,6 +1858,10 @@ static void ahci_thaw(struct ata_port *a
>  
>  static void ahci_error_handler(struct ata_port *ap)
>  {
> + struct ata_host *host = ap->host;
> + int rc;
> + extern int slept;

Eh?

> +
>   if (!(ap->pflags & ATA_PFLAG_FROZEN)) {
>   /* restart engine */
>   ahci_stop_engine(ap);

General comment:  Lots of the comment fixes in libata should be part
of a standalone patch, not part of this patch.


> diff --git a/include/linux/ata.h b/include/linux/ata.h
> index 78bbaca..df2dd4f 100644
> --- a/include/linux/ata.h
> +++ b/include/linux/ata.h
> @@ -298,6 +298,13 @@ enum {
>   SCR_ACTIVE  = 3,
>   SCR_NOTIFICATION= 4,
>  
> + /* SControl subfields, each field is 4 bit wide */

   bits

> + ATA_SCTL_DET= 0, /* lsb */
> + ATA_SCTL_SPD= 1,
> + ATA_SCTL_IPM= 2,
> + ATA_SCTL_SPM= 3,
> + ATA_SCTL_PMP= 4,
> +
>   /* SError bits */
>   SERR_DATA_RECOVERED 

Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Jeff Garzik

Nick Piggin wrote:

Anyway, the idea of making fsync/fdatasync etc. safe by default is
a good idea IMO, and is a bad bug that we don't do that :(


Agreed...  it's also disappointing that [unless I'm mistaken] you have 
to hack each filesystem to support barriers.


It seems far easier to make sync_blkdev() Do The Right Thing, and 
magically make all filesystems data-safe.


Jeff


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24.2: 4KSTACKS + pcdrw + dm + mount -> stack overflow: ide-cd related? dm-related?

2008-02-26 Thread Jan Kara
On Tue 26-02-08 12:37:17, Jiri Kosina wrote:
> On Tue, 26 Feb 2008, Ingo Molnar wrote:
> 
> > > + name = kmalloc(sizeof(char) * UDF_NAME_LEN, GFP_KERNEL);
> > > + fname = kmalloc(sizeof(char) * UDF_NAME_LEN, GFP_KERNEL);
> > > +
> > > + if (!name || !fname) {
> > > + *err = -ENOMEM;
> > > + return NULL;
> > > + }
> > > +
> > >   if (dentry) {
> > >   if (!dentry->d_name.len) {
> > >   *err = -EINVAL;
> > this bit is missing i think:
> > if (name)
> > kfree(name);
> > if (fname)
> > kfree(fname);
> 
> Ergh, of course, stupid me, sorry, it should be freed on all exit paths. I 
> am not sending updated patch, as Jan is probably working on complete 
> removal of one of those fields ... ?
  Yes, I'll convert one variable to kmalloc and the other one remove
completely. Stay tuned ;).

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: vmlinux.lds cleanup

2008-02-26 Thread Cyrill Gorcunov
[Sam Ravnborg - Mon, Feb 25, 2008 at 09:33:07PM +0100]
| On Mon, Feb 25, 2008 at 12:58:00PM +0300, Cyrill Gorcunov wrote:
| > Hi Sam,
| > 
| > you know I've just take a look on different architectures and I suddenly
| > realized that I even can't test my changes I'm bringnin in. For example -
| > xtensa arch, most of lds numeric constants could (and should) be changed
| > to PAGE_SIZE and THREAD_SIZE but this requires to include additional
| > heades in lds script and I'm not even sure if it link without errors...
| > (actually, i'm absolutely sure there would be errors ;)
| > 
| > I'm not sure, but maybe it would be more convenient to ask mainteiners
| > fix their scripts? At least their have access to an appropriate hardware
| > to test.
| 
| I have in most cases the relevant toolchains and then I manually inspect
| the generated .lds file.
| In other cases I just do my best to make it correct and tell in my
| submisison that htis is not build tested. (This type of
| info belongs below the three dasches '---').
| 
|   Sam
| 

got it, thanks ;)

- Cyrill -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] ext3 freeze feature ver 0.2

2008-02-26 Thread Eric Sandeen
Takashi Sato wrote:

> o Elevate XFS ioctl numbers (XFS_IOC_FREEZE and XFS_IOC_THAW) to the VFS
>   As Andreas Dilger and Christoph Hellwig advised me, I have elevated
>   them to include/linux/fs.h as below.
> #define FIFREEZE_IOWR('X', 119, int)
>    #define FITHAW  _IOWR('X', 120, int)
>   The ioctl numbers used by XFS applications don't need to be changed.
>   But my following ioctl for the freeze needs the parameter
>   as the timeout period.  So if XFS applications don't want the timeout
>   feature as the current implementation, the parameter needs to be
>   changed 1 (level?) into 0.

So, existing xfs applications calling the xfs ioctl now will behave
differently, right?  We can only keep the same ioctl number if the
calling semantics are the same.  Keeping the same number but changing
the semantics is harmful, IMHO

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] using smp_processor_id() in preemptible as suspending

2008-02-26 Thread Andrew Morton
On Tue, 26 Feb 2008 16:24:11 +0800 Dave Young <[EMAIL PROTECTED]> wrote:

> I don't know whom I should mail to, could you cc the proper guy? Thanks.
> 
> [  118.331674] acpi LNXSYSTM:00: suspend
> [  118.331674] Disabling non-boot CPUs ...
> [  118.331674] CPU0 attaching NULL sched-domain.
> [  118.331674] CPU1 attaching NULL sched-domain.
> [  118.438750] CPU 1 is now offline
> [  118.438750] lockdep: fixing up alternatives.
> [  118.438750] SMP alternatives: switching to UP code
> [  118.438750] BUG: using smp_processor_id() in preemptible [] code: 
> s2ram/2818
> [  118.438750] caller is rcu_offline_cpu+0x15a/0x1c0
> [  118.438750] Pid: 2818, comm: s2ram Not tainted 2.6.25-rc3-test #2
> [  118.438750]  [] ? printk+0x18/0x20
> [  118.438750]  [] debug_smp_processor_id+0xb1/0xc0
> [  118.438750]  [] rcu_offline_cpu+0x15a/0x1c0
> [  118.438750]  [] rcu_cpu_notify+0x3f/0x60
> [  118.438750]  [] notifier_call_chain+0x3d/0x80
> [  118.438750]  [] __raw_notifier_call_chain+0x19/0x20
> [  118.438750]  [] raw_notifier_call_chain+0x1a/0x20
> [  118.438750]  [] _cpu_down+0x13b/0x230
> [  118.438750]  [] disable_nonboot_cpus+0x49/0xd0
> [  118.438750]  [] suspend_devices_and_enter+0x72/0x130
> [  118.438750]  [] ? printk+0x18/0x20
> [  118.438750]  [] enter_state+0xb3/0xe0
> [  118.438750]  [] state_store+0x7d/0xc0
> [  118.438750]  [] ? state_store+0x0/0xc0
> [  118.438750]  [] kobj_attr_store+0x2e/0x40
> [  118.438750]  [] flush_write_buffer+0x47/0x70
> [  118.438750]  [] sysfs_write_file+0x49/0x70
> [  118.438750]  [] vfs_write+0x91/0x140
> [  118.438750]  [] sys_write+0x3d/0x70
> [  118.438750]  [] syscall_call+0x7/0xb
> [  118.438750]  ===
> [  118.438750] CPU0 attaching NULL sched-domain.
> [  118.440335] CPU1 is down

Paul & Ingo I guess

> My .config 

Doesn't tell us whether you'r eusing CONFIG_CLASSIC_RCU or
CONFIG_PREEMPT_RCU.  I assume CONFIG_CLASSIC_RCU, if you ran `make
oldconfig'.

Which kernel are you running here?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-git: kmap_atomic() WARN_ON()

2008-02-26 Thread Jeff Garzik

Ingo Molnar wrote:

* Jeff Garzik <[EMAIL PROTECTED]> wrote:


+   unsigned long flags;
+
+   local_irq_save(flags);


hm, couldnt we attach the irq disabling to some spinlock, in a natural 
way? Explicit flags fiddling is a PITA once we do things like threaded 
irq handlers, -rt, etc.


Attaching the irq disabling to some spinlock is what would be 
artificial...  See the ahci.c patch earlier in this thread.  It is taken 
without spin_lock_irqsave() in the interrupt handler, and there is no 
reason to disable interrupts for the entirety of the interrupt handler 
run -- only the part where we call kmap.


This is only being done to satisfy kmap_atomic's requirements, not libata's.

I could add a "kmap lock" but that just seems silly.

Jeff




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Andrew Morton
On Tue, 26 Feb 2008 15:07:45 + Jamie Lokier <[EMAIL PROTECTED]> wrote:

> SYNC_FILE_RANGE_WRITE scans all pages in the range, looking for dirty
> pages which aren't already queued for write-out.  It marks those with
> a "write-out" flag, and starts write I/Os at some unspecified time in
> the near future; it can be assumed writes for all the pages will
> complete eventually if there's no errors.  When I/O completes on a
> page, it cleans the page and also clears the write-out flag.
> 
> SYNC_FILE_RANGE_WAIT_AFTER waits until all pages in the range don't
> have the "write-out" flag set.
> 
> SYNC_FILE_RANGE_WAIT_BEFORE does the same wait, but before marking
> pages for write-out.  I don't actually see the point in this.  Isn't a
> preceding call with SYNC_FILE_RANGE_WAIT_AFTER equivalent, making
> BEFORE a redundant flag?

Consider the case of pages which are dirty but are already under writeout. 
ie: someone redirtied the page after someone started writing the page out. 
For these pages the kernel needs to

a) wait for the current writeout to complete

b) start new writeout

c) wait for that writeout to complete.

those are the three stages of sync_file_range().  They are independently
selectable and various combinations provide various results.

The reason for providing b) only (SYNC_FILE_RANGE_WRITE) is so that
userspace can get as much data into the queue as possible, to permit the
kernel to optimise IO scheduling better.

If you perform a) and b) together
(SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE) then you are guaranteed
that all data which was dirty when sync_file_range() executed will be sent
into the queue, but you won't get as much data into the queue if the kernel
encounters dirty, under-writeout pages.  This is especially hurtful if
you're trying to feed a lot of little segments into the queue.  In that
case perhaps userspace should do an asynchrnous pass
(SYNC_FILE_RANGE_WRITE) to stuff as much data as poss into the queue, then
a SYNC_FILE_RANGE_WAIT_AFTER pass then a
SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE|SYNC_FILE_RANGE_WAIT_AFTER
pass to clean up any stragglers.  WHich mode is best very much depends on
the application's file dirtying patterns.  One would have to experiment
with it, and tuning of sync_file_range() usage would occur alongside tuning
of the application's write() design.

It's an interesting problem, with potentially high payback.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sata_nv: fix nmi intr or system hanging in rhel4u6 adma.

2008-02-26 Thread Jeff Garzik

Robert Hancock wrote:



Kuan Luo wrote:

Hi, robert
One customer reported that their system received a nmi interrupt after
issuing "dd if=/dev/sdb of=/dev/null" on a defective disk in rhel4u6.
I tested it and found  that my system hung both in rhel4u6(2.6.9-67) and
2.6.24-rc7.
The patch can work well,  but I am not sure if the patch has other
potential effect on adma.
I attached a  file in case of lines breaked.

The below info comes from Gunther Mayer to reproduce the issue.
"
used a Seagate ST3500841NS 3.AE for my test; probably other seagate 
drives are also capable of creating media errors with the new hdparm-8.1:
- compile hdparm-8.1 - hdparm -- yes-i-know-what-i-am-doing 
--make-bad-sector 6 /dev/sdb

Unfortunately this does not succeed for nvidia sata controller (timeouts
et al.), but it worked fine on AHCI machine (e.g. FSC R640).
When I insert this newly created defective disk in Ultra 20, it 
reboots within seconds after issueing "dd if=/dev/sdb of=/dev/null". "


Signed-off-by: [EMAIL PROTECTED]

---
 
drivers/ata/sata_nv.c |5 +++--

 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c
index ed5473b..e824260 100644
--- a/drivers/ata/sata_nv.c
+++ b/drivers/ata/sata_nv.c
@@ -837,9 +837,10 @@ static void nv_adma_tf_read(struct ata_port *ap,
struct ata_taskfile *tf)
all shortly be aborted anyway. We assume that NCQ commands
are not
issued via passthrough, which is the only way that switching
into
ADMA mode could abort outstanding commands. */
-nv_adma_register_mode(ap);
+struct nv_adma_port_priv *pp = ap->private_data;
 
-ata_tf_read(ap, tf);

+if (pp->flags & NV_ADMA_PORT_REGISTER_MODE)
+ata_tf_read(ap, tf);
 }
 
 static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, __le16

*cpb)


This is basically avoiding switching into register mode, right? I don't 
think this is a very good solution as the point of the tf_read function 
is that it's supposed to read the taskfile provided by the drive to 
diagnose the error, so not doing this isn't a good thing.


Agree with this analysis -- if ->tf_read() is being called, then 
obviously the core wants a current copy of the device's ATA registers.


It is not a good solution to simply avoiding returning meaningful data, 
because -- as Robert notes -- we need tf_read for analysis.


Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ugly patch] Save .15W-.5W by AHCI powersaving

2008-02-26 Thread Matthew Garrett
On Mon, Feb 25, 2008 at 05:42:58PM -0500, Jeff Garzik wrote:

> BTW we can also save power by allowing the user to choose to disable 
> hotplugging support.  Then we can power down PHYs that are not in use.
> 
> That requires the addition of some policy controls, because it is 
> user-specific whether or not to waste power waiting for a plug-in event.

For AHCI, if you've enabled link power management then you've already 
disabled hotplug. We might as well power down unused phys in that case. 
Note that laptop bays still seem to tend to use platform-specific 
hotplug notification, even when they're sata - we'll get the hotplug 
notify for them even if the phy's powered down, so that case also needs 
to be handled.

-- 
Matthew Garrett | [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] Re: device mapper not reporting no-barrier-support?

2008-02-26 Thread Jens Axboe
On Tue, Feb 26 2008, Alasdair G Kergon wrote:
> On Mon, Feb 25, 2008 at 03:20:50PM -0800, Andrew Morton wrote:
> > On Mon, 25 Feb 2008 14:26:15 +0100 Anders Henke <[EMAIL PROTECTED]> wrote:
> > > I'm currently stuck between Kernel LVM and DRBD, as I'm using Kernel
> > > 2.6.24.2 with DRBD 8.2.5 on top of an LVM2 device (LV).
> > > -LVM2/device mapper doesn't support write barriers
> 
> That's right.
> 
> > > -DRBD uses blkdev_issue_flush() to flush its metadata to disk.
> 
> Which won't work if device-mapper is underneath.
> 
> > >  On a no-barrier-device, DRBD should receive EOPNOTSUPP, but
> > >  it really does receive an EIO. Promptly, DRBD gives the
> > >  error message "drbd0: local disk flush failed with status -5".
> > > I've posted a lengty summary of my findings to
> > > http://lists.linbit.com/pipermail/drbd-user/2008-February/008665.html
> > > ... that DRBD does catch the EOPNOTSUPP for blkdev_issue_flush and
> > > BIO_RW_BARRIER, but the lvm implementation of blkdev_issue_flush in
> > > 2.6.24.2 aparently does return EIO for blkdev_issue_flush.
> > I'd say it's a DM bug.
> 
> The dm code is unchanged, but look at the limited endio handling in
> ll_rw_blk.c:
> 
> static void bio_end_empty_barrier(struct bio *bio, int err)
> {
> if (err)
> clear_bit(BIO_UPTODATE, >bi_flags);
> 
> complete(bio->bi_private);
> }
> 
> int blkdev_issue_flush(struct block_device *bdev, sector_t *error_sector)
> {
> ...
> wait_for_completion();
> if (error_sector)
> *error_sector = bio->bi_sector;
> ret = 0;
> if (!bio_flagged(bio, BIO_UPTODATE))
> ret = -EIO;

You are right, the return value got broken there. Does this make it
return -EOPNOTSUPP properly for you?

diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index 6901eed..55c5f1f 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -259,8 +259,11 @@ int blk_do_ordered(struct request_queue *q, struct request 
**rqp)
 
 static void bio_end_empty_barrier(struct bio *bio, int err)
 {
-   if (err)
+   if (err) {
+   if (err == -EOPNOTSUPP)
+   set_bit(BIO_EOPNOTSUPP, >bi_flags);
clear_bit(BIO_UPTODATE, >bi_flags);
+   }
 
complete(bio->bi_private);
 }
@@ -309,7 +312,9 @@ int blkdev_issue_flush(struct block_device *bdev, sector_t 
*error_sector)
*error_sector = bio->bi_sector;
 
ret = 0;
-   if (!bio_flagged(bio, BIO_UPTODATE))
+   if (bio_flagged(bio, BIO_EOPNOTSUPP))
+   ret = -EOPNOTSUPP;
+   else if (!bio_flagged(bio, BIO_UPTODATE))
ret = -EIO;
 
bio_put(bio);

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] pidns: make pid->level and pid_ns->level unsigned

2008-02-26 Thread Pavel Emelyanov
These values represent the nesting level of a namespace and 
pids living in it, and it's always non-negative.

Turning this from int to unsigned int saves some space in 
pid.c (11 bytes on x86 and 64 on ia64) by letting the compiler 
optimize the pid_nr_ns a bit. E.g. on ia64 this removes the 
sign extension calls, which compiler adds to optimize access
to pid->nubers[ns->level].

Signed-off-by: Pavel Emelyanov <[EMAIL PROTECTED]>

---

diff --git a/include/linux/pid.h b/include/linux/pid.h
index c798081..03573e3 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -60,7 +60,7 @@ struct pid
/* lists of tasks that use this pid */
struct hlist_head tasks[PIDTYPE_MAX];
struct rcu_head rcu;
-   int level;
+   unsigned int level;
struct upid numbers[1];
 };
 
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index fcd61fa..caff528 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -20,7 +20,7 @@ struct pid_namespace {
int last_pid;
struct task_struct *child_reaper;
struct kmem_cache *pid_cachep;
-   int level;
+   unsigned int level;
struct pid_namespace *parent;
 #ifdef CONFIG_PROC_FS
struct vfsmount *proc_mnt;
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index 6d792b6..cb17497 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -66,7 +66,7 @@ err_alloc:
return NULL;
 }
 
-static struct pid_namespace *create_pid_namespace(int level)
+static struct pid_namespace *create_pid_namespace(unsigned int level)
 {
struct pid_namespace *ns;
int i;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] Fix b43 driver build for arm

2008-02-26 Thread Gordon Farquharson
Hi Ben

On Tue, Feb 26, 2008 at 7:37 AM, Ben Dooks <[EMAIL PROTECTED]> wrote:

>  I build all of my ARM kernels on an x86 box, it is much faster
>  and I don't have to ensure I have a read/write capable filesystem
>  for any of my ARM boards.

The patch has been merged into Andrew's -mm tree.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg35079.html

Gordon

-- 
Gordon Farquharson
GnuPG Key ID: 32D6D676
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ext2: correct type miss (linux-2.6.24.3)

2008-02-26 Thread ohyama_sec
From: Hiroyasu Ohyama

Maybe I found a type miss in fs/ext2/ext2.h which is in linux-2.6.24.3, and 
write difference below.

Signed-off-by: Hiroyasu OHYAMA <[EMAIL PROTECTED]>
---

--- fs/ext2/ext2.h.orig 2008-02-27 00:56:34.0 +0900
+++ fs/ext2/ext2.h  2008-02-26 19:12:55.0 +0900
@@ -27,7 +27,7 @@ struct ext2_inode_info {
/*
 * i_block_group is the number of the block group which contains
 * this file's inode.  Constant across the lifetime of the inode,
-* it is ued for making block allocation decisions - we try to
+* it is used for making block allocation decisions - we try to
 * place a file's data blocks near its inode block, and new inodes
 * near to their parent directory's inode.
 */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: problem with starting 2.6.25-rc1 and latest git

2008-02-26 Thread Mariusz Kozlowski
Hello,

> On Mon, 18 Feb 2008 14:28:32 +0100, Jean Delvare wrote:
> > On Thu, 14 Feb 2008 00:27:34 +0100, Mariusz Kozlowski wrote:
> > > Of course there is a typo in the subject :)
> > > 
> > > 2.5.25-rc1 -> 2.6.25-rc1
> > > 
> > > > Hello,
> > > > 
> > > > I tried 2.6.25-rc1 and latest git on my laptop (x86 32bit) and 
> > > > have a problem.
> > > > Linux boots but with huge delay due to some issue with loading usb 
> > > > modules.
> > > > Udev complains:
> > > > 
> > > > 'Could not lock modprobe uhci_hcd'
> > > > 'Could not lock modprobe yenta_socket'
> > > > 'Unknown symbol usb_*'
> > > > 'Gave up waiting for init of module usbcore'
> > > > (...)
> > 
> > Have you tried upgrading to rc2? I used to have the same problem you
> > reported, but I was unable to reproduce it since I upgraded to rc2.
> 
> I take this back. It happened to me again today, while I am now running
> rc3.

I can reproduce this here on rc3 as well. Weird - on git tree before
rc3 was released this didn't happen. rc2 and rc1 are broken. I also
don't know why nobody else see this but us :) And not sure how
to debug it some more or provide more accurate information.

Mariusz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

2008-02-26 Thread Alan Stern
On Wed, 27 Feb 2008, David Newall wrote:

> David Brownell wrote:
> > On Tuesday 26 February 2008, David Newall wrote:
> >   
> >> Hardware can be inserted and removed while we're in a suspend state; and
> >> there's nothing that we can do about it until we resume.  Is it fair to
> >> say, then, that having started suspend, we could reasonably ignore any
> >> device insertion and removal, and handle it on resume?
> >> 
> >
> > "Ignore" seems a bit strong; those events may be wakeup triggers,
> > which would cause the hardware to make it a very short suspend state.
> >
> > "Defer handling" is more to the point, be it by hardware or software.
> >
> >   
> 
> Of course, "defer".  The insertion has to be handled eventually.  What
> I'm wondering is if we can ignore it, and catch it on the resume.

Certainly.  If hardware-change events can get lost because of the
system sleep, the resume method should make every effort to verify that 
what it remembers of the hardware state matches the current reality.

> >> Presumably we need to scan for hardware changes on resume.
> >> 
> >
> > Not on most busses I work with; the hardware issues notifications
> > whenever the devices are removable.
> >   
> 
> There's no notification while we're suspended.  Isn't it necessary to
> scan all busses on resume, just to know what's on them?

It depends on the bus.  If the bus doesn't support hotplugging then 
scanning isn't necessary.  If the bus does support hotplugging then 
scanning after suspend may or may not be necessary, depending on 
whether or not the bus controller remained powered during the suspend.  
For hotpluggable buses, scanning after hibernation is always necessary.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Sata-MV, Intergated Sata Device Support

2008-02-26 Thread Mark Lord

saeed wrote:


On Mon, 25 Feb 2008, Jeff Garzik wrote:


...

Saeed:  isn't this what your SOC patches already implemented for us?
As near as I can tell, sata_mv now already has support for the 60x1C0.

Saeed's stuff didn't support PCI though, and Jon Li is definitely talking
about PCI...
yes, my patch added support for the SoC sata like in the 5182, and this 
is what Jon Li was concerned about. he mentioneded the 60x1C0 pci device 
just to suggest to use it's code for the SoC sata as it is very similar.

..

I don't think I understand your english there.

Does the current sata_mv driver work as-is with the chipset this person wants?
If not, then exactly what has to change to make it work?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Tabs, spaces, indent and 80 character lines

2008-02-26 Thread Krzysztof Halasa
Jan Engelhardt <[EMAIL PROTECTED]> writes:

> Now back to coding, oh and don't forget send a patch for CodingStyle 
> since a mail without one is often taken even less seriously.

Someone with a patch to Emacs to use tabs for ident + spaces for
alignment maybe? :-)
-- 
Krzysztof Halasa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ugly patch] Save .15W-.5W by AHCI powersaving

2008-02-26 Thread Mark Lord

Pavel Machek wrote:

Hi!


This is a patch (very ugly, assumes you have just one disk) to bring
powersaving to AHCI. You need Alan's SCSI autosuspend (attached) patch
as a base.

It saves .5W compared to config with disk spinning, and even .15W
compared to hdparm -y... on my thinkpad x60 anyway.

..

There was a discussion of this here today.


Real-life discussion, or something I could read? :-).


It makes good use of AHCI-specific features.

Has it been tested with a Port-Multiplier yet?


I do not know what port-multiplier is, sorry. But it was not really
tested. It is not expected to work on any other config than notebook
very similar to mine.


This is cool enough that we really ought to do a hardware-independent
version, so that all SATA interfaces could benefit.  Especially ata_piix,
but others too.


Well, it seems like it is 10 lines per driver once Alan's SCSI
autosuspend patches are in...

..

Cool (literally)!

I think I might have gotten your patch confused in my mind
with another AHCI patch, which uses features of the chip itself
to automatically negotiate/change link power status on the fly
(no s/w needed, other than to turn it on).

That one is very ACPI specific, though.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Kernel oops with bluetooth usb dongle

2008-02-26 Thread Quel Qun
 -- Original message --
From: Marcel Holtmann <[EMAIL PROTECTED]>
> Hi Quel,
> 
> > Bad news: I still cannot use the device.
> >
> > hcitool inq, hcitool scan, hcitool name  and hcitool info  
> > 
> > commands work.
> >
> > hcitool cc , sdptool , rfcomm connect command fail,  
> > most of them
> > with a 'Connection reset by peer' error.
> 
> what does "hciconfig hci0 version" tell you about your device? Some of  
> the none major based Bluetooth chips are broken and might need an  
> extra tweak within the USB driver.
> 

Marcel,

# hciconfig hci0 version
hci0:   Type: USB
BD Address: 00:03:0D:00:15:47 ACL MTU: 192:8 SCO MTU: 64:8
HCI Ver: 1.1 (0x1) HCI Rev: 0xbc LMP Ver: 1.1 (0x1) LMP Subver: 0xbc
Manufacturer: Cambridge Silicon Radio (10)

# lsusb | grep Cambridge
Bus 003 Device 002: ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle 
(HCI mode)

This device works fine in 2.6.23.1 and got broken circa 2.6.24 rcs.

Thank you,
--
kk1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/28] Swap over NFS -v16

2008-02-26 Thread Miklos Szeredi
> > > > mm-page_file_methods.patch
> > > > 
> > > > This makes page_offset and others more expensive by adding a
> > > > conditional jump to a function call that is not usually made.
> > > > 
> > > > Why do swap pages have a different index to everyone else?
> > > 
> > > Because the page->index of an anonymous page is related to its (anon)vma
> > > so that it satisfies the constraints for vm_normal_page().
> > > 
> > > The index in the swap file it totally unrelated and quite random. Hence
> > > the swap-cache uses page->private to store it in.
> > 
> > Yeah, and putting the condition into page_offset() will confuse code
> > which uses it for finding the offset in the VMA or in a tmpfs file.
> > 
> > So why not just have a separate page_swap_offset() function, used
> > exclusively by swap_in/out()?
> 
> Ah, we can do the page_file_offset() to match page_file_index() and
> page_file_mapping(). And convert NFS to use page_file_offset() where
> appropriate, as I already did for these others.
> 
> That would sort out the mess, right?

Yes, that sounds perfect.

Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

2008-02-26 Thread Alan Stern
On Tue, 26 Feb 2008, Rafael J. Wysocki wrote:

> > > IMO the device driver should assure that no new children will be 
> > > registered
> > > concurrently with the ->suspend() method (IOW, ->suspend() should wait for
> > > all such registrations to complete and should prevent any new ones from
> > > being started) and it should make it impossible to register any new 
> > > children
> > > after ->suspend() has run.  It's the driver's problem how to achieve that.
> > 
> > Exactly; this has to be added to the PM documentation.
> 
> Into Documentation/power/devices.txt, I gather?

Yes.

> > > > The PM core could help detect errors here.  If it tries to suspend a 
> > > > device and sees that the device's parent is already suspended, then the 
> > > > parent's driver has a bug.
> > > 
> > > Yes, I think we ought to fail the suspend in such cases.  Still, that's 
> > > not
> > > sufficient to prevent a child from being registered after we've run
> > > dpm_suspend().  For this reason, we could also leave dpm_suspend() with
> > > dpm_list_mtx held and not release it until the next dpm_resume() is run.
> > 
> > The pm_sleep_rwsem will do a better job of catching such errors.
> 
> But we should not leave a window between releasing dpm_list_mtx and taking
> pm_sleep_rwsem.  Either that, or we should make sure that dpm_active is
> empty after acquiring pm_sleep_rwsem.

I've got some ideas on how to implement this.

We can add a new field "suspend_called" to dev->power.  It would be
owned by the PM core (protect by dpm_list_mtx) and read-only to
drivers.  Normally it will contain 0, but when the suspend method is
running we set it to SUSPEND_RUNNING and when the method returns
successfully we set it to SUSPEND_DONE.  Before calling the resume
method we set it back to 0.  Drivers can use this field as an easy way
of checking that all the child devices have been suspended.

When a new device is registered we check its parent's suspend_called
value.  If it is SUSPEND_DONE then the caller has a bug and we have to
fail the registration.  If it is SUSPEND_RUNNING then the registration
is legal, but we remember what happened.  Then when the
currently-running suspend method returns and we reacquire the
dpm_list_mtx, we will realize that a race was lost.  If the method
completed successfully (which it shouldn't) we can resume that device
immediately without ever taking it off the dpm_active list; but either
way we should continue the suspend loop.  Now the new child will be at
the end of the dpm_active_list, so it will be suspended before the
parent is reached again.

This way we can recover from drivers that are willing to suspend their 
device even though there are unsuspended children.  The only drawback 
will be that for a short time the child will be active while its parent 
is suspended.

We should not abort the entire sleep transition simply because we lost 
a race.  With this scheme we won't even need the pm_sleep_rwsem; the 
dpm_list_mtx will provide all the necessary protection.

This is more intricate than it should be.  It would have been better to
have had "disable_new_children" and "enable_new_children" methods from
the beginning; then there wouldn't be any races at all.  That's life...

The one tricky thing to watch out for is when a suspend or resume 
method wants to unregister the device being suspended or resumed.  Even 
that should be doable (set suspend_called to UNREGISTERED or something 
like that).

> > > That will potentially cause some trouble to CPU hotplug cotifiers, but we 
> > > can
> > > handle that, for example, by using the in_suspend_context() test.
> > 
> > Do they need to register new CPUs at some point?  There ought to be a 
> > way to handle that.
> 
> No, they don't, but there are some CPU-related device objects that get
> uregistered/registered.  Still, all of this work is really redundant if the 
> CPU
> in question comes back up during the resume, so it should be avoided in
> general.  The CPU hotplug notifiers should only unregister those objects if
> the CPU hasn't gone on line during the resume and they have all information
> necessary for discovering that.

Unregistration should always be allowed, and registration should be 
allowed whenever the parent isn't suspended.  For devices with no 
parent, we can imagine there is a fictitious parent at the root of the 
device tree.  Conceptually it gets suspended after every real device 
and resumed before.  Maybe even before dpm_power_up(), meaning that 
devices with no parent could be registered by a resume_early method.

When your lock-removal stuff gets into Greg's tree, I'll write all 
this.  Sound good?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/28] Swap over NFS -v16

2008-02-26 Thread Peter Zijlstra

On Tue, 2008-02-26 at 16:29 +0100, Miklos Szeredi wrote:
> > > mm-page_file_methods.patch
> > > 
> > > This makes page_offset and others more expensive by adding a
> > > conditional jump to a function call that is not usually made.
> > > 
> > > Why do swap pages have a different index to everyone else?
> > 
> > Because the page->index of an anonymous page is related to its (anon)vma
> > so that it satisfies the constraints for vm_normal_page().
> > 
> > The index in the swap file it totally unrelated and quite random. Hence
> > the swap-cache uses page->private to store it in.
> 
> Yeah, and putting the condition into page_offset() will confuse code
> which uses it for finding the offset in the VMA or in a tmpfs file.
> 
> So why not just have a separate page_swap_offset() function, used
> exclusively by swap_in/out()?

Ah, we can do the page_file_offset() to match page_file_index() and
page_file_mapping(). And convert NFS to use page_file_offset() where
appropriate, as I already did for these others.

That would sort out the mess, right?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Jamie Lokier
Ric Wheeler wrote:
> >>I was surprised that fsync() doesn't do this already.  There was a lot
> >>of effort put into block I/O write barriers during 2.5, so that
> >>journalling filesystems can force correct write ordering, using disk
> >>flush cache commands.
> >>
> >>After all that effort, I was very surprised to notice that Linux 2.6.x
> >>doesn't use that capability to ensure fsync() flushes the disk cache
> >>onto stable storage.
> >
> >It's surprising you are surprised, given that this [lame] fsync behavior 
> >has remaining consistently lame throughout Linux's history.
> 
> Maybe I am confused, but isn't this is what fsync() does today whenever 
> barriers are enabled (the fsync() invalidates the drive's write cache).

No, fsync() doesn't always flush the drive's write cache.  It often
does, any I think many people are under the impression it always does,
but it doesn't.

Try this code on ext3:

fd = open ("test_file", O_RDWR | O_CREAT | O_TRUNC, 0666);
while (1) {
char byte;
usleep (10);
pwrite (fd, , 1, 0);
fsync (fd);
}

It will do just over 10 write ops per second on an idle system (13 on
mine), and 1 flush op per second.

That's because ext3 fsync() only does a journal commit when the inode
has changed.  The inode mtime is changed by write only with 1 second
granularity.  Without a journal commit, there's no barrier, which
translates to not flushing disk write cache.

If you add "fchmod (fd, 0644); fchmod (fd, 0664);" between the write
and fsync, you'll see at least 20 write ops and 20 flush ops per
second, and you'll here the disk seeking more.  That's because the
fchmod dirties the inode, so fsync() writes the inode with a journal
commit.

It turns out even _that_ is not sufficient according to the kernel
internals.  A journal commit uses an ordered request, which isn't the
same as a flush potentially, it just happens to use flush in this
instance.  I'm not sure if ordered requests are actually implemented
by any drivers at the moment.  If not now, they will be one day.

We could change ext3 fsync() to always do a journal commit, and depend
on the non-existence of block drivers which do ordered (not flush)
barrier requests.  But there's lots of things wrong with that.  Not
least, it sucks performance for database-like applications and virtual
machines, a lot due to unnecessary seeks.  That way lies wrongness.

Rightness is to make fdatasync() work well, with a genuine flush (or
equivalent (see FUA), only when required, and not a mere ordered
barrier), no inode write, and to make sync_file_range()[*] offer the
fancier applications finer controls which reflect what they actually
need.

[*] - or whatever.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] 2.6.25-rc2-mm1 - fix mcount GPL bogosity.

2008-02-26 Thread Krzysztof Halasa
"David Schwartz" <[EMAIL PROTECTED]> writes:

> I don't know who told you that or why, but it's obvious nonsense,

Correct.

> Exports should be marked GPL if and only if they cannot be used
> except in a derivative work. If it is possible to use them without taking
> sufficient protectable expression, they should not be marked GPL.

This isn't very obvious to me.

The licence doesn't talk about GPL or non-GPL exports. It doesn't
restrict the use, only distribution of the software. One is free to
remove _GPL from the code and distribute it anyway (except perhaps for
some DMCA nonsense).

If a code is a derivative work it has to be distributed (use is not
restricted) under GPL, EXPORT _GPL or not _GPL.

One may say _GPL is a strong indication that all users are
automatically a derivative works, but it's only that - indication. It
doesn't mean they are really derivative works and it doesn't mean a
module not using any _GPL exports isn't a derivative.

I think introducing these _GPL symbols was a mistake in the first place.
-- 
Krzysztof Halasa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/28] Swap over NFS -v16

2008-02-26 Thread Peter Zijlstra

On Tue, 2008-02-26 at 16:29 +0100, Miklos Szeredi wrote:
> > > mm-page_file_methods.patch
> > > 
> > > This makes page_offset and others more expensive by adding a
> > > conditional jump to a function call that is not usually made.
> > > 
> > > Why do swap pages have a different index to everyone else?
> > 
> > Because the page->index of an anonymous page is related to its (anon)vma
> > so that it satisfies the constraints for vm_normal_page().
> > 
> > The index in the swap file it totally unrelated and quite random. Hence
> > the swap-cache uses page->private to store it in.
> 
> Yeah, and putting the condition into page_offset() will confuse code
> which uses it for finding the offset in the VMA 

Right, do we do that anywhere?

> or in a tmpfs file.

Good point. I really should go read tmpfs some day, its really a blind
spot for me.

> So why not just have a separate page_swap_offset() function, used
> exclusively by swap_in/out()?

That would require duplicating quite a lot of NFS code from what I can
see.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BUG] Potential data corruption when splice data spliced from socket to another socket

2008-02-26 Thread Changli Gao
After reviewing the tcp splice receive code, I found that instead of
increasing the page reference counter, pipe buffer holds the socket
buffer by calling skb_get(skb). When you splice this pipe buffer to
another socket, such as a TCP socket, though the function sendpage
returns, the page buffer will be still in use, then you drop the
reference to the skb, so the buffer is free to another process. At
this time, the buffer is shared between socket and another part of
Linux kernel silently. It is possible that the data sent out is
corrupted.

The reason is splice send process knows nothing but page, so before
submitting the buffer to sendpage, we must ensure that the page is an
actual page not a fake one. A solution is adding a member function
get_page, which is used to get a actual page, to structure
pipe_buffer_operations. It the page in structure pipe_buffer isn't an
actual page, a page will be allocated, filled with the corresponding
data and returned. Before calling sendpage, get_page should be called
to get the actual page, and after calling sendpage, the page will be
freed by calling put_page.

Beside splice send process, other code paths maybe have the same problem.

-- 
Regards,
Changli Gao([EMAIL PROTECTED])
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24.2-rt2

2008-02-26 Thread Jan Kiszka
Steven Rostedt wrote:
> 
> On Tue, 26 Feb 2008, Jan Kiszka wrote:
> 
>> Jan Kiszka wrote:
>>> At this chance: We still see the same unbalanced sched-other load on our
>>> NUMA box as Gernot once reported [1]:
>>>
>>> top - 11:19:20 up 4 min,  1 user,  load average: 29.52, 9.54, 3.37
>>> Tasks: 502 total,  41 running, 461 sleeping,   0 stopped,   0 zombie
>>> Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu1  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu4  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu12 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu13 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu14 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Cpu15 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>>> 0.0%st
>>> Mem:  65513284k total,  1032032k used, 64481252k free, 6444k buffers
>>> Swap:  3204896k total,0k used,  3204896k free,37312k cached
>>>
>> ETOOMANYKERNELS, this was from 2.6.23.12-rt14. 2.6.24.2-rt2 shows a
>> different patter under identical load:
> 
> There has been CFS updates, which may account for the differences. Seems
> better though.
> 
>> top - 12:55:27 up 2 min,  1 user,  load average: 9.97, 2.42, 0.81
>> Tasks: 491 total,  42 running, 449 sleeping,   0 stopped,   0 zombie
>> Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu1  : 99.7%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  
>> 0.0%st
>> Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu4  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu5  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu6  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu7  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu9  :  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu10 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu12 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu13 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu14 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Cpu15 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Mem:  65512480k total,   580704k used, 64931776k free, 8964k buffers
>> Swap:  3204896k total,0k used,  3204896k free,   129720k cached
>>
> 
> What's the NUMA topology?

4 nodes. I'm not sure if it is really NUMA related, but the same kernel
runs that test as expected on a non-NUMA 2x2 box.

> What tasks are running, and at what priorities?

40 pthreads, created with default parameters from a main thread which
runs with default parameters as well. The threads simply run endless loops.

> 
> Those three idle CPUS, should they have tasks running on them?

For sure, given the overload situation of the system (40x full load vs.
16 cores). Neither did we fiddle with any parameter of the system
(knowingly, its a standard openSUSE 10.3 underneath) nor did we set
thread affinities.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc[12] Video4Linux Bttv Regression

2008-02-26 Thread Robert Fitzsimons
> Bisecting this won't be that easy. The support for the depreciated V4L1 API
> were removed from bttv driver. Now, it uses v4l1-compat module, that 
> translates
> a V4L1 call into a V4L2 one. I'll try to seek for troubles at the current 
> code.

I think I might have seen this problem but it didn't cause a oops for
me, just that the radio program would hang waiting for the ioctl syscall
to return.  I did tried looking for a new radio program that used the
V4L2 API but couldn't find one.  I'll have a more in-depth look at the
bttv driver when I get home tonight.

Robert

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.25-rc1] jerky mouse cursor and randoooom key repeats

2008-02-26 Thread Chris Holvenstot
Jiri - 

For what it is worth, and understand that it is hard to prove a
negitive, on slack moments over the weekend I repeatedly booted my
system into single user (console) mode using a kernel with
CONFIG_GROUP_SCHED set to yes.

To date I have NOT been able to recreate the repeating key issue outside
of X.

Chris


On Tue, 2008-02-26 at 16:05 +0100, Jiri Kosina wrote:
> On Tue, 26 Feb 2008, Lennart Sorensen wrote:
> 
> > Hmm, I have been seeing repeated keys a lot under X on my athlon 700, 
> > but mainly when I have firefox running (which is of course quite a load 
> > on the poor old thing).  This has been going on for probably the last 
> > year or so.  I thought it was just the machine getting weird, although 
> > whenever it wasn't running firefox or other memory/cpu heavy loads it 
> > seemed fine.
> 
> This could be caused by the fact that as far as I know, X are not using 
> kernel-autorepeat, but they are handling it themselves, right? So if their 
> sense of time (probably due to some change of kernel timekeeping) gets 
> wrong, the autorepeat in X might also get wrong.
> 
> It would be nice to know if when you hit the situation when autorepeat 
> goes strange in X, if it is still OK in console.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.25 patch] drivers/crypto/hifn_795x.c: fix 64bit division

2008-02-26 Thread Adrian Bunk
On Tue, Feb 26, 2008 at 01:21:00PM +0100, Martin Michlmayr wrote:
> With 2.6.25-rc3 and a config file with
> 
> CONFIG_CRYPTO_DEV_HIFN_795X=m
> CONFIG_CRYPTO_DEV_HIFN_795X_RNG=y
> 
> I get the following build error on at least ARM and MIPS:
> 
>   Building modules, stage 2.
>   MODPOST 759 modules
> ERROR: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!

Fix below.

> Martin Michlmayr

cu
Adrian


<--  snip  -->


Using ndelay() with a 64bit variable as parameter can result in build 
errors like the following on some 32bit systems when it results in a 
64bit division:

<--  snip  -->

 ...
  MODPOST 759 modules
ERROR: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!

<--  snip  -->

Reported by Martin Michlmayr.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

40b45041ddc587c20b872a86a6a36952c28b02c7 diff --git 
a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index 3110bf7..b1541c6 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -807,7 +807,7 @@ static int hifn_rng_data_present(struct hwrng *rng, int 
wait)
return 1;
if (!wait)
return 0;
-   ndelay(nsec);
+   ndelay((u32)nsec);
return 1;
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/28] Swap over NFS -v16

2008-02-26 Thread Miklos Szeredi
> > mm-page_file_methods.patch
> > 
> > This makes page_offset and others more expensive by adding a
> > conditional jump to a function call that is not usually made.
> > 
> > Why do swap pages have a different index to everyone else?
> 
> Because the page->index of an anonymous page is related to its (anon)vma
> so that it satisfies the constraints for vm_normal_page().
> 
> The index in the swap file it totally unrelated and quite random. Hence
> the swap-cache uses page->private to store it in.

Yeah, and putting the condition into page_offset() will confuse code
which uses it for finding the offset in the VMA or in a tmpfs file.

So why not just have a separate page_swap_offset() function, used
exclusively by swap_in/out()?

Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24.3 (if_addrlabel.h HEADERS_CHECK failure)

2008-02-26 Thread Stephen Hemminger
On Tue, 26 Feb 2008 14:38:47 +
Daniel Drake <[EMAIL PROTECTED]> wrote:

> Randy Dunlap wrote:
> >> We (the -stable team) are announcing the release of the 2.6.24.3
> >> kernel.
> > 
> > When HEADERS_CHECK=y:
> > 
> > make[3]: *** No rule to make target 
> > `/local/linsrc/linux-2.6.24.3/include/linux/if_addrlabel.h', needed by 
> > `/local/linsrc/linux-2.6.24.3/usr/include/linux/if_addrlabel.h'.  Stop.
> > make[2]: *** [linux] Error 2
> 
> This appears to have been caused by the patch titled:
> 
>   NET: Add if_addrlabel.h to sanitized headers.
> 
> The patch only adds the unifdef-y entry for this header file, however 
> that header was only added after 2.6.24.
> 
> It seems that this patch was submitted to -stable in error. Stephen, can 
> you confirm?

The patch was meant for 2.6.25 only. 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Jamie Lokier
Jörn Engel wrote:
> On Tue, 26 February 2008 20:16:11 +1100, Nick Piggin wrote:
> > Yeah, sync_file_range has slightly unusual semantics and introduce
> > the new concept, "writeout", to userspace (does "writeout" include
> > "in drive cache"? the kernel doesn't think so, but the only way to
> > make sync_file_range "safe" is if you do consider it writeout).
> 
> If sync_file_range isn't safe, it should get replaced by a noop
> implementation.  There really is no point in promising "a little"
> safety.

Sometimes there is a point in "a little" safety.

There's a spectrum of durability (meaning how safely stored the data
is).  In the cases we're imagining, it's application -> main memory
cache -> disk cache -> disk surface.  There are others.

_None_ of those provide perfect safety for your data.  They are a
spectrum, and how far along you want data to be committed before you
say "fine, the data is safe enough for me" depends on your application.

For example, there are users who like to turn _off_ fdatasync() with
their SQL database of choice.  They prefer speed over safety, and they
don't mind losing an hour's data and doing regular backups (we assume
;-) Some blogs fall into this category; who cares if a rare crash
costs you a comment or two and a restore from backup; it's acceptable
for the speed.

There's users who would really like fdatasync() to commit data to the
drive platters, so after their database says "done", they are very
confident that a power failure won't cause committed data to be lost.
Accepting credit cards is more at this end.  So should be anyone using
a virtual machine of any kind without a journalling fs in the guest!

And there's users who like it where it is right now: a compromise,
where a system crash won't lose committed data; but a power failure
might.  (I'm making assumptions about drive behaviour on reset here.)

My problem with fdatasync() at the moment is, I can't choose what I
want from it, and there's no mechanism to give me the safest option.
Most annoyingly, in-kernel filesystems _do_ have a mechanism; it just
isn't exported to userspace.

(A quick aside: fdatasync() et al. are actually used for two
_different_ things.  1: A program says "I've written it", it can say
so with confidence, e.g. announcing email receipt.  2: It's used for
write ordering with write-ahead logging: write, fdatasync, write.
When you tease at the details, efficient implementations of them are
different...  Think SCSI tagged commands versus cache flushes.)

> One interesting aspect of this comes with COW filesystems like btrfs or
> logfs.  Writing out data pages is not sufficient, because those will get
> lost unless their referencing metadata is written as well.  So either we
> have to call fsync for those filesystems or add another callback and let
> filesystems override the default implementation.

Doesn't the ->fsync callback get called in the sys_fdatasync() case,
with appropriate arguments?

With barriers/flushes it certainly makes those a bit more complicated.
You have to flush not just the disks with data pages, but the _other_
disks in a software RAID with data pointer metadata pages, but ideally
not all of them (think database journal commit).

That can be implemented with per-buffer pending-barrier/flush flags
(like I described for pages in the first mail), which are equally
useful when a database-like application uses a block device.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] mmiotrace full patch, preview 1

2008-02-26 Thread Andy Whitcroft
On Tue, Feb 26, 2008 at 11:49:48AM +0100, Ingo Molnar wrote:
> 
> * Andy Whitcroft <[EMAIL PROTECTED]> wrote:
> 
> > Ok, so that would be the following, work for everyone?
> > 
> > WARNING: mutexes are preferred for single holder semaphores
> > #1: FILE: Z95.c:1:
> > +   DECLARE_MUTEX();
> > 
> > WARNING: mutexes are preferred for single holder semaphores
> > #3: FILE: Z95.c:3:
> > +   init_MUTEX();
> 
> yeah.
> 
>   Acked-by: Ingo Molnar <[EMAIL PROTECTED]>
> 
> also i guess init_MUTEX_LOCKED() should emit a "this should be a 
> completion" warning.

Thats easy enough.  Though your tone here implies its less definatly
wrong than the other use forms.  Do we want gentle language here?

"consider using a completion"

> i guess non-DEFINE_SPINLOCK old-style spinlock definition:
> 
>   spinlock_t lock = SPIN_LOCK_UNLOCKED;
> 
> should emit a 'use DEFINE_SPINLOCK' warning as well?

Those (SPIN_LOCK_UNLOCKED & RW_LOCK_UNLOCKED) we already pick up and
indicate are deprecated.

-apw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.25-rc2-mm1] Oops in __kmalloc

2008-02-26 Thread Jiri Slaby
Hi,

while booting up a notebook on 32 bit, this oopses appeared on the console
after ext3 fsck:
http://www.fi.muni.cz/~xslaby/sklad/mem_oops/

It's 2.6.25-rc2-mm1, I can't find similar reports, is this known or hardware
issue (unlikely, 2.6.24.2 seems to be OK)?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24.2-rt2

2008-02-26 Thread Steven Rostedt


On Tue, 26 Feb 2008, Jan Kiszka wrote:

> Jan Kiszka wrote:
> > At this chance: We still see the same unbalanced sched-other load on our
> > NUMA box as Gernot once reported [1]:
> >
> > top - 11:19:20 up 4 min,  1 user,  load average: 29.52, 9.54, 3.37
> > Tasks: 502 total,  41 running, 461 sleeping,   0 stopped,   0 zombie
> > Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu1  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu4  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu12 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu13 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu14 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Cpu15 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  
> > 0.0%st
> > Mem:  65513284k total,  1032032k used, 64481252k free, 6444k buffers
> > Swap:  3204896k total,0k used,  3204896k free,37312k cached
> >
>
> ETOOMANYKERNELS, this was from 2.6.23.12-rt14. 2.6.24.2-rt2 shows a
> different patter under identical load:

There has been CFS updates, which may account for the differences. Seems
better though.

>
> top - 12:55:27 up 2 min,  1 user,  load average: 9.97, 2.42, 0.81
> Tasks: 491 total,  42 running, 449 sleeping,   0 stopped,   0 zombie
> Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu1  : 99.7%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
> Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu4  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu5  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu6  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu7  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu9  :  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu10 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu12 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu13 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu14 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu15 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:  65512480k total,   580704k used, 64931776k free, 8964k buffers
> Swap:  3204896k total,0k used,  3204896k free,   129720k cached
>

What's the NUMA topology? What tasks are running, and at what priorities?

Those three idle CPUS, should they have tasks running on them?

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUILD_FAILURE] Linux 2.6.25-rc3 - various unexported functions () on powerpc

2008-02-26 Thread Adrian Bunk
On Tue, Feb 26, 2008 at 07:59:08PM +0530, Kamalesh Babulal wrote:
> Hi,
> 
> The 2.6.25-rc3 kernel build fails on powerpc with allyesconfig config option,
> the .config has been attached.
>...

Builds fine here.

Local problem (e.g. disk full) on your machine?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [(RT RFC) PATCH v2 6/9] add a loop counter based timeout mechanism

2008-02-26 Thread Gregory Haskins
>>> On Mon, Feb 25, 2008 at  5:06 PM, in message
<[EMAIL PROTECTED]>, Pavel Machek <[EMAIL PROTECTED]> wrote: 
> 
> I believe you have _way_ too many config variables. If this can be set
> at runtime, does it need a config option, too?

Generally speaking, I think until this algorithm has an adaptive-timeout in 
addition to an adaptive-spin/sleep, these .config based defaults are a good 
idea.  Sometimes setting these things at runtime are a PITA when you are 
talking about embedded systems that might not have/want a nice userspace 
sysctl-config infrastructure.  And changing the defaults in the code is 
unattractive for some users.  I don't think its a big deal either way, so if 
people hate the config options, they should go.  But I thought I would throw 
this use-case out there to ponder.

Regards,
-Greg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: epoll design problems with common fork/exec patterns

2008-02-26 Thread Michael Kerrisk


Davide Libenzi wrote:
> On Sun, 28 Oct 2007, David Schwartz wrote:
> 
>> Eric Dumazet wrote:
>>
>>> Events are not necessarly reported "by descriptors". epoll uses an opaque
>>> field provided by the user.
>>>
>>> It's up to the user to properly chose a tag that will makes sense
>>> if the user
>>> app is playing dup()/close() games for example.
>> Great. So the only issue then is that the documentation is confusing. It
>> frequently uses the term "fd" where it means file. For example, it says:
>>
>>   Q1 What  happens  if  you  add  the  same fd to an
>> epoll_set
>>  twice?
>>
>>   A1 You will probably get EEXIST.  However,  it  is
>> possible
>>  that  two  threads  may  add the same fd twice. This is
>> a
>>  harmless condition.
>>
>> This gives no reason to think there's anything wrong with adding the same
>> file twice so long as you do so through different descriptors. (One can
>> imagine an application that does this to segregate read and write operations
>> to avoid a race where the descriptor is closed from under a writer due to
>> handling a fatal read error.) Obviously, that won't work.
> 
> I agree, that is confusing. However, you can safely add two different file 
> descriptors pointing to the same file*, with different event masks, and 
> that will work as expected.

So can I summarize what I understand:

a) Adding the same file descriptor twice to an epoll set will cause an
error (EEXIST).

b) In a separate message to linux-man, Chris Heath says that two threads
*can't* add the same fd twice to an epoll set, despite what the existing
man page text says.  I haven't tested that, but it sounds to me as though
it is likely to be true.  Can you comment please Davide?

c) It is possible to add duplicated file descriptors referring to the same
underlying open file description ("file *").  As you note, this can be a
useful filtering technique, if the two file descriptors specify different
masks.

Assuming that is all correct, for man-pages-2.79, I've reworked the text
for Q1/A1 as follows:

   Q1 What  happens  if you add the same file descriptor
  to an epoll set twice?

   A1 You will probably get EEXIST.  However, it is pos-
  sible   to   add  a  duplicate  (dup(2),  dup2(2),
  fcntl(2) F_DUPFD, fork(2)) descriptor to the  same
  epoll  set.   This  can  be a useful technique for
  filtering events, if the duplicate  file  descrip-
  tors are registered with different events masks.

Seem okay Davide?

Cheers,

Michael

PS I've trimmed the part of this thread about Q6/A6, since I dealt with
that in another thread ("epoll and shared fd's").

-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: epoll and shared fd's

2008-02-26 Thread Michael Kerrisk
Following up after quite some time:

Davide Libenzi wrote:
> On Sat, 26 Jan 2008, Michael Kerrisk wrote:
> 
>> On Jan 25, 2008 12:57 AM, Davide Libenzi <[EMAIL PROTECTED]> wrote:
>>> On Thu, 24 Jan 2008, Pierre Habouzit wrote:
>>>
 On Fri, Jan 18, 2008 at 09:10:18PM +, Davide Libenzi wrote:
> On Fri, 18 Jan 2008, Pierre Habouzit wrote:
>
>>   Hi,
>>
>>   I just came across a strange behavior of epoll that seems to
>> contradict the documentation. Here is what happens:
>>
>> * I have two processes P1 and P2, P1 accept()s connections, and send the
>>   resulting file descriptors to P2 through a unix socket.
>>
>> * P2 registers the received socket in his epollfd.
>>
>>   [time passes]
>>
>> * P2 is done with the socket and closes it
>>
>> * P2 gets events for the socket again !
>>
>>
>>   Though the documentation says that if a process closes a file
>> descriptor, it gets unregistered. And yes I'm sure that P2 doens't dup()
>> the file descriptor. Though (because of a bug) it was still open in
>> P1[0], hence the referenced socket still live at the kernel level.
>>
>>   Of course the userland workaround is to force the EPOLL_CTL_DEL before
>> the close, which I now do, but costs me a syscall where I wanted to
>> spare one :|
> For epoll, a close is when the kernel file* is released (that is, when all
> its instances are gone).
> We could put a special handling in filp_close(), but I don't think is a
> good idea, and we're better live with the current behaviour.
   Okay, maybe updating the linux manpages to be more clear about that is
 the way to go then. Thanks
>>> Sure. I'll send Michael Kerrisk and updated statement for the A6 answer in
>>> the epoll man page.
>> Thanks Davide -- yes please send me a patch.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [EMAIL PROTECTED]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 
> Something like the one below ...
> 
> 
> - Davide
> 
> 
> 
> --- epoll.4   2008-01-26 12:58:21.0 -0800
> +++ epoll.4.new   2008-01-26 13:06:36.0 -0800
> @@ -285,7 +285,19 @@
>  sets automatically?
>  .TP
>  .B A6
> -Yes.
> +A file descriptor is the userspace counterpart of an internal kernel handle.
> +Every time a process calls functions liks
> +.BR dup (2),
> +.BR dup2 (2)
> +or
> +.BR fork (2),
> +a new file descriptor referring to the same internal kernel handle is
> +created. The internal kernel handle remains alive until all the userspace
> +file descriptors have been closed.
> +The
> +.BR epoll (4)
> +interface automatically removes the internal kernel handle from the set,
> +once all the file descriptor instances have been closed.
>  .TP
>  .B Q7
>  If more than one event occurs between

Davide,

Two points.

a) I did a

s/internal kernel handle/open file description/

since that is the POSIX term for the internal handle.

b) It seems to me that you text doesn't quite make the point explicit
enough.  I've tried to rewrite it; could you please check:

   A6 Yes, but be aware of the following point.  A  file
  descriptor is a reference to an open file descrip-
  tion (see  open(2)).   Whenever  a  descriptor  is
  duplicated  via dup(2), dup2(2), fcntl(2) F_DUPFD,
  or fork(2), a new file descriptor referring to the
  same  open  file  description is created.  An open
  file description continues to exist until all file
  descriptors referring to it have been closed.  The
  epoll  interface  automatically  removes  a   file
  descriptor  from  an  epoll set only after all the
  file descriptors referring to the underlying  open
  file  handle  have  been  closed.  This means that
  even after a file descriptor that is  part  of  an
  epoll  set has been closed, events may be reported
  for that file descriptor if other file descriptors
  referring  to the same underlying file description
  remain open.

Does that seem okay?  I plan to include the text in man-pages-2.79.

Was there some reason why removing a file descriptor couldn't have been
made to do the "expected" thing (i.e., remove notifications for that file
descriptor, regardless of whether the underlying file description remains
open)?

Cheers,

Michael

-- 
Michael Kerrisk
Maintainer of the Linux man-pages project
http://www.kernel.org/doc/man-pages/
Want to report a man-pages bug?  Look here:
http://www.kernel.org/doc/man-pages/reporting_bugs.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Ric Wheeler

Jeff Garzik wrote:

Jamie Lokier wrote:

By durable, I mean that fsync() should actually commit writes to
physical stable storage,


Yes, it should.



I was surprised that fsync() doesn't do this already.  There was a lot
of effort put into block I/O write barriers during 2.5, so that
journalling filesystems can force correct write ordering, using disk
flush cache commands.

After all that effort, I was very surprised to notice that Linux 2.6.x
doesn't use that capability to ensure fsync() flushes the disk cache
onto stable storage.


It's surprising you are surprised, given that this [lame] fsync behavior 
has remaining consistently lame throughout Linux's history.


Maybe I am confused, but isn't this is what fsync() does today whenever 
barriers are enabled (the fsync() invalidates the drive's write cache).


ric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [(RT RFC) PATCH v2 5/9] adaptive real-time lock support

2008-02-26 Thread Gregory Haskins
>>> On Mon, Feb 25, 2008 at  5:03 PM, in message
<[EMAIL PROTECTED]>, Pavel Machek <[EMAIL PROTECTED]> wrote: 

>> +static inline void
>> +prepare_adaptive_wait(struct rt_mutex *lock, struct adaptive_waiter 
> *adaptive)
> ...
>> +#define prepare_adaptive_wait(lock, busy) {}
> 
> This is evil. Use empty inline function instead (same for the other
> function, there you can maybe get away with it).
> 

I went to implement your suggested change and I remembered why I did it this 
way:  I wanted a macro so that the "struct adaptive_waiter" local variable will 
fall away without an #ifdef in the main body of code.  So I have left this 
logic alone for now.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Jamie Lokier
Jörn Engel wrote:
> On Tue, 26 February 2008 20:16:11 +1100, Nick Piggin wrote:
> > 
> > Yeah, sync_file_range has slightly unusual semantics and introduce
> > the new concept, "writeout", to userspace (does "writeout" include
> > "in drive cache"? the kernel doesn't think so, but the only way to
> > make sync_file_range "safe" is if you do consider it writeout).
> 
> If sync_file_range isn't safe, it should get replaced by a noop
> implementation.  There really is no point in promising "a little"
> safety.
> 
> One interesting aspect of this comes with COW filesystems like btrfs or
> logfs.  Writing out data pages is not sufficient, because those will get
> lost unless their referencing metadata is written as well.  So either we
> have to call fsync for those filesystems or add another callback and let
> filesystems override the default implementation.

fdatasync() is required to write data pages _and_ the necessary
metadata to reference those changed pages (btrfs tree etc.), but not
non-data metadata.

It's the filesystem's responsibility to interpret that correctly.
In-place writes don't need anything else.  Phase-tree style writes do.
Some kinds of logged writes don't.

I'm under the impression that sync_file_range() is a sort of
restricted-range asynchronous fdatasync().

By limiting the range of file date which must be written out, it
becomes more refined for database and filesystem-in-a-file type
applications.  Just as fsync() is more refined than sync() - it's
useful to sync less - same goes for syncing just part of a file.

It's still the filesystem's responsibility to sync data access
metadata appropriately.  It can sync more if it wants, but not less.

That's what I understand by
   sync_file_range(fd, start,length, SYNC_FILE_RANGE_WRITE_BEFORE
   | SYNC_FILE_RANGE_WRITE
   | SYNC_FILE_RANGE_WRITE_AFTER);
Largely because the manual says to use that combination of flags for
an equivalent to fdatasync().

The concept of "write-out" is not defined in the manual.  I'm assuming
it to mean this, as a reasonable guess:

SYNC_FILE_RANGE_WRITE scans all pages in the range, looking for dirty
pages which aren't already queued for write-out.  It marks those with
a "write-out" flag, and starts write I/Os at some unspecified time in
the near future; it can be assumed writes for all the pages will
complete eventually if there's no errors.  When I/O completes on a
page, it cleans the page and also clears the write-out flag.

SYNC_FILE_RANGE_WAIT_AFTER waits until all pages in the range don't
have the "write-out" flag set.

SYNC_FILE_RANGE_WAIT_BEFORE does the same wait, but before marking
pages for write-out.  I don't actually see the point in this.  Isn't a
preceding call with SYNC_FILE_RANGE_WAIT_AFTER equivalent, making
BEFORE a redundant flag?

The manual says it is something to do with data-integrity, but it's
not clear to me what that means.

All this implies that "write-out" flag is a concept userspace can rely
on.  That's not so peculiar: WRITE seems to be equivalent to AIO-style
fdatasync() on a limited range of offsets, and WAIT_AFTER seems to be
equivalent to waiting for any previously issued such ops to complete.

Any data access metadata updates that btrfs must make for fdatasync(),
it must also make for sync_file_range(), for the limited range of
offsets.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.25-rc1] jerky mouse cursor and randoooom key repeats

2008-02-26 Thread Jiri Kosina
On Tue, 26 Feb 2008, Lennart Sorensen wrote:

> Hmm, I have been seeing repeated keys a lot under X on my athlon 700, 
> but mainly when I have firefox running (which is of course quite a load 
> on the poor old thing).  This has been going on for probably the last 
> year or so.  I thought it was just the machine getting weird, although 
> whenever it wasn't running firefox or other memory/cpu heavy loads it 
> seemed fine.

This could be caused by the fact that as far as I know, X are not using 
kernel-autorepeat, but they are handling it themselves, right? So if their 
sense of time (probably due to some change of kernel timekeeping) gets 
wrong, the autorepeat in X might also get wrong.

It would be nice to know if when you hit the situation when autorepeat 
goes strange in X, if it is still OK in console.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] fbdev: Make deferred I/O work as advertized

2008-02-26 Thread Jaya Kumar
On Tue, Feb 26, 2008 at 9:45 AM, Markus Armbruster <[EMAIL PROTECTED]> wrote:
>
>  What about pushing the fb_defio fixes independently of any new
>  fb_defio users?  If fb_defio was worth merging into Linus's tree, it
>  should be worth fixing there, whether new users are in shape already
>  or not.

I think that Andrew's message is saying that there may be a race
condition in the defio patch itself as opposed to the defio user
patch.

If there is no race condition or other problems, then I think it would
make sense to merge the defio patch independent of metronomefb or
other new patches that use defio.

Thanks,
jaya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.25-rc1] jerky mouse cursor and randoooom key repeats

2008-02-26 Thread Lennart Sorensen
On Thu, Feb 21, 2008 at 03:45:00AM -0600, Chris Holvenstot wrote:
> Jiri - 
> 
> I am tempted to lie to you and say it was in both modes, or when running
> under X only, but to tell you the truth while I have not seen it while
> running in console mode, I have not spent enough time there to make a
> solid statement one way or the other.

Hmm, I have been seeing repeated keys a lot under X on my athlon 700,
but mainly when I have firefox running (which is of course quite a load
on the poor old thing).  This has been going on for probably the last
year or so.  I thought it was just the machine getting weird, although
whenever it wasn't running firefox or other memory/cpu heavy loads it
seemed fine.

The USB bus also keeps detecting and loosing a USB hub again and again,
so perhaps that is also a software problem and not actually the hardware
failing.

For example I continuously get this:

usb-storage: device found at 84
usb-storage: waiting for device to settle before scanning
usb-storage: device scan complete
usb 1-1: USB disconnect, address 82
usb 1-1.1: USB disconnect, address 83
usb 1-1.1.1: USB disconnect, address 84
usb 2-2: USB disconnect, address 44
usb 2-2: new full speed USB device using uhci_hcd and address 45
usb 2-2: configuration #1 chosen from 1 choice
hub 2-2:1.0: USB hub found
usb 1-1: new full speed USB device using uhci_hcd and address 85
usb 1-1: configuration #1 chosen from 1 choice
hub 1-1:1.0: USB hub found
usb 1-1.1: new full speed USB device using uhci_hcd and address 86
usb 1-1.1: configuration #1 chosen from 1 choice
hub 1-1.1:1.0: USB hub found
usb 1-1.1.1: new full speed USB device using uhci_hcd and address 87
usb 1-1.1.1: configuration #1 chosen from 1 choice
scsi20 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 87
usb-storage: waiting for device to settle before scanning
usb-storage: device scan complete
usb 1-1: USB disconnect, address 85
usb 1-1.1: USB disconnect, address 86
usb 1-1.1.1: USB disconnect, address 87
usb 1-1: new full speed USB device using uhci_hcd and address 88
usb 1-1: configuration #1 chosen from 1 choice
hub 1-1:1.0: USB hub found
usb 1-1.1: new full speed USB device using uhci_hcd and address 89
usb 1-1.1: configuration #1 chosen from 1 choice
hub 1-1.1:1.0: USB hub found
usb 1-1.1.1: new full speed USB device using uhci_hcd and address 90
usb 1-1.1.1: configuration #1 chosen from 1 choice
scsi21 : SCSI emulation for USB Mass Storage devices
usb-storage: device found at 90
usb-storage: waiting for device to settle before scanning
usb-storage: device scan complete
usb 1-1: USB disconnect, address 88
usb 1-1.1: USB disconnect, address 89
usb 1-1.1.1: USB disconnect, address 90
usb 2-2: USB disconnect, address 45
usb 2-2: new full speed USB device using uhci_hcd and address 46
usb 2-2: configuration #1 chosen from 1 choice
hub 2-2:1.0: USB hub found

I have never checked if it stops doing that when firefox and such
isn't running and the keyboard isn't repeating characters.

I wonder if my keyboard problems started around 2.6.18 or earlier than
that or later.  I should try booting an older kernel again if I can make
it and see if it changes.

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] exporting capability name/code pairs (final#2)

2008-02-26 Thread Andrew G. Morgan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Acked-by: Andrew G. Morgan <[EMAIL PROTECTED]>
Tested-by: Andrew G. Morgan <[EMAIL PROTECTED]>

Cheers

Andrew

Kohei KaiGai wrote:
| [PATCH 2/3] exporting capability name/code pairs
|
| This patch enables to export code/name pairs of capabilities the running
| kernel supported.
|
| A newer kernel sometimes adds new capabilities, like CAP_MAC_ADMIN
| at 2.6.25. However, we have no interface to disclose what capabilities
| are supported on the running kernel. Thus, we have to maintain libcap
| version in appropriate one synchronously.
|
| This patch enables libcap to collect the list of capabilities at run time,
| and provide them for users. It helps to improve portability of library.
|
| It exports these information as regular files under
/sys/kernel/capability.
| The numeric node exports its name, the symbolic node exports its code.
|
| Signed-off-by: KaiGai Kohei <[EMAIL PROTECTED]>
| --
|  Documentation/ABI/testing/sysfs-kernel-capability |   23 +
|  scripts/mkcapnames.sh |   44 +
|  security/Makefile |9 ++
|  security/commoncap.c  |   99
+
|  4 files changed, 175 insertions(+), 0 deletions(-)
|
| diff --git a/Documentation/ABI/testing/sysfs-kernel-capability
b/Documentation/ABI/testing/sysfs-kernel-capability
| index e69de29..d4a14e7 100644
| --- a/Documentation/ABI/testing/sysfs-kernel-capability
| +++ b/Documentation/ABI/testing/sysfs-kernel-capability
| @@ -0,0 +1,23 @@
| +What:/sys/kernel/capability
| +Date:Feb 2008
| +Contact: KaiGai Kohei <[EMAIL PROTECTED]>
| +Description:
| + The entries under /sys/kernel/capability are used to export
| + the list of capabilities the running kernel supports.
| +
| + - /sys/kernel/capability/version
| +   returns the most preferable version number for the
| +   running kernel.
| +   e.g) $ cat /sys/kernel/capability/version
| +0x20071026
| +
| + - /sys/kernel/capability/code/
| +   returns its symbolic representation, on reading.
| +   e.g) $ cat /sys/kernel/capability/codes/30
| +cap_audit_control
| +
| + - /sys/kernel/capability/name/
| +   returns its numerical representation, on reading.
| +   e.g) $ cat /sys/kernel/capability/names/cap_sys_pacct
| +20
| +
| diff --git a/scripts/mkcapnames.sh b/scripts/mkcapnames.sh
| index e69de29..5d36d52 100644
| --- a/scripts/mkcapnames.sh
| +++ b/scripts/mkcapnames.sh
| @@ -0,0 +1,44 @@
| +#!/bin/sh
| +
| +#
| +# generate a cap_names.h file from include/linux/capability.h
| +#
| +
| +CAPHEAD="`dirname $0`/../include/linux/capability.h"
| +REGEXP='^#define CAP_[A-Z_]+[]+[0-9]+$'
| +NUMCAP=`cat "$CAPHEAD" | egrep -c "$REGEXP"`
| +
| +echo '#ifndef CAP_NAMES_H'
| +echo '#define CAP_NAMES_H'
| +echo
| +echo '/*'
| +echo ' * Do NOT edit this file directly.'
| +echo ' * This file is generated from include/linux/capability.h
automatically'
| +echo ' */'
| +echo
| +echo '#if !defined(SYSFS_CAP_NAME_ENTRY) ||
!defined(SYSFS_CAP_CODE_ENTRY)'
| +echo '#error cap_names.h should be included from security/capability.c'
| +echo '#else'
| +echo "#if $NUMCAP != CAP_LAST_CAP + 1"
| +echo '#error mkcapnames.sh cannot collect capabilities correctly'
| +echo '#else'
| +cat "$CAPHEAD" | egrep "$REGEXP" \
| +| awk '{ printf("SYSFS_CAP_NAME_ENTRY(%s,%s);\n", tolower($2),
$2); }'
| +echo
| +echo 'static struct attribute *capability_name_attrs[] = {'
| +cat "$CAPHEAD" | egrep "$REGEXP" \
| +| awk '{ printf("\t&%s_name_attr.attr,\n", tolower($2)); } END {
print "\tNULL," }'
| +echo '};'
| +
| +echo
| +cat "$CAPHEAD" | egrep "$REGEXP" \
| +| awk '{ printf("SYSFS_CAP_CODE_ENTRY(%s,%s);\n", tolower($2),
$2); }'
| +echo
| +echo 'static struct attribute *capability_code_attrs[] = {'
| +cat "$CAPHEAD" | egrep "$REGEXP" \
| +| awk '{ printf("\t&%s_code_attr.attr,\n", tolower($2)); } END {
print "\tNULL," }'
| +echo '};'
| +
| +echo '#endif'
| +echo '#endif'
| +echo '#endif'
| diff --git a/security/Makefile b/security/Makefile
| index 9e8b025..4093e3e 100644
| --- a/security/Makefile
| +++ b/security/Makefile
| @@ -18,3 +18,12 @@ obj-$(CONFIG_SECURITY_SELINUX) += 
selinux/built-in.o
|  obj-$(CONFIG_SECURITY_SMACK) += commoncap.o smack/built-in.o
|  obj-$(CONFIG_SECURITY_CAPABILITIES)  += commoncap.o capability.o
|  obj-$(CONFIG_SECURITY_ROOTPLUG)  += commoncap.o root_plug.o
| +
| +# cap_names.h contains the code/name pair of capabilities.
| +# It is generated using include/linux/capability.h automatically.
| +$(obj)/commoncap.o: $(obj)/cap_names.h
| +quiet_cmd_cap_names  = CAPS$@
| + cmd_cap_names  = /bin/sh $(srctree)/scripts/mkcapnames.sh > $@
| +targets += cap_names.h
| 

Re: 2.6.25-rc1/2 CD/DVD burning broken

2008-02-26 Thread Andreas Schwab
Borislav Petkov <[EMAIL PROTECTED]> writes:

> On Mon, Feb 25, 2008 at 11:08:55PM +0100, Andreas Schwab wrote:
>> Borislav Petkov <[EMAIL PROTECTED]> writes:
>> 
>> > On Mon, Feb 25, 2008 at 08:38:22PM +0100, Andreas Schwab wrote:
>> >> "Kiyoshi Ueda" <[EMAIL PROTECTED]> writes:
>> >> 
>> >> > I'm looking at this problem, but currently no idea why the conversion
>> >> > to blk_end_request causes it.
>> >> 
>> >> cdrom_newpc_intr apparently never sets rq->sense_len.
>> >> 
>> >
>> > actually it does, see the code chunk around line 1188 in 2.6.25-rc2, for
>> > example.
>> 
>> Yes, it does, but it always adds zero.
>
> yep, true. Does that fix your dvd burning problem?

Yes, sure.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Misc: phantom, consistent whitespace

2008-02-26 Thread jan sonnek
Make it consistent with the rest of the header.

Signed-off-by: jan sonnek <[EMAIL PROTECTED]>
Cc: Jiri Slaby <[EMAIL PROTECTED]>
---
 include/linux/phantom.h |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/phantom.h b/include/linux/phantom.h
index a341e21..02268c5 100644
--- a/include/linux/phantom.h
+++ b/include/linux/phantom.h
@@ -27,13 +27,13 @@ struct phm_regs {
 
 #define PH_IOC_MAGIC   'p'
 #define PHN_GET_REG_IOWR(PH_IOC_MAGIC, 0, struct phm_reg *)
-#define PHN_SET_REG_IOW (PH_IOC_MAGIC, 1, struct phm_reg *)
+#define PHN_SET_REG_IOW(PH_IOC_MAGIC, 1, struct phm_reg *)
 #define PHN_GET_REGS   _IOWR(PH_IOC_MAGIC, 2, struct phm_regs *)
-#define PHN_SET_REGS   _IOW (PH_IOC_MAGIC, 3, struct phm_regs *)
+#define PHN_SET_REGS   _IOW(PH_IOC_MAGIC, 3, struct phm_regs *)
 /* this ioctl tells the driver, that the caller is not OpenHaptics and might
  * use improved registers update (no more phantom switchoffs when using
  * libphantom) */
-#define PHN_NOT_OH _IO  (PH_IOC_MAGIC, 4)
+#define PHN_NOT_OH _IO(PH_IOC_MAGIC, 4)
 #define PHN_GETREG _IOWR(PH_IOC_MAGIC, 5, struct phm_reg)
 #define PHN_SETREG _IOW(PH_IOC_MAGIC, 6, struct phm_reg)
 #define PHN_GETREGS_IOWR(PH_IOC_MAGIC, 7, struct phm_regs)
-- 
1.5.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] fbdev: Make deferred I/O work as advertized

2008-02-26 Thread Markus Armbruster
"Jaya Kumar" <[EMAIL PROTECTED]> writes:

> On Mon, Feb 25, 2008 at 8:03 AM, Markus Armbruster <[EMAIL PROTECTED]> wrote:
>>
>> Subject: [PATCH 1/1 2.6.24] fbdev: defio and Metronomefb
>> From: Jaya Kumar <[EMAIL PROTECTED]>
>> Date: 2008-02-18 13:41:26
>
> Hi Markus,
>
> Andrew pointed out that there may be race conditions associated with
> this patch. [ http://marc.info/?l=linux-fbdev-devel=120376473020396=2
> ] So I would not encourage anyone to merge it. I'll try to figure
> things out this weekend.
>
> Thanks,
> jaya

Thanks for the timely info.  I'm not in an undue hurry to get this
merged.  As long as we're moving forward, I'm happy.

What about pushing the fb_defio fixes independently of any new
fb_defio users?  If fb_defio was worth merging into Linus's tree, it
should be worth fixing there, whether new users are in shape already
or not.

If we must have a new user, well, I could easily whip up something
like FB_VIRTUAL on top of fb_defio.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sata_nv: fix nmi intr or system hanging in rhel4u6 adma.

2008-02-26 Thread Robert Hancock



Kuan Luo wrote:
Hi, robert 


One customer reported that their system received a nmi interrupt after
issuing "dd if=/dev/sdb of=/dev/null" on a defective disk in rhel4u6.
I tested it and found  that my system hung both in rhel4u6(2.6.9-67) and
2.6.24-rc7.
The patch can work well,  but I am not sure if the patch has other
potential effect on adma.
I attached a  file in case of lines breaked.

The below info comes from Gunther Mayer to reproduce the issue.
"
used a Seagate ST3500841NS 3.AE for my test; probably other 
seagate drives are also capable of creating media errors with 
the new hdparm-8.1: 

- compile hdparm-8.1 
- hdparm -- yes-i-know-what-i-am-doing --make-bad-sector 6 /dev/sdb 


Unfortunately this does not succeed for nvidia sata controller (timeouts
et al.), but it worked fine on AHCI machine (e.g. FSC R640). 

When I insert this newly created defective disk in Ultra 20, 
it reboots within seconds after issueing "dd if=/dev/sdb of=/dev/null". 
"


Signed-off-by: [EMAIL PROTECTED]

---
 
drivers/ata/sata_nv.c |5 +++--

 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c
index ed5473b..e824260 100644
--- a/drivers/ata/sata_nv.c
+++ b/drivers/ata/sata_nv.c
@@ -837,9 +837,10 @@ static void nv_adma_tf_read(struct ata_port *ap,
struct ata_taskfile *tf)
   all shortly be aborted anyway. We assume that NCQ commands
are not
   issued via passthrough, which is the only way that switching
into
   ADMA mode could abort outstanding commands. */
-   nv_adma_register_mode(ap);
+   struct nv_adma_port_priv *pp = ap->private_data;
 
-	ata_tf_read(ap, tf);

+   if (pp->flags & NV_ADMA_PORT_REGISTER_MODE)
+   ata_tf_read(ap, tf);
 }
 
 static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, __le16

*cpb)


This is basically avoiding switching into register mode, right? I don't 
think this is a very good solution as the point of the tf_read function 
is that it's supposed to read the taskfile provided by the drive to 
diagnose the error, so not doing this isn't a good thing.


Is there a reason why going into register mode should cause a lockup in 
this case?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH] Fix b43 driver build for arm

2008-02-26 Thread Ben Dooks
On Wed, Feb 20, 2008 at 08:37:09PM +0100, Sam Ravnborg wrote:
> On Wed, Feb 20, 2008 at 03:44:04PM +0100, Michael Buesch wrote:
> > On Wednesday 20 February 2008 01:44:38 Gordon Farquharson wrote:
> > > Hi Michael
> > > 
> > > On Feb 19, 2008 3:41 AM, Michael Buesch <[EMAIL PROTECTED]> wrote:
> > > 
> > > > > [2] 
> > > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=7492d4a416d68ab4bd254b36ffcc4e0138daa8ff
> > > > >
> > > >
> > > > That doesn't cause me to magically sign off this sort of patches, too.
> > > > The sanity check is clearly broken in file2alias.c, as it checks 
> > > > something
> > > > from the target kernel against the host environment it is compiled on.
> > > > That doesn't make any sense at all.
> > > 
> > > I think that you make some good points, but I'm at a loss as to how to
> > > fix the problem. Do you have any suggestions?
> > 
> > Remove the broken sanity check, if it's not possible the check there.
> The check is valid for > 99% of the kernel builds as
> cross compile builds are not that typical.
> And the check is there for the sake of modutils.

I build all of my ARM kernels on an x86 box, it is much faster
and I don't have to ensure I have a read/write capable filesystem
for any of my ARM boards.

-- 
Ben ([EMAIL PROTECTED], http://www.fluff.org/)

  'a smiley only costs 4 bytes'
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24.3 (if_addrlabel.h HEADERS_CHECK failure)

2008-02-26 Thread Daniel Drake

Randy Dunlap wrote:

We (the -stable team) are announcing the release of the 2.6.24.3
kernel.


When HEADERS_CHECK=y:

make[3]: *** No rule to make target 
`/local/linsrc/linux-2.6.24.3/include/linux/if_addrlabel.h', needed by 
`/local/linsrc/linux-2.6.24.3/usr/include/linux/if_addrlabel.h'.  Stop.
make[2]: *** [linux] Error 2


This appears to have been caused by the patch titled:

NET: Add if_addrlabel.h to sanitized headers.

The patch only adds the unifdef-y entry for this header file, however 
that header was only added after 2.6.24.


It seems that this patch was submitted to -stable in error. Stephen, can 
you confirm?


Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/37] Permit filesystem local caching

2008-02-26 Thread David Howells
Daniel Phillips <[EMAIL PROTECTED]> wrote:

> I need to respond to this in pieces... first the bit that is bugging
> me:
> 
> > >   * two new page flags
> > 
> > I need to keep track of two bits of per-cached-page information:
> > 
> >  (1) This page is known by the cache, and that the cache must be informed if
> >  the page is going to go away.
> 
> I still do not understand the life cycle of this bit.  What does the
> cache do when it learns the page has gone away?

That's up to the cache.  CacheFS, for example, unpins some resources when all
the pages managed by a pointer block are taken away from it.  The cache may
also reserve a block on disk to back this page, and that reservation may then
be discarded by the netfs uncaching the page.

The cache may also speculatively take copies of the page if the machine is
idle.

Documentation/filesystems/caching/netfs-api.txt describes the caching API as a
process, including the presentation of netfs pages to the cache and their
uncaching.

> How is it informed?

[Documentation/filesystems/caching/netfs-api.txt]
==
PAGE UNCACHING
==

To uncache a page, this function should be called:

void fscache_uncache_page(struct fscache_cookie *cookie,
  struct page *page);

This function permits the cache to release any in-memory representation it
might be holding for this netfs page.  This function must be called once for
each page on which the read or write page functions above have been called to
make sure the cache's in-memory tracking information gets torn down.

Note that pages can't be explicitly deleted from the data file.  The whole
data file must be retired (see the relinquish cookie function below).

Furthermore, note that this does not cancel the asynchronous read or write
operation started by the read/alloc and write functions.
[/]

> Who owns the page cache in which such a page lives, the nfs client?
> Filesystem that hosts the page?  A third page cache owned by the
> cache itself?  (See my basic confusion about how many page cache
> levels you have, below.)

[Documentation/filesystems/caching/fscache.txt]
 (7) Data I/O is done direct to and from the netfs's pages.  The netfs
 indicates that page A is at index B of the data-file represented by cookie
 C, and that it should be read or written.  The cache backend may or may
 not start I/O on that page, but if it does, a netfs callback will be
 invoked to indicate completion.  The I/O may be either synchronous or
 asynchronous.
[/]

I should perhaps make the documentation more explicit: the pages passed to the
routines defined in include/linux/fscache.h are netfs pages, normally belonging
the pagecache of the appropriate netfs inode.  This is, however, mentioned in
the function banner comments in fscache.h.

> Suppose one were to take a mundane approach to the persistent cache
> problem instead of layering filesystems.  What you would do then is
> change NFS's ->write_page and variants to fiddle the persistent
> cache

It is a requirement laid down by the Linux NFS fs maintainers that the writes
to the cache be asynchronous, even if the writes to NFS aren't.

Note further that NFS's write_page() != writing to the cache.  Writing to the
cache is typically done by NFS's readpages().

Besides, at the moment, caching is suppressed for any NFS file opened for
writing due to coherency issues.  This is something to be revisited later.

> as well as the network, instead of just the network as now.

Not as now.  See above.

> This fiddling could even consist of ->write calls to another
> filesystem, though working directly with the bio interface would
> yield the fastest, and therefore to my mind, best result.

You can't necessarily access the BIO interface, and even if you can, the cache
is still a filesystem.

Essentially, what cachefiles does is to do what you say: to perform ->write
calls on another filesystem.

FS-Cache also protects the netfs against (a) there being no cache, (b) the
cache suffering a fatal I/O error and (c) the cache being removed; and protects
the cache against (d) the netfs uncaching pages that the cache is using and (e)
conflicting operations from the netfs, some of which may be queued for
asynchronous processing.

FS-Cache also groups asynchronous netfs store requests together, which
hopefully, one day, I'll be able to pass on to the backing fs.

> In any case, you find out how to write the page to backing store by
> asking the filesystem, which in the naive approach would be nfs
> augmented with caching library calls.

NFS and AFS and CIFS and ISOFS, but yes, that's what fscache is, if you like, a
caching library.

> The filesystem keeps its own metadata around to know how to map the page to
> disk.  So again naively, this metadata could tell the nfs client that the
> page is not mapped to disk at all.

The netfs should _not_ know about the metadata of a backing fs.  Firstly, there
are many different 

Re: Linux 2.6.24.3 (incr patch missing)

2008-02-26 Thread Daniel Drake

Greg Kroah-Hartman wrote:

We (the -stable team) are announcing the release of the 2.6.24.3
kernel.


patch-2.6.24.2-3.* files are missing from 
http://www.kernel.org/pub/linux/kernel/v2.6/incr/


The 2.6.23.17 patches which were released at the same time are there, so 
it doesn't seem to be a case of kernel.org being behind.


Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oops when using git gc --auto

2008-02-26 Thread Otavio Salvador
Nick Piggin <[EMAIL PROTECTED]> writes:

> On Wednesday 27 February 2008 00:22, Otavio Salvador wrote:
>> Hello,
>>
>> Today I got this oops, someone has an idea of what's going wrong?
>>
>> Unable to handle kernel paging request at 0200 RIP:
>>  [] find_get_pages+0x3c/0x69
>
> At this point, the most likely candidate is a memory corruption
> error, probably hardware. Can you run memtest86 for a few hours
> to get a bit more confidence in the hw (preferably overnight)?

Those memories are new, but I can try. No problem. Will get back to
you by tomorrow.

> I did recently see another quite similar corruption in the
> pagecache radix-tree, though. Coincidence maybe?

I hope not.

-- 
O T A V I OS A L V A D O R
-
 E-mail: [EMAIL PROTECTED]  UIN: 5906116
 GNU/Linux User: 239058 GPG ID: 49A5F855
 Home Page: http://otavio.ossystems.com.br
-
"Microsoft sells you Windows ... Linux gives
 you the whole house."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24.3

2008-02-26 Thread Pascal Hambourg

Hello,

Tino Keitel a écrit :


I can see the patch in http://www.kernel.org/pub/linux/kernel/v2.6/,
but no incremental patch in
http://www.kernel.org/pub/linux/kernel/v2.6/incr/. Is this due to some
delay, or was is just not uploaded?


It seems that the 2->3 incremental patch was uploaded in 
/pub/linux/kernel/v2.6/ too.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] fbdev: Make deferred I/O work as advertized

2008-02-26 Thread Jaya Kumar
On Mon, Feb 25, 2008 at 8:03 AM, Markus Armbruster <[EMAIL PROTECTED]> wrote:
>
> Subject: [PATCH 1/1 2.6.24] fbdev: defio and Metronomefb
> From: Jaya Kumar <[EMAIL PROTECTED]>
> Date: 2008-02-18 13:41:26

Hi Markus,

Andrew pointed out that there may be race conditions associated with
this patch. [ http://marc.info/?l=linux-fbdev-devel=120376473020396=2
] So I would not encourage anyone to merge it. I'll try to figure
things out this weekend.

Thanks,
jaya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Proposal for "proper" durable fsync() and fdatasync()

2008-02-26 Thread Jörn Engel
On Tue, 26 February 2008 20:16:11 +1100, Nick Piggin wrote:
> 
> Yeah, sync_file_range has slightly unusual semantics and introduce
> the new concept, "writeout", to userspace (does "writeout" include
> "in drive cache"? the kernel doesn't think so, but the only way to
> make sync_file_range "safe" is if you do consider it writeout).

If sync_file_range isn't safe, it should get replaced by a noop
implementation.  There really is no point in promising "a little"
safety.

One interesting aspect of this comes with COW filesystems like btrfs or
logfs.  Writing out data pages is not sufficient, because those will get
lost unless their referencing metadata is written as well.  So either we
have to call fsync for those filesystems or add another callback and let
filesystems override the default implementation.

Jörn

-- 
There is no worse hell than that provided by the regrets
for wasted opportunities.
-- Andre-Louis Moreau in Scarabouche
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] I found a type miss in fs/ext2/ext2

2008-02-26 Thread Pekka Enberg
On Tue, Feb 26, 2008 at 12:24 PM,  <[EMAIL PROTECTED]> wrote:
>  Maybe I found a type miss in fs/ext2/ext2.h which is in linux-2.6.24.3, and 
> write difference below.

You probably want to read Documentation/SubmittingPatches first and
re-send the patch in the proper format.

 Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24.3

2008-02-26 Thread Sven Köhler

no incremental patch in
http://www.kernel.org/pub/linux/kernel/v2.6/incr/. Is this due to some
delay, or was is just not uploaded?


Would be really nice to have indeed.



signature.asc
Description: OpenPGP digital signature


Re: PROBLEM: 2.4.36.1 hangs.

2008-02-26 Thread Pascal Hambourg

dann frazier a écrit :


Correcting the le16_to_cpu placement as Glen described
fixes the issue for me.


One of my boxes has at least six directories triggering the issue (I 
must be very unlucky), and Glen's patch fixes it all. Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


usb detecting only high speed devices only - not detecting low speed devices

2008-02-26 Thread mahendra varman
Hello all

In a project Iam using 7448 powerpc processor . In that board iam
using ISP 1562 philips PCI based usb controller.
The problem iam facing is in Linux level the usb ports are detecting
only ehci high speed devices(flash mem stick)
 But it is not detecting low speed devices(mouse,keyboard) and
reporting Unlink no irq..Controller probably using wrong irq.

As per ISP 1562 the same interrupt is routed to 3 functions inside one
controller( 2 ohci and 1 ehci)
 The interrupt works for the ehci device ( flash mem stick)
The same interrupt is assigned for ohci function. I removed the flash
stick and inserted an ohci device(mouse)
but iam getting unlink after no IRQ

How the interrupt works for ehci and the same interrupt not working for ohci ?

I can ensure that the IRQ assignment has been done properly as well as
I have enabled necesary configs in menuconfig for ehci and ohci

I tried linux version 2.6.12 , 2.6.16.60 and also 2.6.23  , 2.6.24 ..
In all these iam facing the above issues

Please shed some light to solve the issue

Thanks


Below are some observations
 ---
BEFORE INSERTING MOUSE
/ # cat /proc/interrupts
   CPU0
 12: 99  tsi108_pic Level serial
 36:  1  tsi108_PCI_int Level VMEBus (Tsi148)
  39:  0  tsi108_PCI_int Level ehci_hcd:usb1,
ohci_hcd:usb2, ohci_hcd :usb3

AFTER INSERTING MOUSE
/ # usb 3-1: new low speed USB device using ohci_hcd and address 2
ohci_hcd :01:04.1: Unlink after no-IRQ?  Controller is probably
using the wrong IRQ.

/ # cat /proc/interrupts
   CPU0
 12:130  tsi108_pic Level serial
 36:  1  tsi108_PCI_int Level VMEBus (Tsi148)
 39:  2  tsi108_PCI_int Level ehci_hcd:usb1,
ohci_hcd:usb2, ohci_hcd:usb3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/6] Add chip driver for WM9712 touchscreen

2008-02-26 Thread Mark Brown
Signed-off-by: Liam Girdwood <[EMAIL PROTECTED]>
Signed-off-by: Graeme Gregory <[EMAIL PROTECTED]>
Signed-off-by: Mike Arthur <[EMAIL PROTECTED]>
Signed-off-by: Mark Brown <[EMAIL PROTECTED]>
Cc: Dmitry Baryshkov <[EMAIL PROTECTED]>
Cc: Stanley Cai <[EMAIL PROTECTED]>
Cc: Rodolfo Giometti <[EMAIL PROTECTED]>
Cc: Russell King <[EMAIL PROTECTED]>
Cc: Marc Kleine-Budde <[EMAIL PROTECTED]>
Cc: Pete MacKay <[EMAIL PROTECTED]>
Cc: Ian Molton <[EMAIL PROTECTED]>
Cc: Vince Sanders <[EMAIL PROTECTED]>
Cc: Andrew Zabolotny <[EMAIL PROTECTED]>
---
 drivers/input/touchscreen/wm9712.c |  461 
 1 files changed, 461 insertions(+), 0 deletions(-)
 create mode 100644 drivers/input/touchscreen/wm9712.c

diff --git a/drivers/input/touchscreen/wm9712.c 
b/drivers/input/touchscreen/wm9712.c
new file mode 100644
index 000..eaab326
--- /dev/null
+++ b/drivers/input/touchscreen/wm9712.c
@@ -0,0 +1,461 @@
+/*
+ * wm9712.c  --  Codec driver for Wolfson WM9712 AC97 Codecs.
+ *
+ * Copyright 2003, 2004, 2005, 2006, 2007 Wolfson Microelectronics PLC.
+ * Author: Liam Girdwood
+ * [EMAIL PROTECTED] or [EMAIL PROTECTED]
+ * Parts Copyright : Ian Molton <[EMAIL PROTECTED]>
+ *   Andrew Zabolotny <[EMAIL PROTECTED]>
+ *   Russell King <[EMAIL PROTECTED]>
+ *
+ *  This program is free software; you can redistribute  it and/or modify it
+ *  under  the terms of  the GNU General  Public License as published by the
+ *  Free Software Foundation;  either version 2 of the  License, or (at your
+ *  option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define TS_NAME"wm97xx"
+#define WM9712_VERSION "0.61"
+#define DEFAULT_PRESSURE   0xb0c0
+
+/*
+ * Module parameters
+ */
+
+/*
+ * Set internal pull up for pen detect.
+ *
+ * Pull up is in the range 1.02k (least sensitive) to 64k (most sensitive)
+ * i.e. pull up resistance = 64k Ohms / rpu.
+ *
+ * Adjust this value if you are having problems with pen detect not
+ * detecting any down event.
+ */
+static int rpu = 8;
+module_param(rpu, int, 0);
+MODULE_PARM_DESC(rpu, "Set internal pull up resitor for pen detect.");
+
+/*
+ * Set current used for pressure measurement.
+ *
+ * Set pil = 2 to use 400uA
+ * pil = 1 to use 200uA and
+ * pil = 0 to disable pressure measurement.
+ *
+ * This is used to increase the range of values returned by the adc
+ * when measureing touchpanel pressure.
+ */
+static int pil;
+module_param(pil, int, 0);
+MODULE_PARM_DESC(pil, "Set current used for pressure measurement.");
+
+/*
+ * Set threshold for pressure measurement.
+ *
+ * Pen down pressure below threshold is ignored.
+ */
+static int pressure = DEFAULT_PRESSURE & 0xfff;
+module_param(pressure, int, 0);
+MODULE_PARM_DESC(pressure, "Set threshold for pressure measurement.");
+
+/*
+ * Set adc sample delay.
+ *
+ * For accurate touchpanel measurements, some settling time may be
+ * required between the switch matrix applying a voltage across the
+ * touchpanel plate and the ADC sampling the signal.
+ *
+ * This delay can be set by setting delay = n, where n is the array
+ * position of the delay in the array delay_table below.
+ * Long delays > 1ms are supported for completeness, but are not
+ * recommended.
+ */
+static int delay = 3;
+module_param(delay, int, 0);
+MODULE_PARM_DESC(delay, "Set adc sample delay.");
+
+/*
+ * Set five_wire = 1 to use a 5 wire touchscreen.
+ *
+ * NOTE: Five wire mode does not allow for readback of pressure.
+ */
+static int five_wire;
+module_param(five_wire, int, 0);
+MODULE_PARM_DESC(five_wire, "Set to '1' to use 5-wire touchscreen.");
+
+/*
+ * Set adc mask function.
+ *
+ * Sources of glitch noise, such as signals driving an LCD display, may feed
+ * through to the touch screen plates and affect measurement accuracy. In
+ * order to minimise this, a signal may be applied to the MASK pin to delay or
+ * synchronise the sampling.
+ *
+ * 0 = No delay or sync
+ * 1 = High on pin stops conversions
+ * 2 = Edge triggered, edge on pin delays conversion by delay param (above)
+ * 3 = Edge triggered, edge on pin starts conversion after delay param
+ */
+static int mask;
+module_param(mask, int, 0);
+MODULE_PARM_DESC(mask, "Set adc mask function.");
+
+/*
+ * Coordinate Polling Enable.
+ *
+ * Set to 1 to enable coordinate polling. e.g. x,y[,p] is sampled together
+ * for every poll.
+ */
+static int coord;
+module_param(coord, int, 0);
+MODULE_PARM_DESC(coord, "Polling coordinate mode");
+
+/*
+ * ADC sample delay times in uS
+ */
+static const int delay_table[] = {
+   21,/* 1 AC97 Link frames */
+   42,/* 2 */
+   84,/* 4 */
+   167,   /* 8 */
+   333,   /* 16 */
+   667,   /* 32 */
+   1000,  /* 48 */
+   1333,  /* 64 */
+   2000,  /* 96 */
+   2667,  /* 128 */
+   ,  /* 160 */
+   4000,  /* 192 */
+   4667,  /* 224 

Re: Please, put 64-bit counter per task and incr.by.one each ctxt switch.

2008-02-26 Thread Alexey Dobriyan
On 2/26/08, J.C. Pizarro <[EMAIL PROTECTED]> wrote:
> On 2008/2/25, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Sun, 24 Feb 2008 14:12:47 +0100 "J.C. Pizarro" <[EMAIL PROTECTED]>
> wrote:
> >
> > > It's statistic, yes, but it's a very important parameter for the
> CPU-scheduler.
> > > The CPU-scheduler will know the number of context switches of each task
> > > before of to take a blind decision into infinitum!.
> >
> >
> > We already have these:
> >
> > unsigned long nvcsw, nivcsw; /* context switch counts */
> >
> > in the task_struct.
>
> 1. They use "unsigned long" instead "unsigned long long".
> 2. They use "= 0;" instead of "= 0ULL";

Very funny.

> 3. They don't use ++ (incr. by one per ctxt-switch).

No they do, read schedule() already.

> 4. I don't like the separation of voluntary and involuntary ctxt-switches,
> and i don't understand the utility of this separation.

Ah, that's why you don't like it.

> The tsk->nvcsw & tsk->nivcsw mean different to i had proposed.
>
> It's simple, when calling to function kernel/sched.c:context_switch(..)
> to do ++, but they don't do it.
>
> I propose you
> 1. unsigned long long tsk->ncsw = 0ULL; and tsk->ncsw++;
> 2. unsigned long long tsk->last_registered_ncsw = tsk->ncsw; when it's
> polling.
> 3. long tsk->vcsw = ( tsk->ncsw - tsk->last_registered_ncsw ) / ( t2 - t1 )
> /* velocity of task (ctxt-switches per second), (t1 != t2 in seconds
> for no zerodiv)
> 4. long tsk->last_registered_vcsw = tsk->vcsw;
> 5. long tsk->normalized_vcsw =
> (1 - alpha)*tsk->last_registered_vcsw + alpha*tsk->vcsw; /* 0 */

   6. Profit.

As I understood the idea of CFS, all interactivity heuristics were bitbucketed,
so you'll add them back (you won't, of course, because you can't be arsed
to send a patch)

So best course of action it to describe workload and setup (distro, relevant
.config items and so on.) on which CFS behaves poorly.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/6] Build system and MAINTAINERS entry for WM97xx touchscreen drivers

2008-02-26 Thread Mark Brown
Signed-off-by: Mark Brown <[EMAIL PROTECTED]>
Signed-off-by: Liam Girdwood <[EMAIL PROTECTED]>
---
 MAINTAINERS|   10 +++
 drivers/input/touchscreen/Kconfig  |   52 
 drivers/input/touchscreen/Makefile |7 +
 3 files changed, 69 insertions(+), 0 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 36c7bc6..96d19c1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4317,6 +4317,16 @@ L:   [EMAIL PROTECTED]
 W: http://oops.ghostprotocols.net:81/blog
 S: Maintained
 
+WM97XX TOUCHSCREEN DRIVERS
+P: Mark Brown
+M: [EMAIL PROTECTED]
+P: Liam Girdwood
+M: [EMAIL PROTECTED]
+L: [EMAIL PROTECTED]
+T: git git://opensource.wolfsonmicro.com/linux-2.6-touch
+W: http://opensource.wolfsonmicro.com/node/7
+S: Supported
+
 X.25 NETWORK LAYER
 P: Henner Eisen
 M: [EMAIL PROTECTED]
diff --git a/drivers/input/touchscreen/Kconfig 
b/drivers/input/touchscreen/Kconfig
index 90e8e92..0be05a2 100644
--- a/drivers/input/touchscreen/Kconfig
+++ b/drivers/input/touchscreen/Kconfig
@@ -158,6 +158,58 @@ config TOUCHSCREEN_TOUCHRIGHT
  To compile this driver as a module, choose M here: the
  module will be called touchright.
 
+config TOUCHSCREEN_WM97XX
+   tristate "Support for WM97xx AC97 touchscreen controllers"
+   depends on AC97_BUS
+
+config TOUCHSCREEN_WM9705
+   bool "WM9705 Touchscreen interface support"
+   depends on TOUCHSCREEN_WM97XX
+   help
+ Say Y here if you have a Wolfson Microelectronics WM9705 touchscreen
+ controller connected to your system.
+
+ If unsure, say N.
+
+ To compile this driver as a module, choose M here: the
+ module will be called wm9705.
+
+config TOUCHSCREEN_WM9712
+   bool "WM9712 Touchscreen interface support"
+   depends on TOUCHSCREEN_WM97XX
+   help
+ Say Y here if you have a Wolfson Microelectronics WM9712 touchscreen
+ controller connected to your system.
+
+ If unsure, say N.
+
+ To compile this driver as a module, choose M here: the
+ module will be called wm9712.
+
+config TOUCHSCREEN_WM9713
+   bool "WM9713 Touchscreen interface support"
+   depends on TOUCHSCREEN_WM97XX
+   help
+ Say Y here if you have a Wolfson Microelectronics WM9713 touchscreen
+ controller connected to your system.
+
+ If unsure, say N.
+
+ To compile this driver as a module, choose M here: the
+ module will be called wm9713.
+
+config TOUCHSCREEN_WM97XX_MAINSTONE
+   tristate "WM97xx Mainstone accelerated touch"
+   depends on TOUCHSCREEN_WM97XX && ARCH_PXA
+   help
+ Say Y here for support for streaming mode with WM97xx touchscreens
+ on Mainstone systems.
+
+ If unsure, say N
+
+ To compile this driver as a module, choose M here: the
+ module will be called mainstone-wm97xx
+
 config TOUCHSCREEN_TOUCHWIN
tristate "Touchwin serial touchscreen"
select SERIO
diff --git a/drivers/input/touchscreen/Makefile 
b/drivers/input/touchscreen/Makefile
index 35d4097..d38156e 100644
--- a/drivers/input/touchscreen/Makefile
+++ b/drivers/input/touchscreen/Makefile
@@ -4,6 +4,8 @@
 
 # Each configuration option enables a list of files.
 
+wm97xx-ts-y := wm97xx-core.o
+
 obj-$(CONFIG_TOUCHSCREEN_ADS7846)  += ads7846.o
 obj-$(CONFIG_TOUCHSCREEN_BITSY)+= h3600_ts_input.o
 obj-$(CONFIG_TOUCHSCREEN_CORGI)+= corgi_ts.o
@@ -19,3 +21,8 @@ obj-$(CONFIG_TOUCHSCREEN_PENMOUNT)+= penmount.o
 obj-$(CONFIG_TOUCHSCREEN_TOUCHRIGHT)   += touchright.o
 obj-$(CONFIG_TOUCHSCREEN_TOUCHWIN) += touchwin.o
 obj-$(CONFIG_TOUCHSCREEN_UCB1400)  += ucb1400_ts.o
+obj-$(CONFIG_TOUCHSCREEN_WM97XX)   += wm97xx-ts.o
+obj-$(CONFIG_TOUCHSCREEN_WM97XX_MAINSTONE) += mainstone-wm97xx.o
+wm97xx-ts-$(CONFIG_TOUCHSCREEN_WM9705)  += wm9705.o
+wm97xx-ts-$(CONFIG_TOUCHSCREEN_WM9712)  += wm9712.o
+wm97xx-ts-$(CONFIG_TOUCHSCREEN_WM9713)  += wm9713.o
-- 
1.5.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/6] Driver for WM97xx touchscreens in streaming mode on Mainstone

2008-02-26 Thread Mark Brown
Signed-off-by: Liam Girdwood <[EMAIL PROTECTED]>
Signed-off-by: Graeme Gregory <[EMAIL PROTECTED]>
Signed-off-by: Mike Arthur <[EMAIL PROTECTED]>
Signed-off-by: Mark Brown <[EMAIL PROTECTED]>
Cc: Dmitry Baryshkov <[EMAIL PROTECTED]>
Cc: Stanley Cai <[EMAIL PROTECTED]>
Cc: Rodolfo Giometti <[EMAIL PROTECTED]>
Cc: Russell King <[EMAIL PROTECTED]>
Cc: Marc Kleine-Budde <[EMAIL PROTECTED]>
Cc: Pete MacKay <[EMAIL PROTECTED]>
Cc: Ian Molton <[EMAIL PROTECTED]>
Cc: Vince Sanders <[EMAIL PROTECTED]>
Cc: Andrew Zabolotny <[EMAIL PROTECTED]>
---
 drivers/input/touchscreen/mainstone-wm97xx.c |  298 ++
 1 files changed, 298 insertions(+), 0 deletions(-)
 create mode 100644 drivers/input/touchscreen/mainstone-wm97xx.c

diff --git a/drivers/input/touchscreen/mainstone-wm97xx.c 
b/drivers/input/touchscreen/mainstone-wm97xx.c
new file mode 100644
index 000..8e1c35d
--- /dev/null
+++ b/drivers/input/touchscreen/mainstone-wm97xx.c
@@ -0,0 +1,298 @@
+/*
+ * mainstone-wm97xx.c  --  Mainstone Continuous Touch screen driver for
+ * Wolfson WM97xx AC97 Codecs.
+ *
+ * Copyright 2004, 2007 Wolfson Microelectronics PLC.
+ * Author: Liam Girdwood
+ * [EMAIL PROTECTED] or [EMAIL PROTECTED]
+ * Parts Copyright : Ian Molton <[EMAIL PROTECTED]>
+ *   Andrew Zabolotny <[EMAIL PROTECTED]>
+ *
+ *  This program is free software; you can redistribute  it and/or modify it
+ *  under  the terms of  the GNU General  Public License as published by the
+ *  Free Software Foundation;  either version 2 of the  License, or (at your
+ *  option) any later version.
+ *
+ * Notes:
+ * This is a wm97xx extended touch driver to capture touch
+ * data in a continuous manner on the Intel XScale archictecture
+ *
+ *  Features:
+ *   - codecs supported:- WM9705, WM9712, WM9713
+ *   - processors supported:- Intel XScale PXA25x, PXA26x, PXA27x
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define VERSION"0.13"
+
+struct continuous {
+   u16 id;/* codec id */
+   u8 code;   /* continuous code */
+   u8 reads;  /* number of coord reads per read cycle */
+   u32 speed; /* number of coords per second */
+};
+
+#define WM_READS(sp) ((sp / HZ) + 1)
+
+static const struct continuous cinfo[] = {
+   {WM9705_ID2, 0, WM_READS(94), 94},
+   {WM9705_ID2, 1, WM_READS(188), 188},
+   {WM9705_ID2, 2, WM_READS(375), 375},
+   {WM9705_ID2, 3, WM_READS(750), 750},
+   {WM9712_ID2, 0, WM_READS(94), 94},
+   {WM9712_ID2, 1, WM_READS(188), 188},
+   {WM9712_ID2, 2, WM_READS(375), 375},
+   {WM9712_ID2, 3, WM_READS(750), 750},
+   {WM9713_ID2, 0, WM_READS(94), 94},
+   {WM9713_ID2, 1, WM_READS(120), 120},
+   {WM9713_ID2, 2, WM_READS(154), 154},
+   {WM9713_ID2, 3, WM_READS(188), 188},
+};
+
+/* continuous speed index */
+static int sp_idx;
+static u16 last, tries;
+
+/*
+ * Pen sampling frequency (Hz) in continuous mode.
+ */
+static int cont_rate = 200;
+module_param(cont_rate, int, 0);
+MODULE_PARM_DESC(cont_rate, "Sampling rate in continuous mode (Hz)");
+
+/*
+ * Pen down detection.
+ *
+ * This driver can either poll or use an interrupt to indicate a pen down
+ * event. If the irq request fails then it will fall back to polling mode.
+ */
+static int pen_int;
+module_param(pen_int, int, 0);
+MODULE_PARM_DESC(pen_int, "Pen down detection (1 = interrupt, 0 = polling)");
+
+/*
+ * Pressure readback.
+ *
+ * Set to 1 to read back pen down pressure
+ */
+static int pressure;
+module_param(pressure, int, 0);
+MODULE_PARM_DESC(pressure, "Pressure readback (1 = pressure, 0 = no 
pressure)");
+
+/*
+ * AC97 touch data slot.
+ *
+ * Touch screen readback data ac97 slot
+ */
+static int ac97_touch_slot = 5;
+module_param(ac97_touch_slot, int, 0);
+MODULE_PARM_DESC(ac97_touch_slot, "Touch screen data slot AC97 number");
+
+
+/* flush AC97 slot 5 FIFO on pxa machines */
+#ifdef CONFIG_PXA27x
+static void wm97xx_acc_pen_up(struct wm97xx *wm)
+{
+   set_current_state(TASK_INTERRUPTIBLE);
+   schedule_timeout(1);
+
+   while (MISR & (1 << 2))
+   MODR;
+}
+#else
+static void wm97xx_acc_pen_up(struct wm97xx *wm)
+{
+   int count = 16;
+   set_current_state(TASK_INTERRUPTIBLE);
+   schedule_timeout(1);
+
+   while (count < 16) {
+   MODR;
+   count--;
+   }
+}
+#endif
+
+static int wm97xx_acc_pen_down(struct wm97xx *wm)
+{
+   u16 x, y, p = 0x100 | WM97XX_ADCSEL_PRES;
+   int reads = 0;
+
+   /* data is never immediately available after pen down irq */
+   set_current_state(TASK_INTERRUPTIBLE);
+   schedule_timeout(1);
+
+   if (tries > 5) {
+   tries = 0;
+   return RC_PENUP;
+   }
+
+   x = MODR;
+   if (x == last) {
+   tries++;
+   return RC_AGAIN;
+   }
+   last = 

[PATCH 4/6] Add chip driver for WM9713 touchscreen

2008-02-26 Thread Mark Brown
Signed-off-by: Liam Girdwood <[EMAIL PROTECTED]>
Signed-off-by: Graeme Gregory <[EMAIL PROTECTED]>
Signed-off-by: Mike Arthur <[EMAIL PROTECTED]>
Signed-off-by: Mark Brown <[EMAIL PROTECTED]>
Cc: Dmitry Baryshkov <[EMAIL PROTECTED]>
Cc: Stanley Cai <[EMAIL PROTECTED]>
Cc: Rodolfo Giometti <[EMAIL PROTECTED]>
Cc: Russell King <[EMAIL PROTECTED]>
Cc: Marc Kleine-Budde <[EMAIL PROTECTED]>
Cc: Pete MacKay <[EMAIL PROTECTED]>
Cc: Ian Molton <[EMAIL PROTECTED]>
Cc: Vince Sanders <[EMAIL PROTECTED]>
Cc: Andrew Zabolotny <[EMAIL PROTECTED]>
---
 drivers/input/touchscreen/wm9713.c |  459 
 1 files changed, 459 insertions(+), 0 deletions(-)
 create mode 100644 drivers/input/touchscreen/wm9713.c

diff --git a/drivers/input/touchscreen/wm9713.c 
b/drivers/input/touchscreen/wm9713.c
new file mode 100644
index 000..ddf0a48
--- /dev/null
+++ b/drivers/input/touchscreen/wm9713.c
@@ -0,0 +1,459 @@
+/*
+ * wm9713.c  --  Codec touch driver for Wolfson WM9713 AC97 Codec.
+ *
+ * Copyright 2003, 2004, 2005, 2006, 2007, 2008 Wolfson Microelectronics PLC.
+ * Author: Liam Girdwood
+ * [EMAIL PROTECTED] or [EMAIL PROTECTED]
+ * Parts Copyright : Ian Molton <[EMAIL PROTECTED]>
+ *   Andrew Zabolotny <[EMAIL PROTECTED]>
+ *   Russell King <[EMAIL PROTECTED]>
+ *
+ *  This program is free software; you can redistribute  it and/or modify it
+ *  under  the terms of  the GNU General  Public License as published by the
+ *  Free Software Foundation;  either version 2 of the  License, or (at your
+ *  option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define TS_NAME"wm97xx"
+#define WM9713_VERSION "0.53"
+#define DEFAULT_PRESSURE   0xb0c0
+
+/*
+ * Module parameters
+ */
+
+/*
+ * Set internal pull up for pen detect.
+ *
+ * Pull up is in the range 1.02k (least sensitive) to 64k (most sensitive)
+ * i.e. pull up resistance = 64k Ohms / rpu.
+ *
+ * Adjust this value if you are having problems with pen detect not
+ * detecting any down event.
+ */
+static int rpu = 8;
+module_param(rpu, int, 0);
+MODULE_PARM_DESC(rpu, "Set internal pull up resitor for pen detect.");
+
+/*
+ * Set current used for pressure measurement.
+ *
+ * Set pil = 2 to use 400uA
+ * pil = 1 to use 200uA and
+ * pil = 0 to disable pressure measurement.
+ *
+ * This is used to increase the range of values returned by the adc
+ * when measureing touchpanel pressure.
+ */
+static int pil;
+module_param(pil, int, 0);
+MODULE_PARM_DESC(pil, "Set current used for pressure measurement.");
+
+/*
+ * Set threshold for pressure measurement.
+ *
+ * Pen down pressure below threshold is ignored.
+ */
+static int pressure = DEFAULT_PRESSURE & 0xfff;
+module_param(pressure, int, 0);
+MODULE_PARM_DESC(pressure, "Set threshold for pressure measurement.");
+
+/*
+ * Set adc sample delay.
+ *
+ * For accurate touchpanel measurements, some settling time may be
+ * required between the switch matrix applying a voltage across the
+ * touchpanel plate and the ADC sampling the signal.
+ *
+ * This delay can be set by setting delay = n, where n is the array
+ * position of the delay in the array delay_table below.
+ * Long delays > 1ms are supported for completeness, but are not
+ * recommended.
+ */
+static int delay = 4;
+module_param(delay, int, 0);
+MODULE_PARM_DESC(delay, "Set adc sample delay.");
+
+/*
+ * Set adc mask function.
+ *
+ * Sources of glitch noise, such as signals driving an LCD display, may feed
+ * through to the touch screen plates and affect measurement accuracy. In
+ * order to minimise this, a signal may be applied to the MASK pin to delay or
+ * synchronise the sampling.
+ *
+ * 0 = No delay or sync
+ * 1 = High on pin stops conversions
+ * 2 = Edge triggered, edge on pin delays conversion by delay param (above)
+ * 3 = Edge triggered, edge on pin starts conversion after delay param
+ */
+static int mask;
+module_param(mask, int, 0);
+MODULE_PARM_DESC(mask, "Set adc mask function.");
+
+/*
+ * Coordinate Polling Enable.
+ *
+ * Set to 1 to enable coordinate polling. e.g. x,y[,p] is sampled together
+ * for every poll.
+ */
+static int coord;
+module_param(coord, int, 0);
+MODULE_PARM_DESC(coord, "Polling coordinate mode");
+
+/*
+ * ADC sample delay times in uS
+ */
+static const int delay_table[] = {
+   21,/* 1 AC97 Link frames */
+   42,/* 2 */
+   84,/* 4 */
+   167,   /* 8 */
+   333,   /* 16 */
+   667,   /* 32 */
+   1000,  /* 48 */
+   1333,  /* 64 */
+   2000,  /* 96 */
+   2667,  /* 128 */
+   ,  /* 160 */
+   4000,  /* 192 */
+   4667,  /* 224 */
+   5333,  /* 256 */
+   6000,  /* 288 */
+   0  /* No delay, switch matrix always on */
+};
+
+/*
+ * Delay after issuing a POLL command.
+ *
+ * The delay is 3 AC97 link frames + the touchpanel settling delay
+ */
+static 

[PATCH 2/6] Add chip driver for WM9705 touchscreen

2008-02-26 Thread Mark Brown
Signed-off-by: Liam Girdwood <[EMAIL PROTECTED]>
Signed-off-by: Graeme Gregory <[EMAIL PROTECTED]>
Signed-off-by: Mike Arthur <[EMAIL PROTECTED]>
Signed-off-by: Mark Brown <[EMAIL PROTECTED]>
Cc: Dmitry Baryshkov <[EMAIL PROTECTED]>
Cc: Stanley Cai <[EMAIL PROTECTED]>
Cc: Rodolfo Giometti <[EMAIL PROTECTED]>
Cc: Russell King <[EMAIL PROTECTED]>
Cc: Marc Kleine-Budde <[EMAIL PROTECTED]>
Cc: Pete MacKay <[EMAIL PROTECTED]>
Cc: Ian Molton <[EMAIL PROTECTED]>
Cc: Vince Sanders <[EMAIL PROTECTED]>
Cc: Andrew Zabolotny <[EMAIL PROTECTED]>
---
 drivers/input/touchscreen/wm9705.c |  352 
 1 files changed, 352 insertions(+), 0 deletions(-)
 create mode 100644 drivers/input/touchscreen/wm9705.c

diff --git a/drivers/input/touchscreen/wm9705.c 
b/drivers/input/touchscreen/wm9705.c
new file mode 100644
index 000..f185104
--- /dev/null
+++ b/drivers/input/touchscreen/wm9705.c
@@ -0,0 +1,352 @@
+/*
+ * wm9705.c  --  Codec driver for Wolfson WM9705 AC97 Codec.
+ *
+ * Copyright 2003, 2004, 2005, 2006, 2007 Wolfson Microelectronics PLC.
+ * Author: Liam Girdwood
+ * [EMAIL PROTECTED] or [EMAIL PROTECTED]
+ * Parts Copyright : Ian Molton <[EMAIL PROTECTED]>
+ *   Andrew Zabolotny <[EMAIL PROTECTED]>
+ *   Russell King <[EMAIL PROTECTED]>
+ *
+ *  This program is free software; you can redistribute  it and/or modify it
+ *  under  the terms of  the GNU General  Public License as published by the
+ *  Free Software Foundation;  either version 2 of the  License, or (at your
+ *  option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define TS_NAME"wm97xx"
+#define WM9705_VERSION "0.62"
+#define DEFAULT_PRESSURE   0xb0c0
+
+/*
+ * Module parameters
+ */
+
+/*
+ * Set current used for pressure measurement.
+ *
+ * Set pil = 2 to use 400uA
+ * pil = 1 to use 200uA and
+ * pil = 0 to disable pressure measurement.
+ *
+ * This is used to increase the range of values returned by the adc
+ * when measureing touchpanel pressure.
+ */
+static int pil;
+module_param(pil, int, 0);
+MODULE_PARM_DESC(pil, "Set current used for pressure measurement.");
+
+/*
+ * Set threshold for pressure measurement.
+ *
+ * Pen down pressure below threshold is ignored.
+ */
+static int pressure = DEFAULT_PRESSURE & 0xfff;
+module_param(pressure, int, 0);
+MODULE_PARM_DESC(pressure, "Set threshold for pressure measurement.");
+
+/*
+ * Set adc sample delay.
+ *
+ * For accurate touchpanel measurements, some settling time may be
+ * required between the switch matrix applying a voltage across the
+ * touchpanel plate and the ADC sampling the signal.
+ *
+ * This delay can be set by setting delay = n, where n is the array
+ * position of the delay in the array delay_table below.
+ * Long delays > 1ms are supported for completeness, but are not
+ * recommended.
+ */
+static int delay = 4;
+module_param(delay, int, 0);
+MODULE_PARM_DESC(delay, "Set adc sample delay.");
+
+/*
+ * Pen detect comparator threshold.
+ *
+ * 0 to Vmid in 15 steps, 0 = use zero power comparator with Vmid threshold
+ * i.e. 1 =  Vmid/15 threshold
+ *  15 =  Vmid/1 threshold
+ *
+ * Adjust this value if you are having problems with pen detect not
+ * detecting any down events.
+ */
+static int pdd = 8;
+module_param(pdd, int, 0);
+MODULE_PARM_DESC(pdd, "Set pen detect comparator threshold");
+
+/*
+ * Set adc mask function.
+ *
+ * Sources of glitch noise, such as signals driving an LCD display, may feed
+ * through to the touch screen plates and affect measurement accuracy. In
+ * order to minimise this, a signal may be applied to the MASK pin to delay or
+ * synchronise the sampling.
+ *
+ * 0 = No delay or sync
+ * 1 = High on pin stops conversions
+ * 2 = Edge triggered, edge on pin delays conversion by delay param (above)
+ * 3 = Edge triggered, edge on pin starts conversion after delay param
+ */
+static int mask;
+module_param(mask, int, 0);
+MODULE_PARM_DESC(mask, "Set adc mask function.");
+
+/*
+ * ADC sample delay times in uS
+ */
+static const int delay_table[] = {
+   21,/* 1 AC97 Link frames */
+   42,/* 2  */
+   84,/* 4  */
+   167,   /* 8  */
+   333,   /* 16 */
+   667,   /* 32 */
+   1000,  /* 48 */
+   1333,  /* 64 */
+   2000,  /* 96 */
+   2667,  /* 128*/
+   ,  /* 160*/
+   4000,  /* 192*/
+   4667,  /* 224*/
+   5333,  /* 256*/
+   6000,  /* 288*/
+   0  /* No delay, switch matrix always on */
+};
+
+/*
+ * Delay after issuing a POLL command.
+ *
+ * The delay is 3 AC97 link frames + the touchpanel settling delay
+ */
+static inline void poll_delay(int d)
+{

[PATCH 1/6] Core driver for WM97xx touchscreens

2008-02-26 Thread Mark Brown
This patch series adds support for the touchscreen controllers provided
by Wolfson Microelectronics WM97xx series chips in both polled and
streaming modes.

These drivers have been maintained out of tree since 2003.  During that
time the driver the primary maintainer was Liam Girdwood and a number of
people have made contributions including Dmitry Baryshkov, Stanley Cai,
Rodolfo Giometti, Russell King, Marc Kleine-Budde, Ian Molton, Vincent
Sanders, Andrew Zabolotny, Graeme Gregory, Mike Arthur and myself.
Apologies to anyone I have omitted.

Signed-off-by: Liam Girdwood <[EMAIL PROTECTED]>
Signed-off-by: Graeme Gregory <[EMAIL PROTECTED]>
Signed-off-by: Mike Arthur <[EMAIL PROTECTED]>
Signed-off-by: Mark Brown <[EMAIL PROTECTED]>
Cc: Dmitry Baryshkov <[EMAIL PROTECTED]>
Cc: Stanley Cai <[EMAIL PROTECTED]>
Cc: Rodolfo Giometti <[EMAIL PROTECTED]>
Cc: Russell King <[EMAIL PROTECTED]>
Cc: Pete MacKay <[EMAIL PROTECTED]>
Cc: Marc Kleine-Budde <[EMAIL PROTECTED]>
Cc: Ian Molton <[EMAIL PROTECTED]>
Cc: Vincent Sanders <[EMAIL PROTECTED]>
Cc: Andrew Zabolotny <[EMAIL PROTECTED]>
---
 drivers/input/touchscreen/wm97xx-core.c |  731 +++
 include/linux/wm97xx.h  |  308 +
 2 files changed, 1039 insertions(+), 0 deletions(-)
 create mode 100644 drivers/input/touchscreen/wm97xx-core.c
 create mode 100644 include/linux/wm97xx.h

diff --git a/drivers/input/touchscreen/wm97xx-core.c 
b/drivers/input/touchscreen/wm97xx-core.c
new file mode 100644
index 000..84d9dc5
--- /dev/null
+++ b/drivers/input/touchscreen/wm97xx-core.c
@@ -0,0 +1,731 @@
+/*
+ * wm97xx-core.c  --  Touch screen driver core for Wolfson WM9705, WM9712
+ *and WM9713 AC97 Codecs.
+ *
+ * Copyright 2003, 2004, 2005, 2006, 2007, 2008 Wolfson Microelectronics PLC.
+ * Author: Liam Girdwood
+ * [EMAIL PROTECTED] or [EMAIL PROTECTED]
+ * Parts Copyright : Ian Molton <[EMAIL PROTECTED]>
+ *   Andrew Zabolotny <[EMAIL PROTECTED]>
+ *   Russell King <[EMAIL PROTECTED]>
+ *
+ *  This program is free software; you can redistribute  it and/or modify it
+ *  under  the terms of  the GNU General  Public License as published by the
+ *  Free Software Foundation;  either version 2 of the  License, or (at your
+ *  option) any later version.
+ *
+ * Notes:
+ *
+ *  Features:
+ *   - supports WM9705, WM9712, WM9713
+ *   - polling mode
+ *   - continuous mode (arch-dependent)
+ *   - adjustable rpu/dpp settings
+ *   - adjustable pressure current
+ *   - adjustable sample settle delay
+ *   - 4 and 5 wire touchscreens (5 wire is WM9712 only)
+ *   - pen down detection
+ *   - battery monitor
+ *   - sample AUX adcs
+ *   - power management
+ *   - codec GPIO
+ *   - codec event notification
+ * Todo
+ *   - Support for async sampling control for noisy LCDs.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define TS_NAME"wm97xx"
+#define WM_CORE_VERSION"1.00"
+#define DEFAULT_PRESSURE   0xb0c0
+
+
+/*
+ * Touchscreen absolute values
+ *
+ * These parameters are used to help the input layer discard out of
+ * range readings and reduce jitter etc.
+ *
+ *   o min, max:- indicate the min and max values your touch screen returns
+ *   o fuzz:- use a higher number to reduce jitter
+ *
+ * The default values correspond to Mainstone II in QVGA mode
+ *
+ * Please read
+ * Documentation/input/input-programming.txt for more details.
+ */
+
+static int abs_x[3] = {350, 3900, 5};
+module_param_array(abs_x, int, NULL, 0);
+MODULE_PARM_DESC(abs_x, "Touchscreen absolute X min, max, fuzz");
+
+static int abs_y[3] = {320, 3750, 40};
+module_param_array(abs_y, int, NULL, 0);
+MODULE_PARM_DESC(abs_y, "Touchscreen absolute Y min, max, fuzz");
+
+static int abs_p[3] = {0, 150, 4};
+module_param_array(abs_p, int, NULL, 0);
+MODULE_PARM_DESC(abs_p, "Touchscreen absolute Pressure min, max, fuzz");
+
+/*
+ * wm97xx IO access, all IO locking done by AC97 layer
+ */
+int wm97xx_reg_read(struct wm97xx *wm, u16 reg)
+{
+   if (wm->ac97)
+   return wm->ac97->bus->ops->read(wm->ac97, reg);
+   else
+   return -1;
+}
+EXPORT_SYMBOL_GPL(wm97xx_reg_read);
+
+void wm97xx_reg_write(struct wm97xx *wm, u16 reg, u16 val)
+{
+   /* cache digitiser registers */
+   if (reg >= AC97_WM9713_DIG1 && reg <= AC97_WM9713_DIG3)
+   wm->dig[(reg - AC97_WM9713_DIG1) >> 1] = val;
+
+   /* cache gpio regs */
+   if (reg >= AC97_GPIO_CFG && reg <= AC97_MISC_AFE)
+   wm->gpio[(reg - AC97_GPIO_CFG) >> 1] = val;
+
+   /* wm9713 irq reg */
+   if (reg == 0x5a)
+   wm->misc = val;
+
+   if (wm->ac97)
+   wm->ac97->bus->ops->write(wm->ac97, reg, val);
+}

Re: oops when using git gc --auto

2008-02-26 Thread Nick Piggin
On Wednesday 27 February 2008 00:22, Otavio Salvador wrote:
> Hello,
>
> Today I got this oops, someone has an idea of what's going wrong?
>
> Unable to handle kernel paging request at 0200 RIP:
>  [] find_get_pages+0x3c/0x69

At this point, the most likely candidate is a memory corruption
error, probably hardware. Can you run memtest86 for a few hours
to get a bit more confidence in the hw (preferably overnight)?

I did recently see another quite similar corruption in the
pagecache radix-tree, though. Coincidence maybe?

> PGD 0
> Oops:  [1] SMP
> CPU 3
> Modules linked in: sha256_generic aes_generic aes_x86_64 cbc blkcipher
> nvidia(P) rfcomm l2cap bluetooth ac battery ipv6 nfs lockd nfs_acl sunrpc
> bridge ext2 mbcache dm_crypt tun kvm_intel kvm loop snd_usb_audio
> snd_usb_lib snd_rawmidi snd_hda_intel e1000e i2c_i801 serio_raw
> snd_seq_device snd_pcm intel_agp button snd_timer pcspkr psmouse snd_hwdep
> snd snd_page_alloc soundcore evdev i2c_core xfs dm_mirror dm_snapshot
> dm_mod raid0 md_mod sg sr_mod cdrom sd_mod usbhid hid usb_storage
> pata_marvell floppy ahci ata_generic libata scsi_mod ehci_hcd uhci_hcd
> thermal processor fan Pid: 15684, comm: git Tainted: P   
> 2.6.24-1-amd64 #1
> RIP: 0010:[]  []
> find_get_pages+0x3c/0x69 RSP: 0018:8100394dfd98  EFLAGS: 00010097
> RAX: 0009 RBX: 000e RCX: 0009
> RDX: 0200 RSI: 000a RDI: 0040
> RBP: 810042964350 R08: 0040 R09: 000a
> R10: 8100425a06c8 R11: 000a R12: 000e
> R13: 8100394dfdf8 R14: 810042964350 R15: 
> FS:  2ae326df2190() GS:81007d7aeb40()
> knlGS: CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 0200 CR3: 358f9000 CR4: 26e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process git (pid: 15684, threadinfo 8100394de000, task
> 8100359cd800) Stack:  000d 8100394dfde8
> 000d 000e 000e 802794d6
> 8100014a7768 80279b04  
>   Call Trace:
>  [] pagevec_lookup+0x17/0x1e
>  [] truncate_inode_pages_range+0x108/0x2bd
>  [] generic_delete_inode+0xbf/0x127
>  [] do_unlinkat+0xd5/0x144
>  [] sys_write+0x45/0x6e
>  [] system_call+0x7e/0x83
>
>
> Code: 48 8b 02 25 00 40 02 00 48 3d 00 40 02 00 75 04 48 8b 52 10
> RIP  [] find_get_pages+0x3c/0x69
>  RSP 
> CR2: 0200
> ---[ end trace cb43a9f4488b815a ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc3: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!

2008-02-26 Thread Martin Michlmayr
* Patrick McHardy <[EMAIL PROTECTED]> [2008-02-26 13:28]:
>> I get the following build error on at least ARM and MIPS:
>>   Building modules, stage 2.
>>   MODPOST 759 modules
>> ERROR: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!
>
> Does this patch fix it?

Nope.
-- 
Martin Michlmayr
http://www.cyrius.com/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

2008-02-26 Thread David Newall
David Brownell wrote:
> On Tuesday 26 February 2008, David Newall wrote:
>   
>> Hardware can be inserted and removed while we're in a suspend state; and
>> there's nothing that we can do about it until we resume.  Is it fair to
>> say, then, that having started suspend, we could reasonably ignore any
>> device insertion and removal, and handle it on resume?
>> 
>
> "Ignore" seems a bit strong; those events may be wakeup triggers,
> which would cause the hardware to make it a very short suspend state.
>
> "Defer handling" is more to the point, be it by hardware or software.
>
>   

Of course, "defer".  The insertion has to be handled eventually.  What
I'm wondering is if we can ignore it, and catch it on the resume.


>> Presumably we need to scan for hardware changes on resume.
>> 
>
> Not on most busses I work with; the hardware issues notifications
> whenever the devices are removable.
>   

There's no notification while we're suspended.  Isn't it necessary to
scan all busses on resume, just to know what's on them?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: regression: CD burning (k3b) went broke

2008-02-26 Thread Mike Galbraith

On Tue, 2008-02-26 at 10:48 +0100, Mike Galbraith wrote:
> Greetings,
> 
> I straced both a good and a bad kernel (good being .git with attached
> revert patch applied) and filtered/diffed/merged the output.  Scroll
> down to "HERE" to see the problem (resid).
> 
> I'm poking around, but not having much luck.

Seems the problem is data_len changes, but raw_data_len doesn't.  I've
not the foggiest IO-land clue, but k3b works again, so the below may
have some diagnostic value.

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ba21d97..7a6f784 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
scsi_end_bidi_request(cmd);
return;
}
-   req->data_len = scsi_get_resid(cmd);
+   req->data_len = req->raw_data_len = scsi_get_resid(cmd);
}
 
BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


oops when using git gc --auto

2008-02-26 Thread Otavio Salvador
Hello,

Today I got this oops, someone has an idea of what's going wrong?

Unable to handle kernel paging request at 0200 RIP: 
 [] find_get_pages+0x3c/0x69
PGD 0 
Oops:  [1] SMP 
CPU 3 
Modules linked in: sha256_generic aes_generic aes_x86_64 cbc blkcipher 
nvidia(P) rfcomm l2cap bluetooth ac battery ipv6 nfs lockd nfs_acl sunrpc 
bridge ext2 mbcache dm_crypt tun kvm_intel kvm loop snd_usb_audio snd_usb_lib 
snd_rawmidi snd_hda_intel e1000e i2c_i801 serio_raw snd_seq_device snd_pcm 
intel_agp button snd_timer pcspkr psmouse snd_hwdep snd snd_page_alloc 
soundcore evdev i2c_core xfs dm_mirror dm_snapshot dm_mod raid0 md_mod sg 
sr_mod cdrom sd_mod usbhid hid usb_storage pata_marvell floppy ahci ata_generic 
libata scsi_mod ehci_hcd uhci_hcd thermal processor fan
Pid: 15684, comm: git Tainted: P2.6.24-1-amd64 #1
RIP: 0010:[]  [] find_get_pages+0x3c/0x69
RSP: 0018:8100394dfd98  EFLAGS: 00010097
RAX: 0009 RBX: 000e RCX: 0009
RDX: 0200 RSI: 000a RDI: 0040
RBP: 810042964350 R08: 0040 R09: 000a
R10: 8100425a06c8 R11: 000a R12: 000e
R13: 8100394dfdf8 R14: 810042964350 R15: 
FS:  2ae326df2190() GS:81007d7aeb40() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0200 CR3: 358f9000 CR4: 26e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process git (pid: 15684, threadinfo 8100394de000, task 8100359cd800)
Stack:  000d 8100394dfde8 000d 000e
 000e 802794d6 8100014a7768 80279b04
    
Call Trace:
 [] pagevec_lookup+0x17/0x1e
 [] truncate_inode_pages_range+0x108/0x2bd
 [] generic_delete_inode+0xbf/0x127
 [] do_unlinkat+0xd5/0x144
 [] sys_write+0x45/0x6e
 [] system_call+0x7e/0x83


Code: 48 8b 02 25 00 40 02 00 48 3d 00 40 02 00 75 04 48 8b 52 10 
RIP  [] find_get_pages+0x3c/0x69
 RSP 
CR2: 0200
---[ end trace cb43a9f4488b815a ]---

-- 
O T A V I OS A L V A D O R
-
 E-mail: [EMAIL PROTECTED]  UIN: 5906116
 GNU/Linux User: 239058 GPG ID: 49A5F855
 Home Page: http://otavio.ossystems.com.br
-
"Microsoft sells you Windows ... Linux gives
 you the whole house."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Please, put 64-bit counter per task and incr.by.one each ctxt switch.

2008-02-26 Thread J.C. Pizarro
On 2008/2/25, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Sun, 24 Feb 2008 14:12:47 +0100 "J.C. Pizarro" <[EMAIL PROTECTED]> wrote:
>
>  > It's statistic, yes, but it's a very important parameter for the 
> CPU-scheduler.
>  > The CPU-scheduler will know the number of context switches of each task
>  >  before of to take a blind decision into infinitum!.
>
>
> We already have these:
>
> unsigned long nvcsw, nivcsw; /* context switch counts */
>
>  in the task_struct.

1. They use "unsigned long" instead "unsigned long long".
2. They use "= 0;" instead of "= 0ULL";
3. They don't use ++ (incr. by one per ctxt-switch).
4. I don't like the separation of voluntary and involuntary ctxt-switches,
and i don't understand the utility of this separation.

The tsk->nvcsw & tsk->nivcsw mean different to i had proposed.

It's simple, when calling to function kernel/sched.c:context_switch(..)
to do ++, but they don't do it.

I propose you
1. unsigned long long tsk->ncsw = 0ULL;  and  tsk->ncsw++;
2. unsigned long long tsk->last_registered_ncsw = tsk->ncsw; when it's polling.
3. long tsk->vcsw =  ( tsk->ncsw - tsk->last_registered_ncsw ) / ( t2 - t1 )
/* velocity of task (ctxt-switches per second), (t1 != t2 in seconds
for no zerodiv)
4. long tsk->last_registered_vcsw = tsk->vcsw;
5. long tsk->normalized_vcsw =
   (1 - alpha)*tsk->last_registered_vcsw + alpha*tsk->vcsw; /* 0http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Compress kernel modules on installation.

2008-02-26 Thread Willy Tarreau
On Tue, Feb 26, 2008 at 02:30:16PM +0200, Adrian Bunk wrote:
> On Tue, Feb 26, 2008 at 11:22:03AM +0100, Willy Tarreau wrote:
> > On Tue, Feb 26, 2008 at 11:14:55AM +0200, Adrian Bunk wrote:
> > > On Mon, Feb 25, 2008 at 11:21:38PM +0100, Willy Tarreau wrote:
> >...
> > > > Have you tried keeping the module names intact (.ko, not .ko.gz) ?
> > > > It's what I was doing with modutils in 2.4 and what I'm still doing
> > > > with module-init-tools in 2.6. While I don't particularly use mkinitrd,
> > > > I think that keeping the name intact is preferable and should help.
> > > 
> > > How would you see if, and if yes with what program, a module was 
> > > compressed if the name is kept intact?
> > 
> > depmod/modinfo/insmod/modprobe already know it. And quite honnestly,
> > I don't know about any other program which really needs to process
> > those files once installed. Well, maybe ksymoops, but I'd have to
> > check, as I don't recall having ever been annoyed with this.
> >...
> 
> depmod/modinfo/insmod/modprobe know only if you compile 
> module-init-tools with zlib support.
> 
> And what about the busybox versions?
> 
> A different name would e.g.:
> - easily allow proper error handling if the userspace modules program 
>   doesn't support the compression used
> - better scale to support additional compressions
> - give the user a hint what is happening and what might be the problem
>   when anythig goes wrong

I agree, but right now module.dep references existing files with their
real names. Maybe something should define exactly what it should contain
(eg: module_name.ko even if .ko.gz is used) so that all tools relying on
it do not stop after noticing that the file referenced there does not
exist.

regards,
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/28] mm: add support for non block device backed swap files

2008-02-26 Thread Peter Zijlstra

On Tue, 2008-02-26 at 13:45 +0100, Miklos Szeredi wrote:
> Starting review in the middle, because this is the part I'm most
> familiar with.
> 
> > New addres_space_operations methods are added:
> >   int swapfile(struct address_space *, int);
> 
> Separate ->swapon() and ->swapoff() methods would be so much cleaner IMO.

I'm ok with that, but its a_ops bloat, do we care about that? I guess
since it has limited instances - typically one per filesystem - there is
no issue here.

> Also is there a reason why 'struct file *' cannot be supplied to these
> functions?

No real reason here. I guess its cleaner indeed. Thanks.

> > +int swap_set_page_dirty(struct page *page)
> > +{
> > +   struct swap_info_struct *sis = page_swap_info(page);
> > +
> > +   if (sis->flags & SWP_FILE) {
> > +   const struct address_space_operations *a_ops =
> > +   sis->swap_file->f_mapping->a_ops;
> > +   int (*spd)(struct page *) = a_ops->set_page_dirty;
> > +#ifdef CONFIG_BLOCK
> > +   if (!spd)
> > +   spd = __set_page_dirty_buffers;
> > +#endif
> 
> This ifdef is not really needed.  Just require ->set_page_dirty() be
> filled in by filesystems which want swapfiles (and others too, in the
> longer term, the fallback is just historical crud).

Agreed. This is a good motivation to clean up that stuff.

> Here's an incremental patch addressing these issues and beautifying
> the new code.

Thanks, I'll fold it into the patch and update the documentation. I'll
put your creds in akpm style.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Print long messages to console from kernel module

2008-02-26 Thread Arvid Brodin
On 2008-02-25 23:27, linux-os (Dick Johnson) wrote:
> On Mon, 25 Feb 2008, Arvid Brodin wrote:
> 
>> I need to write messages > 1023 characters long to the console from a 
>> module*. printk() is limited to 1023 characters, and splitting the message 
>> over several printk()'s results in a line break and "Month hh:mm:ss host 
>> kernel:" being inserted in my text.
>>
>> I tried including  and using the console_drivers declared 
>> there, but get
>> "WARNING: "console_drivers" [/log.ko] undefined!" when compiling and
>> "insmod: error inserting 'log.ko': -1 Unknown symbol in module" when 
>> insmodding.
>>
>> I guess this is because non EXPORT_SYMBOL'd symbols are only accessible to 
>> statically linked code, and not to modules? I see in printk.c that 
>> console_drivers is set up there, and I haven't been able to find any other 
>> interface to console_drivers.
>>
>> In short: is there any way to print messages to the console from a kernel 
>> module, except printk()? Is opening /dev/tty and writing to it the way to go?
>>
>>
>> * I'm writing an in-memory logger to be included in a module. The log can be 
>> several megabytes. The idea is to use SysRq to print the contents of the log 
>> to console after a kernel panic or otherwise when writing to disk might not 
>> work.
>>
> 
> Write the data to a kernel buffer. Impliment read() or ioctl() and
> poll(). Have a user-mode task sleep in poll, waiting for data to
> become available. That user-mode task can do anything it wants,
> unrestricted, with the data including writing it to files or any
> tty it wants to open.

Thank you for your answer. However, I don't see how a user-mode task will help 
me print my log after a kernel panic, through SysRq? Please clarify.

What we want is essentially a replacement for printk(), where the messages are 
instead logged in a big ring buffer, and can be printed with Alt-SysRq-l when 
need be. And the problem is the actual printing of the buffer to the console, 
since printk() inserts its timestamp after every linebreak or 1023 characters, 
whichever comes first.

-- 
Arvid Brodin
Enea LCC

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GAK!!!! Re: PCI: AMD SATA IDE mode quirk

2008-02-26 Thread Alan Cox
> I agree.  I [obviously] missed this when I ack'd, mainly ack'ing the 
> overall change.
> 
> BIOS certainly may modify that PCI config register, but that's before 
> the kernel boots.  So, using pdev->class is fine.

I don't think the resume quirk is needed either as the core PCI
save/restore code rewrites the PCI registers so should be rewriting the
class back on resume as well.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Doesn't work in 2.6.24.3 either - Was: Re: cd/dvd inaccessible in 2.6.24-rc2

2008-02-26 Thread Felix Homann

Hi,

it's still an issue in 2.6.24.3. Syslog looks just like in 2.6.24.2, 
look here:


http://lkml.org/lkml/2008/2/21/241

Latest kernel I've tried with working CD/DVD access was 2.6.23.14.

Kind regards,

Felix

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] Disk shock protection (revisited)

2008-02-26 Thread Alan Cox
> The general idea: A daemon running in user space monitors input data
> from an accelerometer. When the daemon detects a critical condition,

That sounds like a non starter. What if the box is busy, what if the
daemon or something you touch needs memory and causes paging ?

Given the accelerometer data should be very simple doesn't it actually
make sense in this specific case to put the logic (not thresholds) in
kernel space.

> state. To this end, the kernel has to issue an idle immediate command
> with unload feature and stop the block layer queue afterwards. Once the

Yep. Pity the worst case completion time for an IDE I/O is 60 seconds or
so.

> 1. Who is to be in charge for the shock protection application? Should
>userspace speak to libata / ide directly (through sysfs) and the low

I think it has to be kernel side for speed, and because you will need to
issue idle immediate while a command sequence is active which is
*extremely* hairy as you have to recover from the mess and restart the
relevant I/O. Plus you may need controller specific knowledge on issuing
it (and changes to libata).

> 2. Depending on the answer to the previous question, by what mechanism
>should block layer and lld interact? Special requests, queue hooks or
>something in some way similar to power management functions (once
>suggested by James Bottomley)?

Idle immediate seem to simply fit the queue model, it happens in
*parallel* to I/O events and is special in all sorts of ways.

> 3. What is the preferred way to pass device specific configuration
>options to libata (preferrably at runtime, i.e., after module
>loading)?

sysfs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/28] mm: add support for non block device backed swap files

2008-02-26 Thread Miklos Szeredi
Starting review in the middle, because this is the part I'm most
familiar with.

> New addres_space_operations methods are added:
>   int swapfile(struct address_space *, int);

Separate ->swapon() and ->swapoff() methods would be so much cleaner IMO.

Also is there a reason why 'struct file *' cannot be supplied to these
functions?

[snip]

> +int swap_set_page_dirty(struct page *page)
> +{
> + struct swap_info_struct *sis = page_swap_info(page);
> +
> + if (sis->flags & SWP_FILE) {
> + const struct address_space_operations *a_ops =
> + sis->swap_file->f_mapping->a_ops;
> + int (*spd)(struct page *) = a_ops->set_page_dirty;
> +#ifdef CONFIG_BLOCK
> + if (!spd)
> + spd = __set_page_dirty_buffers;
> +#endif

This ifdef is not really needed.  Just require ->set_page_dirty() be
filled in by filesystems which want swapfiles (and others too, in the
longer term, the fallback is just historical crud).

Here's an incremental patch addressing these issues and beautifying
the new code.

Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>

Index: linux/mm/page_io.c
===
--- linux.orig/mm/page_io.c 2008-02-26 11:15:58.0 +0100
+++ linux/mm/page_io.c  2008-02-26 13:40:55.0 +0100
@@ -106,8 +106,10 @@ int swap_writepage(struct page *page, st
}
 
if (sis->flags & SWP_FILE) {
-   ret = sis->swap_file->f_mapping->
-   a_ops->swap_out(sis->swap_file, page, wbc);
+   struct file *swap_file = sis->swap_file;
+   struct address_space *mapping = swap_file->f_mapping;
+
+   ret = mapping->a_ops->swap_out(swap_file, page, wbc);
if (!ret)
count_vm_event(PSWPOUT);
return ret;
@@ -136,12 +138,13 @@ void swap_sync_page(struct page *page)
struct swap_info_struct *sis = page_swap_info(page);
 
if (sis->flags & SWP_FILE) {
-   const struct address_space_operations *a_ops =
-   sis->swap_file->f_mapping->a_ops;
-   if (a_ops->sync_page)
-   a_ops->sync_page(page);
-   } else
+   struct address_space *mapping = sis->swap_file->f_mapping;
+
+   if (mapping->a_ops->sync_page)
+   mapping->a_ops->sync_page(page);
+   } else {
block_sync_page(page);
+   }
 }
 
 int swap_set_page_dirty(struct page *page)
@@ -149,17 +152,12 @@ int swap_set_page_dirty(struct page *pag
struct swap_info_struct *sis = page_swap_info(page);
 
if (sis->flags & SWP_FILE) {
-   const struct address_space_operations *a_ops =
-   sis->swap_file->f_mapping->a_ops;
-   int (*spd)(struct page *) = a_ops->set_page_dirty;
-#ifdef CONFIG_BLOCK
-   if (!spd)
-   spd = __set_page_dirty_buffers;
-#endif
-   return (*spd)(page);
-   }
+   struct address_space *mapping = sis->swap_file->f_mapping;
 
-   return __set_page_dirty_nobuffers(page);
+   return mapping->a_ops->set_page_dirty(page);
+   } else {
+   return __set_page_dirty_nobuffers(page);
+   }
 }
 
 int swap_readpage(struct file *file, struct page *page)
@@ -172,8 +170,10 @@ int swap_readpage(struct file *file, str
BUG_ON(PageUptodate(page));
 
if (sis->flags & SWP_FILE) {
-   ret = sis->swap_file->f_mapping->
-   a_ops->swap_in(sis->swap_file, page);
+   struct file *swap_file = sis->swap_file;
+   struct address_space *mapping = swap_file->f_mapping;
+
+   ret = mapping->a_ops->swap_in(swap_file, page);
if (!ret)
count_vm_event(PSWPIN);
return ret;
Index: linux/include/linux/fs.h
===
--- linux.orig/include/linux/fs.h   2008-02-26 11:15:58.0 +0100
+++ linux/include/linux/fs.h2008-02-26 13:29:40.0 +0100
@@ -485,7 +485,8 @@ struct address_space_operations {
/*
 * swapfile support
 */
-   int (*swapfile)(struct address_space *, int);
+   int (*swapon)(struct file *file);
+   int (*swapoff)(struct file *file);
int (*swap_out)(struct file *file, struct page *page,
struct writeback_control *wbc);
int (*swap_in)(struct file *file, struct page *page);
Index: linux/mm/swapfile.c
===
--- linux.orig/mm/swapfile.c2008-02-26 12:43:57.0 +0100
+++ linux/mm/swapfile.c 2008-02-26 13:34:57.0 +0100
@@ -1014,9 +1014,11 @@ static void destroy_swap_extents(struct 
}
 
if (sis->flags & SWP_FILE) {
+   struct file *swap_file = 

Re: [PATCH] 2.6.25-rc2-mm1 - fix mcount GPL bogosity.

2008-02-26 Thread Alan Cox
> I don't know who told you that or why, but it's obvious nonsense, as this
> issue shows. Exports should be marked GPL if and only if they cannot be used
> except in a derivative work. If it is possible to use them without taking
> sufficient protectable expression, they should not be marked GPL.
> 
> This was what everyone agreed to when GPL exports were created.

No it wasn't.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH try #1] Kconfig: cleanup block/Kconfig.iosched help descriptions

2008-02-26 Thread Nick Andrew
Modify the help descriptions of block/Kconfig.iosched for clarity, accuracy and 
consistency.

More information is added to each of the I/O scheduler choices and they
are also reordered to improve the flow of information to the user.

IOSCHED_CFQ is the default because it distributes the bandwidth fairly.
It's also the place I decided to add the most help text, because it seems
not possible to add help text to the menu "IO Schedulers" itself.

So IOSCHED_CFQ is annotated with two things. Firstly a hint about how
to use the prioritisation at runtime, i.e. through the ionice(1) command.
Secondly, a reference to Documentation/block/switching-sched.txt to
show how to switch the scheduler for block devices at runtime, and/or
choose a new scheduler via the 'elevator=' kernel parameter.

These schedulers can be compiled as modules (except for noop-iosched)
so this is mentioned in each case. However the Kconfig won't allow a
scheduler which is built as a module to be chosen as the default
scheduler, and I noted that.

The boilerplate "If unsure, say Y" was added to IOSCHED_CFQ, IOSCHED_AS
and IOSCHED_DEADLINE.

A 1-line help description was added to each of the 4 choices under
"Default I/O scheduler". It won't explain anything, but it's friendlier
than saying nothing at all.


Signed-off-by: Nick Andrew <[EMAIL PROTECTED]>
---
Questions: Why can't a module be chosen as the default scheduler? The
elevator_get() function in block/elevator.c seems to support it. Is
it that the kernel won't be able to load the module off disk before
it has a scheduler loaded?

Also for people who want to compile only the bare minimum in a kernel
it would seem reasonable to allow IOSCHED_NOOP to be turned off. Is
the lack of a setting here another safety feature? elevator.c certainly
assumes that 'noop' is available:

if (!e) {
e = elevator_get(CONFIG_DEFAULT_IOSCHED);
if (!e) {
printk(KERN_ERR
"Default I/O scheduler not found. " \
"Using noop.\n");
e = elevator_get("noop");
}
}

(note lack of a further test for !e)

Finally my understanding is that USB storage devices, for example,
have constant access time, and so use of an elevator algorithm won't
benefit anything. For these devices why can't the kernel choose "noop"
instead of the default and save the time/space used by the default?


 block/Kconfig.iosched |   74 ++---
 1 files changed, 58 insertions(+), 16 deletions(-)


diff --git a/block/Kconfig.iosched b/block/Kconfig.iosched
index 7e803fc..96a01b3 100644
--- a/block/Kconfig.iosched
+++ b/block/Kconfig.iosched
@@ -2,15 +2,28 @@ if BLOCK
 
 menu "IO Schedulers"
 
-config IOSCHED_NOOP
-   bool
+config IOSCHED_CFQ
+   tristate "CFQ I/O scheduler"
default y
---help---
- The no-op I/O scheduler is a minimal scheduler that does basic merging
- and sorting. Its main uses include non-disk based block devices like
- memory devices, and specialised software or hardware environments
- that do their own scheduling and require only minimal assistance from
- the kernel.
+ The CFQ I/O scheduler tries to distribute bandwidth equally
+ among all processes in the system. It should provide a fair
+ working environment, suitable for desktop systems.
+
+ This is the default I/O scheduler.
+
+ The CFQ I/O scheduler supports prioritisation for individual
+ processes; see ionice(1) for details.
+
+ See  for details
+ on how to choose a default I/O scheduler at boot time and also
+ on a per-device basis at run time.
+
+ To compile this scheduler as a module, choose M here; the module
+ will be called cfq-iosched. A scheduler built as a module cannot
+ be chosen as default.
+
+ If unsure, say Y.
 
 config IOSCHED_AS
tristate "Anticipatory I/O scheduler"
@@ -21,6 +34,15 @@ config IOSCHED_AS
  deadline I/O scheduler, it can also be slower in some cases
  especially some database loads.
 
+ See  for detailed
+ information on this scheduler.
+
+ To compile this scheduler as a module, choose M here; the module
+ will be called as-iosched. A scheduler built as a module cannot
+ be chosen as default.
+
+ If unsure, say Y.
+
 config IOSCHED_DEADLINE
tristate "Deadline I/O scheduler"
default y
@@ -31,14 +53,26 @@ config IOSCHED_DEADLINE
  a disk at any one time, its behaviour is almost identical to the
  anticipatory I/O scheduler and so is a good choice.
 
-config IOSCHED_CFQ
-   tristate "CFQ I/O scheduler"
+ See  for detailed
+ information on this scheduler.
+
+ To compile this scheduler as a module, choose M here; the module
+ will be 

Re: SMACK or SELinux, but not both

2008-02-26 Thread Stephen Smalley

On Tue, 2008-02-26 at 20:28 +1100, James Morris wrote:
> On Tue, 26 Feb 2008, Alexey Dobriyan wrote:
> 
> > If SELinux is registered before SMACK, SMACK panics after
> > register_security() call.
> > 
> > If SMACK is registered before SELinux, SELinux panics after
> > register_security() call.
> > 
> > Consequently allmodconfig kernel doesn't boot. It would be nice if
> > some Kconfig magic to exclude each other will be in place.
> 
> People want to be able to select the security model at boot time, so the 
> option to build both LSMs is required.
> 
> You can stop SELinux from attempting to register as an LSM via selinux=0, 
> which should allow you to boot with just Smack enabled.

Ideally, one could just boot with security= to select the
desired primary security module.  security=smack, security=selinux, or
security=capability.

Having to specify selinux=0 smack=0 foo=0 just to get bar wouldn't be
pretty.  Not that anyone would want to do that, of course...

-- 
Stephen Smalley
National Security Agency

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] video: limit stack usage of ir-kbd-i2c.c

2008-02-26 Thread Jean Delvare
Hi Marcin,

On Mon, 25 Feb 2008 21:51:00 +0100, Marcin Slusarz wrote:
> ir_probe allocated struct i2c_client on stack;
> it's pretty big structure, so allocate it with kzalloc
> 
> make checkstack output without this patch:
> x059d ir_probe [ir-kbd-i2c]:   1000
> 
> compile tested only
> 
> Signed-off-by: Marcin Slusarz <[EMAIL PROTECTED]>
> Cc: Mauro Carvalho Chehab <[EMAIL PROTECTED]>
> Cc: Jean Delvare <[EMAIL PROTECTED]>
> ---
>  drivers/media/video/ir-kbd-i2c.c |   18 +++---
>  1 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/media/video/ir-kbd-i2c.c 
> b/drivers/media/video/ir-kbd-i2c.c
> index 9851987..aec122f 100644
> --- a/drivers/media/video/ir-kbd-i2c.c
> +++ b/drivers/media/video/ir-kbd-i2c.c
> @@ -510,9 +510,9 @@ static int ir_probe(struct i2c_adapter *adap)
>   static const int probe_cx88[] = { 0x18, 0x6b, 0x71, -1 };
>   static const int probe_cx23885[] = { 0x6b, -1 };
>   const int *probe = NULL;
> - struct i2c_client c;
> + struct i2c_client *c;
>   unsigned char buf;
> - int i,rc;
> + int i, rc;
>  
>   switch (adap->id) {
>   case I2C_HW_B_BT848:
> @@ -537,19 +537,23 @@ static int ir_probe(struct i2c_adapter *adap)
>   if (NULL == probe)
>   return 0;
>  
> - memset(,0,sizeof(c));
> - c.adapter = adap;
> + c = kzalloc(sizeof(*c), GFP_KERNEL);
> + if (!c)
> + return -ENOMEM;
> +
> + c->adapter = adap;
>   for (i = 0; -1 != probe[i]; i++) {
> - c.addr = probe[i];
> - rc = i2c_master_recv(,,0);
> + c->addr = probe[i];
> + rc = i2c_master_recv(c, , 0);
>   dprintk(1,"probe 0x%02x @ %s: %s\n",
>   probe[i], adap->name,
>   (0 == rc) ? "yes" : "no");
>   if (0 == rc) {
> - ir_attach(adap,probe[i],0,0);
> + ir_attach(adap, probe[i], 0, 0);
>   break;
>   }
>   }
> + kfree(c);
>   return 0;
>  }
>  

While this works, I'd rather change the code to call i2c_transfer()
instead of i2c_master_recv(). i2c_transfer() is meant exactly for this
case (no i2c_client at hand.) This solves the stack usage problem
without requiring a temporary memory allocation:

* * * * *

Limit stack usage in ir_probe by calling i2c_transfer, which doesn't
require a struct i2c_client, instead of i2c_master_recv which does.

Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
---
 drivers/media/video/ir-kbd-i2c.c |   17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

--- linux-2.6.25-rc3.orig/drivers/media/video/ir-kbd-i2c.c  2008-02-26 
11:35:51.0 +0100
+++ linux-2.6.25-rc3/drivers/media/video/ir-kbd-i2c.c   2008-02-26 
11:44:54.0 +0100
@@ -510,8 +510,11 @@ static int ir_probe(struct i2c_adapter *
static const int probe_cx88[] = { 0x18, 0x6b, 0x71, -1 };
static const int probe_cx23885[] = { 0x6b, -1 };
const int *probe = NULL;
-   struct i2c_client c;
-   unsigned char buf;
+   struct i2c_msg msg = {
+   .flags = I2C_M_RD,
+   .len = 0,
+   .buf = NULL,
+   };
int i,rc;
 
switch (adap->id) {
@@ -537,15 +540,13 @@ static int ir_probe(struct i2c_adapter *
if (NULL == probe)
return 0;
 
-   memset(,0,sizeof(c));
-   c.adapter = adap;
for (i = 0; -1 != probe[i]; i++) {
-   c.addr = probe[i];
-   rc = i2c_master_recv(,,0);
+   msg.addr = probe[i];
+   rc = i2c_transfer(adap, , 1);
dprintk(1,"probe 0x%02x @ %s: %s\n",
probe[i], adap->name,
-   (0 == rc) ? "yes" : "no");
-   if (0 == rc) {
+   (1 == rc) ? "yes" : "no");
+   if (1 == rc) {
ir_attach(adap,probe[i],0,0);
break;
}


Built-tested, I've also tested loading the ir-kbd-i2c driver on an
unsupported cx88 adapter. Review and more testing welcome.

-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Compress kernel modules on installation.

2008-02-26 Thread Adrian Bunk
On Tue, Feb 26, 2008 at 11:22:03AM +0100, Willy Tarreau wrote:
> On Tue, Feb 26, 2008 at 11:14:55AM +0200, Adrian Bunk wrote:
> > On Mon, Feb 25, 2008 at 11:21:38PM +0100, Willy Tarreau wrote:
>...
> > > Have you tried keeping the module names intact (.ko, not .ko.gz) ?
> > > It's what I was doing with modutils in 2.4 and what I'm still doing
> > > with module-init-tools in 2.6. While I don't particularly use mkinitrd,
> > > I think that keeping the name intact is preferable and should help.
> > 
> > How would you see if, and if yes with what program, a module was 
> > compressed if the name is kept intact?
> 
> depmod/modinfo/insmod/modprobe already know it. And quite honnestly,
> I don't know about any other program which really needs to process
> those files once installed. Well, maybe ksymoops, but I'd have to
> check, as I don't recall having ever been annoyed with this.
>...

depmod/modinfo/insmod/modprobe know only if you compile 
module-init-tools with zlib support.

And what about the busybox versions?

A different name would e.g.:
- easily allow proper error handling if the userspace modules program 
  doesn't support the compression used
- better scale to support additional compressions
- give the user a hint what is happening and what might be the problem
  when anythig goes wrong

> Regards,
> Willy

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-26 Thread Robin Holt
> > That is it.  That is all our allowed interaction with the users process.
> 
> OK, when you said something along the lines of "the MPT library has
> control of the comm buffer", then I assumed it was an area of virtual
> memory which is set up as part of initialization, rather than during
> runtime. I guess I jumped to conclusions.

There are six regions the MPT library typically makes.  The most basic
one is a fixed size.  It describes the MPT internal buffers, the stack,
the heap, the application text, and finally the entire address space.
That last region is seldom used.  MPT only has control over the first
two.

> > That doesn't seem too unreasonable, except when you compare it to how the
> > driver currently works.  Remember, this is done from a library which has
> > no insight into what the user has done to its own virtual address space.
> > As a result, each MPI_Send() would result in a system call (or we would
> > need to have a set of callouts for changes to a processes VMAs) which
> > would be a significant increase in communication overhead.
> >
> > Maybe I am missing what you intend to do, but what we need is a means of
> > tracking one processes virtual address space changes so other processes
> > can do direct memory accesses without the need for a system call on each
> > communication event.
> 
> Yeah it's tricky. BTW. what is the performance difference between
> having a system call or no?

The system call takes many microseconds and still requires the same
latency of the communication.  Without it, our latency is
usually below two microseconds.

> > > Because you don't need to swap, you don't need coherency, and you
> > > are in control of the areas, then this seems like the best choice.
> > > It would allow you to use heap, stack, file-backed, anything.
> >
> > You are missing one point here.  The MPI specifications that have
> > been out there for decades do not require the process use a library
> > for allocating the buffer.  I realize that is a horrible shortcoming,
> > but that is the world we live in.  Even if we could change that spec,
> 
> Can you change the spec? Are you working on it?

Even if we changed the spec, the old specs will continue to be
supported.  I personally am not involved.  Not sure if anybody else is
working this issue.

> > we would still need to support the existing specs.  As a result, the
> > user can change their virtual address space as they need and still expect
> > communications be cheap.
> 
> That's true. How has it been supported up to now? Are you using
> these kind of notifiers in patched kernels?

At fault time, we check to see if it is an anon or mspec vma.  We pin
the page an insert them.  The remote OS then losses synchronicity with
the owning processes page tables.  If an unmap, madvise, etc occurs the
page tables are updated without regard to our references.  Fork or exit
(fork is caught using an LD_PRELOAD library) cause the user pages to be
recalled from the remote side and put_page returns them to the kernel.
We have documented that this loss of synchronicity is due to their
action and not supported.  Essentially, we rely upon the application
being well behaved.  To this point, that has remainded true.

Thanks,
Robin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] Re: [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-26 Thread Robin Holt
On Tue, Feb 26, 2008 at 07:52:41PM +1100, Nick Piggin wrote:
> On Tuesday 26 February 2008 18:21, Gleb Natapov wrote:
> > On Tue, Feb 26, 2008 at 05:11:32PM +1100, Nick Piggin wrote:
> > > > You are missing one point here.  The MPI specifications that have
> > > > been out there for decades do not require the process use a library
> > > > for allocating the buffer.  I realize that is a horrible shortcoming,
> > > > but that is the world we live in.  Even if we could change that spec,
> > >
> > > Can you change the spec?
> >
> > Not really. It will break all existing codes.
> 
> I meant as in eg. submit changes to MPI-3
> 
> 
> > MPI-2 provides a call for 
> > memory allocation (and it's beneficial to use this call for some
> > interconnects), but many (most?) applications are still written for MPI-1
> > and those that are written for MPI-2 mostly uses the old habit of
> > allocating memory by malloc(), or even use stack or BSS memory for
> > communication buffer purposes.
> 
> OK, so MPI-2 already has some way to do that... I'm not saying that we
> can now completely dismiss the idea of using notifiers for this, but it
> is just a good data point to know.

It is in MPI-2, but MPI-2 does not prohibit communication from regions
not allocated by the MPI call.

Thanks,
Robin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc3: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!

2008-02-26 Thread Patrick McHardy

Martin Michlmayr wrote:

With 2.6.25-rc3 and a config file with

CONFIG_CRYPTO_DEV_HIFN_795X=m
CONFIG_CRYPTO_DEV_HIFN_795X_RNG=y

I get the following build error on at least ARM and MIPS:

  Building modules, stage 2.
  MODPOST 759 modules
ERROR: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!
  


Does this patch fix it?


diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index 3110bf7..92c53ce 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -825,8 +825,8 @@ static int hifn_register_rng(struct hifn_device *dev)
/*
 * We must wait at least 256 Pk_clk cycles between two reads of the rng.
 */
-   dev->rng_wait_time  = DIV_ROUND_UP(NSEC_PER_SEC, dev->pk_clk_freq) *
- 256;
+   dev->rng_wait_time  = DIV_ROUND_UP((unsigned int)NSEC_PER_SEC,
+  dev->pk_clk_freq) * 256;
 
dev->rng.name   = dev->name;
dev->rng.data_present   = hifn_rng_data_present,


2.6.25-rc3: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!

2008-02-26 Thread Martin Michlmayr
With 2.6.25-rc3 and a config file with

CONFIG_CRYPTO_DEV_HIFN_795X=m
CONFIG_CRYPTO_DEV_HIFN_795X_RNG=y

I get the following build error on at least ARM and MIPS:

  Building modules, stage 2.
  MODPOST 759 modules
ERROR: "__divdi3" [drivers/crypto/hifn_795x.ko] undefined!

-- 
Martin Michlmayr
http://www.cyrius.com/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: problem with starting 2.5.26-rc1 and latest git

2008-02-26 Thread Jean Delvare
On Mon, 18 Feb 2008 14:28:32 +0100, Jean Delvare wrote:
> On Thu, 14 Feb 2008 00:27:34 +0100, Mariusz Kozlowski wrote:
> > Of course there is a typo in the subject :)
> > 
> > 2.5.25-rc1 -> 2.6.25-rc1
> > 
> > > Hello,
> > > 
> > >   I tried 2.6.25-rc1 and latest git on my laptop (x86 32bit) and have a 
> > > problem.
> > > Linux boots but with huge delay due to some issue with loading usb 
> > > modules.
> > > Udev complains:
> > > 
> > > 'Could not lock modprobe uhci_hcd'
> > > 'Could not lock modprobe yenta_socket'
> > > 'Unknown symbol usb_*'
> > > 'Gave up waiting for init of module usbcore'
> > > (...)
> 
> Have you tried upgrading to rc2? I used to have the same problem you
> reported, but I was unable to reproduce it since I upgraded to rc2.

I take this back. It happened to me again today, while I am now running
rc3.

-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24.2-rt2

2008-02-26 Thread Jan Kiszka
Jan Kiszka wrote:
> At this chance: We still see the same unbalanced sched-other load on our
> NUMA box as Gernot once reported [1]:
> 
> top - 11:19:20 up 4 min,  1 user,  load average: 29.52, 9.54, 3.37
> Tasks: 502 total,  41 running, 461 sleeping,   0 stopped,   0 zombie
> Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu1  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu4  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu12 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu13 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu14 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu15 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:  65513284k total,  1032032k used, 64481252k free, 6444k buffers
> Swap:  3204896k total,0k used,  3204896k free,37312k cached
> 

ETOOMANYKERNELS, this was from 2.6.23.12-rt14. 2.6.24.2-rt2 shows a
different patter under identical load:

top - 12:55:27 up 2 min,  1 user,  load average: 9.97, 2.42, 0.81
Tasks: 491 total,  42 running, 449 sleeping,   0 stopped,   0 zombie
Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 99.7%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  65512480k total,   580704k used, 64931776k free, 8964k buffers
Swap:  3204896k total,0k used,  3204896k free,   129720k cached

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/28] Swap over NFS -v16

2008-02-26 Thread Peter Zijlstra

On Tue, 2008-02-26 at 11:50 +0100, Peter Zijlstra wrote:

> > mm-reserve.patch
> > 
> >I'm confused by __mem_reserve_add.
> > 
> > +   reserve = mem_reserve_root.pages;
> > +   __calc_reserve(res, pages, 0);
> > +   reserve = mem_reserve_root.pages - reserve;
> > 
> >__calc_reserve will always add 'pages' to mem_reserve_root.pages.
> >So this is a complex way of doing
> > reserve = pages;
> > __calc_reserve(res, pages, 0);
> > 
> > And as you can calculate reserve before calling __calc_reserve
> > (which seems odd when stated that way), the whole function looks
> > like it could become:
> > 
> >ret = adjust_memalloc_reserve(pages);
> >if (!ret)
> > __calc_reserve(res, pages, limit);
> >return ret;
> > 
> > What am I missing?
> 
> Probably the horrible twist my brain has. Looking at it makes me doubt
> my own sanity. I think you're right - it would also clean up
> __calc_reserve() a little.
> 
> This is what review for :-)

Ah, you confused me. Well, I confused me - this does deserve a comment
its tricksy.

Its correct. The trick is, the mem_reserve in question (res) need not be
connected to mem_reserve_root.

In that case, mem_reserve_root.pages will not change, but we do
propagate the change as far up as possible, so that
mem_reserve_connect() can just observe the parent and child without
being bothered by the rest of the hierarchy.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [xfs-masters] Re: filesystem corruption on xfs after 2.6.25-rc1 (bisected, powerpc related?)

2008-02-26 Thread Gaudenz Steinlin
On Tue, Feb 26, 2008 at 01:13:56AM +0100, Rafael J. Wysocki wrote:
> On Tuesday, 26 of February 2008, Christoph Hellwig wrote:
> > On Tue, Feb 26, 2008 at 12:52:56AM +0100, Rafael J. Wysocki wrote:
> > > > I'm not suggesting a partial revert; I just wonder which part of the
> > > > change is causing the problem, as part of the debugging process.

I debuged this a bit further by testing the 4 changed functions
individually. The problem only occurs with the new version of
xfs_lowbit64. 

Gaudenz

-- 
Ever tried. Ever failed. No matter.
Try again. Fail again. Fail better.
~ Samuel Beckett ~
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + rcu-split-listh-and-move-rcu-protected-lists-into-rculisth.patch added to -mm tree

2008-02-26 Thread Josh Triplett
[I did not see this patch go by on any mailing list, so I replied to
the -mm mail and CCed LKML.]

[EMAIL PROTECTED] wrote:
> The patch titled
>  rcu: split list.h and move rcu-protected lists into rculist.h
> has been added to the -mm tree.  Its filename is
>  rcu-split-listh-and-move-rcu-protected-lists-into-rculisth.patch
[...]
> Subject: rcu: split list.h and move rcu-protected lists into rculist.h
> From: Franck Bui-Huu <[EMAIL PROTECTED]>
> 
> Move rcu-protected lists from list.h into a new header file rculist.h.
> 
> This is done because list are a very used primitive structure all over the
> kernel and it's currently impossible to include other header files in this
> list.h without creating some circular dependencies.
> 
> For example, list.h implements rcu-protected list and uses rcu_dereference()
> without including rcupdate.h.  It actually compiles because users of
> rcu_dereference() are macros.  Others RCU functions could be used too but
> aren't probably because of this.
> 
> Therefore this patch creates rculist.h which includes rcupdates without to
> many changes/troubles.
> 
> Signed-off-by: Franck Bui-Huu <[EMAIL PROTECTED]>
> Acked-by: Paul E. McKenney <[EMAIL PROTECTED]>
> Cc: Josh Triplett <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>

This seems fine to me.  Having the headers separate might make it more
difficult to keep the two in sync, but the list primitives don't
change, so that doesn't really matter much.

Acked-by: Josh Triplett <[EMAIL PROTECTED]>

- Josh Triplett

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24.2: 4KSTACKS + pcdrw + dm + mount -> stack overflow: ide-cd related? dm-related?

2008-02-26 Thread Jiri Kosina
On Tue, 26 Feb 2008, Ingo Molnar wrote:

> > +   name = kmalloc(sizeof(char) * UDF_NAME_LEN, GFP_KERNEL);
> > +   fname = kmalloc(sizeof(char) * UDF_NAME_LEN, GFP_KERNEL);
> > +
> > +   if (!name || !fname) {
> > +   *err = -ENOMEM;
> > +   return NULL;
> > +   }
> > +
> > if (dentry) {
> > if (!dentry->d_name.len) {
> > *err = -EINVAL;
> this bit is missing i think:
>   if (name)
>   kfree(name);
>   if (fname)
>   kfree(fname);

Ergh, of course, stupid me, sorry, it should be freed on all exit paths. I 
am not sending updated patch, as Jan is probably working on complete 
removal of one of those fields ... ?

Thanks,

-- 
Jiri Kosina
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2-mm1 - boot hangs on ia64

2008-02-26 Thread Ingo Molnar

* KOSAKI Motohiro <[EMAIL PROTECTED]> wrote:

> Fujitsu machine can't boot too. my bisect indicate git-sched.patch 
> cause regression too.

hm, that's a bit weird - nothing really should have broken it. Could you 
try to do a specific bisection of sched-devel.git:

   http://people.redhat.com/mingo/sched-devel.git/README

it's just a handful of commits so it should be relatively quick to 
figure out. My only guess would be:

  Subject: sched: make early bootup sched_clock() use safer

but i think this has been ruled out before ...

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    1   2   3   4   5   6   >