Re: [PATCH -next] ashmem: Fix ashmem_shrink deadlock.
On Thu, May 16, 2013 at 1:19 PM, Andrew Morton wrote: > On Thu, 16 May 2013 13:08:17 -0400 Robert Love wrote: >> This problem seems a rare proper use of mutex_trylock. > > Not really. The need for a trylock is often an indication that a > subsystem has a locking misdesign. That is indeed the case here. It is exactly the same as PF_MEMALLOC. We've got an effectively asynchronous event (shrinking) that can occur while you are holding locks requisite to that shrinking. Given that the shrinkage is best effort, a trylock actually communicates the intent pretty well: "If possible, grab this lock and shrink." I think the idiomatic fix is to introduce a GFP_SHMEM but that seems overkill. Lots of the GFP flags are really just preventing recursing into the shrinkage code and it seems ill-designed that we require developers to know where they might end up. But we can disagree. :) > Well, it's not exactly a ton of work, but adding a per-ashmem_area lock > to protect ->file would rather be putting lipstick on a pig. I suppose > we can put the trylock in there and run away, but it wouldn't hurt to > drop in a big fat comment somewhere explaining that the driver should be > migrated to a per-object locking scheme. Unfortunately I think ashmem_shrink would need to grab the per-object lock too; it needs to update the ranges. I'm sure we could re-design this but I don't think it is as easy as simply pushing the locking into the objects. Robert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -next] ashmem: Fix ashmem_shrink deadlock.
On Thu, May 16, 2013 at 12:45 PM, Andrew Morton wrote: > A better approach would be to add a new __GFP_NOSHRINKERS, but it's all > variations on a theme. I don't like this proposal, either. Many of the existing GFP flags already exist to prevent recurse into that flag's respective shrinker. This problem seems a rare proper use of mutex_trylock. > The mutex_trylock(ashmem_mutex) will actually have the best > performance, because it skips the least amount of memory reclaim > opportunities. Right. > But it still sucks! The real problem is that there exists a lock > called "ashmem_mutex", taken by both the high-level mmap() and by the > low-level shrinker. And taken by everything else too! The ashmem > locking is pretty crude... The locking is "crude" because I optimized for space, not time, and there was (and is) no indication we were suffering lock contention due to the global lock. I haven't thought through the implications of pushing locking into the ashmem_area and ashmem_range objects, but it does look like we'd end up often grabbing all of the locks ... > What is the mutex_lock() in ashmem_mmap() actually protecting? I don't > see much, apart from perhaps some incidental races around the contents > of the file's ashmem_area, and those could/should be protected by a > per-object lock, not a global one? ... but not, as you note, in ashmem_mmap. The main race there is around the allocation of asma->file. That could definitely be a lock local to ashmem_area. I'm OK if anyone wants to take that on but it seems a lot of work for a driver with an unclear future. Robert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -next] ashmem: Fix ashmem_shrink deadlock.
On Thu, May 16, 2013 at 4:15 AM, Raul Xiong wrote: > The issue happens in such sequence: > ashmem_mmap acquired ashmem_mutex --> ashmem_mutex:shmem_file_setup > called kmem_cache_alloc --> shrink due to low memory --> ashmem_shrink > tries to acquire the same ashmem_mutex -- it blocks here. > > I think this reports the bug clearly. Please have a look. There is no debate about the nature of the bug. Only the fix. My mutex_trylock patch fixes the problem. I prefer that solution. Andrew's suggestion of GFP_ATOMIC won't work as we'd have to propagate that down into shmem and elsewhere. Using PF_MEMALLOC will work. You'd want to define something like: static int set_memalloc(void) { if (current->flags & PF_MEMALLOC) return 0; current->flags |= PF_MEMALLOC; return 1; } static void clear_memalloc(int memalloc) { if (memalloc) current->flags &= ~PF_MEMALLOC; } and then set/clear PF_MEMALLOC around every memory allocation and function that descends into a memory allocation. As said I prefer my solution but if someone wants to put together a patch with this approach, fine by me. Robert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -next] ashmem: Fix ashmem_shrink deadlock.
Don't acquire ashmem_mutex in ashmem_shrink if we've somehow recursed into the shrinker code from within ashmem. Just bail out, avoiding a deadlock. This is fine, as ashmem cache pruning is advisory anyhow. Signed-off-by: Robert Love --- drivers/staging/android/ashmem.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index e681bdd..82c6768 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -363,7 +363,11 @@ static int ashmem_shrink(struct shrinker *s, struct shrink_control *sc) if (!sc->nr_to_scan) return lru_count; - mutex_lock(&ashmem_mutex); + /* avoid recursing into this code from within ashmem itself */ + if (!mutex_trylock(&ashmem_mutex)) { + return -1; + } + list_for_each_entry_safe(range, next, &ashmem_lru_list, lru) { loff_t start = range->pgstart * PAGE_SIZE; loff_t end = (range->pgend + 1) * PAGE_SIZE; -- 1.8.2.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] staging: android: ashmem: Deadlock during ashmem_shrink
On Tue, Apr 30, 2013 at 9:29 AM, Shankar Brahadeeswaran wrote: > Question: > On occasions when we return because of the lock unavailability, what > could be the worst case number of ashmem pages that are left > unfreed (lru_count). Will it be very huge and have side effects? On that VM shrink path, all of them, but they'll go on the next pass. Even if they didn't, however, that is fine: The ashmem cache functionality is advisory. From user-space's point of view, it doesn't even know when VM pressure will occur, so it can't possibly care. > To get the answer for this question, I added some instrumentation code > to ashmem_shrink function on top of the patch. I ran Android monkey > tests with lot of memory hungry applications so as to hit the Low > Memory situation more frequently. After running this for almost a day > I did not see a situation where the shrinker did not have the mutex. > In fact what I found is that (in this use case at-least) most of the > time the "lru_count" is zero, which means the application has not > unpinned the pages. So the shrinker has no job to do (basically > shrink_slab does not call ashmem_shrinker second time). So worst case > if we hit a scenario where the shrinker is called I'm sure the > lru_count would be very low. So even if the shrinker returns without > freeing them (because of unavailability of the lock) its not going to > be costly. That is expected. This race window is very, very small. > After this experiment, I too think that this patch (returning from > ashmem_shrink if the lock is not available) is good enough and does > not seem to have any major side effects. > > PS: Any plans of submitting this patch formally? Sure. Greg? :) Robert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] staging: android: ashmem: Deadlock during ashmem_shrink
On Thu, Apr 25, 2013 at 9:54 AM, Shankar Brahadeeswaran wrote: > Also, there are other places in the code where ashmem_mutex is held and memory > allocation functions are called, ex:- range_alloc, calls kmem_cache_zalloc > > Since ashmem_shrink holds the ashmem_mutex, any where from ashmem driver > if a memory allocation function is called with the ashmem_mutex held > && if there is a low memory condition that leads to shrinkers being called > we'll hit the deadlock. The usual way this is solved is by checking the gfp_mask in the shrinker code and bailing out (like we do now) for certain masks. So e.g. the kmem_cache_zalloc in range_alloc is fixed by changing the mask to GFP_FS. > I'm trying to see if the ashmem_shrink should really hold the ashmem_mutex, > but looks like its necessary. Yes, it needs to hold ashmem_mutex. There's no reason we have to run ashmem_shrink, though. See attached (untested). Robert ashmem-lock-fix-rlove-2.patch Description: Binary data
Re: [BUG] staging: android: ashmem: Deadlock during ashmem_shrink
On Tue, Apr 23, 2013 at 12:20 PM, Shankar Brahadeeswaran wrote: > I'm unable to think of a straight forward way to fix this. If you have > any suggestions please provide the same. > If we are unable to solve this too with minor mods, as suggested by > Dan we have to re-look at the locking in this driver. This doesn't look insurmountable. It isn't necessary AFAICT to hold ashmem_mutex across shmem_file_setup. Patch attached (untested). Robert ashmem-lock-fix.patch Description: Binary data
Re: [BUG] staging: android: ashmem: Deadlock during ashmem_shrink
On Mon, Apr 22, 2013 at 10:22 AM, Dan Carpenter wrote: > Read Al's email again: https://lkml.org/lkml/2013/3/20/458 > > I don't know much about VFS locking, but the ashmem locking seems > pretty bogus to me. Why can't multiple threads read() at the same > time? ashmem originally did not support read or write operations, just mmap, which is all 99% of users want. The original concurrency model with per-mapping ashmem_mutex's works fine there. It is only with the later addition of read and write that locking becomes a cluster. If there isn't an obvious way to refactor the locking, I'd suggest removing read and write. Robert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] staging: android: ashmem: Deadlock during ashmem_mmap and ashmem_read
On Thu, Mar 21, 2013 at 10:06 AM, Bjorn Bringert wrote: > I did implement ashmem_read, but I had no idea what I was doing. Calling the > VFS read function seemed like an obvious way to do it, but it might be > wrong. If that needs fixing, then the similar VFS call in ashmem_llseek > probably needs fixing too. You don't want to hold ashmem_mutex across the VFS calls. It is only needed to protect the ashmem-internal structures. FWIW is Android now using ashmem_read()? I left it out of the original ashmem implementation on purpose. Robert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] updated hdaps driver.
Andrew, Attached patch updates the hdaps driver in 2.6.13-mm2. It is a drop-in replacement for the current patch. Changes: bug fixes and an absolute input device. Thanks, Robert Love driver for hdaps MAINTAINERS|7 drivers/hwmon/Kconfig | 17 + drivers/hwmon/Makefile |1 drivers/hwmon/hdaps.c | 696 + 4 files changed, 721 insertions(+) diff -urN linux-2.6.13/drivers/hwmon/hdaps.c linux/drivers/hwmon/hdaps.c --- linux-2.6.13/drivers/hwmon/hdaps.c 1969-12-31 19:00:00.0 -0500 +++ linux/drivers/hwmon/hdaps.c 2005-09-08 12:21:21.0 -0400 @@ -0,0 +1,696 @@ +/* + * drivers/hwmon/hdaps.c - driver for IBM's Hard Drive Active Protection System + * + * Copyright (C) 2005 Robert Love <[EMAIL PROTECTED]> + * Copyright (C) 2005 Jesper Juhl <[EMAIL PROTECTED]> + * + * The HardDisk Active Protection System (hdaps) is present in the IBM ThinkPad + * T41, T42, T43, R50, R50p, R51, and X40, at least. It provides a basic + * two-axis accelerometer and other data, such as the device's temperature. + * + * This driver is based on the document by Mark A. Smith available at + * http://www.almaden.ibm.com/cs/people/marksmith/tpaps.html and a lot of trial + * and error. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License v2 as published by the + * Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define HDAPS_LOW_PORT 0x1600 /* first port used by hdaps */ +#define HDAPS_NR_PORTS 0x30/* number of ports: 0x1600 - 0x162f */ + +#define HDAPS_PORT_STATE 0x1611 /* device state */ +#define HDAPS_PORT_YPOS0x1612 /* y-axis position */ +#defineHDAPS_PORT_XPOS 0x1614 /* x-axis position */ +#define HDAPS_PORT_TEMP1 0x1616 /* device temperature, in celcius */ +#define HDAPS_PORT_YVAR0x1617 /* y-axis variance (what is this?) */ +#define HDAPS_PORT_XVAR0x1619 /* x-axis variance (what is this?) */ +#define HDAPS_PORT_TEMP2 0x161b /* device temperature (again?) */ +#define HDAPS_PORT_UNKNOWN 0x161c /* what is this? */ +#define HDAPS_PORT_KMACT 0x161d /* keyboard or mouse activity */ + +#define STATE_FRESH0x50/* accelerometer data is fresh */ + +#define KEYBD_MASK 0x20/* set if keyboard activity */ +#define MOUSE_MASK 0x40/* set if mouse activity */ +#define KEYBD_ISSET(n) (!! (n & KEYBD_MASK)) /* keyboard used? */ +#define MOUSE_ISSET(n) (!! (n & MOUSE_MASK)) /* mouse used? */ + +#define INIT_TIMEOUT_MSECS 4000/* wait up to 4s for device init ... */ +#define INIT_WAIT_MSECS200 /* ... in 200ms increments */ + +#define HDAPS_POLL_PERIOD (HZ/20) /* poll for input every 1/20s */ +#define HDAPS_INPUT_FUZZ 4 /* input event threshold */ + +static struct timer_list hdaps_timer; +static unsigned int hdaps_mousedev; +static unsigned int hdaps_invert; +static u8 km_activity; +static int rest_x; +static int rest_y; + +static DECLARE_MUTEX(hdaps_sem); + +/* + * __get_latch - Get the value from a given port. Callers must hold hdaps_sem. + */ +static inline u8 __get_latch(u16 port) +{ + return inb(port) & 0xff; +} + +/* + * __check_latch - Check a port latch for a given value. Returns zero if the + * port contains the given value. Callers must hold hdaps_sem. + */ +static inline int __check_latch(u16 port, u8 val) +{ + if (__get_latch(port) == val) + return 0; + return -EINVAL; +} + +/* + * __wait_latch - Wait up to 100us for a port latch to get a certain value, + * returning zero if the value is obtained. Callers must hold hdaps_sem. + */ +static int __wait_latch(u16 port, u8 val) +{ + unsigned int i; + + for (i = 0; i < 20; i++) { + if (!__check_latch(port, val)) + return 0; + udelay(5); + } + + return -EIO; +} + +/* + * __device_refresh - request a refresh from the accelerometer. Does not wait + * for refresh to complete. Callers must hold hdaps_sem. + */ +static void __device_refresh(void) +{ + udelay(200); + if (inb(0x1604) != STATE_FRESH) { + outb(0x11, 0x1610); + outb(0x01, 0x161f); + } +} + +/*
[patch] updated hdaps driver.
Below find an updated hdaps driver. Various bug fixes, clean ups, additions to the DMI whitelist, and a new automatic inversion detector (some ThinkPads have the axises negated). Andrew, since a new 2.6-mm has yet to come out, feel free to replace the original patch with this one. Thanks, Robert Love Driver for the IBM Hard Drive Active Protection System (HDAPS), an accelerometer found in most modern ThinkPads. Signed-off-by: Robert Love <[EMAIL PROTECTED]> diff -urN linux-2.6.13/drivers/hwmon/hdaps.c linux/drivers/hwmon/hdaps.c --- linux-2.6.13/drivers/hwmon/hdaps.c 1969-12-31 19:00:00.0 -0500 +++ linux/drivers/hwmon/hdaps.c 2005-08-31 23:50:36.0 -0400 @@ -0,0 +1,739 @@ +/* + * drivers/hwmon/hdaps.c - driver for IBM's Hard Drive Active Protection System + * + * Copyright (C) 2005 Robert Love <[EMAIL PROTECTED]> + * Copyright (C) 2005 Jesper Juhl <[EMAIL PROTECTED]> + * + * The HardDisk Active Protection System (hdaps) is present in the IBM ThinkPad + * T41, T42, T43, R51, and X40, at least. It provides a basic two-axis + * accelerometer and other data, such as the device's temperature. + * + * Based on the document by Mark A. Smith available at + * http://www.almaden.ibm.com/cs/people/marksmith/tpaps.html and a lot of trial + * and error. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License v2 as published by the + * Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define HDAPS_LOW_PORT 0x1600 /* first port used by hdaps */ +#define HDAPS_NR_PORTS 0x30/* 0x1600 - 0x162f */ + +#define STATE_FRESH0x50/* accelerometer data is fresh */ + +#define REFRESH_ASYNC 0x00/* do asynchronous refresh */ +#define REFRESH_SYNC 0x01/* do synchronous refresh */ + +#define HDAPS_PORT_STATE 0x1611 /* device state */ +#define HDAPS_PORT_YPOS0x1612 /* y-axis position */ +#defineHDAPS_PORT_XPOS 0x1614 /* x-axis position */ +#define HDAPS_PORT_TEMP1 0x1616 /* device temperature, in celcius */ +#define HDAPS_PORT_YVAR0x1617 /* y-axis variance (what is this?) */ +#define HDAPS_PORT_XVAR0x1619 /* x-axis variance (what is this?) */ +#define HDAPS_PORT_TEMP2 0x161b /* device temperature (again?) */ +#define HDAPS_PORT_UNKNOWN 0x161c /* what is this? */ +#define HDAPS_PORT_KMACT 0x161d /* keyboard or mouse activity */ + +#define HDAPS_READ_MASK0xff/* some reads have the low 8 bits set */ + +#define KEYBD_MASK 0x20/* set if keyboard activity */ +#define MOUSE_MASK 0x40/* set if mouse activity */ +#define KEYBD_ISSET(n) (!! (n & KEYBD_MASK)) /* keyboard used? */ +#define MOUSE_ISSET(n) (!! (n & MOUSE_MASK)) /* mouse used? */ + +#define INIT_TIMEOUT_MSECS 4000/* wait up to 4s for device init ... */ +#define INIT_WAIT_MSECS200 /* ... in 200ms increments */ + +static struct platform_device *pdev; +static struct input_dev hdaps_idev; +static struct timer_list hdaps_timer; +static unsigned int hdaps_mousedev_threshold = 4; +static unsigned long hdaps_poll_ms = 50; +static unsigned int hdaps_mousedev; +static unsigned int hdaps_invert; +static u8 km_activity; +static int rest_x; +static int rest_y; + +static DECLARE_MUTEX(hdaps_sem); + +/* + * __get_latch - Get the value from a given port. Callers must hold hdaps_sem. + */ +static inline u8 __get_latch(u16 port) +{ + return inb(port) & HDAPS_READ_MASK; +} + +/* + * __check_latch - Check a port latch for a given value. Callers must hold + * hdaps_sem. Returns zero if the port contains the given value. + */ +static inline unsigned int __check_latch(u16 port, u8 val) +{ + if (__get_latch(port) == val) + return 0; + return -EINVAL; +} + +/* + * __wait_latch - Wait up to 100us for a port latch to get a certain value, + * returning zero if the value is obtained. Callers must hold hdaps_sem. + */ +static unsigned int __wait_latch(u16 port, u8 val) +{ + unsigned int i; + + for (i = 0; i < 20; i++) { + if (!__check_latch(port, val)) + return 0; + udelay(5); + } + + return -EINVAL; +} + +/* + * __device_refresh - Request a refresh
Re: inotify and IN_UNMOUNT-events
On Tue, 2005-08-30 at 21:46 +0200, Juergen Quade wrote: > Playing around with inotify I have some problems > to generate/receive IN_UNMOUNT-events (using > a self written application and inotify_utils-0.25; > kernel 2.6.13). > > Doing: > - mount /dev/hda1 /mnt > - add a watch to the path /mnt/ ("./inotify_test /mnt") > - umount /mnt > > results in two events: > 1. IN_DELETE_SELF (mask=0x0400) > 2. IN_IGNORED (mask=0x8000) > > Any ideas? "/mnt" is not unmounted, stuff inside of it is. Watch, say, "/mnt/foo/bar" and when /dev/hda1 is unmounted, you will get an IN_UNMOUNT on the watch. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] fix: dmi_check_system
Background: 1) dmi_check_system() returns the count of the number of matches. Zero thus means no matches. 2) A match callback can return nonzero to stop the match checking. Bug: The count is incremented after we check for the nonzero return value, so it does not reflect the actual count. We could say this is intended, for some dumb reason, except that it means that a match on the first check returns zero--no matches--if the callback returns nonzero. Attached patch implements the count before calling the callback and thus before potentially short-circuiting. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/i386/kernel/dmi_scan.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -urN linux-2.6.13/arch/i386/kernel/dmi_scan.c linux/arch/i386/kernel/dmi_scan.c --- linux-2.6.13/arch/i386/kernel/dmi_scan.c2005-08-29 14:28:33.0 -0400 +++ linux/arch/i386/kernel/dmi_scan.c 2005-08-29 14:35:06.0 -0400 @@ -218,9 +218,9 @@ /* No match */ goto fail; } + count++; if (d->callback && d->callback(d)) break; - count++; fail: d++; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver, with probing.
On Fri, 2005-08-26 at 17:44 -0500, Dmitry Torokhov wrote: > Is this function used in a hot path to warrant using "unlikely"? There > are to many "unlikely" in the code for my taste. unlikely() can result in better, smaller, faster code. and it acts as a nice directive to programmers reading the code. > input_[un]register_device and del_timer_sync are "long" operations. I > think a semaphore would be better here. I was considering moving all locking to a single semaphore, actually. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] IBM HDAPS accelerometer driver, with probing.
Andrew, Attached patch provides a driver for the IBM Hard Drive Active Protection System (hdaps) on top of 2.6.13-rc6-mm2. Over the previous post, it contains several fixes and improvements, including a dev->probe() routine and a DMI whitelist. Robert Love Driver for the IBM HDAPS Signed-off-by: Robert Love <[EMAIL PROTECTED]> MAINTAINERS|7 drivers/hwmon/Kconfig | 17 + drivers/hwmon/Makefile |1 drivers/hwmon/hdaps.c | 664 + 4 files changed, 689 insertions(+) diff -urN linux-2.6.13-rc6-mm2/drivers/hwmon/hdaps.c linux/drivers/hwmon/hdaps.c --- linux-2.6.13-rc6-mm2/drivers/hwmon/hdaps.c 1969-12-31 19:00:00.0 -0500 +++ linux/drivers/hwmon/hdaps.c 2005-08-26 18:17:33.0 -0400 @@ -0,0 +1,664 @@ +/* + * drivers/hwmon/hdaps.c - driver for IBM's Hard Drive Active Protection System + * + * Copyright (C) 2005 Robert Love <[EMAIL PROTECTED]> + * Copyright (C) 2005 Jesper Juhl <[EMAIL PROTECTED]> + * + * The HardDisk Active Protection System (hdaps) is present in the IBM ThinkPad + * T41, T42, T43, and R51, at least. It provides a basic two-axis + * accelerometer and other misc. data. + * + * Based on the document by Mark A. Smith available at + * http://www.almaden.ibm.com/cs/people/marksmith/tpaps.html and a lot of trial + * and error. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define HDAPS_LOW_PORT 0x1600 /* first port used by hdaps */ +#define HDAPS_HIGH_PORT0x162f /* last port used by hdaps */ + +#define STATE_FRESH0x50/* accelerometer data is fresh */ + +#define REFRESH_ASYNC 0x00/* do asynchronous refresh */ +#define REFRESH_SYNC 0x01/* do synchronous refresh */ + +#define HDAPS_PORT_STATE 0x1611 /* device state */ +#defineHDAPS_PORT_XPOS 0x1612 /* x-axis position */ +#define HDAPS_PORT_YPOS0x1614 /* y-axis position */ +#define HDAPS_PORT_TEMP0x1616 /* device temperature, in celcius */ +#define HDAPS_PORT_XVAR0x1617 /* x-axis variance (what is this?) */ +#define HDAPS_PORT_YVAR0x1619 /* y-axis variance (what is this?) */ +#define HDAPS_PORT_TEMP2 0x161b /* device temperature (again?) */ +#define HDAPS_PORT_UNKNOWN 0x161c /* what is this? */ +#define HDAPS_PORT_KMACT 0x161d /* keyboard or mouse activity */ + +#define HDAPS_READ_MASK0xff/* some reads have the low 8 bits set */ + +#define KEYBD_MASK 0x20/* set if keyboard activity */ +#define MOUSE_MASK 0x40/* set if mouse activity */ + +#define KEYBD_ISSET(n) (!! (n & KEYBD_MASK)) +#define MOUSE_ISSET(n) (!! (n & MOUSE_MASK)) + +static spinlock_t hdaps_lock = SPIN_LOCK_UNLOCKED; + + +/* + * __get_latch - Get the value from a given port latch. Callers must hold + * hdaps_lock. + */ +static inline unsigned short __get_latch(unsigned short port) +{ + return inb(port) & HDAPS_READ_MASK; +} + +/* + * __check_latch - Check a port latch for a given value. Callers must hold + * hdaps_lock. + */ +static inline unsigned int __check_latch(unsigned short port, unsigned char val) +{ + if (__get_latch(port) == val) + return 1; + return 0; +} + +/* + * __wait_latch - Wait up to 100us for a port latch to get a certain value, + * returning nonzero if the value is obtained and zero otherwise. Callers + * must hold hdaps_lock. + */ +static unsigned int __wait_latch(unsigned short port, unsigned char val) +{ + unsigned int i; + + for (i = 0; i < 20; i++) { + if (__check_latch(port, val)) + return 1; + udelay(5); + } + + return 0; +} + +/* + * __request_refresh - Request a refresh from the accelerometer. + * + * If sync is REFRESH_SYNC, we perform a synchronous refresh and will wait for + * the refresh. Returns nonzero if successful or zero on error. + * + * If sync is REFRESH_ASYNC, we merely kick off a new refresh if the device is + * not up-to-date. Always returns true. On the next read from the device
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 15:39 -0400, Robert Love wrote: > > This is racy - 2 threads can try to do this simultaneously. > > Fixed. Thanks. Actually, doesn't sysfs and/or the vfs layer serialize the two simultaneous writes? Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 15:33 -0400, Jeff Garzik wrote: > Since such a check is possible, that's definitely a merge-stopper IMO First, I am not asking that Linus merge this. Everyone needs to relax. Second, we don't know a DMI-based solution will work. I'll check it out. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 14:27 -0500, Dmitry Torokhov wrote: > What this completion is used for? I don't see any other references to it. It was the start of the release() routine, but I decided to move to platform_device_register_simple() and use its release, instead. So this is gone now in my tree. > I'd rather you used absolute coordinates and set up > hdaps_idev->absfuzz to do the filtering. Me too. > This is racy - 2 threads can try to do this simultaneously. Fixed. Thanks. > > + > > + device_create_file(&hdaps_plat_dev.dev, &dev_attr_position); > > + device_create_file(&hdaps_plat_dev.dev, &dev_attr_variance); > > + device_create_file(&hdaps_plat_dev.dev, &dev_attr_temp); > > + device_create_file(&hdaps_plat_dev.dev, &dev_attr_calibrate); > > + device_create_file(&hdaps_plat_dev.dev, &dev_attr_mousedev); > > + device_create_file(&hdaps_plat_dev.dev, > > &dev_attr_mousedev_threshold); > > + device_create_file(&hdaps_plat_dev.dev, &dev_attr_mousedev_poll_ms); > > + > > What about using sysfs_attribute_group? I don't see this in my tree? Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 20:55 +0100, Alan Cox wrote: > I think that should be fixed before its merged. Let me be clear, it has an init routine that effectively probes for the device. It just lacks a simple quick non-invasive check. The driver will definitely fail to load on a laptop without the requisite hardware. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 14:45 -0400, Dave Jones wrote: > A little difficult for people to submit dmi patches, unless they > have hardware this driver runs on. Surely as you've tested this, > you're in the best position to write such patches :-) Surely one of the millions of people with a ThinkPad can feel free to try a DMI-based probe() out, if they want a probe() routine, was my point. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 11:18 -0700, Andrew Morton wrote: > > +config SENSORS_HDAPS > > + tristate "IBM Hard Drive Active Protection System (hdaps)" > > + depends on HWMON > > + default n > > + help > > How does this get along with CONFIG_INPUT=n, CONFIG_INPUT_MOUSEDEV=n, etc? Probably a question you should of asked before merging the patch. ;-) We just need CONFIG_INPUT. Thanks, Robert Love Depend on CONFIG_INPUT. Signed-off-by: Robert Love <[EMAIL PROTECTED]> diff -u linux/drivers/hwmon/Kconfig linux/drivers/hwmon/Kconfig --- linux/drivers/hwmon/Kconfig 2005-08-26 11:07:53.0 -0400 +++ linux/drivers/hwmon/Kconfig 2005-08-26 14:28:09.0 -0400 @@ -413,7 +413,7 @@ config SENSORS_HDAPS tristate "IBM Hard Drive Active Protection System (hdaps)" - depends on HWMON + depends on HWMON && INPUT default n help This driver provides support for the IBM Hard Drive Active Protection - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 20:01 +0200, Arjan van de Ven wrote: > > Not that we've been able to tell. It is a legacy platform device. > > > > So, unfortunately, no probe() routine. > > dmi surely Patches accepted. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Inotify problem [was Re: 2.6.13-rc6-mm1]
On Fri, 2005-08-26 at 13:52 -0400, John McCutchan wrote: > Thanks for your suggestion, it has fixed the inotify problem. But where > to put the fix is turning into a bit of a mess. Some callers like > drivers/md/dm.c:682 call idr_get_new_above as if it will return >= > starting_id. The comment says that it will return > starting_id, and the > function name leads people to believe the same thing. In the patch below > I change inotify do add one to the value was pass into idr. I also > change the comment to more accurately reflect what the function does. > The function name doesn't fit, but it never did. > > Signed-off-by: John McCutchan <[EMAIL PROTECTED]> Signed-off-by: Robert Love <[EMAIL PROTECTED]> Keeping the current behavior is probably the best way to go. Robert Love > Index: linux/fs/inotify.c > === > --- linux.orig/fs/inotify.c 2005-08-26 13:38:29.0 -0400 > +++ linux/fs/inotify.c2005-08-26 13:38:55.0 -0400 > @@ -353,7 +353,7 @@ > do { > if (unlikely(!idr_pre_get(&dev->idr, GFP_KERNEL))) > return -ENOSPC; > - ret = idr_get_new_above(&dev->idr, watch, dev->last_wd, > &watch->wd); > + ret = idr_get_new_above(&dev->idr, watch, dev->last_wd+1, > &watch->wd); > } while (ret == -EAGAIN); > > return ret; > Index: linux/lib/idr.c > === > --- linux.orig/lib/idr.c 2005-08-26 13:38:22.0 -0400 > +++ linux/lib/idr.c 2005-08-26 13:39:08.0 -0400 > @@ -207,7 +207,7 @@ > } > > /** > - * idr_get_new_above - allocate new idr entry above a start id > + * idr_get_new_above - allocate new idr entry above or equal to a start id > * @idp: idr handle > * @ptr: pointer you want associated with the ide > * @start_id: id to start search at > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 13:33 -0400, Brian Gerst wrote: > Is there any way to detect that this device is present (PCI, ACPI, etc.) > without poking at ports? Not that we've been able to tell. It is a legacy platform device. So, unfortunately, no probe() routine. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] IBM HDAPS accelerometer driver.
On Fri, 2005-08-26 at 13:05 -0400, Bill Nottingham wrote: > How does this relate to the hdaps driver hosted at sourceforge since > June? This driver is what is hosted there. I thought it was time to push on. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] IBM HDAPS accelerometer driver.
Andrew, Of late I have been working on a driver for the IBM Hard Drive Active Protection System (HDAPS), which provides a two-axis accelerometer and some other misc. data. The hardware is found on recent IBM ThinkPad laptops. The following patch adds the driver to 2.6.13-rc6-mm2. It is self-contained and fairly simple. Please, apply. Robert Love Driver for the IBM HDAPS, an accelerometer Signed-off-by: Robert Love <[EMAIL PROTECTED]> MAINTAINERS|7 drivers/hwmon/Kconfig | 17 + drivers/hwmon/Makefile |1 drivers/hwmon/hdaps.c | 594 + 4 files changed, 619 insertions(+) diff -urN linux-2.6.13-rc6-mm2/drivers/hwmon/hdaps.c linux/drivers/hwmon/hdaps.c --- linux-2.6.13-rc6-mm2/drivers/hwmon/hdaps.c 1969-12-31 19:00:00.0 -0500 +++ linux/drivers/hwmon/hdaps.c 2005-08-26 11:07:53.0 -0400 @@ -0,0 +1,594 @@ +/* + * drivers/hwmon/hdaps.c - driver for IBM's Hard Drive Active Protection System + * + * Copyright (C) 2005 Robert Love <[EMAIL PROTECTED]> + * Copyright (C) 2005 Jesper Juhl <[EMAIL PROTECTED]> + * + * The HardDisk Active Protection System (hdaps) is present in the IBM ThinkPad + * T41, T42, T43, and R51, at least. It provides a basic two-axis + * accelerometer and other misc. data. + * + * Based on the document by Mark A. Smith available at + * http://www.almaden.ibm.com/cs/people/marksmith/tpaps.html and a lot of trial + * and error. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define HDAPS_LOW_PORT 0x1600 /* first port used by accelerometer */ +#define HDAPS_NR_PORTS 0x30/* number of ports (0x1600 - 0x162f) */ + +#define STATE_FRESH0x50/* accelerometer data is fresh */ + +#define REFRESH_ASYNC 0x00/* do asynchronous refresh */ +#define REFRESH_SYNC 0x01/* do synchronous refresh */ + +#define HDAPS_PORT_STATE 0x1611 /* device state */ +#defineHDAPS_PORT_XPOS 0x1612 /* x-axis position */ +#define HDAPS_PORT_YPOS0x1614 /* y-axis position */ +#define HDAPS_PORT_TEMP0x1616 /* device temperature, in celcius */ +#define HDAPS_PORT_XVAR0x1617 /* x-axis variance (what is this?) */ +#define HDAPS_PORT_YVAR0x1619 /* y-axis variance (what is this?) */ +#define HDAPS_PORT_TEMP2 0x161b /* device temperature (again?) */ +#define HDAPS_PORT_UNKNOWN 0x161c /* what is this? */ +#define HDAPS_PORT_KMACT 0x161d /* keyboard or mouse activity */ + +#define HDAPS_READ_MASK0xff/* some reads have the low 8 bits set */ + +#define KEYBD_MASK 0x20/* set if keyboard activity */ +#define MOUSE_MASK 0x40/* set if mouse activity */ + +#define KEYBD_ISSET(n) (!! (n & KEYBD_MASK)) +#define MOUSE_ISSET(n) (!! (n & MOUSE_MASK)) + +static spinlock_t hdaps_lock = SPIN_LOCK_UNLOCKED; + + +/* + * __get_latch - Get the value from a given port latch. Callers must hold + * hdaps_lock. + */ +static inline unsigned short __get_latch(unsigned short port) +{ + return inb(port) & HDAPS_READ_MASK; +} + +/* + * __check_latch - Check a port latch for a given value. Callers must hold + * hdaps_lock. + */ +static inline unsigned int __check_latch(unsigned short port, unsigned char val) +{ + if (__get_latch(port) == val) + return 1; + return 0; +} + +/* + * __wait_latch - Wait up to 100us for a port latch to get a certain value, + * returning nonzero if the value is obtained and zero otherwise. Callers + * must hold hdaps_lock. + */ +static unsigned int __wait_latch(unsigned short port, unsigned char val) +{ + unsigned int i; + + for (i = 0; i < 20; i++) { + if (__check_latch(port, val)) + return 1; + udelay(5); + } + +#if 0 + printk(KERN_WARNING "hdaps: wait on %04x returned %02x, not %02x!\n", + port, __check_latch(port, val), val); +#endif + + return 0; +} + +/* + * __request_refresh - Request a refresh from the accelerometer. + * + * If sync is REFRESH_SYNC, we perform
Re: Inotify problem [was Re: 2.6.13-rc6-mm1]
On Thu, 2005-08-25 at 09:33 -0400, John McCutchan wrote: > On Thu, 2005-08-25 at 22:07 +1200, Reuben Farrelly wrote: > > Hi, > > > > I have also observed another problem with inotify with dovecot - so I spoke > > with Johannes Berg who wrote the inotify code in dovecot. He suggested I > > post > > here to LKML since his opinion is that this to be a kernel bug. > > > > The problem I am observing is this, logged by dovecot after a period of > > time > > when a client is connected: > > > > dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: > > Invalid argument > > dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: > > Invalid argument > > dovecot: Aug 22 14:31:23 Error: IMAP(gilly): inotify_rm_watch() failed: > > Invalid argument > > > > Multiply that by about 1000 ;-) > > > > Some debugging shows this: > > dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): removing wd 1019 from > > inotify fd 4 > > dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): removing wd 1018 from > > inotify fd 4 > > dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): inotify_add_watch returned > > 1019 > > dovecot: Aug 25 19:31:22 Warning: IMAP(gilly): inotify_add_watch returned > > 1020 > > dovecot: Aug 25 19:31:23 Warning: IMAP(gilly): removing wd 1020 from > > inotify fd 4 > > dovecot: Aug 25 19:31:23 Warning: IMAP(gilly): removing wd 1019 from > > inotify fd 4 > > dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): inotify_add_watch returned > > 1020 > > > > > dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): inotify_add_watch returned > > 1021 > > dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): removing wd 1021 from > > inotify fd 4 > > dovecot: Aug 25 19:31:24 Warning: IMAP(gilly): removing wd 1020 from > > inotify fd 4 > > dovecot: Aug 25 19:31:25 Warning: IMAP(gilly): inotify_add_watch returned > > 1021 > > dovecot: Aug 25 19:31:25 Warning: IMAP(gilly): inotify_add_watch returned > > 1022 > > dovecot: Aug 25 19:31:25 Warning: IMAP(gilly): removing wd 1022 from > > inotify fd 4 > > dovecot: Aug 25 19:31:25 Warning: IMAP(gilly): removing wd 1021 from > > inotify fd 4 > > dovecot: Aug 25 19:31:26 Warning: IMAP(gilly): inotify_add_watch returned > > 1022 > > dovecot: Aug 25 19:31:26 Warning: IMAP(gilly): inotify_add_watch returned > > 1023 > > dovecot: Aug 25 19:31:26 Warning: IMAP(gilly): removing wd 1023 from > > inotify fd 4 > > dovecot: Aug 25 19:31:26 Warning: IMAP(gilly): removing wd 1022 from > > inotify fd 4 > > dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): inotify_add_watch returned > > 1023 > > dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): inotify_add_watch returned > > 1024 > > dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): removing wd 1024 from > > inotify fd 4 > > dovecot: Aug 25 19:31:27 Error: IMAP(gilly): inotify_rm_watch() failed: > > Invalid argument > > dovecot: Aug 25 19:31:27 Warning: IMAP(gilly): removing wd 1023 from > > inotify fd 4 > > dovecot: Aug 25 19:31:28 Warning: IMAP(gilly): inotify_add_watch returned > > 1024 > > dovecot: Aug 25 19:31:28 Warning: IMAP(gilly): inotify_add_watch returned > > 1024 > > > > Note the incrementing wd value even though we are removing them as we go.. > > > > What kernel are you running? The wd's should ALWAYS be incrementing, you > should never get the same wd as you did before. From your log, you are > getting the same wd (after you inotify_rm_watch it). I can reproduce > this bug on 2.6.13-rc7. > > idr_get_new_above > > isn't returning something above. > > Also, the idr layer seems to be breaking when we pass in 1024. I can > reproduce that on my 2.6.13-rc7 system as well. > > > This is using latest CVS of dovecot code and with 2.6.12-rc6-mm(1|2) kernel. > > > > Robert, John, what do you think? Is this possibly related to the oops > > seen > > in the log that I reported earlier? (Which is still showing up 2-3 times > > per > > day, btw) > > There is definitely something broken here. Jim, George- We are seeing a problem in the idr layer. If we do idr_find(1024) when, say, a low valued idr, like, zero, is unallocated, NULL is returned. This readily manifests itself in inotify, where we recently switched to using idr_get_new_above() with our last allocated token. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Inotify problem [was Re: 2.6.13-rc6-mm1]
On Thu, 2005-08-25 at 09:40 -0400, John McCutchan wrote: > I get that message a lot. I know I have said this before (and was wrong) > but I think the idr layer is busted. This time I think I agree with you. ;-) Let's just pass zero for the "above" parameter in idr_get_new_above(), which is I believe the behavior of the other interface, and see if the 1024-multiple problem goes away. We definitely did not have that before. If it does, and we don't have another solution, let's run with that for 2.6.13. I don't want this bug released. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify: idr_get_new_above not working?
On Mon, 2005-08-15 at 10:16 -0400, John McCutchan wrote: > Inotify is using idr_get_new_above to make sure that the next watch > descriptor is larger/different than any of the previous watch > descriptors. We keep track of the largest wd that we get out of > idr_get_new_above, and pass that to idr_get_new_above. I have noticed > though, that idr_get_new_above always returns the first available id. > This causes a serious problem for inotify, because user space will get a > IGNORE event for a wd K that might refer to the last holder of the K. Turns out that the problem was in our court and not the idr layer. idr_get_new_above() seems to work fine. One-line patch is attached. Please merge before 2.6.13. Robert Love We are saving the wrong thing in ->last_wd. We want the wd, not the return value. Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/inotify.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -urN linux-2.6.13-rc6-git2/fs/inotify.c linux/fs/inotify.c --- linux-2.6.13-rc6-git2/fs/inotify.c 2005-08-09 16:52:16.0 -0400 +++ linux/fs/inotify.c 2005-08-15 12:21:18.0 -0400 @@ -402,7 +402,7 @@ return ERR_PTR(ret); } - dev->last_wd = ret; + dev->last_wd = watch->wd; watch->mask = mask; atomic_set(&watch->count, 0); INIT_LIST_HEAD(&watch->d_list); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc6 Oops with Software RAID, LVM, JFS, NFS
On Sun, 2005-08-14 at 20:40 -0600, Zwane Mwaikambo wrote: > I'm new here, if the inode isn't being watched, what's to stop d_delete > from removing the inode before fsnotify_unlink proceeds to use it? Nothing. But check out http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=7a91bf7f5c22c8407a9991cbd9ce5bb87caa6b4a Should solve this problem? Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] SH64: inotify and ioprio syscalls
Add inotify and ioprio syscall stubs to SH64. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/sh64/kernel/syscalls.S |5 + include/asm-sh64/unistd.h |7 ++- 2 files changed, 11 insertions(+), 1 deletion(-) diff -urN linux-2.6.13-rc6-git2/arch/sh64/kernel/syscalls.S linux/arch/sh64/kernel/syscalls.S --- linux-2.6.13-rc6-git2/arch/sh64/kernel/syscalls.S 2005-06-17 15:48:29.0 -0400 +++ linux/arch/sh64/kernel/syscalls.S 2005-08-10 16:12:24.0 -0400 @@ -342,4 +342,9 @@ .long sys_add_key .long sys_request_key .long sys_keyctl/* 315 */ + .long sys_ioprio_set + .long sys_ioprio_get + .long sys_inotify_init + .long sys_inotify_add_watch + .long sys_inotify_rm_watch /* 320 */ diff -urN linux-2.6.13-rc6-git2/include/asm-sh64/unistd.h linux/include/asm-sh64/unistd.h --- linux-2.6.13-rc6-git2/include/asm-sh64/unistd.h 2005-06-17 15:48:29.0 -0400 +++ linux/include/asm-sh64/unistd.h 2005-08-10 16:12:10.0 -0400 @@ -338,8 +338,13 @@ #define __NR_add_key 313 #define __NR_request_key 314 #define __NR_keyctl315 +#define __NR_ioprio_set316 +#define __NR_ioprio_get317 +#define __NR_inotify_init 318 +#define __NR_inotify_add_watch 319 +#define __NR_inotify_rm_watch 320 -#define NR_syscalls 316 +#define NR_syscalls 321 /* user-visible error numbers are in the range -1 - -125: see */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] SH: inotify and ioprio syscalls
Add inotify and ioprio syscall stubs to SH. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/sh/kernel/entry.S |5 + include/asm-sh/unistd.h |8 +++- 2 files changed, 12 insertions(+), 1 deletion(-) diff -urN linux-2.6.13-rc6-git2/arch/sh/kernel/entry.S linux/arch/sh/kernel/entry.S --- linux-2.6.13-rc6-git2/arch/sh/kernel/entry.S2005-06-17 15:48:29.0 -0400 +++ linux/arch/sh/kernel/entry.S2005-08-10 15:54:44.0 -0400 @@ -1145,5 +1145,10 @@ .long sys_add_key /* 285 */ .long sys_request_key .long sys_keyctl + .long sys_ioprio_set + .long sys_ioprio_get + .long sys_inotify_init /* 290 */ + .long sys_inotify_add_watch + .long sys_inotify_rm_watch /* End of entry.S */ diff -urN linux-2.6.13-rc6-git2/include/asm-sh/unistd.h linux/include/asm-sh/unistd.h --- linux-2.6.13-rc6-git2/include/asm-sh/unistd.h 2005-06-17 15:48:29.0 -0400 +++ linux/include/asm-sh/unistd.h 2005-08-10 15:55:41.0 -0400 @@ -295,8 +295,14 @@ #define __NR_add_key 285 #define __NR_request_key 286 #define __NR_keyctl287 +#define __NR_ioprio_set288 +#define __NR_ioprio_get289 +#define __NR_inotify_init 290 +#define __NR_inotify_add_watch 291 +#define __NR_inotify_rm_watch 292 -#define NR_syscalls 288 + +#define NR_syscalls 293 /* user-visible error numbers are in the range -1 - -124: see */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] add inotify & ioprio syscalls to ARM
Russell, Hey. Attached patch adds the syscall stubs for the inotify and ioprio system calls to ARM. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/arm/kernel/calls.S |6 ++ include/asm-arm/unistd.h |5 + 2 files changed, 11 insertions(+) diff -urN linux-2.6.13-rc6/arch/arm/kernel/calls.S linux/arch/arm/kernel/calls.S --- linux-2.6.13-rc6/arch/arm/kernel/calls.S2005-06-17 15:48:29.0 -0400 +++ linux/arch/arm/kernel/calls.S 2005-08-10 15:26:10.0 -0400 @@ -327,6 +327,12 @@ /* 310 */ .long sys_request_key .long sys_keyctl .long sys_semtimedop +/* vserver */ .long sys_ni_syscall + .long sys_ioprio_set +/* 315 */ .long sys_ioprio_get + .long sys_inotify_init + .long sys_inotify_add_watch + .long sys_inotify_rm_watch __syscall_end: .rept NR_syscalls - (__syscall_end - __syscall_start) / 4 diff -urN linux-2.6.13-rc6/include/asm-arm/unistd.h linux/include/asm-arm/unistd.h --- linux-2.6.13-rc6/include/asm-arm/unistd.h 2005-06-17 15:48:29.0 -0400 +++ linux/include/asm-arm/unistd.h 2005-08-10 15:26:08.0 -0400 @@ -350,6 +350,11 @@ #endif #define __NR_vserver (__NR_SYSCALL_BASE+313) +#define __NR_ioprio_set(__NR_SYSCALL_BASE+314) +#define __NR_ioprio_get(__NR_SYSCALL_BASE+315) +#define __NR_inotify_init (__NR_SYSCALL_BASE+316) +#define __NR_inotify_add_watch (__NR_SYSCALL_BASE+317) +#define __NR_inotify_rm_watch (__NR_SYSCALL_BASE+318) /* * The following SWIs are ARM private. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] fsnotify: hook on removexattr, too
On Fri, 2005-08-05 at 19:07 +0100, marijn ros wrote: > I got wondering, why does fs_notify_xattr get called from setxattr in > fs/xattr.c, but > not from removexattr that is below it in the same file? Both seem to make > changes to > xattrs and both are exported as system calls. We should. Robert Love Add fsnotify_xattr() hook to removexattr(). Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/xattr.c |2 ++ 1 files changed, 2 insertions(+) diff -urN linux-2.6.13-rc5/fs/xattr.c linux/fs/xattr.c --- linux-2.6.13-rc5/fs/xattr.c 2005-08-05 15:49:17.0 -0400 +++ linux/fs/xattr.c2005-08-05 15:53:45.0 -0400 @@ -307,6 +307,8 @@ down(&d->d_inode->i_sem); error = d->d_inode->i_op->removexattr(d, kname); up(&d->d_inode->i_sem); + if (!error) + fsnotify_xattr(d); } out: return error; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify: update help text
The inotify help text still refers to the character device. Update it. Fixes kernel bug #4993. Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/Kconfig | 11 +++ 1 files changed, 7 insertions(+), 4 deletions(-) --- linux-2.6.13-rc3-git8/fs/Kconfig2005-07-27 10:59:32.0 -0400 +++ linux/fs/Kconfig2005-08-04 09:26:46.0 -0400 @@ -363,12 +363,15 @@ bool "Inotify file change notification support" default y ---help--- - Say Y here to enable inotify support and the /dev/inotify character - device. Inotify is a file change notification system and a + Say Y here to enable inotify support and the associated system + calls. Inotify is a file change notification system and a replacement for dnotify. Inotify fixes numerous shortcomings in dnotify and introduces several new features. It allows monitoring - of both files and directories via a single open fd. Multiple file - events are supported. + of both files and directories via a single open fd. Other features + include multiple file events, one-shot support, and unmount + notification. + + For more information, see Documentation/filesystems/inotify.txt If unsure, say Y. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify: ppc64 syscalls.
On Wed, 2005-07-27 at 13:27 -0700, David S. Miller wrote: > You'll notice that sys_ppc32.c has a ton of shims which purely > exist to sign extend "int" system call arguments. Sparc64 does > something similarly, but in assembler so that we don't eat the > overhead of a full stack frame just to sign extend arguments. Yah, but it looked like they did the sign extend thing for every int but file descriptors, and fd's are the only int's we have. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify: ppc64 syscalls.
On Wed, 2005-07-27 at 09:55 -0700, Andrew Morton wrote: > ppc64 likes to keep its 32-bit-syscall table in sync with ppc32 so it'd be > best to do ppc64 while we're at it (both sys_call_table and > sys_call_table32) Sure thing. Attached find inotify system call support for PPC64. [ I don't think we need sys32 compatibility versions--and if we do, I failed in life. ] Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/ppc64/kernel/misc.S |6 ++ include/asm-ppc64/unistd.h |5 - 2 files changed, 10 insertions(+), 1 deletion(-) diff -urN linux-2.6.13-rc3-git8/arch/ppc64/kernel/misc.S linux/arch/ppc64/kernel/misc.S --- linux-2.6.13-rc3-git8/arch/ppc64/kernel/misc.S 2005-07-27 10:59:31.0 -0400 +++ linux/arch/ppc64/kernel/misc.S 2005-07-27 13:26:36.0 -0400 @@ -1129,6 +1129,9 @@ .llong .compat_sys_waitid .llong .sys32_ioprio_set .llong .sys32_ioprio_get + .llong .sys_inotify_init/* 275 */ + .llong .sys_inotify_add_watch + .llong .sys_inotify_rm_watch .balign 8 _GLOBAL(sys_call_table) @@ -1407,3 +1410,6 @@ .llong .sys_waitid .llong .sys_ioprio_set .llong .sys_ioprio_get + .llong .sys_inotify_init/* 275 */ + .llong .sys_inotify_add_watch + .llong .sys_inotify_rm_watch diff -urN linux-2.6.13-rc3-git8/include/asm-ppc64/unistd.h linux/include/asm-ppc64/unistd.h --- linux-2.6.13-rc3-git8/include/asm-ppc64/unistd.h2005-07-27 10:59:32.0 -0400 +++ linux/include/asm-ppc64/unistd.h2005-07-27 13:27:24.0 -0400 @@ -285,8 +285,11 @@ #define __NR_waitid272 #define __NR_ioprio_set273 #define __NR_ioprio_get274 +#define __NR_inotify_init 275 +#define __NR_inotify_add_watch 276 +#define __NR_inotify_rm_watch 277 -#define __NR_syscalls 275 +#define __NR_syscalls 278 #ifdef __KERNEL__ #define NR_syscalls__NR_syscalls #endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify: ia64 syscalls.
Hi, Tony. Attached patch adds the inotify syscalls to ia64. Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/ia64/kernel/entry.S |6 +++--- include/asm-ia64/unistd.h |3 +++ 2 files changed, 6 insertions(+), 3 deletions(-) diff -urN linux-2.6.13-rc3-git8/arch/ia64/kernel/entry.S linux/arch/ia64/kernel/entry.S --- linux-2.6.13-rc3-git8/arch/ia64/kernel/entry.S 2005-07-27 10:59:31.0 -0400 +++ linux/arch/ia64/kernel/entry.S 2005-07-27 11:51:20.0 -0400 @@ -1574,8 +1574,8 @@ data8 sys_ioprio_set data8 sys_ioprio_get// 1275 data8 sys_set_zone_reclaim - data8 sys_ni_syscall - data8 sys_ni_syscall - data8 sys_ni_syscall + data8 sys_inotify_init + data8 sys_inotify_add_watch + data8 sys_inotify_rm_watch .org sys_call_table + 8*NR_syscalls // guard against failures to increase NR_syscalls diff -urN linux-2.6.13-rc3-git8/include/asm-ia64/unistd.h linux/include/asm-ia64/unistd.h --- linux-2.6.13-rc3-git8/include/asm-ia64/unistd.h 2005-07-27 10:59:32.0 -0400 +++ linux/include/asm-ia64/unistd.h 2005-07-27 11:56:43.0 -0400 @@ -266,6 +266,9 @@ #define __NR_ioprio_set1274 #define __NR_ioprio_get1275 #define __NR_set_zone_reclaim 1276 +#define __NR_inotify_init 1277 +#define __NR_inotify_add_watch 1278 +#define __NR_inotify_rm_watch 1279 #ifdef __KERNEL__ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify: ppc32 syscalls.
Hey, Paulus, Add inotify system call stubs to PPC32. Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/ppc/kernel/misc.S |3 +++ include/asm-ppc/unistd.h |5 - 2 files changed, 7 insertions(+), 1 deletion(-) diff -urN linux-2.6.13-rc3-git8/arch/ppc/kernel/misc.S linux/arch/ppc/kernel/misc.S --- linux-2.6.13-rc3-git8/arch/ppc/kernel/misc.S2005-07-27 10:59:31.0 -0400 +++ linux/arch/ppc/kernel/misc.S2005-07-27 11:25:43.0 -0400 @@ -1451,3 +1451,6 @@ .long sys_waitid .long sys_ioprio_set .long sys_ioprio_get + .long sys_inotify_init /* 275 */ + .long sys_inotify_add_watch + .long sys_inotify_rm_watch diff -urN linux-2.6.13-rc3-git8/include/asm-ppc/unistd.h linux/include/asm-ppc/unistd.h --- linux-2.6.13-rc3-git8/include/asm-ppc/unistd.h 2005-07-27 10:59:32.0 -0400 +++ linux/include/asm-ppc/unistd.h 2005-07-27 11:25:26.0 -0400 @@ -279,8 +279,11 @@ #define __NR_waitid272 #define __NR_ioprio_set273 #define __NR_ioprio_get274 +#define __NR_inotify_init 275 +#define __NR_inotify_add_watch 276 +#define __NR_inotify_rm_watch 277 -#define __NR_syscalls 275 +#define __NR_syscalls 278 #define __NR(n)#n - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify: add x86-64 syscall numbers
On Fri, 2005-07-15 at 22:01 +0200, Andi Kleen wrote: > It won't work anyways because you forgot to patch the compat > sys32_open. Well, "won't work" is a bit harsh, its just one hook. But that was next. I usually leave per-arch stuff to the arch folks. Robert Love Add fsnotify_open() hook to sys32_open() on x86-64. Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/x86_64/ia32/sys_ia32.c |5 - 1 files changed, 4 insertions(+), 1 deletion(-) diff -urN linux-2.6.13-rc3/arch/x86_64/ia32/sys_ia32.c linux/arch/x86_64/ia32/sys_ia32.c --- linux-2.6.13-rc3/arch/x86_64/ia32/sys_ia32.c2005-07-15 16:08:27.0 -0400 +++ linux/arch/x86_64/ia32/sys_ia32.c 2005-07-15 16:07:21.0 -0400 @@ -61,6 +61,7 @@ #include #include #include +#include #include #include #include @@ -984,8 +985,10 @@ if (IS_ERR(f)) { put_unused_fd(fd); fd = error; - } else + } else { + fsnotify_open(f->f_dentry); fd_install(fd, f); + } } putname(tmp); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify: add x86-64 syscall numbers
Andi, Attached patch adds the inotify syscall numbers to x86-64. Also adds the new ioprio_get() and ioprio_set() calls to the IA32 layer. Robert Love Add the inotify syscalls to x86-64 Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/x86_64/ia32/ia32entry.S |8 ++-- include/asm-x86_64/ia32_unistd.h |7 ++- include/asm-x86_64/unistd.h |8 +++- 3 files changed, 19 insertions(+), 4 deletions(-) diff -urN linux-2.6.13-rc3/arch/x86_64/ia32/ia32entry.S linux/arch/x86_64/ia32/ia32entry.S --- linux-2.6.13-rc3/arch/x86_64/ia32/ia32entry.S 2005-07-13 10:51:10.0 -0400 +++ linux/arch/x86_64/ia32/ia32entry.S 2005-07-15 15:47:59.0 -0400 @@ -591,11 +591,15 @@ .quad compat_sys_mq_getsetattr .quad compat_sys_kexec_load /* reserved for kexec */ .quad compat_sys_waitid - .quad quiet_ni_syscall /* sys_altroot */ + .quad quiet_ni_syscall /* 285: sys_altroot */ .quad sys_add_key .quad sys_request_key .quad sys_keyctl - /* don't forget to change IA32_NR_syscalls */ + .quad sys_ioprio_set + .quad sys_ioprio_get/* 290 */ + .quad sys_inotify_init + .quad sys_inotify_add_watch + .quad sys_inotify_rm_watch ia32_syscall_end: .rept IA32_NR_syscalls-(ia32_syscall_end-ia32_sys_call_table)/8 .quad ni_syscall diff -urN linux-2.6.13-rc3/include/asm-x86_64/ia32_unistd.h linux/include/asm-x86_64/ia32_unistd.h --- linux-2.6.13-rc3/include/asm-x86_64/ia32_unistd.h 2005-07-13 10:51:00.0 -0400 +++ linux/include/asm-x86_64/ia32_unistd.h 2005-07-15 15:48:50.0 -0400 @@ -294,7 +294,12 @@ #define __NR_ia32_add_key 286 #define __NR_ia32_request_key 287 #define __NR_ia32_keyctl 288 +#define __NR_ia32_ioprio_set 289 +#define __NR_ia32_ioprio_get 290 +#define __NR_ia32_inotify_init 291 +#define __NR_ia32_inotify_add_watch292 +#define __NR_ia32_inotify_rm_watch 293 -#define IA32_NR_syscalls 290 /* must be > than biggest syscall! */ +#define IA32_NR_syscalls 294 /* must be > than biggest syscall! */ #endif /* _ASM_X86_64_IA32_UNISTD_H_ */ diff -urN linux-2.6.13-rc3/include/asm-x86_64/unistd.h linux/include/asm-x86_64/unistd.h --- linux-2.6.13-rc3/include/asm-x86_64/unistd.h2005-07-13 10:51:14.0 -0400 +++ linux/include/asm-x86_64/unistd.h 2005-07-15 15:49:37.0 -0400 @@ -565,8 +565,14 @@ __SYSCALL(__NR_ioprio_set, sys_ioprio_set) #define __NR_ioprio_get252 __SYSCALL(__NR_ioprio_get, sys_ioprio_get) +#define __NR_inotify_init 253 +__SYSCALL(__NR_inotify_init, sys_inotify_init) +#define __NR_inotify_add_watch 254 +__SYSCALL(__NR_inotify_add_watch, sys_inotify_add_watch) +#define __NR_inotify_rm_watch 255 +__SYSCALL(__NR_inotify_rm_watch, sys_inotify_rm_watch) -#define __NR_syscall_max __NR_ioprio_get +#define __NR_syscall_max __NR_inotify_rm_watch #ifndef __NO_STUBS /* user-visible error numbers are in the range -1 - -4095 */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify: documentation update
Linus, Trivial documentation update for inotify. Please, apply. Robert Love Clean up and expand some of the inotify documentation. Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 77 +++--- 1 files changed, 45 insertions(+), 32 deletions(-) diff -urN linux-2.6.13-rc3/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.13-rc3/Documentation/filesystems/inotify.txt 2005-07-13 10:51:09.0 -0400 +++ linux/Documentation/filesystems/inotify.txt 2005-07-14 15:17:59.0 -0400 @@ -1,18 +1,22 @@ - inotify -a powerful yet simple file change notification system + inotify + a powerful yet simple file change notification system Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + (i) User Interface -Inotify is controlled by a set of three sys calls +Inotify is controlled by a set of three system calls and normal file I/O on a +returned file descriptor. -First step in using inotify is to initialise an inotify instance +First step in using inotify is to initialise an inotify instance: int fd = inotify_init (); +Each instance is associated with a unique, ordered queue. + Change events are managed by "watches". A watch is an (object,mask) pair where the object is a file or directory and the mask is a bit mask of one or more inotify events that the application wishes to receive. See @@ -22,43 +26,52 @@ Watches on a directory will return events on any files inside of the directory. -Adding a watch is simple, +Adding a watch is simple: int wd = inotify_add_watch (fd, path, mask); -You can add a large number of files via something like - - for each file to watch { - int wd = inotify_add_watch (fd, file, mask); - } +Where "fd" is the return value from inotify_init(), path is the path to the +object to watch, and mask is the watch mask (see ). You can update an existing watch in the same manner, by passing in a new mask. -An existing watch is removed via the INOTIFY_IGNORE ioctl, for example +An existing watch is removed via - inotify_rm_watch (fd, wd); + int ret = inotify_rm_watch (fd, wd); Events are provided in the form of an inotify_event structure that is read(2) -from a inotify instance fd. The filename is of dynamic length and follows the -struct. It is of size len. The filename is padded with null bytes to ensure -proper alignment. This padding is reflected in len. +from a given inotify instance. The filename is of dynamic length and follows +the struct. It is of size len. The filename is padded with null bytes to +ensure proper alignment. This padding is reflected in len. You can slurp multiple events by passing a large buffer, for example size_t len = read (fd, buf, BUF_LEN); -Will return as many events as are available and fit in BUF_LEN. +Where "buf" is a pointer to an array of "inotify_event" structures at least +BUF_LEN bytes in size. The above example will return as many events as are +available and fit in BUF_LEN. -each inotify instance fd is also select()- and poll()-able. +Each inotify instance fd is also select()- and poll()-able. -You can find the size of the current event queue via the FIONREAD ioctl. +You can find the size of the current event queue via the standard FIONREAD +ioctl on the fd returned by inotify_init(). All watches are destroyed and cleaned up on close. -(ii) Internal Kernel Implementation +(ii) + +Prototypes: + + int inotify_init (void); + int inotify_add_watch (int fd, const char *path, __u32 mask); + int inotify_rm_watch (int fd, __u32 mask); + -Each open inotify instance is associated with an inotify_device structure. +(iii) Internal Kernel Implementation + +Each inotify instance is associated with an inotify_device structure. Each watch is associated with an inotify_watch structure. Watches are chained off of each associated device and each associated inode. @@ -66,7 +79,7 @@ See fs/inotify.c for the locking and lifetime rules. -(iii) Rationale +(iv) Rationale Q: What is the design decision behind not tying the watch to the open fd of the watched object? @@ -75,9 +88,9 @@ This solves the primary problem with dnotify: keeping the file open pins the file and thus, worse, pins the mount. Dnotify is therefore infeasible for use on a desktop system with removable media as the media cannot be - unmounted. + unmounted. Watching a file should not require that it be open. -Q: What is the design decision behind using an-fd-per-device as opposed to +Q: What is the design decision behind using an-fd-per-instance as opposed to an fd-per-watch? A: An fd-per-watch quickly consumes more file descriptors than a
[patch 3/3] inotify: misc cleanup
Linus, Real simple, basic cleanup. Please, apply. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/inotify.c |9 +++-- include/linux/sched.h |2 +- kernel/user.c |2 +- 3 files changed, 5 insertions(+), 8 deletions(-) diff -urN linux-inotify/fs/inotify.c linux/fs/inotify.c --- linux-inotify/fs/inotify.c 2005-07-13 11:26:02.0 -0400 +++ linux/fs/inotify.c 2005-07-13 11:41:25.0 -0400 @@ -29,8 +29,6 @@ #include #include #include -#include -#include #include #include #include @@ -936,7 +934,7 @@ dev = filp->private_data; - ret = find_inode ((const char __user*)path, &nd); + ret = find_inode((const char __user*) path, &nd); if (ret) goto fput_and_out; @@ -993,8 +991,9 @@ if (!filp) return -EBADF; dev = filp->private_data; - ret = inotify_ignore (dev, wd); + ret = inotify_ignore(dev, wd); fput(filp); + return ret; } @@ -1034,8 +1033,6 @@ sizeof(struct inotify_kernel_event), 0, SLAB_PANIC, NULL, NULL); - printk(KERN_INFO "inotify syscall\n"); - return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: supporting functions missing from inotify patch
On Wed, 2005-07-13 at 15:21 -0500, Steve French wrote: > I did not think that inotify_add_watch called dir_notify. I don't see a path > in which > calls to add a new inotify watch end up in a call to fcntl_dirnotify or > file->dir_notify > This is for the case in which an app only calls inotify ioctl - ie does not > [also] do a call > to dnotify. No, you are right, they do not, right now. Is dir_notify suitable for inotify and your uses? In the 10 months of inotify development, I had hoped that a remote filesystem developer would add support so we could test it. But there is no rush to get this hook added, so its okay. The problem with dir_notify is that the args parameter is dnotify flags. Those don't map directly to inotify flags. What I'd like is (a) a patch adding the requisite inotify hook (really, 4 lines) (b) a filesystem successfully using the hook > Without such a call - an app that does your new ioctl to add a watch on a > file or directory will > not cause the network/cluster fs to turn on notification on the server since > the watch > will be not seen by the client filesystem. It is a system call, now. ;-) > OK - you exported a common underlying function > inotify_inode_queue_event > under the inline functions which the network/cluster fs would call to notify > of remote changes. > That makes sense. I had missed that. Nod. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: supporting functions missing from inotify patch
On Wed, 2005-07-13 at 13:43 -0500, Steve French wrote: > I don't see an inode operation for registering inotify events in the fs > (there is a file operation for dir_notify to register its events). In > create_watch in fs/inotify.c I expected to see something like: Why not use the existing dir_notify method? No point in adding another. Add inotify hooks as needed to your filesystem's dir_notify. > I also don't see exports for > fsnotify_access > fsnotify_modify > > Without these exports, network and cluster filesystems can't notify the > local system about changes. Eh? They are in . Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 3/3] inotify: misc cleanup
Linus, Real simple, basic cleanup. Please, apply. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/inotify.c |9 +++-- include/linux/sched.h |2 +- kernel/user.c |2 +- 3 files changed, 5 insertions(+), 8 deletions(-) diff -urN linux-inotify/fs/inotify.c linux/fs/inotify.c --- linux-inotify/fs/inotify.c 2005-07-13 11:26:02.0 -0400 +++ linux/fs/inotify.c 2005-07-13 11:41:25.0 -0400 @@ -29,8 +29,6 @@ #include #include #include -#include -#include #include #include #include @@ -936,7 +934,7 @@ dev = filp->private_data; - ret = find_inode ((const char __user*)path, &nd); + ret = find_inode((const char __user*) path, &nd); if (ret) goto fput_and_out; @@ -993,8 +991,9 @@ if (!filp) return -EBADF; dev = filp->private_data; - ret = inotify_ignore (dev, wd); + ret = inotify_ignore(dev, wd); fput(filp); + return ret; } @@ -1034,8 +1033,6 @@ sizeof(struct inotify_kernel_event), 0, SLAB_PANIC, NULL, NULL); - printk(KERN_INFO "inotify syscall\n"); - return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/3] inotify: event ordering
Linus, Attached patch rearranges the event ordering for "open" to be consistent with the ordering of the other events. Patch is against current git tree. Please, apply. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> include/linux/fsnotify.h |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -urN linux-inotify/include/linux/fsnotify.h linux/include/linux/fsnotify.h --- linux-inotify/include/linux/fsnotify.h 2005-07-13 11:25:31.0 -0400 +++ linux/include/linux/fsnotify.h 2005-07-13 11:24:27.0 -0400 @@ -125,8 +125,8 @@ if (S_ISDIR(inode->i_mode)) mask |= IN_ISDIR; - inotify_inode_queue_event(inode, mask, 0, NULL); inotify_dentry_parent_queue_event(dentry, mask, 0, dentry->d_name.name); + inotify_inode_queue_event(inode, mask, 0, NULL); } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/3] inotify: move sysctl
Linus, Attached patch moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from "/proc/sys/fs". Also some related cleanup. Patch is against current git tree. Please, apply. Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/inotify.c | 49 +++ include/linux/sysctl.h | 12 +-- kernel/sysctl.c| 51 ++--- 3 files changed, 62 insertions(+), 50 deletions(-) diff -urN linux-2.6.13-rc3/fs/inotify.c linux/fs/inotify.c --- linux-2.6.13-rc3/fs/inotify.c 2005-07-13 10:51:12.0 -0400 +++ linux/fs/inotify.c 2005-07-13 12:36:12.0 -0400 @@ -45,8 +45,8 @@ static struct vfsmount *inotify_mnt; -/* These are configurable via /proc/sys/inotify */ -int inotify_max_user_devices; +/* these are configurable via /proc/sys/fs/inotify/ */ +int inotify_max_user_instances; int inotify_max_user_watches; int inotify_max_queued_events; @@ -125,6 +125,47 @@ u32 mask; /* event mask for this watch */ }; +#ifdef CONFIG_SYSCTL + +#include + +static int zero; + +ctl_table inotify_table[] = { + { + .ctl_name = INOTIFY_MAX_USER_INSTANCES, + .procname = "max_user_instances", + .data = &inotify_max_user_instances, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &zero, + }, + { + .ctl_name = INOTIFY_MAX_USER_WATCHES, + .procname = "max_user_watches", + .data = &inotify_max_user_watches, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &zero, + }, + { + .ctl_name = INOTIFY_MAX_QUEUED_EVENTS, + .procname = "max_queued_events", + .data = &inotify_max_queued_events, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &zero + }, + { .ctl_name = 0 } +}; +#endif /* CONFIG_SYSCTL */ + static inline void get_inotify_dev(struct inotify_device *dev) { atomic_inc(&dev->count); @@ -842,7 +883,7 @@ user = get_uid(current->user); - if (unlikely(atomic_read(&user->inotify_devs) >= inotify_max_user_devices)) { + if (unlikely(atomic_read(&user->inotify_devs) >= inotify_max_user_instances)) { ret = -EMFILE; goto out_err; } @@ -979,7 +1020,7 @@ inotify_mnt = kern_mount(&inotify_fs_type); inotify_max_queued_events = 8192; - inotify_max_user_devices = 128; + inotify_max_user_instances = 8; inotify_max_user_watches = 8192; atomic_set(&inotify_cookie, 0); diff -urN linux-2.6.13-rc3/include/linux/sysctl.h linux/include/linux/sysctl.h --- linux-2.6.13-rc3/include/linux/sysctl.h 2005-07-13 10:51:15.0 -0400 +++ linux/include/linux/sysctl.h2005-07-13 11:11:24.0 -0400 @@ -61,8 +61,7 @@ CTL_DEV=7, /* Devices */ CTL_BUS=8, /* Busses */ CTL_ABI=9, /* Binary emulation */ - CTL_CPU=10, /* CPU stuff (speed scaling, etc) */ - CTL_INOTIFY=11 /* Inotify */ + CTL_CPU=10 /* CPU stuff (speed scaling, etc) */ }; /* CTL_BUS names: */ @@ -71,12 +70,12 @@ CTL_BUS_ISA=1 /* ISA */ }; -/* CTL_INOTIFY names: */ +/* /proc/sys/fs/inotify/ */ enum { - INOTIFY_MAX_USER_DEVICES=1, /* max number of inotify device instances per user */ - INOTIFY_MAX_USER_WATCHES=2, /* max number of inotify watches per user */ - INOTIFY_MAX_QUEUED_EVENTS=3 /* Max number of queued events per inotify device instance */ + INOTIFY_MAX_USER_INSTANCES=1, /* max instances per user */ + INOTIFY_MAX_USER_WATCHES=2, /* max watches per user */ + INOTIFY_MAX_QUEUED_EVENTS=3 /* max queued events per instance */ }; /* CTL_KERN names: */ @@ -685,6 +684,7 @@ FS_XFS=17, /* struct: control xfs parameters */ FS_AIO_NR=18, /* current system-wide number of aio requests */ FS_AIO_MAX_NR=19, /* system-wide maximum number of aio requests */ + FS_INOTIFY=20, /* inotify submenu */ }; /* /proc/sys/fs/quota/ */ diff -urN linux-2.6.13-r
Re: [RFC/PATCH 1/2] fsnotify
On Mon, 2005-07-11 at 13:52 +0100, David Woodhouse wrote: > To be honest, I don't really see that this is in any way better than > what we had before. Yes, two different pieces of code actually use hooks > in similar places in the VFS code. But this 'infrastructure' just to > share those hooks is overkill as far as I can tell. It really isn't any > better than having both inotify and audit hooks side by side where we > can actually see what's going on at a glance. In fact, it's worse. I think what makes this patch look superfluous is that Chris added a set of wrappers for dnotify, too. In the inotify patch, the fsnotify wrappers call directly into the inotify and dnotify interfaces and they do consolidate code and clean things up. I added fsnotify at hch's request. Now that audit is coming along, fsnotify makes even more sense. I would like to share some more code at a lower level, though, as you pointed out. I planned to look at redoing dnotify entirely on top of inotify, once inotify is in the kernel proper, for example. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.13-rc2-mm1
On Thu, 2005-07-07 at 04:00 -0700, Andrew Morton wrote: > - Anything which you think needs to go into 2.6.13, please let me know. Inotify? Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm patch] Fix inotify umount hangs.
On Mon, 2005-07-04 at 20:28 +0100, Anton Altaparmakov wrote: > The below patch against 2.6.13-rc1-mm1 fixes the umount hangs caused by > inotify. Thank you, very much, Anton, for hacking on this over the weekend. It's definitely not the prettiest thing, but there may be no easier approach. One thing, the messy code is working around the list changing, doesn't invalidate_inodes() have the same problem? If so, it solves it differently. I'm also curious if the I_WILL_FREE or i_count check fixed the bug. I suspect the other fix did, but we probably still want this. Or at least the I_WILL_FREE check. Anyhow... I'll send out an updated inotify patch after some testing. Thanks again. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] updated inotify for 2.6.12-rc3.
On Fri, 2005-04-22 at 22:13 +0100, Al Viro wrote: > Or it would, if remove_watch() had been called only once. In the scenario > above that will not be true. Thanks. Robert Love Double check that we don't race. Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/inotify.c |9 +++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff -urN linux-2.6.12-rc3-inotify/fs/inotify.c linux/fs/inotify.c --- linux-2.6.12-rc3-inotify/fs/inotify.c 2005-04-22 19:20:14.0 -0400 +++ linux/fs/inotify.c 2005-04-22 19:25:44.0 -0400 @@ -861,12 +861,17 @@ return -EINVAL; } get_inotify_watch(watch); + inode = watch->inode; up(&dev->sem); - inode = watch->inode; down(&inode->inotify_sem); down(&dev->sem); - remove_watch(watch, dev); + + /* make sure we did not race */ + watch = idr_find(&dev->idr, wd); + if (likely(watch)) + remove_watch(watch, dev); + up(&dev->sem); up(&inode->inotify_sem); put_inotify_watch(watch); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] updated inotify for 2.6.12-rc3.
On Thu, 2005-04-21 at 01:13 -0400, Robert Love wrote: > Live from linux.conf.au, below is inotify against 2.6.12-rc3. Here is an updated rediff for 2.6.12-rc3, with the changes from the last day or so added: - Add oneshot support for Tridge and Jeremy. - Send IN_ATTRIB event on xattr change. - Mark the open device as nonseekable. Andrew, can you please replace the current inotify patch in 2.6-mm with this? Best, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure), Gamin (a FAM replacement), and other projects. Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 965 ++ fs/namei.c| 30 - fs/open.c |4 fs/read_write.c | 15 fs/xattr.c|5 include/linux/fs.h|6 include/linux/fsnotify.h | 230 include/linux/inotify.h | 112 +++ include/linux/sched.h |4 kernel/user.c |4 17 files changed, 1468 insertions(+), 56 deletions(-) diff -urN linux-2.6.12-rc3/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.12-rc3/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-04-22 00:48:39.0 -0400 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It is of size len. The filename is padded with null bytes to ensure proper +alignment. This padding is reflected in len. + +You
Re: [patch] inotify for 2.6.12-rc3.
On Thu, 2005-04-21 at 01:13 -0400, Robert Love wrote: > Live from linux.conf.au, below is inotify against 2.6.12-rc3. Mark the open inotify device as nonseekable, so lseek() and such do not work. Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/inotify.c |2 ++ 1 files changed, 2 insertions(+) diff -urN linux-2.6.12-rc3-inotify/fs/inotify.c linux/fs/inotify.c --- linux-2.6.12-rc3-inotify/fs/inotify.c 2005-04-22 00:54:25.0 -0400 +++ linux/fs/inotify.c 2005-04-22 00:50:36.0 -0400 @@ -743,6 +743,8 @@ file->private_data = dev; + nonseekable_open(inode, file); + return 0; out_err: free_uid(user); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/2] kstrdup: implementation
On Fri, 2005-04-22 at 05:51 +0200, Adrian Bunk wrote: > This is a good example why development against Linus' tree is ofter > pointless: Seriously. > A similar patch is already in -mm. But...woohoo! Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2/2] kstrdup: replace a few
> Rusty and I's LCA kernel tutorial again brought up kstrdup(). Let's > close this never ending saga and provide a standard kernel > implementation. Convert a few existing implementations, with a nice net loss of 50 lines. Still way more to go. Best, Robert Love Convert a bunch of strdup() implementations and their callers to the new kstrdup(). A few remain, for example see sound/core, and there are tons of open coded strdup()'s around. Sigh. But this is a start. By: Robert Love and Rusty Russell. Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/um/kernel/process_kern.c |8 ++-- drivers/md/dm-ioctl.c | 17 +++-- drivers/parport/probe.c | 18 +- include/linux/netdevice.h |4 net/core/neighbour.c |2 +- net/core/sysctl_net_core.c| 15 --- net/ipv4/devinet.c|2 +- net/ipv6/addrconf.c |2 +- net/sunrpc/svcauth_unix.c | 11 ++- 9 files changed, 15 insertions(+), 64 deletions(-) diff -urN linux-2.6.12-rc3/arch/um/kernel/process_kern.c linux/arch/um/kernel/process_kern.c --- linux-2.6.12-rc3/arch/um/kernel/process_kern.c 2005-04-20 22:47:00.0 -0400 +++ linux/arch/um/kernel/process_kern.c 2005-04-21 21:51:38.0 -0400 @@ -8,6 +8,7 @@ #include "linux/kernel.h" #include "linux/sched.h" #include "linux/interrupt.h" +#include "linux/string.h" #include "linux/mm.h" #include "linux/slab.h" #include "linux/utsname.h" @@ -356,12 +357,7 @@ char *uml_strdup(char *string) { - char *new; - - new = kmalloc(strlen(string) + 1, GFP_KERNEL); - if(new == NULL) return(NULL); - strcpy(new, string); - return(new); + return kstrdup(string, GFP_KERNEL); } void *get_init_task(void) diff -urN linux-2.6.12-rc3/drivers/md/dm-ioctl.c linux/drivers/md/dm-ioctl.c --- linux-2.6.12-rc3/drivers/md/dm-ioctl.c 2005-03-02 02:37:51.0 -0500 +++ linux/drivers/md/dm-ioctl.c 2005-04-21 21:46:15.0 -0400 @@ -119,17 +119,6 @@ return NULL; } -/*- - * Inserting, removing and renaming a device. - *---*/ -static inline char *kstrdup(const char *str) -{ - char *r = kmalloc(strlen(str) + 1, GFP_KERNEL); - if (r) - strcpy(r, str); - return r; -} - static struct hash_cell *alloc_cell(const char *name, const char *uuid, struct mapped_device *md) { @@ -139,7 +128,7 @@ if (!hc) return NULL; - hc->name = kstrdup(name); + hc->name = kstrdup(name, GFP_KERNEL); if (!hc->name) { kfree(hc); return NULL; @@ -149,7 +138,7 @@ hc->uuid = NULL; else { - hc->uuid = kstrdup(uuid); + hc->uuid = kstrdup(uuid, GFP_KERNEL); if (!hc->uuid) { kfree(hc->name); kfree(hc); @@ -273,7 +262,7 @@ /* * duplicate new. */ - new_name = kstrdup(new); + new_name = kstrdup(new, GFP_KERNEL); if (!new_name) return -ENOMEM; diff -urN linux-2.6.12-rc3/drivers/parport/probe.c linux/drivers/parport/probe.c --- linux-2.6.12-rc3/drivers/parport/probe.c2005-04-20 22:47:04.0 -0400 +++ linux/drivers/parport/probe.c 2005-04-21 21:45:39.0 -0400 @@ -48,14 +48,6 @@ printk("\n"); } -static char *strdup(char *str) -{ - int n = strlen(str)+1; - char *s = kmalloc(n, GFP_KERNEL); - if (!s) return NULL; - return strcpy(s, str); -} - static void parse_data(struct parport *port, int device, char *str) { char *txt = kmalloc(strlen(str)+1, GFP_KERNEL); @@ -88,16 +80,16 @@ if (!strcmp(p, "MFG") || !strcmp(p, "MANUFACTURER")) { if (info->mfr) kfree (info->mfr); - info->mfr = strdup(sep); + info->mfr = kstrdup(sep, GFP_KERNEL); } else if (!strcmp(p, "MDL") || !strcmp(p, "MODEL")) { if (info->model) kfree (info->model); - info->model = strdup(sep); + info->model = kstrdup(sep, GFP_KERNEL); } else if (!strcmp(p, "CLS") || !strcmp(p, "CLASS")) { int i; if (info->class_name)
[patch 1/2] kstrdup: implementation
Rusty and I's LCA kernel tutorial again brought up kstrdup(). Let's close this never ending saga and provide a standard kernel implementation. As an example of the savings from such a patch, there are a handful of existing strdup() implementations and what looks like 100s of open coded strdup() uses. Some of which are surely buggy or less optimal than our version, and all of which bloat the kernel. Andrew, patch is against 2.6.12-rc3. Best, Robert Love The world continually reimplements kstrdup(). Implement an optimal version and export it to the world. By: Robert Love and Rusty Russell. Signed-off-by: Robert Love <[EMAIL PROTECTED]> include/linux/string.h |1 + lib/string.c | 20 2 files changed, 21 insertions(+) diff -urN linux-2.6.12-rc3/include/linux/string.h linux/include/linux/string.h --- linux-2.6.12-rc3/include/linux/string.h 2005-03-02 02:38:07.0 -0500 +++ linux/include/linux/string.h2005-04-21 21:23:04.0 -0400 @@ -17,6 +17,7 @@ extern char * strsep(char **,const char *); extern __kernel_size_t strspn(const char *,const char *); extern __kernel_size_t strcspn(const char *,const char *); +extern char * kstrdup(const char *,unsigned int __nocast); /* * Include machine specific inline routines diff -urN linux-2.6.12-rc3/lib/string.c linux/lib/string.c --- linux-2.6.12-rc3/lib/string.c 2005-03-02 02:38:25.0 -0500 +++ linux/lib/string.c 2005-04-21 21:31:11.0 -0400 @@ -22,6 +22,7 @@ #include #include #include +#include #include #ifndef __HAVE_ARCH_STRNICMP @@ -76,6 +77,25 @@ EXPORT_SYMBOL(strcpy); #endif +/* + * kstrdup - allocate space for and then copy an existing string + * + * @str: the string to duplicate + * @gfp: the GFP mask used to allocate the storage for the duplicated string + */ +char * kstrdup(const char *str, unsigned int __nocast flags) +{ + size_t len; + char *buf; + + len = strlen(str) + 1; + buf = kmalloc(len, flags); + if (likely(buf)) + memcpy(buf, str, len); + return buf; +} +EXPORT_SYMBOL(kstrdup); + #ifndef __HAVE_ARCH_STRNCPY /** * strncpy - Copy a length-limited, %NUL-terminated string - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] oneshot for inotify.
The Samba guys want dnotify-like oneshot/multishot support. That is not hard to add, so the following patch adds "oneshot" support to inotify. If IN_ONESHOT is set on a watch, the watch is automatically removed after the first event. Default behavior remains "multishot." Best, Robert Love Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/inotify.c|2 ++ include/linux/inotify.h |7 --- 2 files changed, 6 insertions(+), 3 deletions(-) diff -urN linux-2.6.12-rc3-inotify/fs/inotify.c linux/fs/inotify.c --- linux-2.6.12-rc3-inotify/fs/inotify.c 2005-04-21 19:40:44.0 -0400 +++ linux/fs/inotify.c 2005-04-21 19:37:24.0 -0400 @@ -509,6 +509,8 @@ struct inotify_device *dev = watch->dev; down(&dev->sem); inotify_dev_queue_event(dev, watch, mask, cookie, name); + if (watch->mask & IN_ONESHOT) + remove_watch_no_event(watch, dev); up(&dev->sem); } } diff -urN linux-2.6.12-rc3-inotify/include/linux/inotify.h linux/include/linux/inotify.h --- linux-2.6.12-rc3-inotify/include/linux/inotify.h2005-04-21 19:40:44.0 -0400 +++ linux/include/linux/inotify.h 2005-04-21 19:37:25.0 -0400 @@ -49,11 +49,12 @@ #define IN_DELETE_SELF 0x1000 /* Self was deleted */ #define IN_UNMOUNT 0x2000 /* Backing fs was unmounted */ #define IN_Q_OVERFLOW 0x4000 /* Event queued overflowed */ -#define IN_IGNORED 0x8000 /* File was ignored */ /* special flags */ -#define IN_ALL_EVENTS 0x /* All the events */ -#define IN_CLOSE (IN_CLOSE_WRITE | IN_CLOSE_NOWRITE) +#define IN_IGNORED 0x8000 /* File was ignored */ +#define IN_ONESHOT 0x8000 /* only send event once */ +#define IN_ALL_EVENTS (0x & ~IN_ONESHOT) /* All the events */ +#define IN_CLOSE (IN_CLOSE_WRITE | IN_CLOSE_NOWRITE) /* close */ #define INOTIFY_IOCTL_MAGIC'Q' #define INOTIFY_IOCTL_MAXNR2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.12-rc3.
On Thu, 2005-04-21 at 01:13 -0400, Robert Love wrote: > Live from linux.conf.au, below is inotify against 2.6.12-rc3. G'day mates! By popular request! Cheers, Robert Love Send an event on xattr change. Just use the existing metadata change event, IN_ATTRIB, instead of adding a new event. While here, do not wrap dnotify_flush(), no need. Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/open.c|2 +- fs/xattr.c |5 - include/linux/fsnotify.h | 18 ++ 3 files changed, 15 insertions(+), 10 deletions(-) diff -urN linux-2.6.12-rc3-inotify-0.22-2/fs/open.c linux/fs/open.c --- linux-2.6.12-rc3-inotify-0.22-2/fs/open.c 2005-04-21 01:13:17.0 -0400 +++ linux/fs/open.c 2005-04-21 01:17:27.0 -0400 @@ -1000,7 +1000,7 @@ retval = err; } - fsnotify_flush(filp, id); + dnotify_flush(filp, id); locks_remove_posix(filp, id); fput(filp); return retval; diff -urN linux-2.6.12-rc3-inotify-0.22-2/fs/xattr.c linux/fs/xattr.c --- linux-2.6.12-rc3-inotify-0.22-2/fs/xattr.c 2005-04-21 01:13:17.0 -0400 +++ linux/fs/xattr.c2005-04-21 01:24:24.0 -0400 @@ -16,6 +16,7 @@ #include #include #include +#include #include /* @@ -57,8 +58,10 @@ if (error) goto out; error = d->d_inode->i_op->setxattr(d, kname, kvalue, size, flags); - if (!error) + if (!error) { + fsnotify_xattr(d); security_inode_post_setxattr(d, kname, kvalue, size, flags); + } out: up(&d->d_inode->i_sem); } diff -urN linux-2.6.12-rc3-inotify-0.22-2/include/linux/fsnotify.h linux/include/linux/fsnotify.h --- linux-2.6.12-rc3-inotify-0.22-2/include/linux/fsnotify.h2005-04-21 01:13:20.0 -0400 +++ linux/include/linux/fsnotify.h 2005-04-21 01:23:06.0 -0400 @@ -131,6 +131,16 @@ } /* + * fsnotify_xattr - extended attributes were changed + */ +static inline void fsnotify_xattr(struct dentry *dentry) +{ + inotify_dentry_parent_queue_event(dentry, IN_ATTRIB, 0, + dentry->d_name.name); + inotify_inode_queue_event(dentry->d_inode, IN_ATTRIB, 0, NULL); +} + +/* * fsnotify_change - notify_change event. file was modified and/or metadata * was changed. */ @@ -177,14 +187,6 @@ } } -/* - * fsnotify_flush - flush time! - */ -static inline void fsnotify_flush(struct file *filp, fl_owner_t id) -{ - dnotify_flush(filp, id); -} - #ifdef CONFIG_INOTIFY /* inotify helpers */ /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify for 2.6.12-rc3.
Live from linux.conf.au, below is inotify against 2.6.12-rc3. Cheers, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 961 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 include/linux/fs.h|6 include/linux/fsnotify.h | 228 include/linux/inotify.h | 111 +++ include/linux/sched.h |4 kernel/user.c |4 16 files changed, 1458 insertions(+), 56 deletions(-) diff -urN linux-2.6.12-rc3/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.12-rc3/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-04-21 00:56:28.0 -0400 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It is of size len. The filename is padded with null bytes to ensure proper +alignment. This padding is reflected in len. + +You can slurp multiple events by passing a large buffer, for example + + size_t len = read (fd, buf, BUF_LEN); + +Will return as many events as are available and fit in BUF_LEN. + +/dev/inotify is also select() and poll() able. + +You can find the size of the current event queue via the FIONREAD ioctl. + +All watches are destroyed and cleaned up on close. + + +(ii) Internal Kernel Implementation + +Each open inotify device
Re: [patch] inotify for 2.6.11
On Thu, 2005-04-07 at 18:37 -0700, Rusty Lynch wrote: > Looking into this a little more I realized that the lack of /proc > notifications (for processes coming and going) is a common problem anytime > a file is modified without going through the VFS. Other examples are > remote file changes on a mounted NFS partition, remote file changes on a > mounted cluster filesystem (like ocfs or gfs), and just about any virtual > file system where the kernel is adding/deleting/modifying files from below > the VFS. Indeed it is. But none of those are anything that we care about (except maybe /proc). The problem of changes on remote filesystems is solved by FAM. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.12-rc2-mm1: inotify and directory removal
On Wed, 2005-04-06 at 14:21 +0100, Sean Neakums wrote: > Using your glib sample thingy from > http://www.kernel.org/pub/linux/kernel/people/rml/inotify/glib/ Thanks. It was a bug in the glib utility, not inotify itself. I fixed it in inotify-glib-0.0.2, which should appear at the above URL as soon as the mirrors sync. Thanks again! Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11
On Tue, 2005-04-05 at 17:53 -0700, Rusty Lynch wrote: > From just a casual look, it seems like this could be used to monitor the > comings and goings of processes by monitoring /proc. Unfortunately > inotify doesn't seem to be getting all the events on the proc filesystem > like it does on a real filesystem because I am not seeing new events every > time a new process is added or removed. The same is true if you attempt > to monitor something like /sys/bus/usb/devices/ and add/remove a usb > device. > > On a side note, it's still rather interesting to monitor /proc and watch > all the traffic. Yah, I agree. I looked into doing this awhile back, when I noticed inotify did not generate events for /proc. We just need to add calls to the fsnotify hooks to the proc_create() and proc_delete_foo() stuff. The interfaces are capable, e.g. we can add support at anytime, even after inotify is merged. I'd be for it. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11
On Tue, 2005-04-05 at 19:20 +0200, Prakash Punnoor wrote: > BTW, what else could I use to make use of inotify? I know fam, which afaik > only uses dnotify. Here is a little sample glib application that shows the ease-yet-power of inotify. http://www.kernel.org/pub/linux/kernel/people/rml/inotify/glib/ It integrates inotify watches into the glib mainloop via GIOChannel. Everything is abstracted behind simple interfaces, so this might prove a nice start for curious inotify developers. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11
On Tue, 2005-04-05 at 19:20 +0200, Prakash Punnoor wrote: > BTW, what else could I use to make use of inotify? I know fam, which afaik > only uses dnotify. Beagle, a desktop search infrastructure. Check out http://www.gnome.org/projects/beagle Some other little projects. If anyone else is using it, please let us know! The main problem is that dnotify sucks so bad now that no one uses it. So we don't have any existing applications to convert, besides FAM, and we did that (via Gamin). I've been meaning to write some sample GNOME code to show how easy it is to use Inotify, even directly. I'll get on that. > > Anyhow, this should fix it. Confirm? > > So far no problems. Interesting enough the previous patch worked w/o problem > the last hours... It might of been caused by a bug in Gamin, so it took some while to expose. It should only happen when the user asks to remove a watch on a wd that does not exist (I just forgot to check that error case in a bug fix I added). Keep pounding. It ought to be fixed, but please let me know if not! Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] updated inotify 0.22 for 2.6-mm
On Tue, 2005-04-05 at 12:56 -0400, Robert Love wrote: Mr Morton, > Below is an updated inotify 0.22 patch, with various small clean ups and > a fix for the oops reported by Prakash Punnoor. The oops was unrelated > to the semaphore change, which seems to of been the right thing. Below is an updated replacement patch, against 2.6.12-rc2-mm1. Other than misc. cleanup, the only change is the trivial fix for Prakash's reported oops on watch ignore, which I just introduced. The semaphore conversion seems right. It stays. Best, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 961 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 include/linux/fs.h|6 include/linux/fsnotify.h | 228 include/linux/inotify.h | 111 +++ include/linux/sched.h |4 kernel/user.c |4 16 files changed, 1458 insertions(+), 56 deletions(-) diff -urN linux-2.6.12-rc2-mm1/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.12-rc2-mm1/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-04-05 12:41:51.0 -0400 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It is of size len. The filename is padded with null bytes to ensure proper +alignme
[patch] updated inotify 0.22
Below is an updated inotify 0.22 patch, with various small clean ups and a fix for the oops reported by Prakash Punnoor. The oops was unrelated to the semaphore change, which seems to of been the right thing. Patch is against 2.6.12-rc2. Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 961 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 include/linux/fs.h|6 include/linux/fsnotify.h | 228 include/linux/inotify.h | 111 +++ include/linux/sched.h |4 kernel/user.c |4 16 files changed, 1458 insertions(+), 56 deletions(-) diff -urN linux-2.6.12-rc2/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.12-rc2/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-04-05 12:40:41.0 -0400 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It is of size len. The filename is padded with null bytes to ensure proper +alignment. This padding is reflected in len. + +You can slurp multiple events by passing a large buffer, for example + + size_t len = read (fd, buf, BUF_LEN); + +Will return as many events as are available and fit in BUF_LEN. + +/dev/inotify is also select() and poll() able. + +You can find the size of the current
Re: [patch] inotify for 2.6.11
On Tue, 2005-04-05 at 09:58 +0200, Prakash Punnoor wrote: > I am having a little trouble with inotify 0.22. Previous version worked w/o > trouble (even with nvidia and nvsound loaded) with 2.6.12-rc1-kb2 and gamin > > Now I use 2.6.12-rc2 with inotify 0.22 and got this after a few minutes of > uptime (compiling some stuff): Ah, thanks. That was not even related to the semaphore rewrite, but a small bug fix I slipped in. But of course. Gamin is an interesting test case for us because it does so many ignores. Anyhow, this should fix it. Confirm? Thanks, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 961 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 include/linux/fs.h|6 include/linux/fsnotify.h | 228 include/linux/inotify.h | 111 +++ include/linux/sched.h |4 kernel/user.c |4 16 files changed, 1458 insertions(+), 56 deletions(-) diff -urN linux-2.6.12-rc1/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.12-rc1/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-04-04 16:26:15.0 -0400 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It is of size len. The filename is padded with null byte
Re: [patch] inotify 0.22
On Mon, 2005-04-04 at 16:50 -0400, Dale Blount wrote: Hi, Dale. > Will inotify watch directories recursively? A quick browse through the > source doesn't look like it, but I very well could be wrong. Last I > checked, dnotify did not either. I am looking for a way to synchronize > files in as-real-as-possible-time when they are modified. No, inotify does not support watching directories recursively. I would love to add it, but it would be a mess to do inside of the kernel. Making it easy and efficient to watch a full tree, however, was a goal of inotify. Beagle, a personal indexing infrastructure, watches the user's entire home directory. You could never do this in dnotify because you would run out of file descriptors and pin every file. In inotify, it is not hard to write a simple recursive loop to add a watch to each directory starting at a given path. It can even be done in an atomic fashion. See http://mail.gnome.org/archives/dashboard-hackers/2004-October/msg00022.html wherein I publish such an algorithm. Hope this helps, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify 0.22 for 2.6.12-rc1-mm4
On Mon, 2005-04-04 at 16:02 -0400, Robert Love wrote: Greetings, Mr Morton. > Below, find inotify 0.22, against 2.6.12-rc1. > > This release introduces a conversion in our primary locking from > spinlocks to semaphores. Semaphores are a more natural fit for our > code, which synchronizes with user-space, thus we clean up a bit of code > with a net reduction of 63 lines. Also, I was able to remove the > GFP_ATOMIC allocation. > > I did this as a bit of an experiment, not to fix any specific problem, > and I now think it is the right way to go. > > This release also fixes a small bug in the coalescing code, which could > of mistakenly dropped a move event. We now verify that the cookies > match before coalescing. And a patch for 2.6.12-rc1-mm4. Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 979 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 include/linux/fs.h|6 include/linux/fsnotify.h | 228 +++ include/linux/inotify.h | 113 +++ include/linux/sched.h |4 kernel/user.c |4 16 files changed, 1478 insertions(+), 56 deletions(-) diff -urN linux-2.6.12-rc1-mm4/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.12-rc1-mm4/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-04-04 13:32:02.0 -0400 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE,
[patch] inotify 0.22
Below, find inotify 0.22, against 2.6.12-rc1. This release introduces a conversion in our primary locking from spinlocks to semaphores. Semaphores are a more natural fit for our code, which synchronizes with user-space, thus we clean up a bit of code with a net reduction of 63 lines. Also, I was able to remove the GFP_ATOMIC allocation. I did this as a bit of an experiment, not to fix any specific problem, and I now think it is the right way to go. This release also fixes a small bug in the coalescing code, which could of mistakenly dropped a move event. We now verify that the cookies match before coalescing. Comments are welcome. Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 979 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 include/linux/fs.h|6 include/linux/fsnotify.h | 228 +++ include/linux/inotify.h | 113 +++ include/linux/sched.h |4 kernel/user.c |4 16 files changed, 1478 insertions(+), 56 deletions(-) diff -urN linux-2.6.11/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.11/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-04-04 13:34:03.0 -0400 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It
[patch] Re: inotify issue: iput called atomically
On Sun, 2005-03-27 at 15:52 +0200, Christophe Saout wrote: Hi, Christophe. > it looks like you shouldn't call iput with spinlocks held. iput might > call down into the filesystem to delete the inode and this can sleep. We've been working on this for a couple days now. I finally finished it up today. Below is an updated inotify, against 2.6.12-rc1, that no longer calls iput() while atomic. Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). See Documentation/filesystem/inotify.txt for a description of the user-space API. Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|6 fs/inotify.c | 1043 ++ fs/namei.c| 30 fs/open.c |6 fs/read_write.c | 15 fs/super.c|1 include/linux/fs.h|7 include/linux/fsnotify.h | 228 +++ include/linux/inotify.h | 113 +++ include/linux/sched.h |4 kernel/user.c |4 17 files changed, 1544 insertions(+), 56 deletions(-) diff -urN linux-2.6.12-rc1/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.12-rc1/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-03-29 11:36:08.406948954 -0500 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It is of size len. The filename is padded with null bytes to ensure proper +alignment. This p
Re: linux: detect application crash
On Thu, 2005-03-17 at 15:27 -0500, Allison wrote: > Several times when I worked with Windows, I have had a scenario when I > am editing a file and saved some time ago and then the application > crashes and I lose all recent data. > > Can the operating system detect all application crashes ? If so, why > can't the OS save the user data to disk before the application quits ? > > How does this work in Linux. I was curious if such a functionality > already exists in Linux. If not, what are the issues involved in > implementing this functionality. It is hard to just wholesale "save the user's data" because the application is crashing, things are inconsistent, something is broken, etc. But it is possible to dump all memory (a core dump). Linux does this now. It is also possible to catch a segfault and handle it. Various GUI libraries do this. For example, GNOME handles segfaults, presenting the user with various options (send bug report, restart application, etc). The best bet, from an application developer's standpoint, is to just not crash. Second best, save early and save often. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] updateder inotify for 2.6.11-mm4
On Wed, 2005-03-16 at 16:38 -0500, Robert Love wrote: > Andrew, here is an updated inotify for 2.6.11-mm4 (replacing the current > two patches), implementing your API suggestion. It is no different from > the patch I sent you in private, , except it is the full patch and not > an interdiff, and now it is tested. > > I wrote up the API description as you asked. See the aforementioned > file. > > Please, apply. Thanks! Updated patch. Three changes: - use unlocked_ioctl (Juergen Kreileder) - compat_ioctl support (Juergen Kreileder) - remove trailing whitespace (Andrew Morton's strip script) Otherwise the same. Against 2.6.11-mm4. Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). See Documentation/filesystem/inotify.txt for a description of the user-space API. Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|4 fs/inotify.c | 1009 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 fs/super.c|2 include/linux/fs.h|8 include/linux/fsnotify.h | 236 +++ include/linux/inotify.h | 113 +++ include/linux/sched.h |4 kernel/user.c |4 17 files changed, 1518 insertions(+), 56 deletions(-) diff -urN linux-2.6.11-mm4/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.11-mm4/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-03-17 14:29:46.326301298 -0500 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (
[patch] updated inotify for 2.6.11-mm4
On Wed, 2005-03-16 at 16:34 -0500, Robert Love wrote: > Below is an updated inotify patch for 2.6.11. The only change over the > previous release is incorporating an API suggestion of Mr. Andrew > Morton's: We now add watches via the file's file descriptor, not its > pathname. > > I also wrote a API description, documenting the user interface. It is > located in Documentation/filesystems/inotify.txt. Andrew, here is an updated inotify for 2.6.11-mm4 (replacing the current two patches), implementing your API suggestion. It is no different from the patch I sent you in private, , except it is the full patch and not an interdiff, and now it is tested. I wrote up the API description as you asked. See the aforementioned file. Please, apply. Thanks! Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). See Documentation/filesystem/inotify.txt for a description of the user-space API. Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|4 fs/inotify.c | 1008 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 fs/super.c|2 include/linux/fs.h|8 include/linux/fsnotify.h | 236 +++ include/linux/inotify.h | 113 +++ include/linux/sched.h |4 kernel/user.c |4 17 files changed, 1517 insertions(+), 56 deletions(-) diff -urN linux-2.6.11-mm4/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.11-mm4/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-03-16 16:12:12.001110514 -0500 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask.
[patch] updated inotify for 2.6.11
Below is an updated inotify patch for 2.6.11. The only change over the previous release is incorporating an API suggestion of Mr. Andrew Morton's: We now add watches via the file's file descriptor, not its pathname. I also wrote a API description, documenting the user interface. It is located in Documentation/filesystems/inotify.txt. Enjoy. Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). See Documentation/filesystem/inotify.txt for a description of the user-space API. Signed-off-by: Robert Love <[EMAIL PROTECTED]> Documentation/filesystems/inotify.txt | 81 ++ fs/Kconfig| 13 fs/Makefile |1 fs/attr.c | 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c|4 fs/inotify.c | 1008 ++ fs/namei.c| 30 - fs/open.c |6 fs/read_write.c | 15 fs/super.c|2 include/linux/fs.h|8 include/linux/fsnotify.h | 236 +++ include/linux/inotify.h | 113 +++ include/linux/sched.h |4 kernel/user.c |4 17 files changed, 1517 insertions(+), 56 deletions(-) diff -urN linux-2.6.11/Documentation/filesystems/inotify.txt linux/Documentation/filesystems/inotify.txt --- linux-2.6.11/Documentation/filesystems/inotify.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/filesystems/inotify.txt 2005-03-16 16:05:06.152873318 -0500 @@ -0,0 +1,81 @@ + inotify +a powerful yet simple file change notification system + + + +Document started 15 Mar 2005 by Robert Love <[EMAIL PROTECTED]> + +(i) User Interface + +Inotify is controlled by a device node, /dev/inotify. If you do not use udev, +this device may need to be created manually. First step, open it + + int dev_fd = open ("/dev/inotify", O_RDONLY); + +Change events are managed by "watches". A watch is an (object,mask) pair where +the object is a file or directory and the mask is a bitmask of one or more +inotify events that the application wishes to receive. See +for valid events. A watch is referenced by a watch descriptor, or wd. + +Watches are added via a file descriptor. + +Watches on a directory will return events on any files inside of the directory. + +Adding a watch is simple, + + /* 'wd' represents the watch on fd with mask */ + struct inotify_request req = { fd, mask }; + int wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + +You can add a large number of files via something like + + for each file to watch { + struct inotify_request req; + int file_fd; + + file_fd = open (file, O_RDONLY); + if (fd < 0) { + perror ("open"); + break; + } + + req.fd = file_fd; + req.mask = mask; + + wd = ioctl (dev_fd, INOTIFY_WATCH, &req); + + close (fd); + } + +You can update an existing watch in the same manner, by passing in a new mask. + +An existing watch is removed via the INOTIFY_IGNORE ioctl, for example + + ioctl (dev_fd, INOTIFY_IGNORE, wd); + +Events are provided in the form of an inotify_event structure that is read(2) +from /dev/inotify. The filename is of dynamic length and follows the struct. +It is of size len. The filename is padded with null bytes to ensure proper +alignment. This padding is reflected in len. + +You can slurp multiple events by passing
Re: sched_setscheduler and pids/threads
On Thu, 2005-03-10 at 15:12 +1100, Dave Airlie wrote: > In 2.6 all my threads appear as a single PID,if I use chrt -p > will it set the scheduling priority for my main thread or for all > threads in the application? For just the main thread (or the thread of whatever PID you give). You need to set the PID of each thread individually. The "everything appears as a single PID" is just an elaborate parlor trick. Wool pulled over your eyes. > Can I used the thread IDs from /proc//task/ to chrt the other > threads in my app to different priorities? You can use the PID's in /proc//task/, yes. Or you can just set the PID of the main thread before it starts other threads, or use chrt to launch the program, or use chrt to set the PID of a shell script that starts the application: Scheduler properties are inherited. Best, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-mm2
On Tue, 2005-03-08 at 23:51 +, J.A. Magallon wrote: > Ahh, damn, that explains it. I use a main thread that does nothing but > wait for the worker threads. So it sure gets moved to CPU0, but as it > does not waste CPU time, I do not see it... > > Thanks. Will see what can I do with my threads. cpusets, perhaps... Affinity is inherited. Start the threads in a shell script that runs taskset on itself. Or just modify this program to have the main thread do sched_setaffinity() on itself. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-mm2
On Tue, 2005-03-08 at 23:36 +, J.A. Magallon wrote: > Can cpu affinity really be changed for a running process ? Yes. > Does it need something like io or yielding to take effect ? No. > I am playin with Robert Love's taskset (symlinked to runon, it is easier > to type and I'm more used to it), because I want to play with hyperthreading > and wanted a method to force two threads on the same physical package. > It works fine to bound a new process to a cpu set, but I does not change > anything for a running process. > > I try runon -c -p 0 for my numbercruncher and it does nothing, top > shows it is in the same cpus where it started: > > werewolf:~# runon -c -p 0 8277 > pid 8277's current affinity list: 0-3 > pid 8277's new affinity list: 0 > werewolf:~# runon -c -p 8277 > pid 8277's current affinity list: 0 This looks fine. As expected. Although, you have the syntax wrong. It should be taskset -c 0 -p 8277 and taskset -p 8277 > The program uses posix threads, 2 in this case. The two threads change from > cpu sometimes (not too often), but do not go into the same processor > immediately as when I start the program directly with runon/taskset. You have to bind all of the threads individually. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question regarding thread_struct
On Wed, 2005-03-09 at 02:14 +0800, Coywolf Qi Hunt wrote: > CONFIG_IRQSTACKS seems only on ppc64. Is it good to add for other archs too? Some architectures (x86) control per-IRQ stacks via CONFIG_4KSTACKS, so enabling that directive turns on 4K stacks and gives interrupts their own stack. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question regarding thread_struct
On Tue, 2005-03-08 at 23:25 +0530, Imanpreet Arora wrote: > Thanks again, but if the whole of the kernel is restricted to couple of pages. NO. I did not say this. EACH PROCESS'S KERNEL STACK IS A PAGE OR TWO. That is all I said. The kernel can consume hundreds of megabytes of data if it wants. And it does. > Does this mean > > a) the whole of the kernel including drivers is restricted to couple of pages. No. Each process's stack is a page or two. The rest of the kernel is free to use a lot of memory. > b) Or with a more probability, I think what you actually mean is that > whenever there is an interrupt by any driver it runs in either context > of the current process or depending upon CONFIG_IRQSTACKS. Yes, the interrupt runs in the stack of the current process or (given CONFIG_IRQSTACKS) its own stack. Dynamic memory is free to come from all over. > If you could just quote the chapter, in your book which contains > information about this, that would be more than sufficient. That explains what, exactly? Kernel stacks are in Ch2 (1ed) and Ch3 (2ed), I think. > > > b)Or does it mean that a particular stack for a particular > > > process, can't be resized? Yes, a process's kernel stack cannot be resized. > Actually what I asked above was "how exactly does one define and > differentiate kernel stack", as against "user-stack". I think I always > knew it but couple of clouds were coming over after reading your first > mail. Also if each thread has a kernel stack how is it allocated at > first place (alloc_thread_info)(?) The user-space stack is handled by user-space. It is tracked by mm_struct->start_stack. The kernel stack is handled by user-space. It is stored in esp, obviously, while inside of the kernel. And, yes, alloc_thread_info() allocates the stack. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question regarding thread_struct
On Tue, 2005-03-08 at 22:57 +0530, Imanpreet Arora wrote: > This has been a doubt for a couple of days, and I am wondering if this > one could also be cleared. When you say kernel stack, can't be resized > > > a) Does it mean that the _whole_ of the kernel is restricted to > that 8K or 16K of memory? Actually, 4K or 8K these days for x86. But, no, it means that EACH PROCESS is constrained to the kernel stack. The stacks are per-process. The kernel never "runs on its own" -- it is always in the context of a process (which has its own kernel stack) or an interrupt handler (which either shares the previous process's stack or has its own stack, depending on CONFIG_IRQSTACKS). > b)Or does it mean that a particular stack for a particular > process, can't be resized? Yes, I just said that in the previous email. The kernel stack cannot be resized. It is fixed. It is one or two pages, depending on configure option. That is, 4 or 8K. The _user-space_ stack, what the application actually uses, is dynamically resizable. But we are not talking about that. > c) And for that matter how exactly do we define a kernel stack? I don't know what you mean. alloc_thread_info() creates the thread_info structure and stack. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Question regarding thread_struct
On Tue, 2005-03-08 at 22:34 +0530, Imanpreet Arora wrote: > I am wondering if someone could provide information as to how > thread_struct is kept in memory. Robert Love mentions that it is kept > at the "lowest" kernel address in case of x86 based platform. Could > anyone answer these questions. Kernel _stack_ address for the given process. > a)When a stack is resized, is the thread_struct structure copied onto > a new place? This is the kernel stack, not any potential user-space stack. Kernel stacks are not resized. > b)What is the advantage of this scheme as against a fixed > "virtual-address"? This is inside of the kernel, not in user-space. > c)Also could you kindly point the relevant files which do all this > stuff "shed.c"(?) See kernel/fork.c and alloc_thread_info() and friends in . Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11-mm1, updated
On Mon, 2005-03-07 at 23:50 -0500, Robert Love wrote: > Yah, I just missed it. It is fixed in my tree. Following patch, against 2.6.11-mm1, fixes the hooks in fs/compat.c. Otherwise unchanged from the previous patch. Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/Kconfig | 13 fs/Makefile |1 fs/attr.c| 33 - fs/compat.c | 12 fs/file_table.c |3 fs/inode.c |4 fs/inotify.c | 1014 +++ fs/namei.c | 30 - fs/open.c|6 fs/read_write.c | 15 fs/super.c |2 include/linux/fs.h |8 include/linux/fsnotify.h | 236 ++ include/linux/inotify.h | 113 + include/linux/sched.h|4 kernel/user.c|4 16 files changed, 1442 insertions(+), 56 deletions(-) diff -urN linux-2.6.11-mm1/fs/attr.c linux/fs/attr.c --- linux-2.6.11-mm1/fs/attr.c 2005-03-04 14:06:21.0 -0500 +++ linux/fs/attr.c 2005-03-08 12:02:28.216810448 -0500 @@ -10,7 +10,7 @@ #include #include #include -#include +#include #include #include #include @@ -107,31 +107,8 @@ out: return error; } - EXPORT_SYMBOL(inode_setattr); -int setattr_mask(unsigned int ia_valid) -{ - unsigned long dn_mask = 0; - - if (ia_valid & ATTR_UID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_GID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_SIZE) - dn_mask |= DN_MODIFY; - /* both times implies a utime(s) call */ - if ((ia_valid & (ATTR_ATIME|ATTR_MTIME)) == (ATTR_ATIME|ATTR_MTIME)) - dn_mask |= DN_ATTRIB; - else if (ia_valid & ATTR_ATIME) - dn_mask |= DN_ACCESS; - else if (ia_valid & ATTR_MTIME) - dn_mask |= DN_MODIFY; - if (ia_valid & ATTR_MODE) - dn_mask |= DN_ATTRIB; - return dn_mask; -} - int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry->d_inode; @@ -194,11 +171,9 @@ if (ia_valid & ATTR_SIZE) up_write(&dentry->d_inode->i_alloc_sem); - if (!error) { - unsigned long dn_mask = setattr_mask(ia_valid); - if (dn_mask) - dnotify_parent(dentry, dn_mask); - } + if (!error) + fsnotify_change(dentry, ia_valid); + return error; } diff -urN linux-2.6.11-mm1/fs/compat.c linux/fs/compat.c --- linux-2.6.11-mm1/fs/compat.c2005-03-04 14:06:21.0 -0500 +++ linux/fs/compat.c 2005-03-08 12:02:30.518460544 -0500 @@ -36,7 +36,7 @@ #include #include #include -#include +#include #include #include #include @@ -1233,9 +1233,13 @@ out: if (iov != iovstack) kfree(iov); - if ((ret + (type == READ)) > 0) - dnotify_parent(file->f_dentry, - (type == READ) ? DN_ACCESS : DN_MODIFY); + if ((ret + (type == READ)) > 0) { + struct dentry *dentry = file->f_dentry; + if (type == READ) + fsnotify_access(dentry); + else + fsnotify_modify(dentry); + } return ret; } diff -urN linux-2.6.11-mm1/fs/file_table.c linux/fs/file_table.c --- linux-2.6.11-mm1/fs/file_table.c2005-03-04 14:06:21.0 -0500 +++ linux/fs/file_table.c 2005-03-08 12:02:28.219809992 -0500 @@ -16,6 +16,7 @@ #include #include #include +#include /* sysctl tunables... */ struct files_stat_struct files_stat = { @@ -123,6 +124,8 @@ struct
Re: [patch] inotify for 2.6.11-mm1, updated
On Tue, 2005-03-08 at 04:40 +, Christoph Hellwig wrote: > Why do you need the classdevice? I'm really not too eager about adding > tons of new misdevices now that we can route directly to individual majors > with cdev_add & stuff. Especially when you're actually relying on class > device you should have your own one instead of relying on an onsolete > layer. We have sysfs knobs and /sys/class/misc/inotify makes sense. > Actually, you fixed that in read_write.c, just compat.c is still missing. > Looks like you forget to fix that one and didn't have a chance to compile-test > the 32bit compat layer? Yah, I just missed it. It is fixed in my tree. Thanks, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify for 2.6.11-mm1, updated
On Mon, 2005-03-07 at 01:19 +, Christoph Hellwig wrote: Hi, hch. I went ahead and implemented all of your suggestions, save for the ones below where I have comments or disagree (see below). Most of your comments were straightforward and I made the changes as you suggested. See the following patch, against 2.6.11-mm1. We will try out a write()-based interface. John is working on that. I'd like Andrew and others to chime in on whether they really prefer that to an ioctl? > > might_sleep(); > > this one seems totally unrelated. Eh? We did not add that. ;) > > + /* XXX: optimally, we should use GFP_KERNEL */ > > + kevent = kmem_cache_alloc(event_cachep, GFP_ATOMIC); > > indeed. having a new atomic memory allocation in every filesystem operation > sounds like a really bad idea. Obviously we know that--the FIXME is there to signify as much. Anyhow, the allocation is not on every operation, just every event. > > +static struct miscdevice inotify_device = { > > + .minor = MISC_DYNAMIC_MINOR, > > + .name = "inotify", > > + .fops = &inotify_fops, > > +}; > > Should probably use the /dev/mem major. Hrm, should we? Also, the memory class stuff is all local to mem.c. For example, I cannot get at /sys/class/mem. The misc. device stuff is exported. > > + default y > > please don't default a new and experimental facility to y. In fact > default is totally overused. I'd agree when we go to mainline, but for 2.6-mm more testing is welcome. Besides, they don't have to use inotify. This just gets the hooks compiled in. I will definitely remove 'default' altogether before we go to mainline. > > +#ifdef CONFIG_INOTIFY > > + struct list_headinotify_watches; /* watches on this inode */ > > + spinlock_t inotify_lock; /* protects the watches list */ > > +#endif > > do you really need a spinlock of your own in every inode? Inode memory > usage is a quite big problem. Yah, we do. For a couple of reasons. First, by introducing our own lock, we never need touch i_lock, and avoid that scalability mess altogether. Second, and most importantly, i_lock is an outermost lock. We need our lock to be nestable, because we walk inode -> inotify_watch -> inotify_device. I've tried various rewrites to not need our own lock. None are pretty. I can offer to the "inode memory worries me" people that they can always disable CONFIG_INOTIFY. > > +/* > > + * fsnotify_change - notify_change event. file was modified and/or > > metadata > > + * was changed. > > + */ > > +static inline void fsnotify_change(struct dentry *dentry, unsigned int > > ia_valid) > > this one is far too large to be inlined. I'd agree, but it is only called from one place. And this way everything stays in fsnotify.h. Best, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/Kconfig | 13 fs/Makefile |1 fs/attr.c| 33 - fs/compat.c | 14 fs/file_table.c |3 fs/inode.c |4 fs/inotify.c | 1014 +++ fs/namei.c | 30 - fs/open.c|6 fs/read_write.c | 15 fs/super.c |2 include/linux/fs.h |8 include/linux/fsnotify.h | 236 ++ include/linux/inotify.h | 113 + include/linux/sched.h|4 kernel/user.c|4 16 files changed, 1444 insertions(+), 56 deletions(-) diff -urN linux-2.6.11-mm1/fs/attr.c linux/fs/attr.
[patch] inotify for 2.6.11, updated
On Fri, 2005-03-04 at 13:37 -0500, Robert Love wrote: > I greatly reworked much of the data structures and their interactions, > to lay the groundwork for sanitizing the locking. I then, I hope, > sanitized the locking. It looks right, I am happy. Comments welcome. > I surely could of missed something. Maybe even something big. > > But, regardless, this release is a huge jump from the previous, fixing > all known issues and greatly improving the locking. Updated inotify, against 2.6.11, addressing hch's concerns (see other thread). Love, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/Kconfig | 13 fs/Makefile |1 fs/attr.c| 33 - fs/compat.c | 14 fs/file_table.c |3 fs/inode.c |4 fs/inotify.c | 1014 +++ fs/namei.c | 30 - fs/open.c|6 fs/read_write.c | 15 fs/super.c |2 include/linux/fs.h |8 include/linux/fsnotify.h | 236 ++ include/linux/inotify.h | 113 + include/linux/sched.h|4 kernel/user.c|4 16 files changed, 1444 insertions(+), 56 deletions(-) diff -urN linux-2.6.11/fs/attr.c linux/fs/attr.c --- linux-2.6.11/fs/attr.c 2005-03-02 02:37:48.0 -0500 +++ linux/fs/attr.c 2005-03-07 16:10:46.854669712 -0500 @@ -10,7 +10,7 @@ #include #include #include -#include +#include #include #include #include @@ -107,31 +107,8 @@ out: return error; } - EXPORT_SYMBOL(inode_setattr); -int setattr_mask(unsigned int ia_valid) -{ - unsigned long dn_mask = 0; - - if (ia_valid & ATTR_UID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_GID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_SIZE) - dn_mask |= DN_MODIFY; - /* both times implies a utime(s) call */ - if ((ia_valid & (ATTR_ATIME|ATTR_MTIME)) == (ATTR_ATIME|ATTR_MTIME)) - dn_mask |= DN_ATTRIB; - else if (ia_valid & ATTR_ATIME) - dn_mask |= DN_ACCESS; - else if (ia_valid & ATTR_MTIME) - dn_mask |= DN_MODIFY; - if (ia_valid & ATTR_MODE) - dn_mask |= DN_ATTRIB; - return dn_mask; -} - int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry->d_inode; @@ -194,11 +171,9 @@ if (ia_valid & ATTR_SIZE) up_write(&dentry->d_inode->i_alloc_sem); - if (!error) { - unsigned long dn_mask = setattr_mask(ia_valid); - if (dn_mask) - dnotify_parent(dentry, dn_mask); - } + if (!error) + fsnotify_change(dentry, ia_valid); + return error; } diff -urN linux-2.6.11/fs/compat.c linux/fs/compat.c --- linux-2.6.11/fs/compat.c2005-03-02 02:38:08.0 -0500 +++ linux/fs/compat.c 2005-03-07 16:10:14.152641176 -0500 @@ -36,7 +36,7 @@ #include #include #include -#include +#include #include #include #include @@ -1233,9 +1233,15 @@ out: if (iov != iovstack) kfree(iov); - if ((ret + (type == READ)) > 0) - dnotify_parent(file->f_dentry, - (type == READ) ? DN_ACCESS : DN_MODIFY); + if ((ret + (type == READ)) > 0) { + struct dentry *dentry = file->f_dentry; + if (type == READ) + fsnotify_access(dentry, dentry->d_inode, + dentry->d_name.name); + else + fsnotify_modify(
Re: [patch] inotify for 2.6.11
On Mon, 2005-03-07 at 01:23 +, Christoph Hellwig wrote: > It means that every re3vision of inotify so far has been buggy in some > respect and ig got dropped from -mm again and again. It should get some > more testing there and not sent firectly for mainline. It was dropped from 2.6-mm once. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11
On Sun, 2005-03-06 at 00:04 +, Christoph Hellwig wrote: > The user interface is still bogus. I presume you are talking about the ioctl. I have tried to engage you and others on what exactly you prefer instead. I have said that moving to a write interface is fine but I don't see how ut is _any_ better than the ioctl. Write is less typed, in fact, since we lose the command versus argument delineation. But if it is a anonymous decision, I'll switch it. Or take patches. ;-) It isn't a big deal. > Also now version of it has stayed in -mm long enough because bad > bugs pop up almost weekly. I don't follow this sentence. Best, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11
On Fri, 2005-03-04 at 15:38 -0600, Timothy R. Chavez wrote: Hi, Mr. Chavez. > Are there plans of reworking the "generic" hooking infrastructure > (fsnotify.h) to be more like the security hooking framework (+ > stacking)? I think it'd be nice to be able to have a fs_notify struct > of function pointers, point at the one's I've chosen to implement, and > then register / unregister with the framework. Maybe this is an > overly complicated approach, but these don't seem like they're generic > hooks in anyway. Personally, I think it is overkill. I don't think we are going to have the myriad of file notification systems that we have for security layers (indeed, the goal is to have just inotify). That said, we could always make the layer more pluggable once inotify is in. I would not fight that. But, personally I don't see any real benefit, just additional complexity and overhead. > + * include/linux/fs_notify.h - >generic< hooks for filesystem notification, > to > + * reduce in-source duplication from both >dnotify and inotify<. > > I guess I don't fully understand that comment. Just quickly glancing > at it, all you've done is added a level of indirection and shifted the > same redundant code from the VFS to fs_notify.h -- Please correct me > if I'm wrong (not at all uncommon). No, you are right. The "generic" part is supposed to be what is in the VFS. E.g., the fsnotify_foo() calls are supposed to be the generic interface. The body of these calls, as you can see, is static code, a simple copy and cleanup of the inotify + dnotify hooks. The idea, spurred by Christoph Hellwig's suggestion, was to keep the VFS clean. Not make a super neat pluggable notification system. I think the layers ARE generic, though, in the sense that foonotify could probably drop some static code into fsnotify.h and work. > As you already know, there's work being done on the audit subsystem > that also needs notifications from the filesystem and would require > yet another set of hooks. However, where we get notified might differ > from where inotify and dnotify get notified and it seems like > fs_notify is tailored specifically for inotify (and accommodates > dnotify out of obligation) and openly implements the "generic" hooks > it requires. > > Regardless, if this is the way it's going to be done. We'll expand > fs_notify.h to meet our needs as well. If we end up duplicating stuff and making a big mess, then the audit layer and the notification layer should DEFINITELY look at merging and consolidating. But I think that we need to wait until one or the other gets more traction and into the mainline kernel. > Also, FYI: > I just purchased the 2nd edition of your book, looking forward to reading it. Great. Hope you enjoy it! ;-) Best, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch] inotify for 2.6.11-mm1
On Fri, 2005-03-04 at 13:37 -0500, Robert Love wrote: Hey, Andrew. > I greatly reworked much of the data structures and their interactions, > to lay the groundwork for sanitizing the locking. I then, I hope, > sanitized the locking. It looks right, I am happy. Comments welcome. > I surely could of missed something. Maybe even something big. > > But, regardless, this release is a huge jump from the previous, fixing > all known issues and greatly improving the locking. Attached is inotify, replacing the current version of inotify in 2.6-mm. The patch is diffed against 2.6.11-mm1, modulo the two inotify patches already in-tree. I'd like to start moving forward on this, the locking is greatly improved, resolve any new issues, etc. Please, apply. Your humble servant, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. * inotify supports much finer grained events. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> fs/Kconfig | 13 fs/Makefile|1 fs/attr.c | 33 - fs/compat.c| 14 fs/file_table.c|4 fs/inode.c |4 fs/inotify.c | 1013 + fs/namei.c | 38 - fs/open.c |9 fs/read_write.c| 24 - fs/super.c |2 include/linux/fs.h |8 include/linux/fsnotify.h | 235 ++ include/linux/inotify.h| 113 + include/linux/sched.h |2 kernel/user.c |2 18 files changed, 1453 insertions(+), 62 deletions(-) diff -urN linux-2.6.11-mm1/fs/attr.c linux/fs/attr.c --- linux-2.6.11-mm1/fs/attr.c 2005-03-04 14:06:21.732297568 -0500 +++ linux/fs/attr.c 2005-03-04 13:27:05.560490128 -0500 @@ -10,7 +10,7 @@ #include #include #include -#include +#include #include #include #include @@ -107,31 +107,8 @@ out: return error; } - EXPORT_SYMBOL(inode_setattr); -int setattr_mask(unsigned int ia_valid) -{ - unsigned long dn_mask = 0; - - if (ia_valid & ATTR_UID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_GID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_SIZE) - dn_mask |= DN_MODIFY; - /* both times implies a utime(s) call */ - if ((ia_valid & (ATTR_ATIME|ATTR_MTIME)) == (ATTR_ATIME|ATTR_MTIME)) - dn_mask |= DN_ATTRIB; - else if (ia_valid & ATTR_ATIME) - dn_mask |= DN_ACCESS; - else if (ia_valid & ATTR_MTIME) - dn_mask |= DN_MODIFY; - if (ia_valid & ATTR_MODE) - dn_mask |= DN_ATTRIB; - return dn_mask; -} - int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry->d_inode; @@ -194,11 +171,9 @@ if (ia_valid & ATTR_SIZE) up_write(&dentry->d_inode->i_alloc_sem); - if (!error) { - unsigned long dn_mask = setattr_mask(ia_valid); - if (dn_mask) - dnotify_parent(dentry, dn_mask); - } + if (!error) + fsnotify_change(dentry, ia_valid); + return error; } diff -urN linux-2.6.11-mm1/fs/compat.c linux/fs/compat.c --- linux-2.6.11-mm1/fs/compat.c2005-03-04 14:06:21.734297264 -0500 +++ linux/fs/compat.c 2005-03-04 13:27:05.562489824 -0500 @@ -36,7 +36,7 @@ #include #include #include -#include +#include #include #include #include @@ -1233,9 +1233,15 @@ out: if (iov != iovstack) kfree(iov); - if ((ret + (type == READ)) > 0) - dnotify_parent(file->f_dentry, - (type == READ) ? DN_ACCESS : DN_MO
[patch] inotify for 2.6.11
Below is inotify, diffed against 2.6.11. I greatly reworked much of the data structures and their interactions, to lay the groundwork for sanitizing the locking. I then, I hope, sanitized the locking. It looks right, I am happy. Comments welcome. I surely could of missed something. Maybe even something big. But, regardless, this release is a huge jump from the previous, fixing all known issues and greatly improving the locking. Best, Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. * inotify implements provides finger grained event control. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> drivers/char/Makefile |1 fs/Kconfig | 13 fs/Makefile|1 fs/attr.c | 33 - fs/compat.c| 14 fs/file_table.c|4 fs/inode.c |4 fs/inotify.c | 1013 + fs/namei.c | 38 - fs/open.c |9 fs/read_write.c| 24 - fs/super.c |2 include/linux/fs.h |8 include/linux/fsnotify.h | 235 ++ include/linux/inotify.h| 113 + include/linux/miscdevice.h |1 include/linux/sched.h |2 kernel/user.c |2 18 files changed, 1455 insertions(+), 62 deletions(-) diff -urN linux-2.6.11/drivers/char/Makefile linux/drivers/char/Makefile --- linux-2.6.11/drivers/char/Makefile 2005-03-02 02:38:26.0 -0500 +++ linux/drivers/char/Makefile 2005-03-04 13:11:27.414110056 -0500 @@ -9,6 +9,7 @@ obj-y += mem.o random.o tty_io.o n_tty.o tty_ioctl.o + obj-$(CONFIG_LEGACY_PTYS) += pty.o obj-$(CONFIG_UNIX98_PTYS) += pty.o obj-y += misc.o diff -urN linux-2.6.11/fs/attr.c linux/fs/attr.c --- linux-2.6.11/fs/attr.c 2005-03-02 02:37:48.0 -0500 +++ linux/fs/attr.c 2005-03-04 13:13:00.689929976 -0500 @@ -10,7 +10,7 @@ #include #include #include -#include +#include #include #include #include @@ -107,31 +107,8 @@ out: return error; } - EXPORT_SYMBOL(inode_setattr); -int setattr_mask(unsigned int ia_valid) -{ - unsigned long dn_mask = 0; - - if (ia_valid & ATTR_UID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_GID) - dn_mask |= DN_ATTRIB; - if (ia_valid & ATTR_SIZE) - dn_mask |= DN_MODIFY; - /* both times implies a utime(s) call */ - if ((ia_valid & (ATTR_ATIME|ATTR_MTIME)) == (ATTR_ATIME|ATTR_MTIME)) - dn_mask |= DN_ATTRIB; - else if (ia_valid & ATTR_ATIME) - dn_mask |= DN_ACCESS; - else if (ia_valid & ATTR_MTIME) - dn_mask |= DN_MODIFY; - if (ia_valid & ATTR_MODE) - dn_mask |= DN_ATTRIB; - return dn_mask; -} - int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry->d_inode; @@ -194,11 +171,9 @@ if (ia_valid & ATTR_SIZE) up_write(&dentry->d_inode->i_alloc_sem); - if (!error) { - unsigned long dn_mask = setattr_mask(ia_valid); - if (dn_mask) - dnotify_parent(dentry, dn_mask); - } + if (!error) + fsnotify_change(dentry, ia_valid); + return error; } diff -urN linux-2.6.11/fs/compat.c linux/fs/compat.c --- linux-2.6.11/fs/compat.c2005-03-02 02:38:08.0 -0500 +++ linux/fs/compat.c 2005-03-04 13:11:31.336513760 -0500 @@ -36,7 +36,7 @@ #include #include #include -#include +#include #include #include #include @@ -1233,9 +1233,15 @@ out: if (iov != iovstack) kfree(iov); - if ((r
Re: init process and task_struct
On Fri, 2005-02-25 at 23:26 +0100, Josef E. Galea wrote: > Does the init process have a task_struct associated with it, and if yes > where is this structure created? Of course. Stored directly in init_task, declared in , defined in arch-specific code (arch/i386/kernel/init_task.c on x86). Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc4-mm1
On Wed, 2005-02-23 at 12:03 +0100, Mathieu Segaud wrote: > it is the latest Robert Love posted against -mm kernels, but in > inotify_ignore(): I posted an updated patch last Friday, which fixed this. Anyhow, this is the correct fix. Signed-off-by: Robert Love <[EMAIL PROTECTED]> Thanks, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11-rc3-mm2
On Fri, 2005-02-18 at 17:24 +, Al Viro wrote: > Fix the damn locking, already. Fast as I can. Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] inotify for 2.6.11-rc3-mm2
On Thu, 2005-02-10 at 13:47 -0500, Robert Love wrote: > Attached, find a patch against 2.6.11-rc3-mm2 of the latest inotify. Updated patch, fixes a bug. Robert Love inotify, bitches Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/sparc64/Kconfig | 13 drivers/char/Kconfig | 13 drivers/char/Makefile |2 drivers/char/inotify.c | 1053 + drivers/char/misc.c| 14 fs/attr.c | 34 - fs/compat.c| 14 fs/file_table.c|4 fs/inode.c |3 fs/namei.c | 38 - fs/open.c |9 fs/read_write.c| 28 - fs/super.c |3 include/linux/fs.h |7 include/linux/fsnotify.h | 235 ++ include/linux/inotify.h| 118 + include/linux/miscdevice.h |5 include/linux/sched.h |2 kernel/user.c |2 19 files changed, 1522 insertions(+), 75 deletions(-) diff -urN linux-2.6.10/arch/sparc64/Kconfig linux/arch/sparc64/Kconfig --- linux-2.6.10/arch/sparc64/Kconfig 2004-12-24 16:35:25.0 -0500 +++ linux/arch/sparc64/Kconfig 2005-02-01 12:24:26.0 -0500 @@ -88,6 +88,19 @@ bool default y +config INOTIFY + bool "Inotify file change notification support" + default y + ---help--- + Say Y here to enable inotify support and the /dev/inotify character + device. Inotify is a file change notification system and a + replacement for dnotify. Inotify fixes numerous shortcomings in + dnotify and introduces several new features. It allows monitoring + of both files and directories via a single open fd. Multiple file + events are supported. + + If unsure, say Y. + config SMP bool "Symmetric multi-processing support" ---help--- diff -urN linux-2.6.10/drivers/char/inotify.c linux/drivers/char/inotify.c --- linux-2.6.10/drivers/char/inotify.c 1969-12-31 19:00:00.0 -0500 +++ linux/drivers/char/inotify.c2005-02-09 16:05:07.959265648 -0500 @@ -0,0 +1,1053 @@ +/* + * drivers/char/inotify.c - inode-based file event notifications + * + * Authors: + * John McCutchan <[EMAIL PROTECTED]> + * Robert Love <[EMAIL PROTECTED]> + * + * Copyright (C) 2005 John McCutchan + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2, or (at your option) any + * later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +static atomic_t inotify_cookie; +static kmem_cache_t *watch_cachep; +static kmem_cache_t *event_cachep; +static kmem_cache_t *inode_data_cachep; + +static int sysfs_attrib_max_user_devices; +static int sysfs_attrib_max_user_watches; +static unsigned int sysfs_attrib_max_queued_events; + +/* + * struct inotify_device - represents an open instance of an inotify device + * + * For each inotify device, we need to keep track of the events queued on it, + * a list of the inodes that we are watching, and so on. + * + * This structure is protected by 'lock'. Lock ordering: + * + * dev->lock (protects dev) + * inode_lock (used to safely walk inode_in_use list) + * inode->i_lock (only needed for getting ref on inode_data) + */ +struct inotify_device { + wait_queue_head_t wait; + struct idr idr; + struct list_headevents; + struct list_headwatches; + spinlock_t lock; + unsigned intqueue_size; + unsigned intevent_count; + unsigned intmax_events; + struct user_struct *user; +}; + +struct inotify_watch { + s32 wd; /* watch descriptor */ + u32 mask; /* event mask for this watch */ + struct inode*inode; /* associated inode */ + struct inotify_device *dev; /* associated device */ + struct list_headd_list; /* entry in device's list */ + struct list_headi_list; /* entry in inotify_data's list */ +}; + +/* + * A list of these is attached to each instance of the driver. In read(), this + * this list is walked and all events that can fit in the buffer are returned. + */ +struct inotify_kernel_event { + struct inotify_eventevent; + stru
[patch] inotify for 2.6.11-rc3-mm2
On Thu, 2005-02-10 at 02:35 -0800, Andrew Morton wrote: > -inotify.patch > -inotify-fix_find_inode.patch > > I think my version is old, and it oopses. It is old. I have sent you multiple updates. ;-) Attached, find a patch against 2.6.11-rc3-mm2 of the latest inotify. This version has numerous optimizations, bug fixes, and clean ups. It introduces a generic notification layer to cleanly wrap both dnotify and inotify hooks in fs/. Pending is a data structure reorganization, to untangle some of the locking. Andrew, please apply! Robert Love inotify! inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a device node, not SIGIO. You open a single fd to the device node, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure) and Gamin (a FAM replacement). Signed-off-by: Robert Love <[EMAIL PROTECTED]> arch/sparc64/Kconfig | 13 drivers/char/Kconfig | 13 drivers/char/Makefile |2 drivers/char/inotify.c | 1053 + fs/attr.c | 33 - fs/compat.c| 14 fs/file_table.c|4 fs/inode.c |3 fs/namei.c | 38 - fs/open.c |9 fs/read_write.c| 24 - fs/super.c |3 include/linux/fs.h |7 include/linux/fsnotify.h | 235 ++ include/linux/inotify.h| 118 + include/linux/miscdevice.h |1 include/linux/sched.h |2 kernel/user.c |2 18 files changed, 1511 insertions(+), 63 deletions(-) diff -urN linux-2.6.11-rc3-mm2/arch/sparc64/Kconfig linux-mm-inotify/arch/sparc64/Kconfig --- linux-2.6.11-rc3-mm2/arch/sparc64/Kconfig 2005-02-10 13:17:32.212175080 -0500 +++ linux-mm-inotify/arch/sparc64/Kconfig 2005-02-10 13:18:40.358815216 -0500 @@ -88,6 +88,19 @@ bool default y +config INOTIFY + bool "Inotify file change notification support" + default y + ---help--- + Say Y here to enable inotify support and the /dev/inotify character + device. Inotify is a file change notification system and a + replacement for dnotify. Inotify fixes numerous shortcomings in + dnotify and introduces several new features. It allows monitoring + of both files and directories via a single open fd. Multiple file + events are supported. + + If unsure, say Y. + config SMP bool "Symmetric multi-processing support" ---help--- diff -urN linux-2.6.11-rc3-mm2/drivers/char/inotify.c linux-mm-inotify/drivers/char/inotify.c --- linux-2.6.11-rc3-mm2/drivers/char/inotify.c 1969-12-31 19:00:00.0 -0500 +++ linux-mm-inotify/drivers/char/inotify.c 2005-02-10 13:18:40.360814912 -0500 @@ -0,0 +1,1053 @@ +/* + * drivers/char/inotify.c - inode-based file event notifications + * + * Authors: + * John McCutchan <[EMAIL PROTECTED]> + * Robert Love <[EMAIL PROTECTED]> + * + * Copyright (C) 2005 John McCutchan + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2, or (at your option) any + * later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +static atomic_t inotify_cookie; +static kmem_cache_t *watch_cachep; +static kmem_cache_t *event_cachep; +static kmem_cache_t *inode_data_cachep; + +static int sysfs_attrib_max_user_devices; +static int sysfs_attrib_max_user_watches; +static unsigned int sysfs_attrib_m
Re: VM disk cache behavior.
On Tue, 2005-02-08 at 12:06 -0500, jon ross wrote: > I have an app with a small fixed memory footprint that does a lot of > random reads from a large file. I thought if I added more memory to > the machine the VM would do more caching of the disk, but added memory > does not seem to make any difference. I played with some of the params > in /proc/sys/vm and none of them seem to have any effect. > > I tired both a 2.4.20 & 2.6.10 kernels with no difference. > > The machine is a Dell 2560. I tired memory configs of 512M, 1G, 4G and > the average read-times do not change. > > Do I need to set/compile anything to allow the VM to use the memory? > If is was a way to tell how much memory the VM is using for a drive > cache I could at least tell if my kernel is miss-configured or my app > sucks. More memory will allow the kernel to keep more cache in memory. You can see how much memory the kernel is using for cache with free(1). That does not sound like your problem, though. It sounds like you want the kernel to do more _read-ahead_, e.g. cache things _before_ you even need them (and then you might want more memory to actually keep all of the stuff alive in the cache, but that is a secondary problem). Unfortunately, since you are doing random reads, it is very hard for the kernel to do intelligent read-ahead. What you can do is pre-fault the entire file into memory. This is not a bad idea if you know you are going to ultimately read much of the file. You can prefault the file automatically and asynchronously using posix_fadvise(). Example: if (posix_fadvise (fd, 0, 0, POSIX_FADV_WILLNEED)) perror ("posix_fadvise"); See posix_fadvise(2) for more information. It might also be faster to use mmap(1) over read(2). Then you can use madvise(). Best, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc2-mm1
On Mon, 2005-02-07 at 12:57 +0100, Ingo Molnar wrote: Hello, Ingo. > > Also ioctl is not an acceptable interface for adding new core > > functionality. > > seconded. Robert? Well, I don't share the hatred for ioctl, at least compared to another type unsafe interface like write(). But John and I are open to doing whatever is the consensus. If there is an agreed alternative, and that is the requirement for merging, I'll do it. I'd like to keep the user-space interface and simple, and absolutely want to keep the single file descriptor approach. How the fd is obtained is up for discussion. Ingo, what do you prefer? Best, Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc3-mm1
On Sun, 2005-02-06 at 22:22 +0100, Peter Osterlund wrote: > > > > EIP is a strncpy_from_user+0x33/0x47 > > > > ... > > > > Call Trace: > > > > getname+0x69/0xa5 > > > > sys_open+0x12/0xc6 > > > > sysenter_past_esp+0x52/0x75 > > > > ... > > > > Kernel panic - not syncing: Attempted to kill init! > > I found the if I disable CONFIG_INOTIFY, the problem goes away. Weird. While we touch sys_open() with an inotify hook, we do so after the call to getname, and we don't touch getname() or strncpy_from_user() at all. I wonder if there is another bug and inotify is just affecting the timing? Robert Love - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/