Re: If not readdir() then what?
On Thursday April 12, [EMAIL PROTECTED] wrote: > On Thu, 12 April 2007 11:46:41 +1000, Neil Brown wrote: > > > > I could argue that nfs came before ext3+dirindex, so ext3 should have > > been designed to work properly with NFS. You could argue that fixing > > it in nfsd fixes it for all filesystems. But I'm not sure either of > > those arguments are likely to be at all convincing... > > Caring about a non-ext3 filesystem, I sure would like an nfs solution as > well. :) I have a non-ext3 filesystem I care about too. But my perspective is that a solution in nfsd at-best a work-around. Caching the whole 'struct file' when there is just a small bit that we might want seems like a heavy hammer. The filesystem is in the best place to know what needs to be cached, and it should be the one doing the caching. > > > Hmmm. I wonder. Which is more likely? > > - That two 64bit hashes from some set are the same > > - or that 65536 48bit hashes from a set of equal size are the same. > > The former. Each bit going from hash strength to collision chain length > reduces the likelihood of an overflow. In the extreme case of a 0bit > hash and 64bit collision chain, you need 2^64 entries compared to 2^32 > for the other extreme. > > However, the collision chain gives me quite a bit of headache. One > would have to store each entry's position on the chain, deal with older > entries getting deleted, newer entries getting removed, etc. All this > requires a lot of complicated code that basically never gets tested in > the wild. This is a simple consequence of the design decision to use hashes as the search key. They aren't dense and they will collide. So the solution will be a bit fuzzy around the edges. And maybe that is an acceptable tradeoff. But the filesystem should take full responsibility for it, whether in performance or correctness :-) > > Just settling for a 64bit hash and returning -EEXIST when someone causes > a collision an creat() sounds more appealing. Directories with 4 > billion entries will cause problems, but that is hardly news to anyone. > I think you want -EFBIG or -ENOSPC. -EEXIST sounds just wrong. But there are alternatives. e.g. internal chaining. Insist on a unique 64bit hash for every file. If the hash is in use, increment and try again. On lookup, if the hash leads you to a file with the wrong name, increment and try again until you find a hole (hash value that is not stored). When you delete an entry, leave a place holder if the next hash is in use. Conversely if the next hash is not in use, delete the entry and delete the previous one if it is a place holder. Then you get 100% correct semantics and a performance hit in the face of hash collisions that is probably no worse than that which ext3 currently gets. It probably does cost you a bit of storage to store those 64bit hashes, though I suspect some clever compression can help out there (You only need one bit more than the filename when there is no chaining). You have to require 64bit cookies/fpos, but I think that today, that is a reasonable thing to require (5 years ago it might not have been). NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 6/13] maps#2: Move the page walker code to lib/
Matt Mackall wrote: On Wed, Apr 11, 2007 at 04:35:44PM +1000, Nick Piggin wrote: Matt Mackall wrote: Move the page walker code to lib/ This lets it get shared outside of proc/ and linked in only when needed. Still should go into mm/ If it had, you might have also noticed your pagetable walking code is completely different from how everyone else does it, and fixed that too. I actually did notice that, when I compared it to jsgf's page walking code for Xen. Can you fix it then, since you are doing the big reorganisation? BTW. Is it the case that unused and unexported symbols don't get pruned by the linker except inside lib/? Yes, that's been my point all along. It also currently only happens at the granularity of an object file, not a symbol, FYI. You have the config symbols, don't you? Please use them in the makefile to prevent compiling and linking. My point all along is that it belongs in mm/. If you don't want to do it then I can't make you, but I'll submit a patch to use the correct page table walking conventions and move it to mm/. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Why kmem_cache_free occupy CPU for more than 10 seconds?
On 4/11/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote: On Wed, 2007-04-11 at 17:53 +0800, Zhao Forrest wrote: > I got some new information: > Before soft lockup message is out, we have: > [EMAIL PROTECTED] home]# cat /proc/slabinfo |grep buffer_head > buffer_head 10927942 10942560120 321 : tunables 32 > 168 : slabdata 341955 341955 6 : globalstat 37602996 11589379 > 11743736 01 6918 12166031 1013708 > : cpustat 35254590 2350698 13610965 907286 > > Then after buffer_head is freed, we have: > [EMAIL PROTECTED] home]# cat /proc/slabinfo |grep buffer_head > buffer_head 9542 36384120 321 : tunables 32 16 > 8 : slabdata 1137 1137245 : globalstat 37602996 11589379 > 11743736 01 6983 20507478 > 1708818 : cpustat 35254625 2350704 16027174 1068367 > > Does this huge number of buffer_head cause the soft lockup? __blkdev_put() takes the BKL and bd_mutex invalidate_mapping_pages() tries to take the PageLock But no other looks seem held while free_buffer_head() is called All these locks are preemptible (CONFIG_PREEMPT_BKL?=y) and should not hog the cpu like that, what preemption mode have you got selected? (CONFIG_PREEMPT_VOLUNTARY?=y) These 2 kernel options are turned on by default in my kernel. Here's snip from .config # CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY=y # CONFIG_PREEMPT is not set CONFIG_PREEMPT_BKL=y CONFIG_NUMA=y CONFIG_K8_NUMA=y Does this fix it? --- fs/buffer.c~2007-02-01 12:00:34.0 +0100 +++ fs/buffer.c 2007-04-11 12:35:48.0 +0200 @@ -3029,6 +3029,8 @@ out: struct buffer_head *next = bh->b_this_page; free_buffer_head(bh); bh = next; + + cond_resched(); } while (bh != buffers_to_free); } return ret; So far I have run the test with patched kernel for 6 rounds, and didn't see the soft lockup. I think this patch should fix the problem. But what still confused me is that why do we need to invoke cond_resched() voluntarily since CONFIG_PREEMPT_VOLUNTARY and CONFIG_PREEMPT_BKL are both turned on? From my understanding these 2 options should make schedule happen even if CPU is under heavy load.. Thanks, Forrest - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
help me on 2.6.15 kernel on MPC850
Hey Guys, The CPU was spinning when executing in head_8xx.S::initial_mmu. I have located the instruction that cause the spinning through on-board leds. Do you have any ideas? #ifdef CONFIG_8xx_COPYBACK mtspr SPRN_DC_CST, r8 <<< The spinning was caused by this instruction #endif #else /* * For a debug option, I left this here to easily enable * the write through cache mode */ lis r8, [EMAIL PROTECTED] mtspr SPRN_DC_CST, r8 lis r8, [EMAIL PROTECTED] mtspr SPRN_DC_CST, r8 #endif blr - The output from u-boot: => tftp 0x10 uImage Using SCC ETHERNET device TFTP from server 90.0.0.3; our IP address is 90.0.0.50 Filename 'uImage'. Load address: 0x10 Loading: # # ### done Bytes transferred = 844738 (ce3c2 hex) => bootm ## Booting image at 0010 ... Image Name: Linux-2.6.15 Created: 2007-04-07 17:25:29 UTC Image Type: PowerPC Linux Kernel Image (gzip compressed) Data Size:844674 Bytes = 824.9 kB Load Address: Entry Point: Verifying Checksum ... OK Uncompressing Kernel Image ... OK <<< The CPU is spinning then - The hardware, bootloader, linux I'm using: 1) MPC850, 1KB DCache + 2KB ICache 2) u-boot 1.1.4 ( Data Cache is turned off all the way, but instruction cache is always on) 3) ppclinux 2.6.15 Thanks, Gavin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 00/31] [00/@[EMAIL PROTECTED] -stable review
(hm, quilt messed up that subject, sorry...) On Wed, Apr 11, 2007 at 03:51:00PM -0700, Greg KH wrote: > This is the start of the stable review cycle for the 2.6.20.7 release. And here's the rolled up patch (note the new subdirectory): kernel.org/pub/linux/kernel/v2.6/stable-review/patch-2.6.20.7-rc1.gz thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] make MADV_FREE lazily free memory
Nick Piggin wrote: Eric Dumazet wrote: Two things can happen here. If this program used the pages before the kernel needed them, the program will be reusing its old pages. ah ok, this is because accessed/dirty bits are set by hardware and not a page fault. No it isn't. That is to say, it isn't required for correctness. But if the question was about avoiding a fault, then yes ;) But as Linus recently said, even hardware handled faults still take expensive microarchitectural traps. Is it true for all architectures ? No. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] make MADV_FREE lazily free memory
Eric Dumazet wrote: Rik van Riel a écrit : Eric Dumazet wrote: Rik van Riel a écrit : Make it possible for applications to have the kernel free memory lazily. This reduces a repeated free/malloc cycle from freeing pages and allocating them, to just marking them freeable. If the application wants to reuse them before the kernel needs the memory, not even a page fault will happen. I dont understand this last sentence. If not even a page fault happens, how the kernel knows that the page was eventually reused by the application, and should not be freed in case of memory pressure ? Before maybe freeing the page, the kernel checks the referenced and dirty bits of the page table entries mapping that page. ptr = mmap(some space); madvise(ptr, length, MADV_FREE); /* kernel may free the pages */ All this call does is: - clear the accessed and dirty bits - move the page to the far end of the inactive list, where it will be the first to be reclaimed sleep(10); /* what the application must do know before reusing space ? */ memset(ptr, data, 1); /* kernel should not free ptr[0..1] now */ Two things can happen here. If this program used the pages before the kernel needed them, the program will be reusing its old pages. ah ok, this is because accessed/dirty bits are set by hardware and not a page fault. No it isn't. Is it true for all architectures ? No. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] make MADV_FREE lazily free memory
Rik van Riel a écrit : Eric Dumazet wrote: Rik van Riel a écrit : Make it possible for applications to have the kernel free memory lazily. This reduces a repeated free/malloc cycle from freeing pages and allocating them, to just marking them freeable. If the application wants to reuse them before the kernel needs the memory, not even a page fault will happen. I dont understand this last sentence. If not even a page fault happens, how the kernel knows that the page was eventually reused by the application, and should not be freed in case of memory pressure ? Before maybe freeing the page, the kernel checks the referenced and dirty bits of the page table entries mapping that page. ptr = mmap(some space); madvise(ptr, length, MADV_FREE); /* kernel may free the pages */ All this call does is: - clear the accessed and dirty bits - move the page to the far end of the inactive list, where it will be the first to be reclaimed sleep(10); /* what the application must do know before reusing space ? */ memset(ptr, data, 1); /* kernel should not free ptr[0..1] now */ Two things can happen here. If this program used the pages before the kernel needed them, the program will be reusing its old pages. ah ok, this is because accessed/dirty bits are set by hardware and not a page fault. Is it true for all architectures ? If the kernel got there first, you will get page faults and the kernel will fill in the memory with new pages. perfect Both of these alternatives are transparent to userspace. Thanks a lot for these clarifications. This will fly :) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6
Bartek wrote: > Hopefully, this time it my bug report should be ok :): > > Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0x7689] e1 cd 33 f6 > fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76 0a 1c 87 59 > bf 44 cc ac 3b ... > Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0x7689 received > Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0x9 76 89 > e1 cd 33 f6 fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76 > 0a 1c 87 59 bf 44 cc ...] > Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0xda7d] 15 19 45 3c > e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56 98 47 62 bc > cd a6 8e d5 77 ... > Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0xda7d received > Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0xa da 7d > 15 19 45 3c e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56 > 98 47 62 bc cd a6 8e ...] > Apr 11 23:53:40 localhost kernel: skb_under_panic: text:f8c62c0e > len:291 put:1 head:ddc94800 data:ddc947ff tail:ddc94922 end:ddc94e00 > dev: It seems we fail to reserve enough headroom for the case buf[0] == PPP_ALLSTATIONS and buf[1] != PPP_UI. Can you try this patch please? diff --git a/drivers/net/ppp_async.c b/drivers/net/ppp_async.c index 933e2f3..c68e37f 100644 --- a/drivers/net/ppp_async.c +++ b/drivers/net/ppp_async.c @@ -890,6 +890,8 @@ ppp_async_input(struct asyncppp *ap, const unsigned char *buf, ap->rpkt = skb; } if (skb->len == 0) { + int headroom = 0; + /* Try to get the payload 4-byte aligned. * This should match the * PPP_ALLSTATIONS/PPP_UI/compressed tests in @@ -897,7 +899,10 @@ ppp_async_input(struct asyncppp *ap, const unsigned char *buf, * enough chars here to test buf[1] and buf[2]. */ if (buf[0] != PPP_ALLSTATIONS) - skb_reserve(skb, 2 + (buf[0] & 1)); + headroom += 2; + if (buf[0] & 1) + headroom += 1; + skb_reserve(skb, headroom); } if (n > skb_tailroom(skb)) { /* packet overflowed MRU */
Re: tmpfs and the OOM killer
On Thursday 12 April 2007 02:04, Al Boldi wrote: > > Pedro wrote: > > 2) How should an application be written to not be killed by OOM? > > Try this: > > # echo -17 > /proc//oom_adj I should know that to run a fail-safe application is a superuser privilege. Sorry from wasting your time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] Pte simplify ops.patch
Add comment and condense code to make use of native_local_ptep_get_and_clear function. Also, it turns out the 2-level and 3-level paging definitions were identical, so move the common definition into pgtable.h Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> diff -r b3bbc1b5e085 include/asm-i386/pgtable-2level.h --- a/include/asm-i386/pgtable-2level.h Wed Apr 11 18:23:44 2007 -0700 +++ b/include/asm-i386/pgtable-2level.h Wed Apr 11 18:24:07 2007 -0700 @@ -39,16 +39,6 @@ static inline void native_pte_clear(stru static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *xp) { *xp = __pte(0); -} - -/* local pte updates need not use xchg for locking */ -static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep) -{ - pte_t res; - - res = *ptep; - native_pte_clear(NULL, 0, ptep); - return res; } #ifdef CONFIG_SMP diff -r b3bbc1b5e085 include/asm-i386/pgtable-3level.h --- a/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:44 2007 -0700 +++ b/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:49 2007 -0700 @@ -139,16 +139,6 @@ static inline void pud_clear (pud_t * pu #define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \ pmd_index(address)) -/* local pte updates need not use xchg for locking */ -static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep) -{ - pte_t res; - - res = *ptep; - native_pte_clear(NULL, 0, ptep); - return res; -} - #ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *ptep) { diff -r b3bbc1b5e085 include/asm-i386/pgtable.h --- a/include/asm-i386/pgtable.hWed Apr 11 18:23:44 2007 -0700 +++ b/include/asm-i386/pgtable.hWed Apr 11 18:23:49 2007 -0700 @@ -267,6 +267,16 @@ static inline pte_t pte_mkhuge(pte_t pte #define pte_update_defer(mm, addr, ptep) do { } while (0) #endif +/* local pte updates need not use xchg for locking */ +static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep) +{ + pte_t res = *ptep; + + /* Pure native function needs no input for mm, addr */ + native_pte_clear(NULL, 0, ptep); + return res; +} + /* * We only update the dirty/accessed state if we set * the dirty bit by hand in the kernel, since the hardware @@ -348,8 +358,11 @@ static inline pte_t ptep_get_and_clear_f { pte_t pte; if (full) { - pte = *ptep; - native_pte_clear(mm, addr, ptep); + /* +* Full address destruction in progress; paravirt does not +* care about updates and native needs no locking +*/ + pte = native_local_ptep_get_and_clear(ptep); } else { pte = ptep_get_and_clear(mm, addr, ptep); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] Pte xchg optimization.patch
In situations where page table updates need only be made locally, and there is no cross-processor A/D bit races involved, we need not use the heavyweight xchg instruction to atomically fetch and clear page table entries. Instead, we can just read and clear them directly. This introduces a neat optimization for non-SMP kernels; drop the atomic xchg operations from page table updates. Thanks to Michel Lespinasse for noting this potential optimization. Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> diff -r 47495b2532b3 include/asm-i386/pgtable-2level.h --- a/include/asm-i386/pgtable-2level.h Wed Apr 11 18:23:01 2007 -0700 +++ b/include/asm-i386/pgtable-2level.h Wed Apr 11 18:23:39 2007 -0700 @@ -41,10 +41,24 @@ static inline void native_pte_clear(stru *xp = __pte(0); } +/* local pte updates need not use xchg for locking */ +static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep) +{ + pte_t res; + + res = *ptep; + native_pte_clear(NULL, 0, ptep); + return res; +} + +#ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *xp) { return __pte(xchg(&xp->pte_low, 0)); } +#else +#define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) +#endif #define pte_page(x)pfn_to_page(pte_pfn(x)) #define pte_none(x)(!(x).pte_low) diff -r 47495b2532b3 include/asm-i386/pgtable-3level.h --- a/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:01 2007 -0700 +++ b/include/asm-i386/pgtable-3level.h Wed Apr 11 18:23:05 2007 -0700 @@ -139,6 +139,17 @@ static inline void pud_clear (pud_t * pu #define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \ pmd_index(address)) +/* local pte updates need not use xchg for locking */ +static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep) +{ + pte_t res; + + res = *ptep; + native_pte_clear(NULL, 0, ptep); + return res; +} + +#ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *ptep) { pte_t res; @@ -150,6 +161,9 @@ static inline pte_t native_ptep_get_and_ return res; } +#else +#define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) +#endif #define __HAVE_ARCH_PTE_SAME static inline int pte_same(pte_t a, pte_t b) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch
In shadow mode hypervisors, ptep_get_and_clear achieves the desired purpose of keeping the shadows in sync by issuing a native_get_and_clear, followed by a call to pte_update, which indicates the PTE has been modified. Direct mode hypervisors (Xen) have no need for this anyway, and will trap the update using writable pagetables. This means no hypervisor makes use of ptep_get_and_clear; there is no reason to have it in the paravirt-ops structure. Change confusing terminology about raw vs. native functions into consistent use of native_pte_xxx for operations which do not invoke paravirt-ops. Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> diff -r c02c6f5e882c arch/i386/kernel/paravirt.c --- a/arch/i386/kernel/paravirt.c Wed Apr 11 16:25:09 2007 -0700 +++ b/arch/i386/kernel/paravirt.c Wed Apr 11 17:09:55 2007 -0700 @@ -315,8 +315,6 @@ struct paravirt_ops paravirt_ops = { .pte_update = paravirt_nop, .pte_update_defer = paravirt_nop, - .ptep_get_and_clear = native_ptep_get_and_clear, - #ifdef CONFIG_HIGHPTE .kmap_atomic_pte = kmap_atomic, #endif diff -r c02c6f5e882c include/asm-i386/paravirt.h --- a/include/asm-i386/paravirt.h Wed Apr 11 16:25:09 2007 -0700 +++ b/include/asm-i386/paravirt.h Wed Apr 11 17:12:03 2007 -0700 @@ -187,8 +187,6 @@ struct paravirt_ops void (*pte_update)(struct mm_struct *mm, unsigned long addr, pte_t *ptep); void (*pte_update_defer)(struct mm_struct *mm, unsigned long addr, pte_t *ptep); - - pte_t (*ptep_get_and_clear)(pte_t *ptep); #ifdef CONFIG_HIGHPTE void *(*kmap_atomic_pte)(struct page *page, enum km_type type); @@ -859,12 +857,8 @@ static inline void pmd_clear(pmd_t *pmdp PVOP_VCALL1(pmd_clear, pmdp); } -static inline pte_t raw_ptep_get_and_clear(pte_t *p) -{ - unsigned long long val = PVOP_CALL1(unsigned long long, ptep_get_and_clear, p); - return (pte_t) { val, val >> 32 }; -} #else /* !CONFIG_X86_PAE */ + static inline pte_t __pte(unsigned long val) { return (pte_t) { PVOP_CALL1(unsigned long, make_pte, val) }; @@ -899,11 +893,6 @@ static inline void set_pmd(pmd_t *pmdp, static inline void set_pmd(pmd_t *pmdp, pmd_t pmdval) { PVOP_VCALL2(set_pmd, pmdp, pmdval.pud.pgd.pgd); -} - -static inline pte_t raw_ptep_get_and_clear(pte_t *p) -{ - return (pte_t) { PVOP_CALL1(unsigned long, ptep_get_and_clear, p) }; } #endif /* CONFIG_X86_PAE */ diff -r c02c6f5e882c include/asm-i386/pgtable.h --- a/include/asm-i386/pgtable.hWed Apr 11 16:25:09 2007 -0700 +++ b/include/asm-i386/pgtable.hWed Apr 11 17:11:22 2007 -0700 @@ -265,8 +265,6 @@ static inline pte_t pte_mkhuge(pte_t pte */ #define pte_update(mm, addr, ptep) do { } while (0) #define pte_update_defer(mm, addr, ptep) do { } while (0) - -#define raw_ptep_get_and_clear(xp) native_ptep_get_and_clear(xp) #endif /* @@ -340,7 +338,7 @@ do { \ #define __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - pte_t pte = raw_ptep_get_and_clear(ptep); + pte_t pte = native_ptep_get_and_clear(ptep); pte_update(mm, addr, ptep); return pte; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] Pte clear optimization.patch
When exiting from an address space, no special hypervisor notification of page table updates needs to occur; direct page table hypervisors, such as Xen, switch to another address space first (init_mm) and unprotects the page tables to avoid the cost of trapping to the hypervisor for each pte_clear. Shadow mode hypervisors, such as VMI and lhype don't need to do the extra work of calling through paravirt-ops, and can just directly clear the page table entries without notifiying the hypervisor, since all the page tables are about to be freed. So introduce native_pte_clear functions which bypass any paravirt-ops notification. This results in a significant performance win for VMI and removes some indirect calls from zap_pte_range. Note the 3-level paging already had a native_pte_clear function, thus demanding argument conformance and extra args for the 2-level definition. Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> diff -r 1478ce4ec9e3 include/asm-i386/pgtable-2level.h --- a/include/asm-i386/pgtable-2level.h Wed Apr 11 17:13:10 2007 -0700 +++ b/include/asm-i386/pgtable-2level.h Wed Apr 11 18:22:51 2007 -0700 @@ -35,6 +35,11 @@ static inline void native_set_pmd(pmd_t #define pte_clear(mm,addr,xp) do { set_pte_at(mm, addr, xp, __pte(0)); } while (0) #define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0) + +static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *xp) +{ + *xp = __pte(0); +} static inline pte_t native_ptep_get_and_clear(pte_t *xp) { diff -r 1478ce4ec9e3 include/asm-i386/pgtable.h --- a/include/asm-i386/pgtable.hWed Apr 11 17:13:10 2007 -0700 +++ b/include/asm-i386/pgtable.hWed Apr 11 18:21:43 2007 -0700 @@ -349,7 +349,7 @@ static inline pte_t ptep_get_and_clear_f pte_t pte; if (full) { pte = *ptep; - pte_clear(mm, addr, ptep); + native_pte_clear(mm, addr, ptep); } else { pte = ptep_get_and_clear(mm, addr, ptep); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] i386 - pte update optimizations
Some PTE optimizations for native and paravirt-ops kernels; this provides a huge win for shadow mode hypervisors and gets rid of some unnecessary atomic instructions in native kernels, saving even more on UP by getting rid of implicit LOCK on xchg instruction. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.
On Wed, 2007-04-11 at 19:03 +0200, Patrick McHardy wrote: > > You bring up a good point, it would be good to hear the opinion from > one of the wireless people on this since they have their own > multiqueue scheduler in the wireless-dev tree. The one in the wireless-dev is pretty much like this one. It existed only because there was not such a multiqueue aware qdisc available at that time. The requirement for wireless is the same as the strict PRIO with an addition that the dequeued SKB's corresponding NIC hardware queue must be active (this is also true for other devices I think, otherwise it has to be requeued which leads a busy or dead loop in the end). In other words, the dequeue method should select the SKB with the highest priority from all the ACTIVE hardware queues (not all queues). The wireless hardware then schedules all the packets from its 4 hardware TX queues based on the priority and network environment. Thanks, -yi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tmpfs and the OOM killer
On Wednesday 11 April 2007 19:39, Alan Cox wrote: > > 2) How should an application be written to not be killed by OOM? > > OOM isn't an application matter. The kernel has to choose between > allowing overcommit on the basis it might run out of memory and have to > kill stuff, or that it won't in which case an applicatio which correctly > handles malloc() and similar failures will not be killed (unless it is > out of space on a stack grow which is a C language flaw as you can't > catch that event in C) > > It's configured by /proc/sys/vm/overcommit_memory > > 0 - try and spot obviously dumb allocations > 1 - anything goes > 2 - strictly control resource commit I deduce that a fail-safe application must scanf overcommit_memory, warn the user and waitpid. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel BUG at net/core/skbuff.c in linux-2.6.21-rc6
It is not enough to unload proprietary modules. As long as they have ever been loaded at all the kernel is tainted. You need to ensure that the proprietary modules never get loaded at all. I guess you probably already worked that out, just wanted to point it out just in case :-) Hopefully, this time it my bug report should be ok :): Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0x7689] e1 cd 33 f6 fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76 0a 1c 87 59 bf 44 cc ac 3b ... Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0x7689 received Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0x9 76 89 e1 cd 33 f6 fd f7 52 e6 58 c9 73 98 bc ff ad d5 b5 a3 e5 d9 1e 77 76 0a 1c 87 59 bf 44 cc ...] Apr 11 23:53:38 localhost pppd[31289]: rcvd [proto=0xda7d] 15 19 45 3c e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56 98 47 62 bc cd a6 8e d5 77 ... Apr 11 23:53:38 localhost pppd[31289]: Unsupported protocol 0xda7d received Apr 11 23:53:38 localhost pppd[31289]: sent [LCP ProtRej id=0xa da 7d 15 19 45 3c e0 ac 44 92 3b c4 8e 75 6b b8 4a 9f 4a 3a 22 63 d3 a1 56 98 47 62 bc cd a6 8e ...] Apr 11 23:53:40 localhost kernel: skb_under_panic: text:f8c62c0e len:291 put:1 head:ddc94800 data:ddc947ff tail:ddc94922 end:ddc94e00 dev: Apr 11 23:53:40 localhost kernel: [ cut here ] Apr 11 23:53:40 localhost kernel: kernel BUG at net/core/skbuff.c:111! Apr 11 23:53:40 localhost kernel: invalid opcode: [#1] Apr 11 23:53:40 localhost kernel: Modules linked in: nfs nfsd exportfs lockd nfs_acl sunrpc button xt_TCPMSS xt_limit xt_tcpudp nf_nat_irc nf_nat_ftp iptable_nat iptable_mangle ipt_LOG ipt_MASQUERADE nf_nat ipt_TOS ipt_REJECT nf_conntrack_irc nf_conntrack_ftp nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables x_tables ppp_async ipv6 ppp_generic slhc xfs fuse eeprom w83781d w83627hf hwmon_vid i2c_isa ide_generic parport_pc parport i2c_viapro floppy i2c_core serio_raw snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer snd_page_alloc snd_mpu401_uart via_ircc snd_rawmidi snd_seq_device irda rtc psmouse via_agp agpgart snd soundcore pcspkr crc_ccitt evdev ext3 jbd mbcache usbhid ide_cd cdrom ide_disk generic uhci_hcd usbcore via82cxxx ide_core e100 mii thermal processor fan Apr 11 23:53:40 localhost kernel: CPU:0 Apr 11 23:53:40 localhost kernel: EIP:0060:[]Not tainted VLI Apr 11 23:53:40 localhost kernel: EFLAGS: 00010096 (2.6.21-rc6 #3) Apr 11 23:53:40 localhost kernel: EIP is at skb_under_panic+0x59/0x5d Apr 11 23:53:40 localhost kernel: eax: 0072 ebx: ddc94800 ecx: edx: Apr 11 23:53:40 localhost kernel: esi: edi: ddc94924 ebp: ddc9491e esp: c1ce5ed8 Apr 11 23:53:40 localhost kernel: ds: 007b es: 007b fs: 00d8 gs: ss: 0068 Apr 11 23:53:40 localhost kernel: Process events/0 (pid: 3, ti=c1ce4000 task=dfd02030 task.ti=c1ce4000) Apr 11 23:53:40 localhost kernel: Stack: c02c47d0 f8c62c0e 0123 0001 ddc94800 ddc947ff ddc94922 ddc94e00 Apr 11 23:53:40 localhost kernel:c02b7ed8 dfd23a60 00ff f8c62c13 0282 dfff5c20 f7e67c00 0208 Apr 11 23:53:40 localhost kernel:f0e45d34 f0e45c34 f7e67c00 0202 e0f7d600 0006 f0e45c00 f7e67c0c Apr 11 23:53:40 localhost kernel: Call Trace: Apr 11 23:53:40 localhost kernel: [] ppp_asynctty_receive+0x3b0/0x584 [ppp_async] Apr 11 23:53:40 localhost kernel: [] ppp_asynctty_receive+0x3b5/0x584 [ppp_async] Apr 11 23:53:40 localhost kernel: [] flush_to_ldisc+0xe6/0x124 Apr 11 23:53:40 localhost kernel: [] flush_to_ldisc+0x0/0x124 Apr 11 23:53:40 localhost kernel: [] run_workqueue+0x70/0x101 Apr 11 23:53:40 localhost kernel: [] worker_thread+0x105/0x12e Apr 11 23:53:40 localhost kernel: [] default_wake_function+0x0/0xc Apr 11 23:53:40 localhost kernel: [] worker_thread+0x0/0x12e Apr 11 23:53:40 localhost kernel: [] kthread+0xa0/0xc8 Apr 11 23:53:40 localhost kernel: [] kthread+0x0/0xc8 Apr 11 23:53:40 localhost kernel: [] kernel_thread_helper+0x7/0x10 Apr 11 23:53:40 localhost kernel: === Apr 11 23:53:40 localhost kernel: Code: 00 00 89 5c 24 14 8b 98 a0 00 00 00 89 54 24 0c 89 5c 24 10 8b 40 60 89 4c 24 04 c7 04 24 d0 47 2c c0 89 44 24 08 e8 af c5 ef ff <0f> 0b eb fe 56 53 bb d8 7e 2b c0 83 ec 24 8b 70 14 85 f6 0f 45 Apr 11 23:53:40 localhost kernel: EIP: [] skb_under_panic+0x59/0x5d SS:ESP 0068:c1ce5ed8 Apr 11 23:54:01 localhost /USR/SBIN/CRON[32147]: (root) CMD (/usr/local/bin/pppd_test.sh) Apr 11 23:54:31 localhost pppd[31289]: No response to 5 echo-requests Apr 11 23:54:31 localhost pppd[31289]: Serial link appears to be disconnected. Apr 11 23:54:31 localhost pppd[31289]: Connect time 34.0 minutes. Apr 11 23:54:31 localhost pppd[31289]: Sent 6451377 bytes, received 21004296 bytes. Apr 11 23:54:31 localhost pppd[31289]: Script /etc/ppp/ip-down started (pid 32149) Apr 11 23:54:31 localhost pppd[31289]: sent [LCP TermReq id=0xb "Peer not responding"] Apr 11 23:54:31 localhos
Re: tmpfs and the OOM killer
Pedro wrote: > On Wednesday 11 April 2007 16:48, Willy Tarreau wrote: > > On Wed, Apr 11, 2007 at 02:23:31AM -0300, Pedro wrote: > > > > > > As the OOM killer is not Posix, > > > > If you cannot control your application's memory usage, you'll have to > > finely tune the overcommit_ratio. > > 2) How should an application be written to not be killed by OOM? Try this: # echo -17 > /proc//oom_adj Or this: # echo 2 > /proc/sys/vm/overcommit_memory # echo 95 > /proc/sys/vm/overcommit_ratio Or this: # ulimit -v [max vm] Thanks, and good luck with the OOM killer! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[KJ][PATCH] ROUND_UP macro cleanup in arch/sh64/kernel/pci_sh5.c
ROUND_UP macro cleanup, use ALIGN where ever appropriate. Signed-off-by: Milind Arun Choudhary <[EMAIL PROTECTED]> --- pci_sh5.c | 12 1 files changed, 4 insertions(+), 8 deletions(-) diff --git a/arch/sh64/kernel/pci_sh5.c b/arch/sh64/kernel/pci_sh5.c index 9dae689..11d1fef 100644 --- a/arch/sh64/kernel/pci_sh5.c +++ b/arch/sh64/kernel/pci_sh5.c @@ -376,8 +376,6 @@ irqreturn_t pcish5_serr_irq(int irq, void *dev_id, struct pt_regs *regs) return IRQ_NONE; } -#define ROUND_UP(x, a) (((x) + (a) - 1) & ~((a) - 1)) - static void __init pcibios_size_bridge(struct pci_bus *bus, struct resource *ior, struct resource *memr) @@ -434,8 +432,8 @@ pcibios_size_bridge(struct pci_bus *bus, struct resource *ior, mem_res.end -= mem_res.start; /* Align the sizes up by bridge rules */ - io_res.end = ROUND_UP(io_res.end, 4*1024) - 1; - mem_res.end = ROUND_UP(mem_res.end, 1*1024*1024) - 1; + io_res.end = ALIGN(io_res.end, 4*1024) - 1; + mem_res.end = ALIGN(mem_res.end, 1*1024*1024) - 1; /* Adjust the bridge's allocation requirements */ bridge->resource[0].end = bridge->resource[0].start + io_res.end; @@ -448,18 +446,16 @@ pcibios_size_bridge(struct pci_bus *bus, struct resource *ior, /* adjust parent's resource requirements */ if (ior) { - ior->end = ROUND_UP(ior->end, 4*1024); + ior->end = ALIGN(ior->end, 4*1024); ior->end += io_res.end; } if (memr) { - memr->end = ROUND_UP(memr->end, 1*1024*1024); + memr->end = ALIGN(memr->end, 1*1024*1024); memr->end += mem_res.end; } } -#undef ROUND_UP - static void __init pcibios_size_bridges(void) { struct resource io_res, mem_res; -- Milind Arun Choudhary - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21-rc6
On 4/10/07, Linus Torvalds <[EMAIL PROTECTED]> wrote: On Tue, 10 Apr 2007, Jeff Chua wrote: > I couldn't get suspend-to-disk to work with 2.6.21-rc6. I've tried > set/unset CONFIG_NO_HZ/CONFIG_HPET_TIMER, but nothing worked. Do you think you could busect it? You'd have to apply maxim's patch by hand at each bisection step (up until the point where it's already applied in the git tree, of course), so it's not a totally mindless bisection, but it should still be fairly painless, since there is only 277 commits between -rc5 and -rc6 (so bisection should rather quickly narrow it down) Linus, I did that last night and realize that I could suspend to disk/ram with 2.6.21-rc6 CONFIG_NO_HZ unset. I must have done something wrong before. Thank you, Jeff. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] [RFC] Battery monitoring class
On Thu, Apr 12, 2007 at 03:25:03AM +0400, Anton Vorontsov wrote: > Here is battery monitor class. According to first copyright string, we're > maintaining it since 2003. I've took few days and cleaned it up to be > more suitable for mainline inclusion. > > It differs from battery class at git://git.infradead.org/battery-2.6.git: Why fork from David's work? Does he not like these changes for some reason? > +static int battery_create_attrs(struct battery *bat) > +{ > + int rc; > + > + #define create_bat_attr_conditional(name)\ > + if(bat->get_##name) {\ > + rc = device_create_file(bat->dev, &dev_attr_##name); \ > + if (rc) goto name##_failed; \ > + } > + > + create_bat_attr_conditional(status); > + create_bat_attr_conditional(min_voltage); > + create_bat_attr_conditional(min_current); > + create_bat_attr_conditional(min_capacity); > + create_bat_attr_conditional(max_voltage); > + create_bat_attr_conditional(max_current); > + create_bat_attr_conditional(max_capacity); > + create_bat_attr_conditional(temp); > + create_bat_attr_conditional(voltage); > + create_bat_attr_conditional(current); > + create_bat_attr_conditional(capacity); Use an attribute group please. It's much simpler and will be created at the proper time so your userspace tools don't have to sit and spin in order to properly wait for them to show up. Ok, yes, you want a conditional type of attribute group, like the new firewire code does. I have no problem adding that if you like. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/7] [RFC] Battery monitoring class
On Thu, 12 Apr 2007 03:25:03 +0400 Anton Vorontsov wrote: > Here is battery monitor class. According to first copyright string, we're > maintaining it since 2003. I've took few days and cleaned it up to be > more suitable for mainline inclusion. > > --- > drivers/Kconfig |2 + > drivers/Makefile |1 + > drivers/battery/Kconfig | 11 ++ > drivers/battery/Makefile |1 + > drivers/battery/battery.c | 303 > + > include/linux/battery.h | 98 +++ > 6 files changed, 416 insertions(+), 0 deletions(-) > create mode 100644 drivers/battery/Kconfig > create mode 100644 drivers/battery/Makefile > create mode 100644 drivers/battery/battery.c > create mode 100644 include/linux/battery.h > > diff --git a/drivers/battery/battery.c b/drivers/battery/battery.c > new file mode 100644 > index 000..32b8288 > --- /dev/null > +++ b/drivers/battery/battery.c > @@ -0,0 +1,303 @@ > + > +void battery_status_changed(struct battery *bat) > +{ > + pr_debug("%s\n", __FUNCTION__); > + #ifdef CONFIG_LEDS_TRIGGERS Please don't indent preprocessor controls (ifdef/endif etc.). > + switch(bat->get_status(bat)) > + { > + case BATTERY_STATUS_FULL: > + led_trigger_event(bat->charging_trig, LED_OFF); > + led_trigger_event(bat->full_trig, LED_FULL); > + break; > + case BATTERY_STATUS_CHARGING: > + led_trigger_event(bat->charging_trig, LED_FULL); > + led_trigger_event(bat->full_trig, LED_OFF); > + break; > + default: > + led_trigger_event(bat->charging_trig, LED_OFF); > + led_trigger_event(bat->full_trig, LED_OFF); > + break; Place 'switch' and 'case' at the same indent level. This prevents the "double-indent" for the code statements. > + } > + #endif /* CONFIG_LEDS_TRIGGERS */ > + return; > +} > + > +static char *status_text[] = { > + "Unknown", "Charging", "Discharging", "Not charging", "Full" > +}; > + > +static ssize_t battery_show_status(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct battery *bat = dev_get_drvdata(dev); > + int status = 0; We usually try to place a blank line between local data and code. > + if (bat->get_status) { > + status = bat->get_status(bat); > + if (status > 4) > + status = 0; > + return sprintf(buf, "%s\n", status_text[status]); > + } > + return 0; > +} > + > +static int battery_create_attrs(struct battery *bat) > +{ > + int rc; > + > + #define create_bat_attr_conditional(name)\ > + if(bat->get_##name) {\ space after "if" > + rc = device_create_file(bat->dev, &dev_attr_##name); \ > + if (rc) goto name##_failed; \ > + } > + > + create_bat_attr_conditional(status); > + create_bat_attr_conditional(min_voltage); > + create_bat_attr_conditional(min_current); > + create_bat_attr_conditional(min_capacity); > + create_bat_attr_conditional(max_voltage); > + create_bat_attr_conditional(max_current); > + create_bat_attr_conditional(max_capacity); > + create_bat_attr_conditional(temp); > + create_bat_attr_conditional(voltage); > + create_bat_attr_conditional(current); > + create_bat_attr_conditional(capacity); > + > + #define remove_bat_attr_conditional(name) \ > + if(bat->get_##name) \ ditto. > + device_remove_file(bat->dev, &dev_attr_##name); > + > + goto success; > + > +capacity_failed: remove_bat_attr_conditional(current); > +current_failed: remove_bat_attr_conditional(voltage); > +voltage_failed: remove_bat_attr_conditional(temp); > +temp_failed: remove_bat_attr_conditional(max_capacity); > +max_capacity_failed: remove_bat_attr_conditional(max_current); > +max_current_failed: remove_bat_attr_conditional(max_voltage); > +max_voltage_failed: remove_bat_attr_conditional(min_capacity); > +min_capacity_failed: remove_bat_attr_conditional(min_current); > +min_current_failed: remove_bat_attr_conditional(min_voltage); > +min_voltage_failed: remove_bat_attr_conditional(status); I thought there was a class_remove() or something like that? but I'm not sure of it. > +status_failed: > +success: > + return rc; > +} > + > +static void battery_remove_attrs(struct battery *bat) > +{ > + remove_bat_attr_conditional(capacity); > + remove_bat_attr_conditional(current); > + remove_bat_attr_conditional(voltage); > + remove_bat_attr_conditional(temp); > + remove_bat_attr_conditional(max_capacity); > + remove_bat_attr_conditional(max_current);
Re: [PATCH 0/12] Pass MAP_FIXED down to get_unmapped_area
> Is there any support consideration for nommu arch such as blackfin which > is in the -mm tree now? > > It is very kind of you to point out some idea about MAP_FIXED for > Blackfin arch, I will do some help for this. Right now, my understanding is that nommu archs just reject MAP_FIXED outright... we might be able to be smarter, especially if we bring a better infrastructure which I'm still thinking about. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/17] hfs: remove redundant read_mapping_page error check
Now that read_mapping_page() does error checking internally, there is no need to check PageError here. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/hfs/bnode.c linux-2.6.21-rc6-mm1-test/fs/hfs/bnode.c --- linux-2.6.21-rc6-mm1/fs/hfs/bnode.c 2007-04-09 17:20:13.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/hfs/bnode.c2007-04-10 21:28:03.0 -0700 @@ -282,10 +282,6 @@ static struct hfs_bnode *__hfs_bnode_cre page = read_mapping_page(mapping, block++, NULL); if (IS_ERR(page)) goto fail; - if (PageError(page)) { - page_cache_release(page); - goto fail; - } page_cache_release(page); node->page[i] = page; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/17] jffs2: convert jffs2_gc_fetch_page to read_cache_page
Replace jffs2_gc_fetch_page() and jffs2_gc_release_page() using the read_cache_page() and put_kmapped_page() calls, and update the call site accordingly. Explicit calls to kmap()/kunmap() make the code more clear. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/jffs2/fs.c linux-2.6.21-rc5-mm4-test/fs/jffs2/fs.c --- linux-2.6.21-rc5-mm4/fs/jffs2/fs.c 2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/jffs2/fs.c 2007-04-06 01:59:19.0 -0700 @@ -621,33 +621,6 @@ struct jffs2_inode_info *jffs2_gc_fetch_ return JFFS2_INODE_INFO(inode); } -unsigned char *jffs2_gc_fetch_page(struct jffs2_sb_info *c, - struct jffs2_inode_info *f, - unsigned long offset, - unsigned long *priv) -{ - struct inode *inode = OFNI_EDONI_2SFFJ(f); - struct page *pg; - - pg = read_cache_page(inode->i_mapping, offset >> PAGE_CACHE_SHIFT, -(void *)jffs2_do_readpage_unlock, inode); - if (IS_ERR(pg)) - return (void *)pg; - - *priv = (unsigned long)pg; - return kmap(pg); -} - -void jffs2_gc_release_page(struct jffs2_sb_info *c, - unsigned char *ptr, - unsigned long *priv) -{ - struct page *pg = (void *)*priv; - - kunmap(pg); - page_cache_release(pg); -} - static int jffs2_flash_setup(struct jffs2_sb_info *c) { int ret = 0; diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/jffs2/gc.c linux-2.6.21-rc5-mm4-test/fs/jffs2/gc.c --- linux-2.6.21-rc5-mm4/fs/jffs2/gc.c 2007-04-05 17:13:10.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/jffs2/gc.c 2007-04-06 01:59:19.0 -0700 @@ -1078,7 +1078,7 @@ static int jffs2_garbage_collect_dnode(s uint32_t alloclen, offset, orig_end, orig_start; int ret = 0; unsigned char *comprbuf = NULL, *writebuf; - unsigned long pg; + struct page *page; unsigned char *pg_ptr; memset(&ri, 0, sizeof(ri)); @@ -1219,12 +1219,16 @@ static int jffs2_garbage_collect_dnode(s *page OK. We'll actually write it out again in commit_write, which is a little *suboptimal, but at least we're correct. */ - pg_ptr = jffs2_gc_fetch_page(c, f, start, &pg); + page = read_cache_page(OFNI_EDONI_2SFFJ(f)->i_mapping, + start >> PAGE_CACHE_SHIFT, + (void *)jffs2_do_readpage_unlock, + OFNI_EDONI_2SFFJ(f)); - if (IS_ERR(pg_ptr)) { + if (IS_ERR(page)) { printk(KERN_WARNING "read_cache_page() returned error: %ld\n", PTR_ERR(pg_ptr)); - return PTR_ERR(pg_ptr); + return PTR_ERR(page); } + pg_ptr = kmap(page); offset = start; while(offset < orig_end) { @@ -1287,6 +1291,7 @@ static int jffs2_garbage_collect_dnode(s } } - jffs2_gc_release_page(c, pg_ptr, &pg); + kunmap(page); + page_cache_release(page); return ret; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/17] ntfs: convert ntfs_map_page to read_kmap_page
Replace ntfs_map_page() and ntfs_unmap_page() using the new read_kmap_page() and put_kmapped_page() calls, and their locking variants, and remove unneeded PageError checking. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ntfs/aops.h linux-2.6.21-rc5-mm4-test/fs/ntfs/aops.h --- linux-2.6.21-rc5-mm4/fs/ntfs/aops.h 2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/ntfs/aops.h2007-04-06 01:59:19.0 -0700 @@ -31,73 +31,6 @@ #include "inode.h" -/** - * ntfs_unmap_page - release a page that was mapped using ntfs_map_page() - * @page: the page to release - * - * Unpin, unmap and release a page that was obtained from ntfs_map_page(). - */ -static inline void ntfs_unmap_page(struct page *page) -{ - kunmap(page); - page_cache_release(page); -} - -/** - * ntfs_map_page - map a page into accessible memory, reading it if necessary - * @mapping: address space for which to obtain the page - * @index: index into the page cache for @mapping of the page to map - * - * Read a page from the page cache of the address space @mapping at position - * @index, where @index is in units of PAGE_CACHE_SIZE, and not in bytes. - * - * If the page is not in memory it is loaded from disk first using the readpage - * method defined in the address space operations of @mapping and the page is - * added to the page cache of @mapping in the process. - * - * If the page belongs to an mst protected attribute and it is marked as such - * in its ntfs inode (NInoMstProtected()) the mst fixups are applied but no - * error checking is performed. This means the caller has to verify whether - * the ntfs record(s) contained in the page are valid or not using one of the - * ntfs_is__record{,p}() macros, where is the record type you are - * expecting to see. (For details of the macros, see fs/ntfs/layout.h.) - * - * If the page is in high memory it is mapped into memory directly addressible - * by the kernel. - * - * Finally the page count is incremented, thus pinning the page into place. - * - * The above means that page_address(page) can be used on all pages obtained - * with ntfs_map_page() to get the kernel virtual address of the page. - * - * When finished with the page, the caller has to call ntfs_unmap_page() to - * unpin, unmap and release the page. - * - * Note this does not grant exclusive access. If such is desired, the caller - * must provide it independently of the ntfs_{un}map_page() calls by using - * a {rw_}semaphore or other means of serialization. A spin lock cannot be - * used as ntfs_map_page() can block. - * - * The unlocked and uptodate page is returned on success or an encoded error - * on failure. Caller has to test for error using the IS_ERR() macro on the - * return value. If that evaluates to 'true', the negative error code can be - * obtained using PTR_ERR() on the return value of ntfs_map_page(). - */ -static inline struct page *ntfs_map_page(struct address_space *mapping, - unsigned long index) -{ - struct page *page = read_mapping_page(mapping, index, NULL); - - if (!IS_ERR(page)) { - kmap(page); - if (!PageError(page)) - return page; - ntfs_unmap_page(page); - return ERR_PTR(-EIO); - } - return page; -} - #ifdef NTFS_RW extern void mark_ntfs_record_dirty(struct page *page, const unsigned int ofs); diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ntfs/bitmap.c linux-2.6.21-rc5-mm4-test/fs/ntfs/bitmap.c --- linux-2.6.21-rc5-mm4/fs/ntfs/bitmap.c 2006-11-29 13:57:37.0 -0800 +++ linux-2.6.21-rc5-mm4-test/fs/ntfs/bitmap.c 2007-04-06 12:40:53.0 -0700 @@ -72,7 +72,7 @@ int __ntfs_bitmap_set_bits_in_run(struct /* Get the page containing the first bit (@start_bit). */ mapping = vi->i_mapping; - page = ntfs_map_page(mapping, index); + page = read_kmap_page(mapping, index); if (IS_ERR(page)) { if (!is_rollback) ntfs_error(vi->i_sb, "Failed to map first page (error " @@ -123,8 +123,8 @@ int __ntfs_bitmap_set_bits_in_run(struct /* Update @index and get the next page. */ flush_dcache_page(page); set_page_dirty(page); - ntfs_unmap_page(page); - page = ntfs_map_page(mapping, ++index); + put_kmapped_page(page); + page = read_kmap_page(mapping, ++index); if (IS_ERR(page)) goto rollback; kaddr = page_address(page); @@ -159,7 +159,7 @@ done: /* We are done. Unmap the page and return success. */ flush_dcache_page(page); set_page_dirty(page); - ntfs_unmap_page(page); + put_kmapped_page(page); ntfs_debug("Done."); return 0; rollback: diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ntfs/dir.c linux
[PATCH 1/17] cramfs: use read_mapping_page
read_mapping_page_async() is going away, so convert its only user to read_mapping_page(). This change has not been benchmarked, however, in order to get real parallelism this wants something completely different, like __do_page_cache_readahead(), which is not currently exported. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/cramfs/inode.c linux-2.6.21-rc6-mm1-test/fs/cramfs/inode.c --- linux-2.6.21-rc6-mm1/fs/cramfs/inode.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/cramfs/inode.c 2007-04-09 21:37:09.0 -0700 @@ -180,8 +180,7 @@ static void *cramfs_read(struct super_bl struct page *page = NULL; if (blocknr + i < devsize) { - page = read_mapping_page_async(mapping, blocknr + i, - NULL); + page = read_mapping_page(mapping, blocknr + i, NULL); /* synchronous error? */ if (IS_ERR(page)) page = NULL; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/17] hfsplus: remove redundant read_mapping_page error check
Now that read_mapping_page() does error checking internally, there is no need to check PageError here. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/hfsplus/bnode.c linux-2.6.21-rc6-mm1-test/fs/hfsplus/bnode.c --- linux-2.6.21-rc6-mm1/fs/hfsplus/bnode.c 2007-04-09 17:20:13.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/hfsplus/bnode.c2007-04-10 21:28:45.0 -0700 @@ -442,10 +442,6 @@ static struct hfs_bnode *__hfs_bnode_cre page = read_mapping_page(mapping, block, NULL); if (IS_ERR(page)) goto fail; - if (PageError(page)) { - page_cache_release(page); - goto fail; - } page_cache_release(page); node->page[i] = page; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 14/17] reiserfs: convert reiserfs_get_page to read_kmap_page
Replace reiserfs_get_page() and reiserfs_put_page() using the new read_kmap_page() and put_kmapped_page() calls and their locking variants. Also, propagate the gfp_mask() deadlock comment to callsites. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/reiserfs/xattr.c linux-2.6.21-rc5-mm4-test/fs/reiserfs/xattr.c --- linux-2.6.21-rc5-mm4/fs/reiserfs/xattr.c2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/reiserfs/xattr.c 2007-04-06 14:41:34.0 -0700 @@ -438,33 +438,6 @@ int xattr_readdir(struct file *file, fil return res; } -/* Internal operations on file data */ -static inline void reiserfs_put_page(struct page *page) -{ - kunmap(page); - page_cache_release(page); -} - -static struct page *reiserfs_get_page(struct inode *dir, unsigned long n) -{ - struct address_space *mapping = dir->i_mapping; - struct page *page; - /* We can deadlock if we try to free dentries, - and an unlink/rmdir has just occured - GFP_NOFS avoids this */ - mapping_set_gfp_mask(mapping, GFP_NOFS); - page = read_mapping_page(mapping, n, NULL); - if (!IS_ERR(page)) { - kmap(page); - if (PageError(page)) - goto fail; - } - return page; - - fail: - reiserfs_put_page(page); - return ERR_PTR(-EIO); -} - static inline __u32 xattr_hash(const char *msg, int len) { return csum_partial(msg, len, 0); @@ -537,13 +510,15 @@ reiserfs_xattr_set(struct inode *inode, else chunk = buffer_size - buffer_pos; - page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT); + /* We can deadlock if we try to free dentries, + and an unlink/rmdir has just occured - GFP_NOFS avoids this */ + mapping_set_gfp_mask(mapping, GFP_NOFS); + page = __read_kmap_page(mapping, file_pos >> PAGE_CACHE_SHIFT); if (IS_ERR(page)) { err = PTR_ERR(page); goto out_filp; } - lock_page(page); data = page_address(page); if (file_pos == 0) { @@ -566,8 +541,7 @@ reiserfs_xattr_set(struct inode *inode, page_offset + chunk + skip); } - unlock_page(page); - reiserfs_put_page(page); + put_locked_page(page); buffer_pos += chunk; file_pos += chunk; skip = 0; @@ -646,13 +620,15 @@ reiserfs_xattr_get(const struct inode *i else chunk = isize - file_pos; - page = reiserfs_get_page(xinode, file_pos >> PAGE_CACHE_SHIFT); + /* We can deadlock if we try to free dentries, + and an unlink/rmdir has just occured - GFP_NOFS avoids this */ + mapping_set_gfp_mask(xinode->i_mapping, GFP_NOFS); + page = __read_kmap_page(xinode->i_mapping, file_pos >> PAGE_CACHE_SHIFT); if (IS_ERR(page)) { err = PTR_ERR(page); goto out_dput; } - lock_page(page); data = page_address(page); if (file_pos == 0) { struct reiserfs_xattr_header *rxh = @@ -661,8 +637,7 @@ reiserfs_xattr_get(const struct inode *i chunk -= skip; /* Magic doesn't match up.. */ if (rxh->h_magic != cpu_to_le32(REISERFS_XATTR_MAGIC)) { - unlock_page(page); - reiserfs_put_page(page); + put_locked_page(page); reiserfs_warning(inode->i_sb, "Invalid magic for xattr (%s) " "associated with %k", name, @@ -673,8 +648,7 @@ reiserfs_xattr_get(const struct inode *i hash = le32_to_cpu(rxh->h_hash); } memcpy(buffer + buffer_pos, data + skip, chunk); - unlock_page(page); - reiserfs_put_page(page); + put_locked_page(page); file_pos += chunk; buffer_pos += chunk; skip = 0; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 17/17] vxfs: convert vxfs_get_page to read_kmap_page
Replace vxfs_get_page() with the new read_kmap_page(). Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_extern.h linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_extern.h --- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_extern.h 2007-04-05 17:13:29.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_extern.h 2007-04-06 01:59:19.0 -0700 @@ -69,7 +69,6 @@ extern const struct file_operations vxfs extern int vxfs_read_olt(struct super_block *, u_long); /* vxfs_subr.c */ -extern struct page * vxfs_get_page(struct address_space *, u_long); extern voidvxfs_put_page(struct page *); extern struct buffer_head *vxfs_bread(struct inode *, int); diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_inode.c linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_inode.c --- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_inode.c 2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_inode.c 2007-04-06 01:59:19.0 -0700 @@ -138,7 +138,7 @@ __vxfs_iget(ino_t ino, struct inode *ili u_long offset; offset = (ino % (PAGE_SIZE / VXFS_ISIZE)) * VXFS_ISIZE; - pp = vxfs_get_page(ilistp->i_mapping, ino * VXFS_ISIZE / PAGE_SIZE); + pp = read_kmap_page(ilistp->i_mapping, ino * VXFS_ISIZE / PAGE_SIZE); if (!IS_ERR(pp)) { struct vxfs_inode_info *vip; diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_lookup.c linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_lookup.c --- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_lookup.c 2007-04-05 17:13:29.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_lookup.c 2007-04-06 01:59:19.0 -0700 @@ -125,7 +125,7 @@ vxfs_find_entry(struct inode *ip, struct caddr_t kaddr; struct page *pp; - pp = vxfs_get_page(ip->i_mapping, page); + pp = read_kmap_page(ip->i_mapping, page); if (IS_ERR(pp)) continue; kaddr = (caddr_t)page_address(pp); @@ -280,7 +280,7 @@ vxfs_readdir(struct file *fp, void *retp caddr_t kaddr; struct page *pp; - pp = vxfs_get_page(ip->i_mapping, page); + pp = read_kmap_page(ip->i_mapping, page); if (IS_ERR(pp)) continue; kaddr = (caddr_t)page_address(pp); diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_subr.c linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_subr.c --- linux-2.6.21-rc5-mm4/fs/freevxfs/vxfs_subr.c2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/freevxfs/vxfs_subr.c 2007-04-06 01:59:19.0 -0700 @@ -56,39 +56,6 @@ vxfs_put_page(struct page *pp) } /** - * vxfs_get_page - read a page into memory. - * @ip:inode to read from - * @n: page number - * - * Description: - * vxfs_get_page reads the @n th page of @ip into the pagecache. - * - * Returns: - * The wanted page on success, else a NULL pointer. - */ -struct page * -vxfs_get_page(struct address_space *mapping, u_long n) -{ - struct page * pp; - - pp = read_mapping_page(mapping, n, NULL); - - if (!IS_ERR(pp)) { - kmap(pp); - /** if (!PageChecked(pp)) **/ - /** vxfs_check_page(pp); **/ - if (PageError(pp)) - goto fail; - } - - return (pp); - -fail: - vxfs_put_page(pp); - return ERR_PTR(-EIO); -} - -/** * vxfs_bread - read buffer for a give inode,block tuple * @ip:inode * @block: logical block - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 15/17] sysv: convert dir_get_page to read_kmap_page
Replace sysv dir_get_page() with the new read_kmap_page(). Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/sysv/dir.c linux-2.6.21-rc5-mm4-test/fs/sysv/dir.c --- linux-2.6.21-rc5-mm4/fs/sysv/dir.c 2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/sysv/dir.c 2007-04-06 01:59:19.0 -0700 @@ -50,15 +50,6 @@ static int dir_commit_chunk(struct page return err; } -static struct page * dir_get_page(struct inode *dir, unsigned long n) -{ - struct address_space *mapping = dir->i_mapping; - struct page *page = read_mapping_page(mapping, n, NULL); - if (!IS_ERR(page)) - kmap(page); - return page; -} - static int sysv_readdir(struct file * filp, void * dirent, filldir_t filldir) { unsigned long pos = filp->f_pos; @@ -77,7 +68,7 @@ static int sysv_readdir(struct file * fi for ( ; n < npages; n++, offset = 0) { char *kaddr, *limit; struct sysv_dir_entry *de; - struct page *page = dir_get_page(inode, n); + struct page *page = read_kmap_page(inode->i_mapping, n); if (IS_ERR(page)) continue; @@ -149,7 +140,7 @@ struct sysv_dir_entry *sysv_find_entry(s do { char *kaddr; - page = dir_get_page(dir, n); + page = read_kmap_page(dir->i_mapping, n); if (!IS_ERR(page)) { kaddr = (char*)page_address(page); de = (struct sysv_dir_entry *) kaddr; @@ -191,7 +182,7 @@ int sysv_add_link(struct dentry *dentry, /* We take care of directory expansion in the same loop */ for (n = 0; n <= npages; n++) { - page = dir_get_page(dir, n); + page = read_kmap_page(dir->i_mapping, n); err = PTR_ERR(page); if (IS_ERR(page)) goto out; @@ -299,7 +290,7 @@ int sysv_empty_dir(struct inode * inode) for (i = 0; i < npages; i++) { char *kaddr; struct sysv_dir_entry * de; - page = dir_get_page(inode, i); + page = read_kmap_page(inode->i_mapping, i); if (IS_ERR(page)) continue; @@ -353,7 +344,7 @@ void sysv_set_link(struct sysv_dir_entry struct sysv_dir_entry * sysv_dotdot (struct inode *dir, struct page **p) { - struct page *page = dir_get_page(dir, 0); + struct page *page = read_kmap_page(dir->i_mapping, 0); struct sysv_dir_entry *de = NULL; if (!IS_ERR(page)) { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/12] Pass MAP_FIXED down to get_unmapped_area
On Thu, 2007-04-12 at 12:20 +1000, Benjamin Herrenschmidt wrote: > This is a "first step" as there are still cleanups to be done in various > areas touched by that code but I think it's probably good to go as is and > at least enables me to implement what I need for PowerPC. > > (Andrew, this is also candidate for 2.6.22 since I haven't had any real > objection, mostly suggestion for improving further, which I'll try to > do later, and I have further powerpc patches that rely on this). > > The current get_unmapped_area code calls the f_ops->get_unmapped_area or > the arch one (via the mm) only when MAP_FIXED is not passed. That makes > it impossible for archs to impose proper constraints on regions of the > virtual address space. To work around that, get_unmapped_area() then > calls some hugetlbfs specific hacks. > > This cause several problems, among others: > > - It makes it impossible for a driver or filesystem to do the same thing > that hugetlbfs does (for example, to allow a driver to use larger page > sizes to map external hardware) if that requires applying a constraint > on the addresses (constraining that mapping in certain regions and other > mappings out of those regions). > > - Some archs like arm, mips, sparc, sparc64, sh and sh64 already want > MAP_FIXED to be passed down in order to deal with aliasing issues. > The code is there to handle it... but is never called. > Is there any support consideration for nommu arch such as blackfin which is in the -mm tree now? It is very kind of you to point out some idea about MAP_FIXED for Blackfin arch, I will do some help for this. Thanks -Bryan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 16/17] ufs: convert ufs_get_page to read_kmap_page
Replace ufs_get_page()/ufs_get_locked_page() and ufs_put_page()/ufs_put_locked_page() using the new read_kmap_page() and put_kmapped_page() calls and their locking variants. Also, change the ufs_check_page() call to return the page's error status, and update the call sites accordingly. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/balloc.c linux-2.6.21-rc5-mm4-test/fs/ufs/balloc.c --- linux-2.6.21-rc5-mm4/fs/ufs/balloc.c2007-04-05 17:13:29.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/ufs/balloc.c 2007-04-06 12:46:02.0 -0700 @@ -272,7 +272,7 @@ static void ufs_change_blocknr(struct in index = i >> (PAGE_CACHE_SHIFT - inode->i_blkbits); if (likely(cur_index != index)) { - page = ufs_get_locked_page(mapping, index); + page = __read_mapping_page(mapping, index, NULL); if (!page)/* it was truncated */ continue; if (IS_ERR(page)) {/* or EIO */ @@ -325,8 +325,10 @@ static void ufs_change_blocknr(struct in bh = bh->b_this_page; } while (bh != head); - if (likely(cur_index != index)) - ufs_put_locked_page(page); + if (likely(cur_index != index)) { + unlock_page(page); + page_cache_release(page); + } } UFSD("EXIT\n"); } diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/truncate.c linux-2.6.21-rc5-mm4-test/fs/ufs/truncate.c --- linux-2.6.21-rc5-mm4/fs/ufs/truncate.c 2007-04-05 17:13:29.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/ufs/truncate.c 2007-04-06 12:46:14.0 -0700 @@ -395,8 +395,9 @@ static int ufs_alloc_lastblock(struct in lastfrag--; - lastpage = ufs_get_locked_page(mapping, lastfrag >> - (PAGE_CACHE_SHIFT - inode->i_blkbits)); + lastpage = __read_mapping_page(mapping, lastfrag >> + (PAGE_CACHE_SHIFT - inode->i_blkbits), + NULL); if (IS_ERR(lastpage)) { err = -EIO; goto out; @@ -441,7 +442,8 @@ static int ufs_alloc_lastblock(struct in } } out_unlock: - ufs_put_locked_page(lastpage); + unlock_page(lastpage); + page_cache_release(lastpage); out: return err; } diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/util.c linux-2.6.21-rc5-mm4-test/fs/ufs/util.c --- linux-2.6.21-rc5-mm4/fs/ufs/util.c 2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/ufs/util.c 2007-04-06 12:40:53.0 -0700 @@ -232,55 +232,3 @@ ufs_set_inode_dev(struct super_block *sb ufsi->i_u1.i_data[0] = cpu_to_fs32(sb, fs32); } -/** - * ufs_get_locked_page() - locate, pin and lock a pagecache page, if not exist - * read it from disk. - * @mapping: the address_space to search - * @index: the page index - * - * Locates the desired pagecache page, if not exist we'll read it, - * locks it, increments its reference - * count and returns its address. - * - */ - -struct page *ufs_get_locked_page(struct address_space *mapping, -pgoff_t index) -{ - struct page *page; - - page = find_lock_page(mapping, index); - if (!page) { - page = read_mapping_page(mapping, index, NULL); - - if (IS_ERR(page)) { - printk(KERN_ERR "ufs_change_blocknr: " - "read_mapping_page error: ino %lu, index: %lu\n", - mapping->host->i_ino, index); - goto out; - } - - lock_page(page); - - if (unlikely(page->mapping == NULL)) { - /* Truncate got there first */ - unlock_page(page); - page_cache_release(page); - page = NULL; - goto out; - } - - if (!PageUptodate(page) || PageError(page)) { - unlock_page(page); - page_cache_release(page); - - printk(KERN_ERR "ufs_change_blocknr: " - "can not read page: ino %lu, index: %lu\n", - mapping->host->i_ino, index); - - page = ERR_PTR(-EIO); - } - } -out: - return page; -} diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ufs/util.h linux-2.6.21-rc5-mm4-test/fs/ufs/util.h --- linux-2.6.21-rc5-mm4/fs/ufs/util.h 2007-04-05 17:13:29.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/ufs/util.h 2007-04-06 12:46:36.0 -0700 @@ -251,16 +251,6 @@ extern void _ubh_ubhcpymem_(struct ufs_s #define ubh_memcpyubh(ubh,mem,size) _ubh_
[PATCH 13/17] reiser4: remove redundant read_mapping_page error checks
read_mapping_page() is now fully synchronous, so there's no need wait for the page lock or check for I/O errors. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/tail_conversion.c linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/tail_conversion.c --- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/file/tail_conversion.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/file/tail_conversion.c 2007-04-10 21:33:47.0 -0700 @@ -608,14 +608,6 @@ int extent2tail(unix_file_info_t *uf_inf break; } - wait_on_page_locked(page); - - if (!PageUptodate(page)) { - page_cache_release(page); - result = RETERR(-EIO); - break; - } - /* cut part of file we have read */ start_byte = (__u64) (i << PAGE_CACHE_SHIFT); set_key_offset(&from, start_byte); diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/extent_file_ops.c linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/extent_file_ops.c --- linux-2.6.21-rc6-mm1/fs/reiser4/plugin/item/extent_file_ops.c 2007-04-10 19:41:14.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/reiser4/plugin/item/extent_file_ops.c 2007-04-10 21:38:41.0 -0700 @@ -1220,15 +1220,8 @@ int reiser4_read_extent(struct file *fil page = read_mapping_page(mapping, cur_page, file); if (IS_ERR(page)) return PTR_ERR(page); - lock_page(page); - if (!PageUptodate(page)) { - unlock_page(page); - page_cache_release(page); - warning("jmacd-97178", "extent_read: page is not up to date"); - return RETERR(-EIO); - } + mark_page_accessed(page); - unlock_page(page); /* If users can be writing to this page using arbitrary virtual addresses, take care about potential aliasing before reading - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/17] partition: remove redundant read_mapping_page error checks
Remove unneeded PageError checking in read_dev_sector(), and clean up the code a bit. Can anyone point out why it's OK to use page_address() here on a page which has not been kmapped? If it's not OK, then a good number of callers need to be fixed. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/partitions/check.c linux-2.6.21-rc6-mm1-test/fs/partitions/check.c --- linux-2.6.21-rc6-mm1/fs/partitions/check.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/partitions/check.c 2007-04-10 21:59:01.0 -0700 @@ -568,16 +568,12 @@ unsigned char *read_dev_sector(struct bl page = read_mapping_page(mapping, (pgoff_t)(n >> (PAGE_CACHE_SHIFT-9)), NULL); - if (!IS_ERR(page)) { - if (PageError(page)) - goto fail; - p->v = page; - return (unsigned char *)page_address(page) + ((n & ((1 << (PAGE_CACHE_SHIFT - 9)) - 1)) << 9); -fail: - page_cache_release(page); + if (IS_ERR(page)) { + p->v = NULL; + return NULL; } - p->v = NULL; - return NULL; + p->v = page; + return (unsigned char *)page_address(page) + ((n & ((1 << (PAGE_CACHE_SHIFT - 9)) - 1)) << 9); } EXPORT_SYMBOL(read_dev_sector); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/17] minix: convert dir_get_page to read_kmap_page
Replace minix dir_get_page() and dir_put_page() using the new read_kmap_page() and put_kmapped_page()/put_locked_page() calls. Also, use __read_kmap_page() instead of re-taking the page_lock. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/minix/dir.c linux-2.6.21-rc5-mm4-test/fs/minix/dir.c --- linux-2.6.21-rc5-mm4/fs/minix/dir.c 2007-04-05 17:14:25.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/minix/dir.c2007-04-06 02:31:55.0 -0700 @@ -23,12 +23,6 @@ const struct file_operations minix_dir_o .fsync = minix_sync_file, }; -static inline void dir_put_page(struct page *page) -{ - kunmap(page); - page_cache_release(page); -} - /* * Return the offset into page `page_nr' of the last valid * byte in that page, plus one. @@ -60,22 +54,6 @@ static int dir_commit_chunk(struct page return err; } -static struct page * dir_get_page(struct inode *dir, unsigned long n) -{ - struct address_space *mapping = dir->i_mapping; - struct page *page = read_mapping_page(mapping, n, NULL); - if (!IS_ERR(page)) { - kmap(page); - if (!PageUptodate(page)) - goto fail; - } - return page; - -fail: - dir_put_page(page); - return ERR_PTR(-EIO); -} - static inline void *minix_next_entry(void *de, struct minix_sb_info *sbi) { return (void*)((char*)de + sbi->s_dirsize); @@ -102,7 +80,7 @@ static int minix_readdir(struct file * f for ( ; n < npages; n++, offset = 0) { char *p, *kaddr, *limit; - struct page *page = dir_get_page(inode, n); + struct page *page = read_kmap_page(inode->i_mapping, n); if (IS_ERR(page)) continue; @@ -128,12 +106,12 @@ static int minix_readdir(struct file * f (n << PAGE_CACHE_SHIFT) | offset, inumber, DT_UNKNOWN); if (over) { - dir_put_page(page); + put_kmapped_page(page); goto done; } } } - dir_put_page(page); + put_kmapped_page(page); } done: @@ -177,7 +155,7 @@ minix_dirent *minix_find_entry(struct de for (n = 0; n < npages; n++) { char *kaddr, *limit; - page = dir_get_page(dir, n); + page = read_kmap_page(dir->i_mapping, n); if (IS_ERR(page)) continue; @@ -198,7 +176,7 @@ minix_dirent *minix_find_entry(struct de if (namecompare(namelen, sbi->s_namelen, name, namx)) goto found; } - dir_put_page(page); + put_kmapped_page(page); } return NULL; @@ -233,11 +211,10 @@ int minix_add_link(struct dentry *dentry for (n = 0; n <= npages; n++) { char *limit, *dir_end; - page = dir_get_page(dir, n); + page = __read_kmap_page(dir->i_mapping, n); err = PTR_ERR(page); if (IS_ERR(page)) goto out; - lock_page(page); kaddr = (char*)page_address(page); dir_end = kaddr + minix_last_byte(dir, n); limit = kaddr + PAGE_CACHE_SIZE - sbi->s_dirsize; @@ -265,8 +242,7 @@ int minix_add_link(struct dentry *dentry if (namecompare(namelen, sbi->s_namelen, name, namx)) goto out_unlock; } - unlock_page(page); - dir_put_page(page); + put_locked_page(page); } BUG(); return -EINVAL; @@ -288,13 +264,12 @@ got_it: err = dir_commit_chunk(page, from, to); dir->i_mtime = dir->i_ctime = CURRENT_TIME_SEC; mark_inode_dirty(dir); -out_put: - dir_put_page(page); + put_kmapped_page(page); out: return err; out_unlock: - unlock_page(page); - goto out_put; + put_locked_page(page); + return err; } int minix_delete_entry(struct minix_dir_entry *de, struct page *page) @@ -314,7 +289,7 @@ int minix_delete_entry(struct minix_dir_ } else { unlock_page(page); } - dir_put_page(page); + put_kmapped_page(page); inode->i_ctime = inode->i_mtime = CURRENT_TIME_SEC; mark_inode_dirty(inode); return err; @@ -378,7 +353,7 @@ int minix_empty_dir(struct inode * inode for (i = 0; i < npages; i++) { char *p, *kaddr, *limit; - page = dir_get_page(inode, i); + page = read_kmap_page(inode->i_mapping, i); if (IS_ERR(page)
[PATCH 10/17] mtd: convert page_read to read_kmap_page
Replace page_read() with read_kmap_page()/__read_kmap_page(). This probably fixes behaviour on highmem systems, since page_address() was being used without kmap(). Also eliminate the need to re-take the page lock during writes to the page. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/drivers/mtd/devices/block2mtd.c linux-2.6.21-rc5-mm4-test/drivers/mtd/devices/block2mtd.c --- linux-2.6.21-rc5-mm4/drivers/mtd/devices/block2mtd.c2007-04-05 17:14:24.0 -0700 +++ linux-2.6.21-rc5-mm4-test/drivers/mtd/devices/block2mtd.c 2007-04-06 01:59:19.0 -0700 @@ -39,12 +39,6 @@ struct block2mtd_dev { /* Static info about the MTD, used in cleanup_module */ static LIST_HEAD(blkmtd_device_list); - -static struct page *page_read(struct address_space *mapping, int index) -{ - return read_mapping_page(mapping, index, NULL); -} - /* erase a specified part of the device */ static int _block2mtd_erase(struct block2mtd_dev *dev, loff_t to, size_t len) { @@ -56,23 +50,19 @@ static int _block2mtd_erase(struct block u_long *max; while (pages) { - page = page_read(mapping, index); - if (!page) - return -ENOMEM; + page = __read_kmap_page(mapping, index); if (IS_ERR(page)) return PTR_ERR(page); max = page_address(page) + PAGE_SIZE; for (p=page_address(page); pblkdev->bd_inode->i_mapping, index); - if (!page) - return -ENOMEM; + page = read_kmap_page(dev->blkdev->bd_inode->i_mapping, index); if (IS_ERR(page)) return PTR_ERR(page); memcpy(buf, page_address(page) + offset, cpylen); - page_cache_release(page); + put_kmapped_page(page); if (retlen) *retlen += cpylen; @@ -163,19 +151,15 @@ static int _block2mtd_write(struct block cpylen = len; // this page len = len - cpylen; - page = page_read(mapping, index); - if (!page) - return -ENOMEM; + page = __read_kmap_page(mapping, index); if (IS_ERR(page)) return PTR_ERR(page); if (memcmp(page_address(page)+offset, buf, cpylen)) { - lock_page(page); memcpy(page_address(page) + offset, buf, cpylen); set_page_dirty(page); - unlock_page(page); } - page_cache_release(page); + put_locked_page(page); if (retlen) *retlen += cpylen; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/17] ext2: convert ext2_get_page to read_kmap_page
Replace ext2_get_page() and ext2_put_page() using the new read_kmap_page() and put_kmapped_page() calls. Also, change the ext2_check_page() call to return the page's error status, and update the call sites accordingly. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/ext2/dir.c linux-2.6.21-rc5-mm4-test/fs/ext2/dir.c --- linux-2.6.21-rc5-mm4/fs/ext2/dir.c 2007-04-06 12:27:03.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/ext2/dir.c 2007-04-06 14:34:23.0 -0700 @@ -35,12 +35,6 @@ static inline unsigned ext2_chunk_size(s return inode->i_sb->s_blocksize; } -static inline void ext2_put_page(struct page *page) -{ - kunmap(page); - page_cache_release(page); -} - static inline unsigned long dir_pages(struct inode *inode) { return (inode->i_size+PAGE_CACHE_SIZE-1)>>PAGE_CACHE_SHIFT; @@ -74,7 +68,7 @@ static int ext2_commit_chunk(struct page return err; } -static void ext2_check_page(struct page *page) +static int ext2_check_page(struct page *page) { struct inode *dir = page->mapping->host; struct super_block *sb = dir->i_sb; @@ -86,6 +80,14 @@ static void ext2_check_page(struct page ext2_dirent *p; char *error; + if (likely(PageChecked(page))) { + if (likely(!PageError(page))) + return 0; + + put_kmapped_page(page); + return -EIO; + } + if ((dir->i_size >> PAGE_CACHE_SHIFT) == page->index) { limit = dir->i_size & ~PAGE_CACHE_MASK; if (limit & (chunk_size - 1)) @@ -112,7 +114,7 @@ static void ext2_check_page(struct page goto Eend; out: SetPageChecked(page); - return; + return 0; /* Too bad, we had an error */ @@ -153,24 +155,8 @@ Eend: fail: SetPageChecked(page); SetPageError(page); -} - -static struct page * ext2_get_page(struct inode *dir, unsigned long n) -{ - struct address_space *mapping = dir->i_mapping; - struct page *page = read_mapping_page(mapping, n, NULL); - if (!IS_ERR(page)) { - kmap(page); - if (!PageChecked(page)) - ext2_check_page(page); - if (PageError(page)) - goto fail; - } - return page; - -fail: - ext2_put_page(page); - return ERR_PTR(-EIO); + put_kmapped_page(page); + return -EIO; } /* @@ -262,9 +248,9 @@ ext2_readdir (struct file * filp, void * for ( ; n < npages; n++, offset = 0) { char *kaddr, *limit; ext2_dirent *de; - struct page *page = ext2_get_page(inode, n); + struct page *page = read_kmap_page(inode->i_mapping, n); - if (IS_ERR(page)) { + if (IS_ERR(page) || ext2_check_page(page)) { ext2_error(sb, __FUNCTION__, "bad page in #%lu", inode->i_ino); @@ -286,7 +272,7 @@ ext2_readdir (struct file * filp, void * if (de->rec_len == 0) { ext2_error(sb, __FUNCTION__, "zero-length directory entry"); - ext2_put_page(page); + put_kmapped_page(page); return -EIO; } if (de->inode) { @@ -301,13 +287,13 @@ ext2_readdir (struct file * filp, void * (nf_pos += le16_to_cpu(de->rec_len); } - ext2_put_page(page); + put_kmapped_page(page); } return 0; } @@ -344,8 +330,8 @@ struct ext2_dir_entry_2 * ext2_find_entr n = start; do { char *kaddr; - page = ext2_get_page(dir, n); - if (!IS_ERR(page)) { + page = read_kmap_page(dir->i_mapping, n); + if (!IS_ERR(page) && !ext2_check_page(page)) { kaddr = page_address(page); de = (ext2_dirent *) kaddr; kaddr += ext2_last_byte(dir, n) - reclen; @@ -353,14 +339,14 @@ struct ext2_dir_entry_2 * ext2_find_entr if (de->rec_len == 0) { ext2_error(dir->i_sb, __FUNCTION__, "zero-length directory entry"); - ext2_put_page(page); +
[PATCH 8/17] jfs: use locking read_mapping_page
Use the new locking variant of read_mapping_page to avoid doing extra work. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/jfs/jfs_metapage.c linux-2.6.21-rc6-mm1-test/fs/jfs/jfs_metapage.c --- linux-2.6.21-rc6-mm1/fs/jfs/jfs_metapage.c 2007-04-09 17:23:48.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/jfs/jfs_metapage.c 2007-04-09 21:37:09.0 -0700 @@ -632,12 +632,11 @@ struct metapage *__get_metapage(struct i } SetPageUptodate(page); } else { - page = read_mapping_page(mapping, page_index, NULL); - if (IS_ERR(page) || !PageUptodate(page)) { + page = __read_mapping_page(mapping, page_index, NULL); + if (IS_ERR(page)) { jfs_err("read_mapping_page failed!"); return NULL; } - lock_page(page); } mp = page_to_mp(page, page_offset); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/17] fs: introduce new read_cache_page interface
Export a single version of read_cache_page, which returns with a locked, Uptodate page or a synchronous error, and use inline helper functions to replicate the old behavior. Also, introduce new helper functions for the most common file system uses, which include kmapping the page, as well as needing to keep the page locked. These changes collectively eliminate a substantial amount of private fs logic in favor of generic code. It also simplifies filemap.c significantly, by assuming that callers want synchronous behavior, which is true for all callers anyway except one. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/include/linux/pagemap.h linux-2.6.21-rc6-mm1-test/include/linux/pagemap.h --- linux-2.6.21-rc6-mm1/include/linux/pagemap.h2007-04-11 14:22:19.0 -0700 +++ linux-2.6.21-rc6-mm1-test/include/linux/pagemap.h 2007-04-11 14:29:31.0 -0700 @@ -108,21 +108,30 @@ static inline struct page *grab_cache_pa extern struct page * grab_cache_page_nowait(struct address_space *mapping, unsigned long index); -extern struct page * read_cache_page_async(struct address_space *mapping, - unsigned long index, filler_t *filler, - void *data); -extern struct page * read_cache_page(struct address_space *mapping, +extern struct page *__read_cache_page(struct address_space *mapping, unsigned long index, filler_t *filler, void *data); extern int read_cache_pages(struct address_space *mapping, struct list_head *pages, filler_t *filler, void *data); -static inline struct page *read_mapping_page_async( - struct address_space *mapping, +void fastcall unlock_page(struct page *page); +static inline struct page *read_cache_page(struct address_space *mapping, + unsigned long index, filler_t *filler, + void *data) +{ + struct page *page; + + page = __read_cache_page(mapping, index, filler, data); + if (!IS_ERR(page)) + unlock_page(page); + return page; +} + +static inline struct page *__read_mapping_page(struct address_space *mapping, unsigned long index, void *data) { filler_t *filler = (filler_t *)mapping->a_ops->readpage; - return read_cache_page_async(mapping, index, filler, data); + return __read_cache_page(mapping, index, filler, data); } static inline struct page *read_mapping_page(struct address_space *mapping, @@ -132,6 +141,36 @@ static inline struct page *read_mapping_ return read_cache_page(mapping, index, filler, data); } +static inline struct page *__read_kmap_page(struct address_space *mapping, + unsigned long index) +{ + struct page *page = __read_mapping_page(mapping, index, NULL); + if (!IS_ERR(page)) + kmap(page); + return page; +} + +static inline struct page *read_kmap_page(struct address_space *mapping, + unsigned long index) +{ + struct page *page = read_mapping_page(mapping, index, NULL); + if (!IS_ERR(page)) + kmap(page); + return page; +} + +static inline void put_kmapped_page(struct page *page) +{ + kunmap(page); + page_cache_release(page); +} + +static inline void put_locked_page(struct page *page) +{ + unlock_page(page); + put_kmapped_page(page); +} + int add_to_page_cache(struct page *page, struct address_space *mapping, unsigned long index, gfp_t gfp_mask); int add_to_page_cache_lru(struct page *page, struct address_space *mapping, diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/mm/filemap.c linux-2.6.21-rc6-mm1-test/mm/filemap.c --- linux-2.6.21-rc6-mm1/mm/filemap.c 2007-04-11 14:26:42.0 -0700 +++ linux-2.6.21-rc6-mm1-test/mm/filemap.c 2007-04-10 21:46:03.0 -0700 @@ -1600,115 +1600,53 @@ int generic_file_readonly_mmap(struct fi EXPORT_SYMBOL(generic_file_mmap); EXPORT_SYMBOL(generic_file_readonly_mmap); -static struct page *__read_cache_page(struct address_space *mapping, - unsigned long index, - int (*filler)(void *,struct page*), - void *data) -{ - struct page *page, *cached_page = NULL; - int err; -repeat: - page = find_get_page(mapping, index); - if (!page) { - if (!cached_page) { - cached_page = page_cache_alloc_cold(mapping); - if (!cached_page) - return ERR_PTR(-ENOMEM); - } - err = add_to_page_cache_lru(cached_page, mapping, - index,
[PATCH 3/17] afs: convert afs_dir_get_page to read_kmap_page
Replace afs_dir_get_page() and afs_dir_put_page() using the new read_kmap_page() and put_kmapped_page() calls, and eliminate unnecessary PageError checks. Also, change the afs_dir_check_page() call to return the page's error status, and update the call site accordingly. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- diff -urpN -X dontdiff linux-2.6.21-rc5-mm4/fs/afs/dir.c linux-2.6.21-rc5-mm4-test/fs/afs/dir.c --- linux-2.6.21-rc5-mm4/fs/afs/dir.c 2007-04-06 12:27:03.0 -0700 +++ linux-2.6.21-rc5-mm4-test/fs/afs/dir.c 2007-04-06 14:30:22.0 -0700 @@ -115,12 +115,15 @@ struct afs_dir_lookup_cookie { /* * check that a directory page is valid */ -static inline void afs_dir_check_page(struct inode *dir, struct page *page) +static inline int afs_dir_check_page(struct inode *dir, struct page *page) { struct afs_dir_page *dbuf; loff_t latter; int tmp, qty; + if (likely(PageChecked(page))) + return PageError(page); + #if 0 /* check the page count */ qty = desc.size / sizeof(dbuf->blocks[0]); @@ -154,52 +157,16 @@ static inline void afs_dir_check_page(st } SetPageChecked(page); - return; + return 0; error: SetPageChecked(page); SetPageError(page); - + return 1; } /* end afs_dir_check_page() */ /*/ /* - * discard a page cached in the pagecache - */ -static inline void afs_dir_put_page(struct page *page) -{ - kunmap(page); - page_cache_release(page); - -} /* end afs_dir_put_page() */ - -/*/ -/* - * get a page into the pagecache - */ -static struct page *afs_dir_get_page(struct inode *dir, unsigned long index) -{ - struct page *page; - - _enter("{%lu},%lu", dir->i_ino, index); - - page = read_mapping_page(dir->i_mapping, index, NULL); - if (!IS_ERR(page)) { - kmap(page); - if (!PageChecked(page)) - afs_dir_check_page(dir, page); - if (PageError(page)) - goto fail; - } - return page; - - fail: - afs_dir_put_page(page); - return ERR_PTR(-EIO); -} /* end afs_dir_get_page() */ - -/*/ -/* * open an AFS directory file */ static int afs_dir_open(struct inode *inode, struct file *file) @@ -344,11 +311,16 @@ static int afs_dir_iterate(struct inode blkoff = *fpos & ~(sizeof(union afs_dir_block) - 1); /* fetch the appropriate page from the directory */ - page = afs_dir_get_page(dir, blkoff / PAGE_SIZE); + page = read_kmap_page(dir->i_mapping, blkoff / PAGE_SIZE); if (IS_ERR(page)) { ret = PTR_ERR(page); break; } + if (afs_check_page(dir, page)) { + err = -EIO; + put_kmapped_page(page); + break; + } limit = blkoff & ~(PAGE_SIZE - 1); @@ -361,7 +333,7 @@ static int afs_dir_iterate(struct inode ret = afs_dir_iterate_block(fpos, dblock, blkoff, cookie, filldir); if (ret != 1) { - afs_dir_put_page(page); + put_kmapped_page(page); goto out; } @@ -369,7 +341,7 @@ static int afs_dir_iterate(struct inode } while (*fpos < dir->i_size && blkoff < limit); - afs_dir_put_page(page); + put_kmapped_page(page); ret = 0; } diff -urpN -X dontdiff linux-2.6.21-rc6-mm1/fs/afs/mntpt.c linux-2.6.21-rc6-mm1-test/fs/afs/mntpt.c --- linux-2.6.21-rc6-mm1/fs/afs/mntpt.c 2007-04-09 17:24:03.0 -0700 +++ linux-2.6.21-rc6-mm1-test/fs/afs/mntpt.c2007-04-10 21:22:07.0 -0700 @@ -74,11 +74,6 @@ int afs_mntpt_check_symlink(struct afs_v ret = PTR_ERR(page); goto out; } - - ret = -EIO; - if (PageError(page)) - goto out_free; - buf = kmap(page); /* examine the symlink's contents */ @@ -98,7 +93,6 @@ int afs_mntpt_check_symlink(struct afs_v ret = 0; kunmap(page); - out_free: page_cache_release(page); out: _leave(" = %d", ret); @@ -180,10 +174,6 @@ static struct vfsmount *afs_mntpt_do_aut goto error; } - ret = -EIO; - if (PageError(page)) - goto error; - buf = kmap(page); memcpy(devname, buf, size); kunmap(page); - To unsubscribe from this list: send the line "unsubscribe linux-kernel"
[PATCH 0/17] fs: cleanup single page synchronous read interface
Nick Piggin recently changed the read_cache_page interface to be synchronous, which is pretty much what the file systems want anyway. Turns out that they have more in common than that, though, and some of them want to be able to get an uptodate *locked* page. Many of them want a kmapped page, which is uptodate and unlocked, and they all have their own individual helper functions to achieve this. Since the helper functions are so similar, this patch just combines them into a small number of simple library functions, which call read_cache_page (renamed to __read_cache_page because it now returns a locked page). The immediate result is a vast reduction in the number of fs-specific helper functions. The secondary goal is to reduce the number of places the page lock is taken, and eliminate a lot of PageUptodate and PageError checks. The file systems that still use PageChecked now have checker functions that return an error if the page is corrupted or has some other error. This simplifies the logic since the checker function is not part of any helper function anymore. Compile tested on x86_64. Signed-off-by: Nate Diller <[EMAIL PROTECTED]> --- drivers/mtd/devices/block2mtd.c | 28 +-- fs/afs/dir.c | 56 +++--- fs/afs/mntpt.c | 10 -- fs/cramfs/inode.c|3 fs/ext2/dir.c| 82 - fs/freevxfs/vxfs_extern.h|1 fs/freevxfs/vxfs_inode.c |2 fs/freevxfs/vxfs_lookup.c|4 - fs/freevxfs/vxfs_subr.c | 33 fs/hfs/bnode.c |4 - fs/hfsplus/bnode.c |4 - fs/jffs2/fs.c| 27 --- fs/jffs2/gc.c| 15 ++- fs/jfs/jfs_metapage.c|5 - fs/minix/dir.c | 59 --- fs/ntfs/aops.h | 67 - fs/ntfs/bitmap.c |8 +- fs/ntfs/dir.c| 65 ++--- fs/ntfs/index.c | 12 +-- fs/ntfs/lcnalloc.c |6 - fs/ntfs/logfile.c| 12 +-- fs/ntfs/mft.c| 53 + fs/ntfs/super.c | 38 - fs/ntfs/usnjrnl.c|4 - fs/partitions/check.c| 14 +-- fs/reiser4/plugin/file/tail_conversion.c |8 -- fs/reiser4/plugin/item/extent_file_ops.c |9 -- fs/reiserfs/xattr.c | 48 ++-- fs/sysv/dir.c| 19 +--- fs/ufs/balloc.c |8 +- fs/ufs/dir.c | 90 +-- fs/ufs/truncate.c|8 +- fs/ufs/util.c| 52 - fs/ufs/util.h| 10 -- include/linux/pagemap.h | 53 - mm/filemap.c | 118 +++ 36 files changed, 315 insertions(+), 720 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: If not readdir() then what?
On Thu, 12 April 2007 11:46:41 +1000, Neil Brown wrote: > > I could argue that nfs came before ext3+dirindex, so ext3 should have > been designed to work properly with NFS. You could argue that fixing > it in nfsd fixes it for all filesystems. But I'm not sure either of > those arguments are likely to be at all convincing... Caring about a non-ext3 filesystem, I sure would like an nfs solution as well. :) > Hmmm. I wonder. Which is more likely? > - That two 64bit hashes from some set are the same > - or that 65536 48bit hashes from a set of equal size are the same. The former. Each bit going from hash strength to collision chain length reduces the likelihood of an overflow. In the extreme case of a 0bit hash and 64bit collision chain, you need 2^64 entries compared to 2^32 for the other extreme. However, the collision chain gives me quite a bit of headache. One would have to store each entry's position on the chain, deal with older entries getting deleted, newer entries getting removed, etc. All this requires a lot of complicated code that basically never gets tested in the wild. Just settling for a 64bit hash and returning -EEXIST when someone causes a collision an creat() sounds more appealing. Directories with 4 billion entries will cause problems, but that is hardly news to anyone. Jörn -- Fantasy is more important than knowledge. Knowledge is limited, while fantasy embraces the whole world. -- Albert Einstein - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Announce: New release of Linux-ready Firmware Dev Kit - Release 2
The Linux-ready Firmware Developer Kit team is pleased to announce the release R2 of the kit. This release is mostly very heavy with bug-fixes, infrastructure re-org. to make it easier for outside developers to write & contribute plugins, and of course, tons of documentation. A few new tests and features have been added, including things you've asked for such as ssh-upload and a text-based version of the results. The Linux-ready Firmware Developer Kit is a tool to test how well Linux works together with the firmware (BIOS or EFI) of your machine, and is designed for use by both firmware development teams and Linux kernel hackers to prevent and diagnose firmware bugs. Summary === Enhancements * ssh upload of results * globalized DSDT & SSDT lists for standalone and .so plugins * better logically implemented dmesg and e820 functions * text-based results * documentation of each plugin and the meaning if its results (Documentation/TestsInfo) * bug-fixes New Tests = * ia64 error injection tool * fan test (now functional) * SUN test (now functional) * ebda test * cpufreq: added test for Ingo's _PSS bug * dmesg: added detection for buzilla.kernel.org bug 6859 You can download this latest release of the kit from http://www.linuxfirmwarekit.org The Linux-ready Firmware Developer Kit team Jacob Pan Rolla Selbak Arjan van de Ven [Please cc my email on any replies or comments] Thanks, rs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/12] get_unmapped_area handles MAP_FIXED on parisc
Handle MAP_FIXED in parisc arch_get_unmapped_area(), just return the address. We might want to also check for possible cache aliasing issues now that we get called in that case (like ARM or MIPS), leave a comment for the maintainers to pick up. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/parisc/kernel/sys_parisc.c |5 + 1 file changed, 5 insertions(+) Index: linux-cell/arch/parisc/kernel/sys_parisc.c === --- linux-cell.orig/arch/parisc/kernel/sys_parisc.c 2007-03-22 15:28:05.0 +1100 +++ linux-cell/arch/parisc/kernel/sys_parisc.c 2007-03-22 15:29:08.0 +1100 @@ -106,6 +106,11 @@ unsigned long arch_get_unmapped_area(str { if (len > TASK_SIZE) return -ENOMEM; + /* Might want to check for cache aliasing issues for MAP_FIXED case +* like ARM or MIPS ??? --BenH. +*/ + if (flags & MAP_FIXED) + return addr; if (!addr) addr = TASK_UNMAPPED_BASE; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/12] get_unmapped_area handles MAP_FIXED on ia64
Handle MAP_FIXED in ia64 arch_get_unmapped_area and hugetlb_get_unmapped_area(), just call prepare_hugepage_range in the later and is_hugepage_only_range() in the former. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/ia64/kernel/sys_ia64.c |7 +++ arch/ia64/mm/hugetlbpage.c |8 2 files changed, 15 insertions(+) Index: linux-cell/arch/ia64/kernel/sys_ia64.c === --- linux-cell.orig/arch/ia64/kernel/sys_ia64.c 2007-03-22 15:10:45.0 +1100 +++ linux-cell/arch/ia64/kernel/sys_ia64.c 2007-03-22 15:10:47.0 +1100 @@ -33,6 +33,13 @@ arch_get_unmapped_area (struct file *fil if (len > RGN_MAP_LIMIT) return -ENOMEM; + /* handle fixed mapping: prevent overlap with huge pages */ + if (flags & MAP_FIXED) { + if (is_hugepage_only_range(mm, addr, len)) + return -EINVAL; + return addr; + } + #ifdef CONFIG_HUGETLB_PAGE if (REGION_NUMBER(addr) == RGN_HPAGE) addr = 0; Index: linux-cell/arch/ia64/mm/hugetlbpage.c === --- linux-cell.orig/arch/ia64/mm/hugetlbpage.c 2007-03-22 15:12:32.0 +1100 +++ linux-cell/arch/ia64/mm/hugetlbpage.c 2007-03-22 15:12:39.0 +1100 @@ -148,6 +148,14 @@ unsigned long hugetlb_get_unmapped_area( return -ENOMEM; if (len & ~HPAGE_MASK) return -EINVAL; + + /* Handle MAP_FIXED */ + if (flags & MAP_FIXED) { + if (prepare_hugepage_range(addr, len, pgoff)) + return -EINVAL; + return addr; + } + /* This code assumes that RGN_HPAGE != 0. */ if ((REGION_NUMBER(addr) != RGN_HPAGE) || (addr & (HPAGE_SIZE - 1))) addr = HPAGE_REGION_BASE; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/12] get_unmapped_area handles MAP_FIXED on i386
Handle MAP_FIXED in i386 hugetlb_get_unmapped_area(), just call prepare_hugepage_range. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/i386/mm/hugetlbpage.c |6 ++ 1 file changed, 6 insertions(+) Index: linux-cell/arch/i386/mm/hugetlbpage.c === --- linux-cell.orig/arch/i386/mm/hugetlbpage.c 2007-03-22 16:08:12.0 +1100 +++ linux-cell/arch/i386/mm/hugetlbpage.c 2007-03-22 16:14:19.0 +1100 @@ -367,6 +367,12 @@ hugetlb_get_unmapped_area(struct file *f if (len > TASK_SIZE) return -ENOMEM; + if (flags & MAP_FIXED) { + if (prepare_hugepage_range(addr, len, pgoff)) + return -EINVAL; + return addr; + } + if (addr) { addr = ALIGN(addr, HPAGE_SIZE); vma = find_vma(mm, addr); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/12] get_unmapped_area handles MAP_FIXED on frv
Handle MAP_FIXED in arch_get_unmapped_area on frv. Trivial case, just return the address. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/frv/mm/elf-fdpic.c |4 1 file changed, 4 insertions(+) Index: linux-cell/arch/frv/mm/elf-fdpic.c === --- linux-cell.orig/arch/frv/mm/elf-fdpic.c 2007-03-22 15:00:50.0 +1100 +++ linux-cell/arch/frv/mm/elf-fdpic.c 2007-03-22 15:01:06.0 +1100 @@ -64,6 +64,10 @@ unsigned long arch_get_unmapped_area(str if (len > TASK_SIZE) return -ENOMEM; + /* handle MAP_FIXED */ + if (flags & MAP_FIXED) + return addr; + /* only honour a hint if we're not going to clobber something doing so */ if (addr) { addr = PAGE_ALIGN(addr); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/12] get_unmapped_area handles MAP_FIXED on x86_64
Handle MAP_FIXED in x86_64 arch_get_unmapped_area(), simple case, just return the address as passed in Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/x86_64/kernel/sys_x86_64.c |3 +++ 1 file changed, 3 insertions(+) Index: linux-cell/arch/x86_64/kernel/sys_x86_64.c === --- linux-cell.orig/arch/x86_64/kernel/sys_x86_64.c 2007-03-22 16:10:10.0 +1100 +++ linux-cell/arch/x86_64/kernel/sys_x86_64.c 2007-03-22 16:11:06.0 +1100 @@ -93,6 +93,9 @@ arch_get_unmapped_area(struct file *filp unsigned long start_addr; unsigned long begin, end; + if (flags & MAP_FIXED) + return addr; + find_start_end(flags, &begin, &end); if (len > end) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/12] get_unmapped_area handles MAP_FIXED on sparc64
Handle MAP_FIXED in hugetlb_get_unmapped_area on sparc64 by just using prepare_hugepage_range() Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/sparc64/mm/hugetlbpage.c |6 ++ 1 file changed, 6 insertions(+) Index: linux-cell/arch/sparc64/mm/hugetlbpage.c === --- linux-cell.orig/arch/sparc64/mm/hugetlbpage.c 2007-03-22 16:12:57.0 +1100 +++ linux-cell/arch/sparc64/mm/hugetlbpage.c2007-03-22 16:15:33.0 +1100 @@ -175,6 +175,12 @@ hugetlb_get_unmapped_area(struct file *f if (len > task_size) return -ENOMEM; + if (flags & MAP_FIXED) { + if (prepare_hugepage_range(addr, len, pgoff)) + return -EINVAL; + return addr; + } + if (addr) { addr = ALIGN(addr, HPAGE_SIZE); vma = find_vma(mm, addr); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/12] get_unmapped_area doesn't need hugetlbfs hacks anymore
Remove the hugetlbfs specific hacks in toplevel get_unmapped_area() now that all archs and hugetlbfs itself do the right thing for both cases. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> mm/mmap.c | 16 1 file changed, 16 deletions(-) Index: linux-cell/mm/mmap.c === --- linux-cell.orig/mm/mmap.c 2007-04-12 12:14:46.0 +1000 +++ linux-cell/mm/mmap.c2007-04-12 12:14:47.0 +1000 @@ -1381,22 +1381,6 @@ get_unmapped_area(struct file *file, uns if (addr & ~PAGE_MASK) return -EINVAL; - if (file && is_file_hugepages(file)) { - /* -* Check if the given range is hugepage aligned, and -* can be made suitable for hugepages. -*/ - ret = prepare_hugepage_range(addr, len, pgoff); - } else { - /* -* Ensure that a normal request is not falling in a -* reserved hugepage range. For some archs like IA-64, -* there is a separate region for hugepages. -*/ - ret = is_hugepage_only_range(current->mm, addr, len); - } - if (ret) - return -EINVAL; return addr; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/12] get_unmapped_area handles MAP_FIXED in hugetlbfs
Generic hugetlb_get_unmapped_area() now handles MAP_FIXED by just calling prepare_hugepage_range() Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> fs/hugetlbfs/inode.c |6 ++ 1 file changed, 6 insertions(+) Index: linux-cell/fs/hugetlbfs/inode.c === --- linux-cell.orig/fs/hugetlbfs/inode.c2007-03-22 16:12:56.0 +1100 +++ linux-cell/fs/hugetlbfs/inode.c 2007-03-22 16:16:02.0 +1100 @@ -115,6 +115,12 @@ hugetlb_get_unmapped_area(struct file *f if (len > TASK_SIZE) return -ENOMEM; + if (flags & MAP_FIXED) { + if (prepare_hugepage_range(addr, len, pgoff)) + return -EINVAL; + return addr; + } + if (addr) { addr = ALIGN(addr, HPAGE_SIZE); vma = find_vma(mm, addr); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/12] get_unmapped_area handles MAP_FIXED in generic code
generic arch_get_unmapped_area() now handles MAP_FIXED. Now that all implementations have been fixed, change the toplevel get_unmapped_area() to call into arch or drivers for the MAP_FIXED case. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> mm/mmap.c | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) Index: linux-cell/mm/mmap.c === --- linux-cell.orig/mm/mmap.c 2007-03-22 16:29:22.0 +1100 +++ linux-cell/mm/mmap.c2007-03-22 16:30:06.0 +1100 @@ -1199,6 +1199,9 @@ arch_get_unmapped_area(struct file *filp if (len > TASK_SIZE) return -ENOMEM; + if (flags & MAP_FIXED) + return addr; + if (addr) { addr = PAGE_ALIGN(addr); vma = find_vma(mm, addr); @@ -1272,6 +1275,9 @@ arch_get_unmapped_area_topdown(struct fi if (len > TASK_SIZE) return -ENOMEM; + if (flags & MAP_FIXED) + return addr; + /* requesting a specific address */ if (addr) { addr = PAGE_ALIGN(addr); @@ -1360,22 +1366,21 @@ get_unmapped_area(struct file *file, uns unsigned long pgoff, unsigned long flags) { unsigned long ret; + unsigned long (*get_area)(struct file *, unsigned long, + unsigned long, unsigned long, unsigned long); - if (!(flags & MAP_FIXED)) { - unsigned long (*get_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long); - - get_area = current->mm->get_unmapped_area; - if (file && file->f_op && file->f_op->get_unmapped_area) - get_area = file->f_op->get_unmapped_area; - addr = get_area(file, addr, len, pgoff, flags); - if (IS_ERR_VALUE(addr)) - return addr; - } + get_area = current->mm->get_unmapped_area; + if (file && file->f_op && file->f_op->get_unmapped_area) + get_area = file->f_op->get_unmapped_area; + addr = get_area(file, addr, len, pgoff, flags); + if (IS_ERR_VALUE(addr)) + return addr; if (addr > TASK_SIZE - len) return -ENOMEM; if (addr & ~PAGE_MASK) return -EINVAL; + if (file && is_file_hugepages(file)) { /* * Check if the given range is hugepage aligned, and - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/12] get_unmapped_area handles MAP_FIXED on alpha
Handle MAP_FIXED in alpha's arch_get_unmapped_area(), simple case, just return the address as passed in Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/alpha/kernel/osf_sys.c |3 +++ 1 file changed, 3 insertions(+) Index: linux-cell/arch/alpha/kernel/osf_sys.c === --- linux-cell.orig/arch/alpha/kernel/osf_sys.c 2007-03-22 14:58:33.0 +1100 +++ linux-cell/arch/alpha/kernel/osf_sys.c 2007-03-22 14:58:44.0 +1100 @@ -1267,6 +1267,9 @@ arch_get_unmapped_area(struct file *filp if (len > limit) return -ENOMEM; + if (flags & MAP_FIXED) + return addr; + /* First, see if the given suggestion fits. The OSF/1 loader (/sbin/loader) relies on us returning an - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/12] get_unmapped_area handles MAP_FIXED on arm
ARM already had a case for MAP_FIXED in arch_get_unmapped_area() though it was not called before. Fix the comment to reflect that it will now be called. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/arm/mm/mmap.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Index: linux-cell/arch/arm/mm/mmap.c === --- linux-cell.orig/arch/arm/mm/mmap.c 2007-03-22 14:59:51.0 +1100 +++ linux-cell/arch/arm/mm/mmap.c 2007-03-22 15:00:01.0 +1100 @@ -49,8 +49,7 @@ arch_get_unmapped_area(struct file *filp #endif /* -* We should enforce the MAP_FIXED case. However, currently -* the generic kernel code doesn't allow us to handle this. +* We enforce the MAP_FIXED case. */ if (flags & MAP_FIXED) { if (aliasing && flags & MAP_SHARED && addr & (SHMLBA - 1)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/12] get_unmapped_area handles MAP_FIXED on powerpc
Handle MAP_FIXED in powerpc's arch_get_unmapped_area() in all 3 implementations of it. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> arch/powerpc/mm/hugetlbpage.c | 21 + 1 file changed, 21 insertions(+) Index: linux-cell/arch/powerpc/mm/hugetlbpage.c === --- linux-cell.orig/arch/powerpc/mm/hugetlbpage.c 2007-03-22 14:52:07.0 +1100 +++ linux-cell/arch/powerpc/mm/hugetlbpage.c2007-03-22 14:57:40.0 +1100 @@ -572,6 +572,13 @@ unsigned long arch_get_unmapped_area(str if (len > TASK_SIZE) return -ENOMEM; + /* handle fixed mapping: prevent overlap with huge pages */ + if (flags & MAP_FIXED) { + if (is_hugepage_only_range(mm, addr, len)) + return -EINVAL; + return addr; + } + if (addr) { addr = PAGE_ALIGN(addr); vma = find_vma(mm, addr); @@ -647,6 +654,13 @@ arch_get_unmapped_area_topdown(struct fi if (len > TASK_SIZE) return -ENOMEM; + /* handle fixed mapping: prevent overlap with huge pages */ + if (flags & MAP_FIXED) { + if (is_hugepage_only_range(mm, addr, len)) + return -EINVAL; + return addr; + } + /* dont allow allocations above current base */ if (mm->free_area_cache > base) mm->free_area_cache = base; @@ -829,6 +843,13 @@ unsigned long hugetlb_get_unmapped_area( /* Paranoia, caller should have dealt with this */ BUG_ON((addr + len) < addr); + /* Handle MAP_FIXED */ + if (flags & MAP_FIXED) { + if (prepare_hugepage_range(addr, len, pgoff)) + return -EINVAL; + return addr; + } + if (test_thread_flag(TIF_32BIT)) { curareas = current->mm->context.low_htlb_areas; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/12] Pass MAP_FIXED down to get_unmapped_area
This is a "first step" as there are still cleanups to be done in various areas touched by that code but I think it's probably good to go as is and at least enables me to implement what I need for PowerPC. (Andrew, this is also candidate for 2.6.22 since I haven't had any real objection, mostly suggestion for improving further, which I'll try to do later, and I have further powerpc patches that rely on this). The current get_unmapped_area code calls the f_ops->get_unmapped_area or the arch one (via the mm) only when MAP_FIXED is not passed. That makes it impossible for archs to impose proper constraints on regions of the virtual address space. To work around that, get_unmapped_area() then calls some hugetlbfs specific hacks. This cause several problems, among others: - It makes it impossible for a driver or filesystem to do the same thing that hugetlbfs does (for example, to allow a driver to use larger page sizes to map external hardware) if that requires applying a constraint on the addresses (constraining that mapping in certain regions and other mappings out of those regions). - Some archs like arm, mips, sparc, sparc64, sh and sh64 already want MAP_FIXED to be passed down in order to deal with aliasing issues. The code is there to handle it... but is never called. This serie of patches moves the logic to handle MAP_FIXED down to the various arch/driver get_unmapped_area() implementations, and then changes the generic code to always call them. The hugetlbfs hacks then disappear from the generic code. Since I need to do some special 64K pages mappings for SPEs on cell, I need to work around the first problem at least. I have further patches thus implementing a "slices" layer that handles multiple page sizes through slices of the address space for use by hugetlbfs, the SPE code, and possibly others, but it requires that serie of patches first/ There is still a potential (but not practical) issue due to the fact that filesystems/drivers implemeting g_u_a will effectively bypass all arch checks. This is not an issue in practice as the only filesystems/drivers using that hook are doing so for arch specific purposes in the first place. There is also a problem with mremap that will completely bypass all arch checks. I'll try to address that separately, I'm not 100% certain yet how, possibly by making it not work when the vma has a file whose f_ops has a get_unmapped_area callback, and by making it use is_hugepage_only_range() before expanding into a new area. Also, I want to turn is_hugepage_only_range() into a more generic is_normal_page_range() as that's really what it will end up meaning when used in stack grow, brk grow and mremap. None of the above "issues" however are introduced by this patch, they are already there, so I think the patch can go in. Cheers, Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] Clean up workqueue.c with respect to the freezer based cpu-hotplug
On Tue, Apr 03, 2007 at 10:48:20PM +0530, Srivatsa Vaddagiri wrote: > > Actually, we should do this before destroy_workqueue() calls > > flush_workqueue(). > > Otherwise flush_cpu_workqueue() can hang forever in a similar manner. > > Yep. I guess these are a class of freezer deadlocks very similar to vfork > parent waiting on child case. I get a feeling these should become common > outside of kthread too (A waits on B for something, B gets frozen, which > means A won't freeze causing freezer to fail). Can freezer detect this > dependency somehow and thaw B automatically? Probably not that easy .. I wonder if there is some value in "enforcing" an order in which processes get frozen i.e freeze A first before B. That may solve the deadlocks we have been discussing wrt kthread_stop and flush_workqueue as well. The idea is similar to how deadlock wrt multiple locks are solved - where a ordering is enforced. Take Lock A first before Lock B. If process A waits on B (like in kthread_stop or flush_workqueue), then if we: 1. Insert A and B in a list (freeze_me_first_list) 2. Have freezer scan freeze_me_first_list before the master task-list, so that it: 2a. "freezes A and waits for A to get frozen" first 2b. "freezes B and waits for B to get frozen" next then we would avoid the nastiness of "B getting frozen first and A doesnt freeze because of that" with lesser code changes? -- Regards, vatsa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] GIT 1.5.1.1
The latest maintenance release GIT 1.5.1.1 is available at the usual places: http://www.kernel.org/pub/software/scm/git/ git-1.5.1.1.tar.{gz,bz2} (tarball) git-htmldocs-1.5.1.1.tar.{gz,bz2} (preformatted docs) git-manpages-1.5.1.1.tar.{gz,bz2} (preformatted docs) RPMS/$arch/git-*-1.5.1.1-1.$arch.rpm (RPM) GIT v1.5.1.1 Release Notes == Fixes since v1.5.1 -- * Documentation updates - The --left-right option of rev-list and friends is documented. - The documentation for cvsimport has been majorly improved. - "git-show-ref --exclude-existing" was documented. * Bugfixes - The implementation of -p option in "git cvsexportcommit" had the meaning of -C (context reduction) option wrong, and loosened the context requirements when it was told to be strict. - "git cvsserver" did not behave like the real cvsserver when client side removed a file from the working tree without doing anything else on the path. In such a case, it should restore it from the checked out revision. - "git fsck" issued an alarming error message on detached HEAD. It is not an error since at least 1.5.0. - "git send-email" produced of References header of unbounded length; fixed this with line-folding. - "git archive" to download from remote site should not require you to be in a git repository, but it incorrectly did. - "git apply" ignored -p for "diff --git" formatted patches. - "git rerere" recorded a conflict that had one side empty (the other side adds) incorrectly; this made merging in the other direction fail to use previously recorded resolution. - t4200 test was broken where "wc -l" pads its output with spaces. - "git branch -m old new" to rename branch did not work without a configuration file in ".git/config". - The sample hook for notification e-mail was misnamed. - gitweb did not show type-changing patch correctly in the blobdiff view. - git-svn did not error out with incorrect command line options. - git-svn fell into an infinite loop when insanely long commit message was found. - git-svn dcommit and rebase was confused by patches that were merged from another branch that is managed by git-svn. Changes since v1.5.1 are as follows: Arjen Laarhoven (4): usermanual.txt: some capitalization nits t3200-branch.sh: small language nit t5300-pack-object.sh: portability issue using /usr/bin/stat Makefile: iconv() on Darwin has the old interface Brian Gernhardt (3): Fix t4200-rerere for white-space from "wc -l" Document --left-right option to rev-list. Distinguish branches by more than case in tests. Dana How (1): Fix lseek(2) calls with args 2 and 3 swapped Eric Wong (3): git-svn: bail out on incorrect command-line options git-svn: dcommit/rebase confused by patches with git-svn-id: lines git-svn: fix log command to avoid infinite loop on long commit messages Frank Lichtenheld (7): cvsimport: sync usage lines with existing options cvsimport: Improve documentation of CVSROOT and CVS module determination cvsimport: Improve usage error reporting cvsimport: Reorder options in documentation for better understanding cvsimport: Improve formating consistency cvsserver: small corrections to asciidoc documentation cvsserver: Fix handling of diappeared files on update Geert Bosch (1): Fix renaming branch without config file Gerrit Pape (1): rename contrib/hooks/post-receieve-email to contrib/hooks/post-receive-email. Jakub Narebski (1): gitweb: Fix bug in "blobdiff" view for split (e.g. file to symlink) patches Jim Meyering (1): (encode_85, decode_85): Mark source buffer pointer as "const". Julian Phillips (1): Documentation: show-ref: document --exclude-existing Junio C Hamano (7): rerere: make sorting really stable. Fix dependency of common-cmds.h Documentation: tighten dependency for git.{html,txt} Prepare for 1.5.1.1 Add Documentation/cmd-list.made to .gitignore fsck: do not complain on detached HEAD. GIT 1.5.1.1 Lars Hjemli (2): rename_ref(): only print a warning when config-file update fails Make builtin-branch.c handle the git config file René Scharfe (1): Revert "builtin-archive: use RUN_SETUP" Shawn O. Pearce (1): Honor -p when applying git diffs Tomash Brechko (1): cvsexportcommit -p : fix the usage of git-apply -C. Ville Skyttä (1): DESTDIR support for git/contrib/emacs YOSHIFUJI Hideaki (1): Avoid composing too long "References" header. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info
Re: [PATCH, take4] FUTEX : new PRIVATE futexes
Eric Dumazet wrote: On Wed, 11 Apr 2007 19:23:26 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: As this external thing certainly is not doing the check itself, to be on the safe side we should enforce it in get_futex_key(). I agree with you : If we want to maximize performance, we could say : The check *must* be done by the caller. Well we _control_ the API, so let's make it as clean and performant as possible from the start. Take a look at do_futex(). Adding checks in callers just increase code size. I tried this got only bad results. This would speedup only the slow path (ie when some user code want to give us non aligned addrs) A single factorized check is cleaner and not slower, since we reduce icache pressure. 1 extra check versus all that additional argument passing? I don't think it is conclusive. -- SUSE Labs, Novell Inc. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: If not readdir() then what?
On Wednesday April 11, [EMAIL PROTECTED] wrote: > > Actually, no, we can't keep the collision chain count stable across a > create/delete even while the tree is cached. At least, not without > storing a huge amount of state associated with each page. (It would > be a lot more work than simply having nfsd keep a fd cache for > directory streams ;-). Well, there's the rub, isn't it :-) You think it is easier to fix the problem in nfsd, and I think it is easier to fix the problem in ext3. Both positions are quite understandable and quite genuine. And I am quite sure that all the issues that have been raised can be solved with a bit of effort providing the motivation is there. I could argue that nfs came before ext3+dirindex, so ext3 should have been designed to work properly with NFS. You could argue that fixing it in nfsd fixes it for all filesystems. But I'm not sure either of those arguments are likely to be at all convincing... Maybe the best compromise is to both fix the 'problem' :-? Let me explores some designs a bit more.. NFS: All we have to do is cache the open files. This should be only a performance issue, not a correctness issue (once we get 64bit cookies from ext3). So the important thing is to cache then for a reasonable period of time. We currently cache the read-ahead info from regular files (though I was hoping that would go away when the page-cache-based readahead became a reality). We can reasonably replace this with caching the open files if we are going to do it for directories anyway. So the primary key is "struct inode * + loff_t". This is suitable both for file-readahead and for ext3-directory-caching. Might also be useful for filesystem that stores pre-allocation information in the struct-file. We keep these in an LRU list and a hash table. We register a callback with register_shrinker (or whatever it is called today) so that VM pressure can shrink the cache, and also arrange a timer to remove entries older than -- say -- 5 seconds. I think that placing a fixed size on the cache based on number of active clients would be a mistake, as it is virtually impossible to know how many active clients there are, and the number can change very dynamically. When a filesystem is un-exported, (rare event) we walk the whole list and discard entries for that filesystem. Look into the possibility of a callback on unlink to drop the cached open when the link count hits zero I wonder if inotify or leases can help with that. To help with large NUMA machine, we probably don't want a single hash LRU chain, but rather a number of LRU chains. That way the action of moving an entry to the end of the chain is less likely to conflict with another processor trying to do the same thing to a different entry. This is the sort of consideration that is already handled in the page cache, and having to handle it in every other cache is troublesome because the next time a need like that arises, the page cache will get fixed but other little caches won't until someone like Greg Banks come along with a big hammer... EXT3: Have a rbtree for storing directory entries. This is attached to a pages via the ->private page field. Normally each page of a directory has it's own rbtree, but when two pages contain entries with the same hash, the one rbtree is shared between the pages. Thus when you load a block you must also load other blocks under the same hash, but I think you do that already. When you split a block (because it has become too big) the rbtree attached to that block is dismantled and each entry is inserted into the appropriate new rbtree, one for each of the two blocks. The entries are unchanged - they just get placed in a different tree - so cursors in the struct file will still be valid. Each entry has a count of the number of cursors pointing to it, and when this is non-zero, a refcount on the page is held, thus making sure the page doesn't get freed and the btree lost. The entry should possibly also contain a pointer to the page.. not sure if that is needed. Each entry in the rbtree contains (in minor_hash) a sequence number that is used when multiple entries hash to the same value. We store a 'current-seq-number' in the root of the rbtree and when an attempt to insert an entry finds a collision, we increase current-seq-number, set the minor_hash to that, and retry the insert. This minor_hash is combined with some bits of the major hash to form the fpos/cookie. The releasepage address_space_operation will check that all pages which share the same major hash are treated as a unit, all released at the same time. So it will fail if any of the pages in the group are in use. If they can all be freed, it will free the rbtree for that group. This not only benefits nfsd, which opens
[PATCH] fix bogon in /dev/mem mmap'ing on nommu
While digging through my MAP_FIXED changes, I found that rather obvious bug in /dev/mem mmap implementation for nommu archs. get_unmapped_area() is expected to return an address, not a pfn. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> --- I suppose that can go in anytime, and probably in stable too, Dave ? Index: linux-cell/drivers/char/mem.c === --- linux-cell.orig/drivers/char/mem.c 2007-02-12 10:36:14.0 +1100 +++ linux-cell/drivers/char/mem.c 2007-04-12 11:38:44.0 +1000 @@ -248,7 +248,7 @@ static unsigned long get_unmapped_area_m { if (!valid_mmap_phys_addr_range(pgoff, len)) return (unsigned long) -EINVAL; - return pgoff; + return pgoff << PAGE_SHIFT; } /* can't do an in-place private mapping if there's no MMU */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 9/10] Vmi timer update.patch
Chris Wright wrote: * Zachary Amsden ([EMAIL PROTECTED]) wrote: +void __init vmi_time_init(void) +{ + /* Disable PIT: BIOSes start PIT CH0 with 18.2hz peridic. */ + outb_p(0x3a, PIT_MODE); /* binary, mode 5, LSB/MSB, ch 0 */ That shouldn't be necessary using clockevents. Actually, I'm not so sure. If clockevents simply masks the PIT when disabling it, we still have overhead of keeping the latch in sync, which requires a timer at the PIT frequency. I can instrument to see how exactly the PIT gets disabled. It should switch from pit to vmi-timer, and the switch should do the state transistions on pit to go to unused mode. Ok, here's why we need it: the reason is even more basic. PIT clockevents never get setup; the time_init paravirt-op makes it conditional whether the PIT or VMI timer get invoked. But our BIOS still sets it up to run at 18.2 HZ, like any good BIOS would. We need the disable hack, in fact it is actually a good thing to do for native hardware. Why leave the PIT enabled with junk programming from the BIOS once we are in the protected mode kernel? Eventually, on hardware that doesn't want to use the PIT at all, this might be wanted to conserve power (casually joking but potentially correct argument). Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Security computation within Linux kernel
Carlo Florendo wrote: IIRC, The kernel does some encryption functions, involving TCP, NFS, and IPsec since all these are part of the kernel itself. Yes, but key management is done in userspace. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Security computation within Linux kernel
JanuGerman wrote: Hi every one, I have one question regarding security libraries, already shipped with Linux Kernel. That is, all PKI, RSA libraries, as provided by OpenSSL are already integrated within the linux kernel source code? OR, one have to use OpenSSL seperately in this regard. IIRC, The kernel does some encryption functions, involving TCP, NFS, and IPsec since all these are part of the kernel itself. If you intend to write your own apps that have to use encryption functions, it would be best to use the relevant encryption libraries, such as OpenSSL. Thank you very much. Best Regards, Carlo -- Carlo Florendo Softare Engineer/Network Co-Administrator Astra Philippines Inc. UP-Ayala Technopark, Diliman 1101, Quezon City Philippines http://www.astra.ph -- The Astra Group of Companies 5-3-11 Sekido, Tama City Tokyo 206-0011, Japan http://www.astra.co.jp - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: "menu" versus "menuconfig" -- they're *both* a bad idea
Robert P. J. Day wrote: (in short, if i, the builder, explicitly choose *not* to add a certain feature to my build, i think i have every right to expect that some other part of my configuration isn't quietly going to put some sub-choice of that feature back in behind my back.) I agree with this. However, if another feature actually depends on another explicitly unselected feature, there should at least be a warning prompt that such is the case. It probably would be hard though to track all dependencies. Best Regards, Carlo -- Carlo Florendo Softare Engineer/Network Co-Administrator Astra Philippines Inc. UP-Ayala Technopark, Diliman 1101, Quezon City Philippines http://www.astra.ph -- The Astra Group of Companies 5-3-11 Sekido, Tama City Tokyo 206-0011, Japan http://www.astra.co.jp - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/7] [RFC] External power framework
> diff --git a/drivers/Kconfig b/drivers/Kconfig > index 050323f..c546de3 100644 I've forgot to pass -s flag to git-format-patch. :-/ Please count it for whole x/7 patch set: Signed-off-by: Anton Vorontsov <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [PATCH] kernel-doc: fix plist.h comments
>From: Randy Dunlap <[EMAIL PROTECTED]> > >Make kernel-doc comments match macro names. >Correct parameter names in a few places. >Remove '#' from beginning of kernel-doc comment macro names. >Remove extra (erroneous) blank lines in kernel-doc. > > ... > >cc: Inaky Perez-Gonzalez <[EMAIL PROTECTED]> >cc: Daniel Walker <[EMAIL PROTECTED]> >cc: Thomas Gleixner <[EMAIL PROTECTED]> >cc: Oleg Nesterov <[EMAIL PROTECTED]> > >Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> Acked-by: Inaky Perez-Gonzalez <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] i386 tsc: remove xtime_lock'ing around cpufreq notifier
On Wed, 11 Apr 2007 14:33:57 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > > Here is the call ordering , > > > > ktime_get() > > ktime_get_ts() -> read_seqretry(&xtime_lock, seq) > > getnstimeofday() > >__get_realtime_clock_ts() -> read_seqretry(&xtime_lock, seq) > > > > > > I wonder if there is a weird case which case this to loop forever .. But > > as said , it's just something I noticed so I don't know if it's > > related . > > > > hm. > > Bear in mind that printk calls sched_clock() for each line of output. > (with the "time" kernel boot parameter). > > If we're doing a read_seqretry() in sched_clock() then bascially any printk > inside the write_seqlock() will cause a lockup. > > So in fact, this explains my hang: I was debugging it with printk and I > noticed that the printk before the write_seqlock() came out and the one > after it did not. Presumably if I wasn't using "time", that hang wouldn't > have happened. > > Which means that I still don't have a clue why Andi's patch is locking up > the Vaio. > > It's a bad idea to make sched_clock() this complex - we've gone and > degraded kernel debuggability somewhat. > > We have provision for fixing this: the architecture can provide its own > printk_clock(). We should do something quick-n-dirty in printk_clock() > which doesn't require any locks. > OK, so I resurrected x86_64-mm-sched-clock-share.patch and x86_64-mm-sched-clock64.patch. The x86_64 box hangs on boot when using netconsole and printk timestamps too. Removing "time" from the kernel boot command line prevents that. This explains why the hang only happens with x86_64-mm-log-reason-why-tsc-was-marked-unstable.patch applied, too: that patch must be triggering a printk inside xtime_lock. Does someone want to cook up a lockless printk_clock() for i386 and x86_64? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] kernel-doc: fix plist.h comments
From: Randy Dunlap <[EMAIL PROTECTED]> Make kernel-doc comments match macro names. Correct parameter names in a few places. Remove '#' from beginning of kernel-doc comment macro names. Remove extra (erroneous) blank lines in kernel-doc. Warning(plist.h:100): Cannot understand * #PLIST_HEAD_INIT - static struct plist_head initializer on line 100 - I thought it was a doc line Warning(plist.h:112): Cannot understand * #PLIST_NODE_INIT - static struct plist_node initializer on line 112 - I thought it was a doc line Warning(plist.h:103): No description found for parameter '_lock' Warning(plist.h:129): No description found for parameter 'lock' Warning(plist.h:158): No description found for parameter 'pos' Warning(plist.h:169): No description found for parameter 'pos' Warning(plist.h:169): No description found for parameter 'n' Warning(plist.h:179): No description found for parameter 'mem' This still leaves one warning & one error that need attention: Error(plist.h:219): cannot understand prototype: '(' Warning(plist.h): no structured comments found cc: Inaky Perez-Gonzalez <[EMAIL PROTECTED]> cc: Daniel Walker <[EMAIL PROTECTED]> cc: Thomas Gleixner <[EMAIL PROTECTED]> cc: Oleg Nesterov <[EMAIL PROTECTED]> Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- include/linux/plist.h | 54 +- 1 file changed, 23 insertions(+), 31 deletions(-) --- linux-2621-rc6.orig/include/linux/plist.h +++ linux-2621-rc6/include/linux/plist.h @@ -97,9 +97,9 @@ struct plist_node { #endif /** - * #PLIST_HEAD_INIT - static struct plist_head initializer - * + * PLIST_HEAD_INIT - static struct plist_head initializer * @head: struct plist_head variable name + * @_lock: lock to initialize for this list */ #define PLIST_HEAD_INIT(head, _lock) \ { \ @@ -109,8 +109,7 @@ struct plist_node { } /** - * #PLIST_NODE_INIT - static struct plist_node initializer - * + * PLIST_NODE_INIT - static struct plist_node initializer * @node: struct plist_node variable name * @__prio:initial node priority */ @@ -122,8 +121,8 @@ struct plist_node { /** * plist_head_init - dynamic struct plist_head initializer - * * @head: &struct plist_head pointer + * @lock: list spinlock, remembered for debugging */ static inline void plist_head_init(struct plist_head *head, spinlock_t *lock) @@ -137,7 +136,6 @@ plist_head_init(struct plist_head *head, /** * plist_node_init - Dynamic struct plist_node initializer - * * @node: &struct plist_node pointer * @prio: initial node priority */ @@ -152,49 +150,46 @@ extern void plist_del(struct plist_node /** * plist_for_each - iterate over the plist - * - * @pos1: the type * to use as a loop counter. - * @head: the head for your list. + * @pos: the type * to use as a loop counter + * @head: the head for your list */ #define plist_for_each(pos, head) \ list_for_each_entry(pos, &(head)->node_list, plist.node_list) /** - * plist_for_each_entry_safe - iterate over a plist of given type safe - * against removal of list entry + * plist_for_each_safe - iterate safely over a plist of given type + * @pos: the type * to use as a loop counter + * @n: another type * to use as temporary storage + * @head: the head for your list * - * @pos1: the type * to use as a loop counter. - * @n1:another type * to use as temporary storage - * @head: the head for your list. + * Iterate over a plist of given type, safe against removal of list entry. */ #define plist_for_each_safe(pos, n, head) \ list_for_each_entry_safe(pos, n, &(head)->node_list, plist.node_list) /** * plist_for_each_entry- iterate over list of given type - * - * @pos: the type * to use as a loop counter. - * @head: the head for your list. - * @member:the name of the list_struct within the struct. + * @pos: the type * to use as a loop counter + * @head: the head for your list + * @mem: the name of the list_struct within the struct */ #define plist_for_each_entry(pos, head, mem) \ list_for_each_entry(pos, &(head)->node_list, mem.plist.node_list) /** - * plist_for_each_entry_safe - iterate over list of given type safe against - * removal of list entry - * - * @pos: the type * to use as a loop counter. + * plist_for_each_entry_safe - iterate safely over list of given type + * @pos: the type * to use as a loop counter * @n: another type * to use as temporary storage - * @head: the head for your list. - * @m: the name of the list_struct within the struct. + * @head: the head for your list + * @m: the name of the list_struct within the struct + * + * Iterate over list of given type, safe against removal of list entry. */ #define plist_for_each_entry_safe(pos, n, head, m) \ l
RE: Help Understanding Linux memory management
> 1) When physical memory runs low, the memory manager will try to use > memory currently allocated to the pagecache. Is this true? Yes. > 2) When vm.overcommit_memory = 2 (overcommit disabled), and memory runs > low, it appears that the memory manager does not try to use memory > currently allocated to pagecache. Is this true? It does try, that doesn't mean it will succeed. If overcommit is disabled, the OS must have enoug (RAM+swap) to handle the maximum memory consumption is has allowed to take place. Perhaps you are laboring under the incorrect assumption that the pagecache can always be shrunk to zero? Not all the data in the pagecache is discardable. For example, any page that has been modified from its disk copy cannot be discarded. > 3) Is it possible to disable the pagecache? No, because huge amounts of capability would become impossible. It is not even clear to me how you could execute self-modifying code without a pagecache. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] markers-linker-generic
Jim Keniston wrote: On Wed, 2007-04-11 at 15:21 -0400, Mathieu Desnoyers wrote: * Andrew Morton ([EMAIL PROTECTED]) wrote: On Wed, 11 Apr 2007 13:51:11 -0400 Mathieu Desnoyers <[EMAIL PROTECTED]> wrote: What's this marker stuff about? Hi Russel, Here is an overview : I am told that the systemtap developers plan to (or are) using this infrastructure. Quoting Frank Ch. Eigler, from the SystemTAP team : "The LTTng user-space programs use it today. Systemtap used to support the earlier marker prototype and will be rapidly ported over to this new API upon acceptance." If correct: what is their reason for preferring it over kprobes? Markers are not a substitute or preference over kprobes, they augment kprobes by enabling additional functionality. I will let them answer on this one.. I'll take a shot at this one. First of all, kprobes remains a vital foundation for SystemTap. But markers are attactive as an alternate source of trace/debug info. Here's why: 1. Markers will live in the kernel and presumably be kept up to date by the maintainers of the enclosing code. We have a growing set of tapsets (probe libraries), each of which "knows" the source code for a certain area of the kernel. Whenever the underlying kernel code changes (e.g., a function or one of its args disappears or is renamed), there's a chance that the tapset will become invalid until we bring it back in sync with the kernel. As you can imagine, maintaining tapsets separate from the kernel source is a maintenance headache. Markers could mitigate this. Jim's above stated reason is not a consideration for markers. We don't plan to convert the current tapsets to use markers. We do need to augment tapsets with a few markers in the kernel code where it is not easy to put a kprobe in a maintainable fashion -- e.g in the middle of a function. 2. Because the kernel code is highly optimized, the kernel's dwarf info doesn't always accurately reflect which variables have which values on which lines (sometimes even upon entry to a function). A marker is a way to ensure that values of interest are available to SystemTap at marked points. Agreed 3. Sometimes the overhead of a kprobe probepoint is too much (either in terms of time or locking) for the particular hotspot we want to probe. Agreed Jim bye, Vara Prasad - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
One odd thing about Synaptics
Hi, Peter: There's one thing I wanted to report, just in case... It does not affect anything, but it's odd. Every time I lift a finger from the pad, the driver sends an event with odd values (X is 1 and Y is 5855): Event: time 1174695694.561806, type 1 (Key), code 330 (Touch), value 0 Event: time 1174695694.561809, type 3 (Absolute), code 0 (X), value 1425 Event: time 1174695694.561811, type 3 (Absolute), code 1 (Y), value 1223 Event: time 1174695694.561813, type 3 (Absolute), code 24 (Pressure), value 20 Event: time 1174695694.561816, -- Report Sync Event: time 1174695694.573918, type 3 (Absolute), code 0 (X), value 1500 Event: time 1174695694.573921, type 3 (Absolute), code 1 (Y), value 1265 Event: time 1174695694.573922, type 3 (Absolute), code 24 (Pressure), value 5 Event: time 1174695694.573924, type 3 (Absolute), code 28 (Tool Width), value 5 Event: time 1174695694.573926, -- Report Sync Event: time 1174695694.585575, type 3 (Absolute), code 0 (X), value 1 Event: time 1174695694.585578, type 3 (Absolute), code 1 (Y), value 5855 Event: time 1174695694.585580, type 3 (Absolute), code 24 (Pressure), value 2 Event: time 1174695694.585582, type 1 (Key), code 325 (ToolFinger), value 0 Event: time 1174695694.585584, type 1 (Key), code 333 (Tool Doubletap), value 1 Event: time 1174695694.585587, -- Report Sync Event: time 1174695694.622685, type 3 (Absolute), code 24 (Pressure), value 1 This correspods to hw.x=1, hw.y=0 in the driver. Looks like a bug somewhere. Cheers, -- Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: If not readdir() then what?
On Wednesday April 11, [EMAIL PROTECTED] wrote: > On Thu, 12 Apr 2007, Neil Brown wrote: > > > For the second. > > You say that you " would need at least 96 bits in order to make that > > guarantee; 64 bits of hash, plus a 32-bit count value in the hash > > collision chain". I think 96 is a bit greedy. Surely 48 bits of > > hash and 16 bits of collision-chain-position would plenty. You would > > need 65537 entries before a collision was even possible, and > > billions before it was at all likely. (How big does a set of 48bit > > numbers have to get before the probability that "No subset of 65536 > > numbers are all the same" drops below 0.95?) > > Neil, >you can get a hash collision with two entries. You need at least 65537 entries before there is any possibility of collision between two "48-bit-hash ++ 16-bit-sequence-number" objects where the 16-bit-sequence-number is chosen to be different from all other 16 bit sequence numbers combined with the same 48 bit hash. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/7] [RFC] APM emulation driver for class batteries
It finds battery with "main_battery" flag set (or with max_capacity if no batteries marked as main), and converts battery values to APM form. --- drivers/battery/Kconfig |7 +++ drivers/battery/Makefile|1 + drivers/battery/apm_power.c | 121 +++ 3 files changed, 129 insertions(+), 0 deletions(-) create mode 100644 drivers/battery/apm_power.c diff --git a/drivers/battery/Kconfig b/drivers/battery/Kconfig index 0c14ae0..bbf8283 100644 --- a/drivers/battery/Kconfig +++ b/drivers/battery/Kconfig @@ -15,4 +15,11 @@ config BATTERY_DS2760 help Say Y here to enable support for batteries with ds2760 chip. +config APM_POWER + tristate "APM emulation" + depends on BATTERY && APM + help + Say Y here to enable support APM status emulation using + battery class devices. + endmenu diff --git a/drivers/battery/Makefile b/drivers/battery/Makefile index 9902513..cea5807 100644 --- a/drivers/battery/Makefile +++ b/drivers/battery/Makefile @@ -1,2 +1,3 @@ obj-$(CONFIG_BATTERY) += battery.o obj-$(CONFIG_BATTERY_DS2760) += ds2760_battery.o +obj-$(CONFIG_APM_POWER)+= apm_power.o diff --git a/drivers/battery/apm_power.c b/drivers/battery/apm_power.c new file mode 100644 index 000..5e741c1 --- /dev/null +++ b/drivers/battery/apm_power.c @@ -0,0 +1,121 @@ +/* + * Copyright (c) 2007 Eugeny Boger + * + * Use consistent with the GNU GPL is permitted, + * provided that this copyright notice is + * preserved in its entirety in all copies and derived works. + */ + +#include +#include +#include +#include + +#define BATTERY_PROPERTY(property) (main_battery->get_##property ? \ + main_battery->get_##property(main_battery) : 0) + +static struct battery *main_battery; + +static void (*old_apm_get_power_status)(struct apm_power_info*); + +static void apm_battery_find_main_battery(void) +{ + struct device *dev; + struct battery *bat, *batm; + int max_capacity = 0; + + main_battery = NULL; + batm = NULL; + list_for_each_entry(dev, &battery_class->devices, node) { + bat = dev_get_drvdata(dev); + /* If none of battery devices cantains 'main_battery' flag, + choice one with max capacity */ + if (bat->get_max_capacity) + if (bat->get_max_capacity(bat) > max_capacity) { + batm = bat; + max_capacity = bat->get_max_capacity(bat); + } + + if (bat->main_battery) + main_battery = bat; + } + if (!main_battery) + main_battery = batm; +} + +static void apm_battery_apm_get_power_status(struct apm_power_info *info) +{ + int bat_current; + + down(&battery_class->sem); + apm_battery_find_main_battery(); + if (!main_battery) { + up(&battery_class->sem); + return; + } + + if (BATTERY_PROPERTY(status) == BATTERY_STATUS_FULL) + info->battery_life = 100; + else + if (BATTERY_PROPERTY(max_capacity) - + BATTERY_PROPERTY(min_capacity)) + info->battery_life = ((BATTERY_PROPERTY(capacity) - +BATTERY_PROPERTY(min_capacity)) * 100) / +(BATTERY_PROPERTY(max_capacity) - + BATTERY_PROPERTY(min_capacity)); + else + info->battery_life = -1; + if ((BATTERY_PROPERTY(status) == BATTERY_STATUS_CHARGING) + || (BATTERY_PROPERTY(status) == BATTERY_STATUS_NOT_CHARGING) + || (BATTERY_PROPERTY(status) == BATTERY_STATUS_FULL)) + info->ac_line_status = APM_AC_ONLINE; + else + info->ac_line_status = APM_AC_OFFLINE; + + if (BATTERY_PROPERTY(status) == BATTERY_STATUS_CHARGING) + info->battery_status = APM_BATTERY_STATUS_CHARGING; + else { + if (info->battery_life > 50) + info->battery_status = APM_BATTERY_STATUS_HIGH; + else if (info->battery_life > 5) + info->battery_status = APM_BATTERY_STATUS_LOW; + else + info->battery_status = APM_BATTERY_STATUS_CRITICAL; + } + info->battery_flag = info->battery_status; + + bat_current = BATTERY_PROPERTY(current); + if (bat_current) + info->time = ((BATTERY_PROPERTY(capacity) - + BATTERY_PROPERTY(min_capacity)) * 60) / +bat_current; + else + info->time = -1; + + info->units = APM_UNITS_MINS; + + up(&battery_class->sem); + return; +} + +static int __init apm_battery_init(void) +{ + printk(KERN_
[PATCH 6/7] [RFC] ds2760 battery driver
This is driver for batteries with ds2760 chip inside. Such batteries used in almost every HP iPaq and HTC PDAs/phones. --- drivers/battery/Kconfig |7 + drivers/battery/Makefile |1 + drivers/battery/ds2760_battery.c | 466 ++ include/linux/ds2760_battery.h | 32 +++ 4 files changed, 506 insertions(+), 0 deletions(-) create mode 100644 drivers/battery/ds2760_battery.c create mode 100644 include/linux/ds2760_battery.h diff --git a/drivers/battery/Kconfig b/drivers/battery/Kconfig index c386593..0c14ae0 100644 --- a/drivers/battery/Kconfig +++ b/drivers/battery/Kconfig @@ -8,4 +8,11 @@ config BATTERY Say Y here to enable generic battery status reporting in the /sys filesystem. +config BATTERY_DS2760 + tristate "DS2760 battery driver (HP iPAQ & others)" + depends on BATTERY && W1 + select W1_SLAVE_DS2760 + help + Say Y here to enable support for batteries with ds2760 chip. + endmenu diff --git a/drivers/battery/Makefile b/drivers/battery/Makefile index a2239cb..9902513 100644 --- a/drivers/battery/Makefile +++ b/drivers/battery/Makefile @@ -1 +1,2 @@ obj-$(CONFIG_BATTERY) += battery.o +obj-$(CONFIG_BATTERY_DS2760) += ds2760_battery.o diff --git a/drivers/battery/ds2760_battery.c b/drivers/battery/ds2760_battery.c new file mode 100644 index 000..a686304 --- /dev/null +++ b/drivers/battery/ds2760_battery.c @@ -0,0 +1,466 @@ +/* + * Driver for batteries with DS2760 chips inside. + * + * Copyright (c) 2007 Anton Vorontsov + * 2004 Matt Reimer + * 2004 Szabolcs Gyurko + * + * Use consistent with the GNU GPL is permitted, + * provided that this copyright notice is + * preserved in its entirety in all copies and derived works. + * + * Author: Anton Vorontsov <[EMAIL PROTECTED]> + * February 2007 + * + * Matt Reimer <[EMAIL PROTECTED]> + * April 2004, 2005 + * + * Szabolcs Gyurko <[EMAIL PROTECTED]> + * September 2004 + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "../w1/w1.h" +#include "../w1/slaves/w1_ds2760.h" + +struct ds2760_device_info { + struct battery_info *bi; + + /* DS2760 data, valid after calling ds2760_battery_read_status() */ + unsigned long update_time; /* jiffies when data read */ + char raw[DS2760_DATA_SIZE]; /* raw DS2760 data */ + int voltage_raw;/* units of 4.88 mV */ + int voltage_mV; /* units of mV */ + int current_raw;/* units of 0.625 mA */ + int current_mA; /* units of mA */ + int accum_current_raw; /* units of 0.25 mAh */ + int accum_current_mAh; /* units of mAh */ + int temp_raw; /* units of 0.125 C */ + int temp_C; /* units of 0.1 C */ + int rated_capacity; /* units of mAh */ + int rem_capacity; /* percentage */ + int full_active_mAh;/* units of mAh */ + int empty_mAh; /* units of mAh */ + int life_min; /* units of minutes */ + int charge_status; /* BATTERY_STATUS_* */ + + int full_counter; + struct battery batt_cdev; + struct device *w1_dev; + struct workqueue_struct *monitor_wqueue; + struct delayed_work monitor_work; +}; + +static unsigned int cache_time = 1000; +module_param(cache_time, uint, 0644); +MODULE_PARM_DESC(cache_time, "cache time in milliseconds"); + +/* Some batteries have their rated capacity stored a N * 10 mAh, while + * others use an index into this table. */ +static int rated_capacities[] = { + 0, + 920,/* Samsung */ + 920,/* BYD */ + 920,/* Lishen */ + 920,/* NEC */ + 1440, /* Samsung */ + 1440, /* BYD */ + 1440, /* Lishen */ + 1440, /* NEC */ + 2880, /* Samsung */ + 2880, /* BYD */ + 2880, /* Lishen */ + 2880/* NEC */ +}; + +/* array is level at temps 0C, 10C, 20C, 30C, 40C + * temp is in Celsius */ +static int battery_interpolate(int array[], int temp) +{ + int index, dt; + + if (temp <= 0) + return array[0]; + if (temp >= 40) + return array[4]; + + index = temp / 10; + dt= temp % 10; + + return array[index] + (((array[index + 1] - array[index]) * dt) / 10); +} + +static int ds2760_battery_read_status(struct ds2760_device_info *di) +{ + int ret, i, start, count, scale[5]; + + if (di->update_time && time_before(jiffies, di->update_time + + msecs_to_jiffies(cache_time))) + return 0; + + if (!di->w1_dev) + return 0; + + /* The first time w
[PATCH] [Trivial] [Doc] Add webpages' URL and summarize 3 lines.
Trivial patch, against -rc6. Please apply, thanks. --- CREDITS: - Summarize 3 lines into one. - Add webpage. MAINTAINERS: - Add auxdisplay drivers/tree webpages. CREDITS |7 +++ MAINTAINERS |4 2 files changed, 7 insertions(+), 4 deletions(-) Signed-off-by: Miguel Ojeda Sandonis <[EMAIL PROTECTED]> --- diff --git a/CREDITS b/CREDITS index 6bd8ab8..f990730 100644 --- a/CREDITS +++ b/CREDITS @@ -2573,10 +2573,9 @@ S: Australia N: Miguel Ojeda Sandonis E: [EMAIL PROTECTED] -D: Author: Auxiliary LCD Controller driver (ks0108) -D: Author: Auxiliary LCD driver (cfag12864b) -D: Author: Auxiliary LCD framebuffer driver (cfag12864bfb) -D: Maintainer: Auxiliary display drivers tree (drivers/auxdisplay/*) +W: http://maxextreme.googlepages.com/ +D: Author of the ks0108, cfag12864b and cfag12864bfb auxiliary display drivers. +D: Maintainer of the auxiliary display drivers tree (drivers/auxdisplay/*) S: C/ Mieses 20, 9-B S: Valladolid 47009 S: Spain diff --git a/MAINTAINERS b/MAINTAINERS index 829407f..2a658ef 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -672,6 +672,7 @@ AUXILIARY DISPLAY DRIVERS P: Miguel Ojeda Sandonis M: [EMAIL PROTECTED] L: linux-kernel@vger.kernel.org +W: http://auxdisplay.googlepages.com/ S: Maintained AVR32 ARCHITECTURE @@ -884,12 +885,14 @@ CFAG12864B LCD DRIVER P: Miguel Ojeda Sandonis M: [EMAIL PROTECTED] L: linux-kernel@vger.kernel.org +W: http://auxdisplay.googlepages.com/ S: Maintained CFAG12864BFB LCD FRAMEBUFFER DRIVER P: Miguel Ojeda Sandonis M: [EMAIL PROTECTED] L: linux-kernel@vger.kernel.org +W: http://auxdisplay.googlepages.com/ S: Maintained COMMON INTERNET FILE SYSTEM (CIFS) @@ -2020,6 +2023,7 @@ KS0108 LCD CONTROLLER DRIVER P: Miguel Ojeda Sandonis M: [EMAIL PROTECTED] L: linux-kernel@vger.kernel.org +W: http://auxdisplay.googlepages.com/ S: Maintained LAPB module -- Miguel Ojeda http://maxextreme.googlepages.com/index.htm - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: If not readdir() then what?
On Wed, 11 April 2007 16:23:21 -0700, H. Peter Anvin wrote: > David Lang wrote: > >On Thu, 12 Apr 2007, Neil Brown wrote: > > > >>For the second. > >> You say that you " would need at least 96 bits in order to make that > >> guarantee; 64 bits of hash, plus a 32-bit count value in the hash > >> collision chain". I think 96 is a bit greedy. Surely 48 bits of > >> hash and 16 bits of collision-chain-position would plenty. You would > >> need 65537 entries before a collision was even possible, and > >> billions before it was at all likely. (How big does a set of 48bit > >> numbers have to get before the probability that "No subset of 65536 > >> numbers are all the same" drops below 0.95?) > > > > you can get a hash collision with two entries. > > Yes, but the probability is 2^-n for an n-bit hash, assuming it's > uniformly distributed. > > The probability approaches 1/2 as the number of entries hashes > approaches 2^(n/2) (birthday number.) I believe you are both barking up the wrong tree. Neil proposed a 16bit collision chain. With that, it takes 65537 entries before a collision chain overflow is possible. Calling a collision chain overflow "collision" is inviting confusion, of course. :) Jörn -- The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague. -- Edsger W. Dijkstra - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/7] [RFC] ds2760 W1 slave
This is W1 slave for ds2760 chip, found inside almost every HP iPaq and HTC PDAs/phones. --- drivers/w1/slaves/Kconfig | 13 +++ drivers/w1/slaves/Makefile|1 + drivers/w1/slaves/w1_ds2760.c | 162 + drivers/w1/slaves/w1_ds2760.h | 52 + drivers/w1/w1_family.h|1 + 5 files changed, 229 insertions(+), 0 deletions(-) create mode 100644 drivers/w1/slaves/w1_ds2760.c create mode 100644 drivers/w1/slaves/w1_ds2760.h diff --git a/drivers/w1/slaves/Kconfig b/drivers/w1/slaves/Kconfig index 904e5ae..df95d6c 100644 --- a/drivers/w1/slaves/Kconfig +++ b/drivers/w1/slaves/Kconfig @@ -35,4 +35,17 @@ config W1_SLAVE_DS2433_CRC Each block has 30 bytes of data and a two byte CRC16. Full block writes are only allowed if the CRC is valid. +config W1_SLAVE_DS2760 + tristate "Dallas 2760 battery monitor chip (HP iPAQ & others)" + depends on W1 + help + If you enable this you will have the DS2760 battery monitor + chip support. + + The battery monitor chip is used in many batteries/devices + as the one who is responsible for charging/discharging/monitoring + Li+ batteries. + + If you are unsure, say N. + endmenu diff --git a/drivers/w1/slaves/Makefile b/drivers/w1/slaves/Makefile index 725dcfd..a8eb752 100644 --- a/drivers/w1/slaves/Makefile +++ b/drivers/w1/slaves/Makefile @@ -5,4 +5,5 @@ obj-$(CONFIG_W1_SLAVE_THERM) += w1_therm.o obj-$(CONFIG_W1_SLAVE_SMEM)+= w1_smem.o obj-$(CONFIG_W1_SLAVE_DS2433) += w1_ds2433.o +obj-$(CONFIG_W1_SLAVE_DS2760) += w1_ds2760.o diff --git a/drivers/w1/slaves/w1_ds2760.c b/drivers/w1/slaves/w1_ds2760.c new file mode 100644 index 000..21b0ef6 --- /dev/null +++ b/drivers/w1/slaves/w1_ds2760.c @@ -0,0 +1,162 @@ +/* + * 1-Wire implementation for the ds2760 chip + * + * Copyright (c) 2004-2005, Szabolcs Gyurko <[EMAIL PROTECTED]> + * + * Use consistent with the GNU GPL is permitted, + * provided that this copyright notice is + * preserved in its entirety in all copies and derived works. + * + */ + +#include + +#include +#include +#include +#include +#include +#include + +#include "../w1.h" +#include "../w1_int.h" +#include "../w1_family.h" +#include "w1_ds2760.h" + +static int w1_ds2760_io(struct device *dev, char *buf, int addr, size_t count, +int io) +{ + struct w1_slave *sl = container_of(dev, struct w1_slave, dev); + + if (!dev) + return 0; + + mutex_lock(&sl->master->mutex); + + if (addr > DS2760_DATA_SIZE || addr < 0) { + count = 0; + goto out; + } + if (addr + count > DS2760_DATA_SIZE) + count = DS2760_DATA_SIZE - addr; + + if (!w1_reset_select_slave(sl)) { + if (!io) { + w1_write_8(sl->master, W1_DS2760_READ_DATA); + w1_write_8(sl->master, addr); + count = w1_read_block(sl->master, buf, count); + } else { + w1_write_8(sl->master, W1_DS2760_WRITE_DATA); + w1_write_8(sl->master, addr); + w1_write_block(sl->master, buf, count); + /* XXX w1_write_block returns void, not n_written */ + } + } + +out: + mutex_unlock(&sl->master->mutex); + + return count; +} + +int w1_ds2760_read(struct device *dev, char *buf, int addr, size_t count) +{ + return w1_ds2760_io(dev, buf, addr, count, 0); +} + +int w1_ds2760_write(struct device *dev, char *buf, int addr, size_t count) +{ + return w1_ds2760_io(dev, buf, addr, count, 1); +} + +/* io = 0 means copy from EEPROM to SRAM, 1 means from SRAM to EEPROM */ +static int w1_ds2760_eeprom(struct device *dev, int addr, int io) +{ + struct w1_slave *sl = container_of(dev, struct w1_slave, dev); + int ret = 0; + + mutex_lock(&sl->master->mutex); + + if (!w1_reset_select_slave(sl)) { + if (!io) + w1_write_8(sl->master, W1_DS2760_RECALL_DATA); + else + w1_write_8(sl->master, W1_DS2760_COPY_DATA); + w1_write_8(sl->master, addr); + } + + mutex_unlock(&sl->master->mutex); + + return ret; +} + +int w1_ds2760_recall(struct device *dev, int addr) +{ + return w1_ds2760_eeprom(dev, addr, 0); +} + +int w1_ds2760_copy(struct device *dev, int addr) +{ + return w1_ds2760_eeprom(dev, addr, 1); +} + +static ssize_t w1_ds2760_read_bin(struct kobject *kobj, char *buf, loff_t off, + size_t count) +{ + struct device *dev = container_of(kobj, struct device, kobj); + return w1_ds2760_read(dev, buf, off, count); +} + +static struct bin_attribute w1_ds2760_bin_attr = { + .attr = { + .name = "w1_slave", + .mode = S_IRUGO, + .owner = THIS
[patch] x86_64: more fixes to node_possible_map runtime setup
On Mon, Apr 09, 2007 at 04:13:28PM -0700, Siddha, Suresh B wrote: > On Mon, Apr 09, 2007 at 03:05:01PM -0700, [EMAIL PROTECTED] wrote: > > Subject: x86_64-set-node_possible_map-at-runtime fix > > From: David Rientjes <[EMAIL PROTECTED]> > > > > Clear node_possible_map if numa_emulation() fails for some reason, such as > > a failed hash shift, but setup_node_range() has already set some fake nodes > > as online. > > David, Looking at your fix, I think we require more fixes in this area. > Please review the appended patch. Thanks. Andrew, Please apply the appended patch. Goes on top of the x86_64-set-node_possible_map-at-runtime-fix.patch thanks, suresh --- Subject: [patch] x86_64: more fixes to node_possible_map runtime setup From: Suresh Siddha <[EMAIL PROTECTED]> More fixes in the failure cases and a small cleanup in numa emulation case. Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> Acked-by: David Rientjes <[EMAIL PROTECTED]> --- --- linux-2.6.21-rc6/arch/x86_64/mm/numa.c~ 2007-04-09 15:59:03.0 -0700 +++ linux-2.6.21-rc6/arch/x86_64/mm/numa.c 2007-04-09 17:44:38.0 -0700 @@ -298,7 +298,6 @@ static int __init setup_node_range(int n ret = -1; } nodes[nid].end = *addr; - node_set_online(nid); node_set(nid, node_possible_map); printk(KERN_INFO "Faking node %d at %016Lx-%016Lx (%LuMB)\n", nid, nodes[nid].start, nodes[nid].end, @@ -483,7 +482,7 @@ out: * SRAT. */ remove_all_active_ranges(); - for_each_online_node(i) { + for_each_node_mask(i, node_possible_map) { e820_register_active_regions(i, nodes[i].start >> PAGE_SHIFT, nodes[i].end >> PAGE_SHIFT); setup_node_bootmem(i, nodes[i].start, nodes[i].end); @@ -510,11 +509,13 @@ void __init numa_initmem_init(unsigned l if (!numa_off && !acpi_scan_nodes(start_pfn << PAGE_SHIFT, end_pfn << PAGE_SHIFT)) return; + nodes_clear(node_possible_map); #endif #ifdef CONFIG_K8_NUMA if (!numa_off && !k8_scan_nodes(start_pfn
[PATCH 4/7] [RFC] remove "#if 0" around find_bus function, export it.
This function were placed in "#if 0" because nobody was using it. We do use. See http://lwn.net/Articles/210610/ --- drivers/base/bus.c |5 ++--- include/linux/device.h |2 ++ 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/base/bus.c b/drivers/base/bus.c index 253868e..971efa2 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -667,14 +667,13 @@ void put_bus(struct bus_type * bus) * * Note that kset_find_obj increments bus' reference count. */ -#if 0 + struct bus_type * find_bus(char * name) { struct kobject * k = kset_find_obj(&bus_subsys.kset, name); return k ? to_bus(k) : NULL; } -#endif /* 0 */ - +EXPORT_SYMBOL_GPL(find_bus); /** * bus_add_attrs - Add default attributes for this bus. diff --git a/include/linux/device.h b/include/linux/device.h index 5cf30e9..4015b39 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -68,6 +68,8 @@ extern void bus_unregister(struct bus_type * bus); extern int __must_check bus_rescan_devices(struct bus_type * bus); +extern struct bus_type *find_bus(char *name); + /* iterator helpers for buses */ int bus_for_each_dev(struct bus_type * bus, struct device * start, void * data, -- 1.5.0.5-dirty - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/7] [RFC] Battery monitoring class
Here is battery monitor class. According to first copyright string, we're maintaining it since 2003. I've took few days and cleaned it up to be more suitable for mainline inclusion. It differs from battery class at git://git.infradead.org/battery-2.6.git: * It's using external power kernel interface, i.e. does not fake external powers as batteries. (Same thing David Woodhouse planed last year). * It have predefined set of attributes, this eliminates code duplication by battery drivers. And also gives opportunity to write emulation drivers for legacy stuff (APM emulation driver follow). If driver can't afford some attribute, it will not appear in sysfs. * It insists on reusing its predefined attributes *and* their units. So, userspace getting expected values for any battery. Also common units is required for APM/ACPI emulation. Though our battery class insisting on re-usage, but not forces it. If some battery driver can't convert its own raw values (can't imagine why), then driver is free to implement its own attributes *and* additional _units attribute. Though, this scheme is discouraged. * LEDs support. Each battery register its trigger, and gadgets with LEDs can quickly bind to battery-charging / battery-full triggers. Here how it looks like from user space: # ls /sys/class/battery/main-battery/ capacity max_capacity max_voltage min_current power subsystem uevent current max_current min_capacity min_voltage status temp voltage # cat /sys/class/battery/main-battery/status Full # cat /sys/class/leds/h5400\:green-right/trigger none h5400-radio timer hwtimer main-battery-charging [main-battery-full] # cat /sys/class/leds/h5400\:green-right/brightness 255 --- drivers/Kconfig |2 + drivers/Makefile |1 + drivers/battery/Kconfig | 11 ++ drivers/battery/Makefile |1 + drivers/battery/battery.c | 303 + include/linux/battery.h | 98 +++ 6 files changed, 416 insertions(+), 0 deletions(-) create mode 100644 drivers/battery/Kconfig create mode 100644 drivers/battery/Makefile create mode 100644 drivers/battery/battery.c create mode 100644 include/linux/battery.h diff --git a/drivers/Kconfig b/drivers/Kconfig index c546de3..c3a0038 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -56,6 +56,8 @@ source "drivers/w1/Kconfig" source "drivers/power/Kconfig" +source "drivers/battery/Kconfig" + source "drivers/hwmon/Kconfig" source "drivers/mfd/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 2bdaae7..7cbfd37 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_RTC_LIB) += rtc/ obj-$(CONFIG_I2C) += i2c/ obj-$(CONFIG_W1) += w1/ obj-$(CONFIG_EXTERNAL_POWER) += power/ +obj-$(CONFIG_BATTERY) += battery/ obj-$(CONFIG_HWMON)+= hwmon/ obj-$(CONFIG_PHONE)+= telephony/ obj-$(CONFIG_MD) += md/ diff --git a/drivers/battery/Kconfig b/drivers/battery/Kconfig new file mode 100644 index 000..c386593 --- /dev/null +++ b/drivers/battery/Kconfig @@ -0,0 +1,11 @@ + +menu "Battery support" + +config BATTERY + tristate "Battery monitoring support" + select EXTERNAL_POWER + help + Say Y here to enable generic battery status reporting in + the /sys filesystem. + +endmenu diff --git a/drivers/battery/Makefile b/drivers/battery/Makefile new file mode 100644 index 000..a2239cb --- /dev/null +++ b/drivers/battery/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_BATTERY) += battery.o diff --git a/drivers/battery/battery.c b/drivers/battery/battery.c new file mode 100644 index 000..32b8288 --- /dev/null +++ b/drivers/battery/battery.c @@ -0,0 +1,303 @@ +/* + * Universal battery monitor class + * + * Copyright (c) 2007 Anton Vorontsov <[EMAIL PROTECTED]> + * Copyright (c) 2004 Szabolcs Gyurko + * Copyright (c) 2003 Ian Molton <[EMAIL PROTECTED]> + * + * Modified: 2004, Oct Szabolcs Gyurko + * + * You may use this code as per GPL version 2 + * + * All voltages, currents, capacities and temperatures in mV, mA, mAh and + * tenths of a degree unless otherwise stated. It's driver's job to convert + * its raw values to which this class operates. If for some reason driver + * can't afford this requirement, then it have to create its own attributes, + * plus additional "XYZ_units" for each of them. + */ + +#include +#include +#include +#include +#include +#include + +/* If we have hwtimer trigger, then use it to blink charging LED */ +#if defined(CONFIG_LEDS_TRIGGER_HWTIMER) || \ +defined(CONFIG_LEDS_TRIGGER_HWTIMER_MODULE) + #define led_trigger_register_charging led_trigger_register_hwtimer + #define led_trigger_unregister_charging led_trigger_unregister_hwtimer +#else + #define led_trigger_register_charging led_trigger_register_simple + #define led_trigger_unreg
[PATCH 1/7] [RFC] External power framework
External power framework - power supplies and power supplicants. Supplicants (batteries so far) may ask to notify they when power supply arrive/gone. This framework used by battery class (next patches). It's permitted for supply to be bound to several supplicants (think main and backup batteries). It's also permitted for supplicants to consume power from several external supplies (say AC and USB). Here is how it look like from userspace: # pwd /sys/class/power_supply # ls ac usb # cat ac/online usb/online 1 0 --- drivers/Kconfig|2 + drivers/Makefile |1 + drivers/power/Kconfig | 13 ++ drivers/power/Makefile |1 + drivers/power/external_power.c | 318 include/linux/external_power.h | 54 +++ 6 files changed, 389 insertions(+), 0 deletions(-) create mode 100644 drivers/power/Kconfig create mode 100644 drivers/power/Makefile create mode 100644 drivers/power/external_power.c create mode 100644 include/linux/external_power.h diff --git a/drivers/Kconfig b/drivers/Kconfig index 050323f..c546de3 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -54,6 +54,8 @@ source "drivers/spi/Kconfig" source "drivers/w1/Kconfig" +source "drivers/power/Kconfig" + source "drivers/hwmon/Kconfig" source "drivers/mfd/Kconfig" diff --git a/drivers/Makefile b/drivers/Makefile index 3a718f5..2bdaae7 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -60,6 +60,7 @@ obj-$(CONFIG_I2O) += message/ obj-$(CONFIG_RTC_LIB) += rtc/ obj-$(CONFIG_I2C) += i2c/ obj-$(CONFIG_W1) += w1/ +obj-$(CONFIG_EXTERNAL_POWER) += power/ obj-$(CONFIG_HWMON)+= hwmon/ obj-$(CONFIG_PHONE)+= telephony/ obj-$(CONFIG_MD) += md/ diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig new file mode 100644 index 000..17349c1 --- /dev/null +++ b/drivers/power/Kconfig @@ -0,0 +1,13 @@ + +menu "External power support" + +config EXTERNAL_POWER + tristate "External power kernel interface" + help + Say Y here to enable kernel external power detection interface, + like AC or USB. Information also will exported to userspace via + /sys/class/external_power/ directory. + + This interface is mandatory for battery class support. + +endmenu diff --git a/drivers/power/Makefile b/drivers/power/Makefile new file mode 100644 index 000..c303b45 --- /dev/null +++ b/drivers/power/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_EXTERNAL_POWER) += external_power.o diff --git a/drivers/power/external_power.c b/drivers/power/external_power.c new file mode 100644 index 000..21c25a4 --- /dev/null +++ b/drivers/power/external_power.c @@ -0,0 +1,318 @@ +/* + * Linux kernel interface for external power suppliers/supplicants + * + * Copyright (c) 2007 Anton Vorontsov <[EMAIL PROTECTED]> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include + +static struct class *power_supply_class; + +static LIST_HEAD(supplicants); +static struct rw_semaphore supplicants_sem; + +struct bound_supply { + struct power_supply *psy; + struct list_head node; +}; + +struct bound_supplicant { + struct power_supplicant *pst; + struct list_head node; +}; + +int power_supplicant_am_i_supplied(struct power_supplicant *pst) +{ + int ret = 0; + struct bound_supply *bpsy; + + pr_debug("%s\n", __FUNCTION__); + down(&power_supply_class->sem); + list_for_each_entry(bpsy, &pst->bound_supplies, node) { + if (bpsy->psy->is_online(bpsy->psy)) { + ret = 1; + goto out; + } + } +out: + up(&power_supply_class->sem); + + return ret; +} + +static void unbind_pst_from_psys(struct power_supplicant *pst) +{ + struct bound_supply *bpsy, *bpsy_tmp; + struct bound_supplicant *bpst, *bpst_tmp; + + list_for_each_entry_safe(bpsy, bpsy_tmp, &pst->bound_supplies, node) { + list_for_each_entry_safe(bpst, bpst_tmp, + &bpsy->psy->bound_supplicants, node) { + if (bpst->pst == pst) { + list_del(&bpst->node); + kfree(bpst); + break; + } + } + list_del(&bpsy->node); + kfree(bpsy); + } + + return; +} + +static void unbind_psy_from_psts(struct power_supply *psy) +{ + struct bound_supply *bpsy, *bpsy_tmp; + struct bound_supplicant *bpst, *bpst_tmp; + + list_for_each_entry_safe(bpst, bpst_tmp, &psy->bound_supplicants, +
[PATCH 2/7] [RFC] Common power driver for Linux gadgets
This driver used to stop code/logic duplication through different machines we porting at handhelds.org. pda_power register machs' power supplies, and will take care about notifying batteries about power changes through external power interface. This driver should be suitable for almost every Linux gadget today. Here is brief example how we use it: static int h5000_is_ac_online(void) { return !!(samcop_get_gpio_a(&h5400_samcop.dev) & SAMCOP_GPIO_GPA_ADP_IN_STATUS); } static int h5000_is_usb_online(void) { return !!(samcop_get_gpio_a(&h5400_samcop.dev) & SAMCOP_GPIO_GPA_USB_DETECT); } static void h5000_set_charge(int flags) { SET_H5400_GPIO(CHG_EN, !!flags); SET_H5400_GPIO(EXT_CHG_RATE, !!(flags & PDA_POWER_CHARGE_AC)); SET_H5400_GPIO(USB_CHG_RATE, !!(flags & PDA_POWER_CHARGE_USB)); return; } static struct pda_power_pdata h5000_power_pdata = { .is_ac_online = h5000_is_ac_online, .is_usb_online = h5000_is_usb_online, .set_charge = h5000_set_charge, }; static struct resource h5000_power_resourses[] = { [0] = { .name = "ac", .flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHEDGE | IORESOURCE_IRQ_LOWEDGE, }, [1] = { .name = "usb", .flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHEDGE | IORESOURCE_IRQ_LOWEDGE, }, }; static struct platform_device h5000_power_pdev = { .name = "pda-power", .id = -1, .resource = h5000_power_resourses, .num_resources = ARRAY_SIZE(h5000_power_resourses), .dev = { .platform_data = &h5000_power_pdata, }, }; --- drivers/power/Kconfig |8 ++ drivers/power/Makefile|1 + drivers/power/pda_power.c | 218 + include/linux/pda_power.h | 27 ++ 4 files changed, 254 insertions(+), 0 deletions(-) create mode 100644 drivers/power/pda_power.c create mode 100644 include/linux/pda_power.h diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig index 17349c1..b87779e 100644 --- a/drivers/power/Kconfig +++ b/drivers/power/Kconfig @@ -10,4 +10,12 @@ config EXTERNAL_POWER This interface is mandatory for battery class support. +config PDA_POWER + tristate "Generic PDA/phone power driver" + depends on EXTERNAL_POWER + help + Say Y here to enable generic power driver for PDAs and phones with + one or two external power supplies (AC/USB) connected to main and + backup batteries, and optional builtin charger. + endmenu diff --git a/drivers/power/Makefile b/drivers/power/Makefile index c303b45..6f084e7 100644 --- a/drivers/power/Makefile +++ b/drivers/power/Makefile @@ -1 +1,2 @@ obj-$(CONFIG_EXTERNAL_POWER) += external_power.o +obj-$(CONFIG_PDA_POWER) += pda_power.o diff --git a/drivers/power/pda_power.c b/drivers/power/pda_power.c new file mode 100644 index 000..0256ee4 --- /dev/null +++ b/drivers/power/pda_power.c @@ -0,0 +1,218 @@ +/* + * Common power driver for PDAs and phones with one or two external + * power supplies (AC/USB) connected to main and backup batteries, + * and optional builtin charger. + * + * Copyright 2007 Anton Vorontsov <[EMAIL PROTECTED]> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include + +/* + * include/linux/ioport.h does not provide flags for generic IRQ trigger + * types. So, we're using "ISA PnP IRQ specific bits", and converting them. + */ +static unsigned int get_irq_flags(struct resource *res) +{ + unsigned int flags = IRQF_DISABLED; + + if (res->flags & IORESOURCE_IRQ_HIGHEDGE) + flags |= IRQF_TRIGGER_RISING; + if (res->flags & IORESOURCE_IRQ_LOWEDGE) + flags |= IRQF_TRIGGER_FALLING; + if (res->flags & IORESOURCE_IRQ_HIGHLEVEL) + flags |= IRQF_TRIGGER_HIGH; + if (res->flags & IORESOURCE_IRQ_LOWLEVEL) + flags |= IRQF_TRIGGER_LOW; + if (res->flags & IORESOURCE_IRQ_SHAREABLE) + flags |= IRQF_SHARED; + + return flags; +} + +static struct resource *ac_irq, *usb_irq; +static struct pda_power_pdata *pdata; + +static int pda_power_is_ac_online(struct power_supply *psy) +{ + return pdata->is_ac_online ? pdata->is_ac_online() : 0; +} + +static int pda_power_is_usb_online(struct power_supply *psy) +{ + return pdata->is_usb_online ? pdata->is_usb_online() : 0; +} + +static char *pda_power_supplied_to[] = { + "main-battery", + "backup-battery", +}; + +static struct power_supply pda_power_supplies[] = { + { + .name = "ac", + .type = "ac", + .supplied_to = pda_power_s
Re: If not readdir() then what?
David Lang wrote: On Thu, 12 Apr 2007, Neil Brown wrote: For the second. You say that you " would need at least 96 bits in order to make that guarantee; 64 bits of hash, plus a 32-bit count value in the hash collision chain". I think 96 is a bit greedy. Surely 48 bits of hash and 16 bits of collision-chain-position would plenty. You would need 65537 entries before a collision was even possible, and billions before it was at all likely. (How big does a set of 48bit numbers have to get before the probability that "No subset of 65536 numbers are all the same" drops below 0.95?) Neil, you can get a hash collision with two entries. Yes, but the probability is 2^-n for an n-bit hash, assuming it's uniformly distributed. The probability approaches 1/2 as the number of entries hashes approaches 2^(n/2) (birthday number.) -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc6-mm1 USB related boot hang
On Thu, 12 Apr 2007 01:07:00 +0200 Helge Hafting <[EMAIL PROTECTED]> wrote: > On Wed, Apr 11, 2007 at 01:43:46PM -0700, Andrew Morton wrote: > > > > OK. If you add initcall_debug to the kernel boot command line, what's the > > last thing we call? > > The last messages (handwritten, somewhat shortened) > calling hid_init+0x0/0x10() > returned 0 > ran for 0 msec > calling hid_init+0x0/0x50() > usbcore registered new interface driver hiddev > > and then it hangs completely. > OK, thanks. If it happens to be, I'll bisect it down. Chances are it won't, and it gets merged, and we get to futz around with it for a week or two while holding up 2.6.22. I can only think we must enjoy doing it this way. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: If not readdir() then what?
On Thu, Apr 12, 2007 at 08:32:05AM +1000, Neil Brown wrote: > For the first: > You are storing an internal tree representation of part of the > directory attached to the 'struct file'. > Would it be possible to store this attached to the page via the > ->private pointer? What would avoid the allocate/create/free > overhead on every request. The reason why we are storing it associated with the file pointer instead of the page/block is because the a filename insertion might cause a node split, in which case half of the 4k block gets copied to another block. We need a stable pointer to where we are in the tree that can cope with hash collisions, and that's the reason for creating red/black tree in the first place, since it *doesn't* get split and reorganized when the directory's hash tree gets reorg'ed. So attaching the tree to the page breaks the reason why we have the separate data structure in the first place. > You suggest caching the open files in nfsd. While that is probably > possible (I've thought of it a number of times) is would also be > quite complex, e.g. requiring some sort of call-back to close all > those files when the filesystem is unexported. And it is very easy > to get caching heuristics wrong. Leveraging the page-cache which is > a very mature cache seems to make a lot of sense. Is it really that complex? The simplest way of handling it is simply keeping a open directory fd cache in a per-filesystem rbtree index which is indexed by file handle and contains the file pointer. When you unexport the filesystem, you simply walk the rbtree and close all of the file descriptors; no callback is required. The caching hueristics are an issue; but fixed-size cache with a simple LFU replacement strategy isn't all that complex to implement. If 95% of the time, the readdir's come in quick succession, even a small cache will probably provide huge performance gains, and increaing the cache size past some critical point will probably only provide marginal improvements. > For the second. > You say that you " would need at least 96 bits in order to make that > guarantee; 64 bits of hash, plus a 32-bit count value in the hash > collision chain". I think 96 is a bit greedy. Surely 48 bits of > hash and 16 bits of collision-chain-position would plenty. You would > need 65537 entries before a collision was even possible, and > billions before it was at all likely. (How big does a set of 48bit > numbers have to get before the probability that "No subset of 65536 > numbers are all the same" drops below 0.95?) > > This would really require that the collision-chain-index was stable > across create/delete. Doing that while you have the tree in the > page cache is probably easy enough. Doing it across reboots is > probably not possible without on-disk changes. Actually, no, we can't keep the collision chain count stable across a create/delete even while the tree is cached. At least, not without storing a huge amount of state associated with each page. (It would be a lot more work than simply having nfsd keep a fd cache for directory streams ;-). If we need create/delete stability, probably our only sane implementation choice is to just stick with a 63-bit hash, and cross our fingers and hope for the best. If nfsd caches the last N used directory caches, where N is roughly proportional to the number of active clients, and the clients all only use the last cookie returned in the readdir entry (since it would be stupid to use one of the earlier ones and request the server to re-send something which the client already has), at least in the absense of telldir/seekdir calls, then that might be quite sufficient, even if we return multiple direntory entries which contain hash collisions to the client. As long as the directory fd is cached, and the client just uses the last cookie to fetch the next batch of dirents, we'll be fine. - Ted - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 10/31] HID: Do not discard truncated input reports
-stable review patch. If anyone has any objections, please let us know. -- From: Adam Kropelin <[EMAIL PROTECTED]> HID: Do not discard truncated input reports Truncated reports should not be discarded since it prevents buggy devices from communicating with userspace. Prior to the regession introduced in 2.6.20, a shorter-than-expected report in hid_input_report() was passed thru after having the missing bytes cleared. This behavior was established over a few patches in the 2.6.early-teens days, including commit cd6104572bca9e4afe0dcdb8ecd65ef90b01297b. This patch restores the previous behavior and fixes the regression. Signed-off-by: Adam Kropelin <[EMAIL PROTECTED]> Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/hid/hid-core.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -975,7 +975,7 @@ int hid_input_report(struct hid_device * if (size < rsize) { dbg("report %d is too short, (%d < %d)", report->id, size, rsize); - return -1; + memset(data + size, 0, rsize - size); } if ((hid->claimed & HID_CLAIMED_HIDDEV) && hid->hiddev_report_event) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.21-rc6-mm1 USB related boot hang
On Wed, Apr 11, 2007 at 01:43:46PM -0700, Andrew Morton wrote: > > OK. If you add initcall_debug to the kernel boot command line, what's the > last thing we call? The last messages (handwritten, somewhat shortened) calling hid_init+0x0/0x10() returned 0 ran for 0 msec calling hid_init+0x0/0x50() usbcore registered new interface driver hiddev and then it hangs completely. Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 11/31] Fix calculation for size of filemap_attr array in md/bitmap.
-stable review patch. If anyone has any objections, please let us know. -- From: Neil Brown <[EMAIL PROTECTED]> If 'num_pages' were ever 1 more than a multiple of 8 (32bit platforms) for of 16 (64 bit platforms). filemap_attr would be allocated one 'unsigned long' shorter than required. We need a round-up in there. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/md/bitmap.c |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) --- a/drivers/md/bitmap.c +++ b/drivers/md/bitmap.c @@ -863,9 +863,7 @@ static int bitmap_init_from_disk(struct /* We need 4 bits per page, rounded up to a multiple of sizeof(unsigned long) */ bitmap->filemap_attr = kzalloc( - (((num_pages*4/8)+sizeof(unsigned long)-1) -/sizeof(unsigned long)) - *sizeof(unsigned long), + roundup( DIV_ROUND_UP(num_pages*4, 8), sizeof(unsigned long)), GFP_KERNEL); if (!bitmap->filemap_attr) goto out; -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 09/31] DVB: pluto2: fix incorrect TSCR register setting
-stable review patch. If anyone has any objections, please let us know. -- From: Andreas Oberritter <[EMAIL PROTECTED]> DVB: pluto2: fix incorrect TSCR register setting The ADEF bits in the TSCR register have different meanings in read and write mode. For this reason ADEF has to be reset on every read-modify-write operation. This patch introduces a special write function for this register, which takes care of it. Thanks to Holger Magnussen for pointing my nose at this problem. (cherry picked from commit 1489f90a49f0603a393e1800d729050f6e332bec) Signed-off-by: Andreas Oberritter <[EMAIL PROTECTED]> Signed-off-by: Mauro Carvalho Chehab <[EMAIL PROTECTED]> Signed-off-by: Michael Krufky <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/media/dvb/pluto2/pluto2.c | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) --- a/drivers/media/dvb/pluto2/pluto2.c +++ b/drivers/media/dvb/pluto2/pluto2.c @@ -149,6 +149,15 @@ static inline void pluto_rw(struct pluto writel(val, &pluto->io_mem[reg]); } +static void pluto_write_tscr(struct pluto *pluto, u32 val) +{ + /* set the number of packets */ + val &= ~TSCR_ADEF; + val |= TS_DMA_PACKETS / 2; + + pluto_writereg(pluto, REG_TSCR, val); +} + static void pluto_setsda(void *data, int state) { struct pluto *pluto = data; @@ -213,11 +222,11 @@ static void pluto_reset_ts(struct pluto if (val & TSCR_RSTN) { val &= ~TSCR_RSTN; - pluto_writereg(pluto, REG_TSCR, val); + pluto_write_tscr(pluto, val); } if (reenable) { val |= TSCR_RSTN; - pluto_writereg(pluto, REG_TSCR, val); + pluto_write_tscr(pluto, val); } } @@ -339,7 +348,7 @@ static irqreturn_t pluto_irq(int irq, vo } /* ACK the interrupt */ - pluto_writereg(pluto, REG_TSCR, tscr | TSCR_IACK); + pluto_write_tscr(pluto, tscr | TSCR_IACK); return IRQ_HANDLED; } @@ -348,9 +357,6 @@ static void __devinit pluto_enable_irqs( { u32 val = pluto_readreg(pluto, REG_TSCR); - /* set the number of packets */ - val &= ~TSCR_ADEF; - val |= TS_DMA_PACKETS / 2; /* disable AFUL and LOCK interrupts */ val |= (TSCR_MSKA | TSCR_MSKL); /* enable DMA and OVERFLOW interrupts */ @@ -358,7 +364,7 @@ static void __devinit pluto_enable_irqs( /* clear pending interrupts */ val |= TSCR_IACK; - pluto_writereg(pluto, REG_TSCR, val); + pluto_write_tscr(pluto, val); } static void pluto_disable_irqs(struct pluto *pluto) @@ -370,7 +376,7 @@ static void pluto_disable_irqs(struct pl /* clear pending interrupts */ val |= TSCR_IACK; - pluto_writereg(pluto, REG_TSCR, val); + pluto_write_tscr(pluto, val); } static int __devinit pluto_hw_init(struct pluto *pluto) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 07/31] sky2: phy workarounds for Yukon EC-U A1
-stable review patch. If anyone has any objections, please let us know. -- From: Stephen Hemminger <[EMAIL PROTECTED]> The workaround Yukon EC-U wasn't comparing with correct version and wasn't doing correct setup. Without it, 88e8056 throws all sorts of errors. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/net/sky2.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/drivers/net/sky2.c +++ b/drivers/net/sky2.c @@ -524,9 +524,9 @@ static void sky2_phy_init(struct sky2_hw ledover &= ~PHY_M_LED_MO_RX; } - if (hw->chip_id == CHIP_ID_YUKON_EC_U && hw->chip_rev == CHIP_REV_YU_EC_A1) { + if (hw->chip_id == CHIP_ID_YUKON_EC_U && + hw->chip_rev == CHIP_REV_YU_EC_U_A1) { /* apply fixes in PHY AFE */ - pg = gm_phy_read(hw, port, PHY_MARV_EXT_ADR); gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 255); /* increase differential signal amplitude in 10BASE-T */ @@ -538,7 +538,7 @@ static void sky2_phy_init(struct sky2_hw gm_phy_write(hw, port, 0x17, 0x2002); /* set page register to 0 */ - gm_phy_write(hw, port, PHY_MARV_EXT_ADR, pg); + gm_phy_write(hw, port, PHY_MARV_EXT_ADR, 0); } else { gm_phy_write(hw, port, PHY_MARV_LED_CTRL, ledctrl); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 14/31] Fix IFB net driver input device crashes
-stable review patch. If anyone has any objections, please let us know. -- From: Patrick McHardy <[EMAIL PROTECTED]> [IFB]: Fix crash on input device removal The input_device pointer is not refcounted, which means the device may disappear while packets are queued, causing a crash when ifb passes packets with a stale skb->dev pointer to netif_rx(). Fix by storing the interface index instead and do a lookup where neccessary. Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Acked-by: Jamal Hadi Salim <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/net/ifb.c | 35 +-- include/linux/skbuff.h |5 +++-- include/net/pkt_cls.h |7 +-- net/core/dev.c |8 net/core/skbuff.c |2 +- net/sched/act_mirred.c |2 +- 6 files changed, 27 insertions(+), 32 deletions(-) --- a/drivers/net/ifb.c +++ b/drivers/net/ifb.c @@ -96,17 +96,24 @@ static void ri_tasklet(unsigned long dev skb->tc_verd = SET_TC_NCLS(skb->tc_verd); stats->tx_packets++; stats->tx_bytes +=skb->len; + + skb->dev = __dev_get_by_index(skb->iif); + if (!skb->dev) { + dev_kfree_skb(skb); + stats->tx_dropped++; + break; + } + skb->iif = _dev->ifindex; + if (from & AT_EGRESS) { dp->st_rx_frm_egr++; dev_queue_xmit(skb); } else if (from & AT_INGRESS) { - dp->st_rx_frm_ing++; + skb_pull(skb, skb->dev->hard_header_len); netif_rx(skb); - } else { - dev_kfree_skb(skb); - stats->tx_dropped++; - } + } else + BUG(); } if (netif_tx_trylock(_dev)) { @@ -157,26 +164,10 @@ static int ifb_xmit(struct sk_buff *skb, stats->rx_packets++; stats->rx_bytes+=skb->len; - if (!from || !skb->input_dev) { -dropped: + if (!(from & (AT_INGRESS|AT_EGRESS)) || !skb->iif) { dev_kfree_skb(skb); stats->rx_dropped++; return ret; - } else { - /* -* note we could be going -* ingress -> egress or -* egress -> ingress - */ - skb->dev = skb->input_dev; - skb->input_dev = dev; - if (from & AT_INGRESS) { - skb_pull(skb, skb->dev->hard_header_len); - } else { - if (!(from & AT_EGRESS)) { - goto dropped; - } - } } if (skb_queue_len(&dp->rq) >= dev->tx_queue_len) { --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -188,7 +188,7 @@ enum { * @sk: Socket we are owned by * @tstamp: Time we arrived * @dev: Device we arrived on/are leaving by - * @input_dev: Device we arrived on + * @iif: ifindex of device we arrived on * @h: Transport layer header * @nh: Network layer header * @mac: Link layer header @@ -235,7 +235,8 @@ struct sk_buff { struct sock *sk; struct skb_timeval tstamp; struct net_device *dev; - struct net_device *input_dev; + int iif; + /* 4 byte hole on 64 bit*/ union { struct tcphdr *th; --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -352,10 +352,13 @@ tcf_change_indev(struct tcf_proto *tp, c static inline int tcf_match_indev(struct sk_buff *skb, char *indev) { + struct net_device *dev; + if (indev[0]) { - if (!skb->input_dev) + if (!skb->iif) return 0; - if (strcmp(indev, skb->input_dev->name)) + dev = __dev_get_by_index(skb->iif); + if (!dev || strcmp(indev, dev->name)) return 0; } --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1741,8 +1741,8 @@ static int ing_filter(struct sk_buff *sk if (dev->qdisc_ingress) { __u32 ttl = (__u32) G_TC_RTTL(skb->tc_verd); if (MAX_RED_LOOP < ttl++) { - printk(KERN_WARNING "Redir loop detected Dropping packet (%s->%s)\n", - skb->input_dev->name, skb->dev->name); + printk(KERN_WARNING "Redir loop detected Dropping packet (%d->%d)\n", + skb->iif, skb->dev->ifindex); return TC_ACT_SHOT; } @@ -1775,8 +1775,8 @@ int netif_receive_skb(struct sk_buff *sk if (!skb->tstamp.off_sec)
[patch 12/31] 8139too: RTNL and flush_scheduled_work deadlock
-stable review patch. If anyone has any objections, please let us know. -- From: Francois Romieu <[EMAIL PROTECTED]> Your usual dont-flush_scheduled_work-with-RTNL-held stuff. It is a bit different here since the thread runs permanently or is only occasionally kicked for recovery depending on the hardware revision. Signed-off-by: Francois Romieu <[EMAIL PROTECTED]> Cc: Ben Greear <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/net/8139too.c | 40 +--- 1 file changed, 17 insertions(+), 23 deletions(-) --- a/drivers/net/8139too.c +++ b/drivers/net/8139too.c @@ -1109,6 +1109,8 @@ static void __devexit rtl8139_remove_one assert (dev != NULL); + flush_scheduled_work(); + unregister_netdev (dev); __rtl8139_cleanup_dev (dev); @@ -1603,18 +1605,21 @@ static void rtl8139_thread (struct work_ struct net_device *dev = tp->mii.dev; unsigned long thr_delay = next_tick; + rtnl_lock(); + + if (!netif_running(dev)) + goto out_unlock; + if (tp->watchdog_fired) { tp->watchdog_fired = 0; rtl8139_tx_timeout_task(work); - } else if (rtnl_trylock()) { - rtl8139_thread_iter (dev, tp, tp->mmio_addr); - rtnl_unlock (); - } else { - /* unlikely race. mitigate with fast poll. */ - thr_delay = HZ / 2; - } + } else + rtl8139_thread_iter(dev, tp, tp->mmio_addr); - schedule_delayed_work(&tp->thread, thr_delay); + if (tp->have_thread) + schedule_delayed_work(&tp->thread, thr_delay); +out_unlock: + rtnl_unlock (); } static void rtl8139_start_thread(struct rtl8139_private *tp) @@ -1626,19 +1631,11 @@ static void rtl8139_start_thread(struct return; tp->have_thread = 1; + tp->watchdog_fired = 0; schedule_delayed_work(&tp->thread, next_tick); } -static void rtl8139_stop_thread(struct rtl8139_private *tp) -{ - if (tp->have_thread) { - cancel_rearming_delayed_work(&tp->thread); - tp->have_thread = 0; - } else - flush_scheduled_work(); -} - static inline void rtl8139_tx_clear (struct rtl8139_private *tp) { tp->cur_tx = 0; @@ -1696,12 +1693,11 @@ static void rtl8139_tx_timeout (struct n { struct rtl8139_private *tp = netdev_priv(dev); + tp->watchdog_fired = 1; if (!tp->have_thread) { - INIT_DELAYED_WORK(&tp->thread, rtl8139_tx_timeout_task); + INIT_DELAYED_WORK(&tp->thread, rtl8139_thread); schedule_delayed_work(&tp->thread, next_tick); - } else - tp->watchdog_fired = 1; - + } } static int rtl8139_start_xmit (struct sk_buff *skb, struct net_device *dev) @@ -2233,8 +2229,6 @@ static int rtl8139_close (struct net_dev netif_stop_queue (dev); - rtl8139_stop_thread(tp); - if (netif_msg_ifdown(tp)) printk(KERN_DEBUG "%s: Shutting down ethercard, status was 0x%4.4x.\n", dev->name, RTL_R16 (IntrStatus)); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 06/31] sky2: turn on clocks when doing resume
-stable review patch. If anyone has any objections, please let us know. -- From: Stephen Hemminger <[EMAIL PROTECTED]> Some of these chips are disabled until clock is enabled. This fixes: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=404107 Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/net/sky2.c |7 +++ 1 file changed, 7 insertions(+) --- a/drivers/net/sky2.c +++ b/drivers/net/sky2.c @@ -2421,6 +2421,10 @@ static int sky2_reset(struct sky2_hw *hw return -EOPNOTSUPP; } + /* Make sure and enable all clocks */ + if (hw->chip_id == CHIP_ID_YUKON_EC_U) + sky2_pci_write32(hw, PCI_DEV_REG3, 0); + hw->chip_rev = (sky2_read8(hw, B2_MAC_CFG) & CFG_CHIP_R_MSK) >> 4; /* This rev is really old, and requires untested workarounds */ @@ -3639,6 +3643,9 @@ static int sky2_resume(struct pci_dev *p pci_restore_state(pdev); pci_enable_wake(pdev, PCI_D0, 0); + + if (hw->chip_id == CHIP_ID_YUKON_EC_U) + sky2_pci_write32(hw, PCI_DEV_REG3, 0); sky2_set_power_state(hw, PCI_D0); err = sky2_reset(hw); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 05/31] sky2: turn carrier off when down
-stable review patch. If anyone has any objections, please let us know. -- From: Stephen Hemminger <[EMAIL PROTECTED]> Driver needs to turn off carrier when down. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- drivers/net/sky2.c |1 + 1 file changed, 1 insertion(+) --- a/drivers/net/sky2.c +++ b/drivers/net/sky2.c @@ -1506,6 +1506,7 @@ static int sky2_down(struct net_device * /* Stop more packets from being queued */ netif_stop_queue(dev); + netif_carrier_off(dev); /* Disable port IRQ */ imask = sky2_read32(hw, B0_IMSK); -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 20/31] Fix TCP slow_start_after_idle sysctl
-stable review patch. If anyone has any objections, please let us know. -- From: David Miller <[EMAIL PROTECTED]> [TCP]: slow_start_after_idle should influence cwnd validation too For the cases that slow_start_after_idle are meant to deal with, it is almost a certainty that the congestion window tests will think the connection is application limited and we'll thus decrease the cwnd there too. This defeats the whole point of setting slow_start_after_idle to zero. So test it there too. We do not cancel out the entire tcp_cwnd_validate() function so that if the sysctl is changed we still have the validation state maintained. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- net/ipv4/tcp_output.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -943,7 +943,8 @@ static void tcp_cwnd_validate(struct soc if (tp->packets_out > tp->snd_cwnd_used) tp->snd_cwnd_used = tp->packets_out; - if ((s32)(tcp_time_stamp - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto) + if (sysctl_tcp_slow_start_after_idle && + (s32)(tcp_time_stamp - tp->snd_cwnd_stamp) >= inet_csk(sk)->icsk_rto) tcp_cwnd_application_limited(sk); } } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 22/31] knfsd: allow nfsd READDIR to return 64bit cookies
-stable review patch. If anyone has any objections, please let us know. -- From: Neil Brown <[EMAIL PROTECTED]> [PATCH] knfsd: allow nfsd READDIR to return 64bit cookies ->readdir passes lofft_t offsets (used as nfs cookies) to nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it becomes an 'off_t', which isn't good. So filesystems that returned 64bit offsets would lose. Signed-off-by: Neil Brown <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> Cc: Chuck Ebbert <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- fs/nfsd/nfs3xdr.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -844,8 +844,8 @@ compose_entry_fh(struct nfsd3_readdirres #define NFS3_ENTRY_BAGGAGE (2 + 1 + 2 + 1) #define NFS3_ENTRYPLUS_BAGGAGE (1 + 21 + 1 + (NFS3_FHSIZE >> 2)) static int -encode_entry(struct readdir_cd *ccd, const char *name, -int namlen, off_t offset, ino_t ino, unsigned int d_type, int plus) +encode_entry(struct readdir_cd *ccd, const char *name, int namlen, +loff_t offset, ino_t ino, unsigned int d_type, int plus) { struct nfsd3_readdirres *cd = container_of(ccd, struct nfsd3_readdirres, common); @@ -865,7 +865,7 @@ encode_entry(struct readdir_cd *ccd, con *cd->offset1 = htonl(offset64 & 0x); cd->offset1 = NULL; } else { - xdr_encode_hyper(cd->offset, (u64) offset); + xdr_encode_hyper(cd->offset, offset64); } } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 18/31] Fix IPSEC replay window handling
-stable review patch. If anyone has any objections, please let us know. -- From: Herbert Xu <[EMAIL PROTECTED]> [IPSEC]: Reject packets within replay window but outside the bit mask Up until this point we've accepted replay window settings greater than 32 but our bit mask can only accomodate 32 packets. Thus any packet with a sequence number within the window but outside the bit mask would be accepted. This patch causes those packets to be rejected instead. Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> --- net/xfrm/xfrm_state.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -1220,7 +1220,8 @@ int xfrm_replay_check(struct xfrm_state return 0; diff = x->replay.seq - seq; - if (diff >= x->props.replay_window) { + if (diff >= min_t(unsigned int, x->props.replay_window, + sizeof(x->replay.bitmap) * 8)) { x->stats.replay_window++; return -EINVAL; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/