Re: r8169: NFG in 2.6.24-rc2
Am Wed, 07 Nov 2007 11:07:07 -0500 schrieb Mark Lord <[EMAIL PROTECTED]>: > My ASUS board has one of these: > > 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. > RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01) > Subsystem: ASUSTeK Computer Inc. Unknown device 81aa Control: I/O+ > Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- > FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast > >TAbort- SERR- >IRQ 16 Region 0: I/O ports at 9800 [size=256] > Region 2: Memory at ff3ff000 (64-bit, non-prefetchable) > [size=4K] Expansion ROM at ff3c [disabled] [size=128K] > Capabilities: [40] Power Management version 2 > Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA > PME(D0-,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 > DScale=0 PME- Capabilities: [48] Vital Product Data > Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ > Queue=0/1 Enable- Address: Data: > Capabilities: [60] Express Endpoint IRQ 0 > Device: Supported: MaxPayload 1024 bytes, PhantFunc > 0, ExtTag+ Device: Latency L0s <1us, L1 unlimited > Device: AtnBtn+ AtnInd+ PwrInd+ > Device: Errors: Correctable- Non-Fatal- Fatal- > Unsupported- Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ > Device: MaxPayload 128 bytes, MaxReadReq 4096 bytes > Link: Supported Speed 2.5Gb/s, Width x4, ASPM L0s, > Port 0 Link: Latency L0s unlimited, L1 unlimited > Link: ASPM Disabled RCB 64 bytes CommClk+ ExtSynch- > Link: Speed 2.5Gb/s, Width x1 > Capabilities: [84] Vendor Specific Information > > It works perfectly in 2.6.23. > It does not work in 2.6.24-rc2. Dunno about -rc1 or earlier -git*. > > Without CONFIG_PCI_MSI, it works slightly, enough to ping it a couple > of times, but it then dies when used for anything real: > > r8169 Gigabit Ethernet driver 2.2LK loaded > r8169 :01:00.0: no MSI. Back to INTx. > ... > eth0: RTL8168b/8111b at 0xf884a000, 00:17:31:64:e0:bc, XID > 3000 IRQ 16 ... > r8169: eth0: link up > ... > kernel: NETDEV WATCHDOG: eth0: transmit timed out > r8169: eth0: link up > ... > Not usable from this point on. Same problem here with a MSI K9AGM2 board. The problem appeared in 2.6.24-rc1 (http://bugzilla.kernel.org/show_bug.cgi?id=9257). It seems to be better in -rc2, at least the chip is detected again. I can assign an IP to that interface and bring it up, but no data traffic is possible. After some tests in both directions, ifconfig reports 157.6 KiB RX bytes, but TX bytes is 0, so sent packets seem to disappear quite early. Thanks, Hans - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Massive slowdown when re-querying large nfs dir
Andrew Morton wrote: > > > I would suggest getting a 'tcpdump -s0' trace and seeing (with > > > wireshark) what is different between the various cases. > > > > Thanks Neil for looking into this. Your suggestion has already been > > answered in a previous post, where the difference has been attributed to > > "ls -l" inducing lookup for the first try, which is fast, and getattr > > for later tries, which is super-slow. > > > > Now it's easy to blame the userland rpc.nfs.V2 server for this, but > > what's not clear is how come 2.4.31 handles getattr faster than 2.6.23? > > We broke 2.6? It'd be interesting to run the ls in an infinite loop on > the client them start poking at the server. Is the 2.6 server doing > physical IO? Is the 2.6 server consuming more system time? etc. A basic > `vmstat 1' trace for both 2.4 and 2.6 would be a starting point. > > Could be that there's some additional latency caused by networking > changes, too. I expect the tcpdump/wireshark/etc traces would have > sufficient resolution for us to be able to see that. The problem turns out to be "tune2fs -O dir_index". Removing that feature resolves the big slowdown. Does 2.4.31 support this feature? Neil Brown wrote: > Maybe an "strace -tt" of the nfs server might show some significant > difference. ### # ls -l <3K dir entry> (first try after mount inducing lookup) in ~3sec # strace -tt rpc.nfsd 08:28:14.668557 time([1194499694]) = 1194499694 08:28:14.669420 alarm(5)= 2 08:28:14.669667 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4]) 08:28:14.670142 recvfrom(4, "\275\3607{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\4"..., 8800, 0, {sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, [16]) = 116 08:28:14.670554 time(NULL) = 1194499694 08:28:14.670711 time([1194499694]) = 1194499694 08:28:14.670875 lstat("/a/x", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0 08:28:14.671134 time([1194499694]) = 1194499694 08:28:14.671302 lstat("/a/x/3619", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 08:28:14.671530 time([1194499694]) = 1194499694 08:28:14.671701 alarm(2)= 5 08:28:14.671903 time([1194499694]) = 1194499694 08:28:14.672060 lstat("/a/x/3619", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 08:28:14.672305 time([1194499694]) = 1194499694 08:28:14.672508 sendto(4, "\275\3607{\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128, 0, {sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 16) = 128 08:28:14.672909 time([1194499694]) = 1194499694 08:28:14.673869 alarm(5)= 2 08:28:14.674145 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4]) 08:28:14.674589 recvfrom(4, "\276\3607{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\4"..., 8800, 0, {sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, [16]) = 116 08:28:14.675003 time(NULL) = 1194499694 08:28:14.675160 time([1194499694]) = 1194499694 08:28:14.675321 lstat("/a/x", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0 08:28:14.675581 time([1194499694]) = 1194499694 08:28:14.675749 lstat("/a/x/3631", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 08:28:14.675979 time([1194499694]) = 1194499694 08:28:14.676150 alarm(2)= 5 08:28:14.676348 time([1194499694]) = 1194499694 08:28:14.676505 lstat("/a/x/3631", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 08:28:14.676746 time([1194499694]) = 1194499694 08:28:14.676952 sendto(4, "\276\3607{\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128, 0, {sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 16) = 128 ## # ls -l <3K dir entry> (second try after mount inducing getattr) in ~11sec # strace -tt rpc.nfsd 08:28:40.963668 time([1194499720]) = 1194499720 08:28:40.964525 alarm(5)= 2 08:28:40.964772 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4]) 08:28:40.965215 recvfrom(4, ",\3747{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\1\0\0"..., 8800, 0, {sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, [16]) = 108 08:28:40.965609 time(NULL) = 1194499720 08:28:40.965763 time([1194499720]) = 1194499720 08:28:40.965941 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0 08:28:40.966176 setfsuid(0) = 0 08:28:40.966329 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0 08:28:40.966539 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0 08:28:40.966748 open("/", O_RDONLY|O_NONBLOCK) = 0 08:28:40.966919 fcntl(0, F_SETFD, FD_CLOEXEC) = 0 08:28:40.967084 lseek(0, 0, SEEK_CUR) = 0 08:28:40.967240 getdents(0, /* 71 entries */, 3933) = 1220 08:28:40.968195 close(0)= 0 08:28:40.968351 stat("/a/", {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0 08:28:40.968583 stat("/a/",
Re: LTP ustat01 test fails on NFSROOT
On Nov 2, 2007, at 9:28 AM, Kumar Gala wrote: On Thu, 25 Oct 2007, Trond Myklebust wrote: Could you please try the following patch? Cheers Trond Its a new month so I'll ping again about sending this fix upstream to linus for 2.6.24 :) ? - k Trond, any update on sending this to Linus for 2.6.24? - k - CUT HERE - From: Trond Myklebust <[EMAIL PROTECTED]> Date: Thu, 25 Oct 2007 13:56:10 -0400 NFS: Fix the ustat() regression Since 2.6.18, the superblock sb->s_root has been a dummy dentry with a dummy inode. This breaks ustat(), which actually uses sb->s_root in a vfstat() call. Fix this by making the s_root a dummy alias to the directory inode that was used when creating the superblock. Signed-off-by: Trond Myklebust <[EMAIL PROTECTED]> --- fs/nfs/getroot.c | 81 + + 1 files changed, 27 insertions(+), 54 deletions(-) diff --git a/fs/nfs/getroot.c b/fs/nfs/getroot.c index 522e5ad..0ee4384 100644 --- a/fs/nfs/getroot.c +++ b/fs/nfs/getroot.c @@ -43,6 +43,25 @@ #define NFSDBG_FACILITYNFSDBG_CLIENT /* + * Set the superblock root dentry. + * Note that this function frees the inode in case of error. + */ +static int nfs_superblock_set_dummy_root(struct super_block *sb, struct inode *inode) +{ + /* The mntroot acts as the dummy root dentry for this superblock */ + if (sb->s_root == NULL) { + sb->s_root = d_alloc_root(inode); + if (sb->s_root == NULL) { + iput(inode); + return -ENOMEM; + } + /* Circumvent igrab(): we know the inode is not being freed */ + atomic_inc(&inode->i_count); + } + return 0; +} + +/* * get an NFS2/NFS3 root dentry from the root filehandle */ struct dentry *nfs_get_root(struct super_block *sb, struct nfs_fh *mntfh) @@ -54,33 +73,6 @@ struct dentry *nfs_get_root(struct super_block *sb, struct nfs_fh *mntfh) struct inode *inode; int error; - /* create a dummy root dentry with dummy inode for this superblock */ - if (!sb->s_root) { - struct nfs_fh dummyfh; - struct dentry *root; - struct inode *iroot; - - memset(&dummyfh, 0, sizeof(dummyfh)); - memset(&fattr, 0, sizeof(fattr)); - nfs_fattr_init(&fattr); - fattr.valid = NFS_ATTR_FATTR; - fattr.type = NFDIR; - fattr.mode = S_IFDIR | S_IRUSR | S_IWUSR; - fattr.nlink = 2; - - iroot = nfs_fhget(sb, &dummyfh, &fattr); - if (IS_ERR(iroot)) - return ERR_PTR(PTR_ERR(iroot)); - - root = d_alloc_root(iroot); - if (!root) { - iput(iroot); - return ERR_PTR(-ENOMEM); - } - - sb->s_root = root; - } - /* get the actual root for this mount */ fsinfo.fattr = &fattr; @@ -96,6 +88,10 @@ struct dentry *nfs_get_root(struct super_block *sb, struct nfs_fh *mntfh) return ERR_PTR(PTR_ERR(inode)); } + error = nfs_superblock_set_dummy_root(sb, inode); + if (error != 0) + return ERR_PTR(error); + /* root dentries normally start off anonymous and get spliced in later * if the dentry tree reaches them; however if the dentry already * exists, we'll pick it up at this point and use it as the root @@ -241,33 +237,6 @@ struct dentry *nfs4_get_root(struct super_block *sb, struct nfs_fh *mntfh) dprintk("--> nfs4_get_root()\n"); - /* create a dummy root dentry with dummy inode for this superblock */ - if (!sb->s_root) { - struct nfs_fh dummyfh; - struct dentry *root; - struct inode *iroot; - - memset(&dummyfh, 0, sizeof(dummyfh)); - memset(&fattr, 0, sizeof(fattr)); - nfs_fattr_init(&fattr); - fattr.valid = NFS_ATTR_FATTR; - fattr.type = NFDIR; - fattr.mode = S_IFDIR | S_IRUSR | S_IWUSR; - fattr.nlink = 2; - - iroot = nfs_fhget(sb, &dummyfh, &fattr); - if (IS_ERR(iroot)) - return ERR_PTR(PTR_ERR(iroot)); - - root = d_alloc_root(iroot); - if (!root) { - iput(iroot); - return ERR_PTR(-ENOMEM); - } - - sb->s_root = root; - } - /* get the info about the server and filesystem */ error = nfs4_server_capabilities(server, mntfh); if (error < 0) { @@ -289,6 +258,10 @@ struct dentry *nfs4_get_root(struct super_block *sb, struct nfs_fh *mntfh) return ERR_PTR(PTR_ERR(inode)); } + error = nfs_superblock_set_d
Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10
Does it work as kernel parameter? I tried libata_dma_mask=0x4 and to set 0xf or 0xff - doesn't help. How to disable DMA in libata, if it is compiled in kernel? On Thu, 8 Nov 2007 01:30:53 +0100, Bartlomiej Zolnierkiewicz wrote > On Thursday 08 November 2007, Denys Fedoryshchenko wrote: > > 2.6.24-rc2 not working very well > > > > > > dmesg > > [ 12.386395] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > > [ 12.405579] ide: Assuming 33MHz system bus speed for PIO modes; override > > with idebus=xx > > [ 12.430441] SC1200: IDE controller (0x100b:0x0502 rev 0x01) at PCI slot > > :00:12.2 > > [ 12.454070] SC1200: not 100% native mode: will probe irqs later > > [ 12.471947] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, > > hdb:pio > > [ 12.493873] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, > > hdd:pio > > [ 12.515810] Probing IDE interface ide0... > > [ 12.528810] Clocksource tsc unstable (delta = -497423729 ns) > > [ 12.545888] Time: pit clocksource has been installed. > > [ 12.563379] hda: SanDisk SDCFH-1024, CFA DISK drive > > [ 12.578340] hda: applying conservative PIO "downgrade" > > [ 12.593869] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO1 > > [ 12.594006] hda: MW DMA 2 mode selected > > [ 12.594297] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > > [ 12.608778] Probing IDE interface ide1... > > [ 12.623192] hda: max request size: 128KiB > > [ 12.635322] hda: 2001888 sectors (1024 MB) w/1KiB Cache, CHS=1986/16/ 63, > > DMA > > [ 12.657134] hda:<4>hda: dma_timer_expiry: dma status == 0x21 > > [ 12.865846] hda: DMA timeout error > > [ 12.876092] ide_dma_end dma_stat=21 err=1 newerr=0 > > [ 12.890753] hda: dma timeout error: status=0x58 { DriveReady SeekComplete > > DataRequest } > > [ 12.914977] ide: failed opcode was: unknown > > [ 12.927743] hda: DMA disabled > > [ 12.937035] ide0: reset: success > > [ 12.948324] hda1 > > > > Mounting taking long time on 1GB card cause of DMA issues. In dmesg i am not > > sure about timestamp showing few seconds, in real life it took about 2 > > minutes. > > Please try booting with "hda=nodma". > > It could be a hardware problem (CF adapter without DMA lines). > > Thanks, > Bart -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
From: Andrew Morton <[EMAIL PROTECTED]> Date: Wed, 7 Nov 2007 23:09:16 -0800 > I don't think that's a big problem? This syscall can (oddly) return any > 32-bit (64-bit) number and a smart application developer (after saying wtf) > would realise that he just can't check for errors and have correctly > working code. > > Then again, if he was smart he just wouldn't use times(2)'s return value > for anything. But what is the alternative? I don't think there is one, > apart from much saner things like gettimeofday(). You and I would say "wtf", but the manual states what it does: On error, (clock_t) -1 is returned, and errno is set appro- priately. And I think this (obviously bogus) convention is something we are really stuck with. Another awful aspect of this is that glibc is going to overwrite 'errno' for this return value range. That will likely cause more application misbehavior than some of the other side effects we've been discussing. In short we have two problems: 1) glibc thinks -4096 < x < 0 is an error, and will write this value into errno and return -1 to the application 2) the manual states that -1 means error - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Wed, 07 Nov 2007 22:25:30 -0800 (PST) David Miller <[EMAIL PROTECTED]> > wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > Date: Wed, 7 Nov 2007 21:20:05 -0800 > > > Yup. But userspace will already have a fit if either the start or end time > > advanced into the glibc-thought-that-was-an-error range. > > On x86 only. We could use force_successful_syscall_return() > to make sure the condition codes get set correctly on > other platforms. > > But even in that case we'd still be broken when the return > value is exactly -1 and that's what the application is going > to compare against to test for errors. I don't think that's a big problem? This syscall can (oddly) return any 32-bit (64-bit) number and a smart application developer (after saying wtf) would realise that he just can't check for errors and have correctly working code. Then again, if he was smart he just wouldn't use times(2)'s return value for anything. But what is the alternative? I don't think there is one, apart from much saner things like gettimeofday(). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fwd: same problem with 2.6.24-rc2
[adding linux-kernel again] werner wrote: The compilation is ready. By any reason that list as suggested by you wasn't generated. However, the 3 compiling/linking lists what my kernel-build-script normally generates, were. They are annexed here. It's the same , after booting the kernel crashs imediately with EIP error. And the building process reclaims a missing Makefile.o in //arch/x86. OK, first show us (that is, the mailing list "linux-kernel@vger.kernel.org", not just me) what your "kernel-build-script" looks like. The beginning of the log files that you sent to me (at end of this email) is very suspicious looking. It looks like you are not using the expect kernel build procedures. The crash problem (snippet below) is a fault in xor_sse_2() in the function that tries to choose the best (fastest) xor method. I would expect other people to be having a similar problem. I don't suspect that it's related to the build problem (Makefile.o), but we need to have you building kernels correctly before we try to find out why they break when you boot them. = On 7/Nov/2007 22:06 Randy Dunlap wrote .. On Wed, 07 Nov 2007 21:32:43 -0300 (GFT) werner wrote: On 7/Nov/2007 20:10 werner wrote .. With 2.6.23-rc2 is the same problem: it crashed at the beginning: EIP 060 c03fdea4 EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200 Again during the compilation was reclaimed that /arch/x86/Makefile.o cannot be found and were certain dependencies on it not made, such a file isn't present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ), nor was generated automaticaly during compilation, I think this is incorrect and the reason for the problems Hi, Please provide the complete build log (with V=1 if possible) for the missing Makefile.o problem. E.g.: make V=1 all >build.log 2>&1 Make sure that build.log contains the error message and then send the complete build.log file to us at linux-kernel@vger.kernel.org . wl [EMAIL PROTECTED] = On 7/Nov/2007 16:14 Andrew Morton wrote .. On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[EMAIL PROTECTED]> wrote: I really don't know what's happening. I don't understand nothing about the kernel error reporting system. Because of this, always when there is a problem, I report it via e-mail to linux-kernel@vger.kernel.org . I don't know what people there do with my messages. It went like this: 1: you sent an email to linux-kernel 2: I sent a reply to you and linux-kernel 3: you sent a reply to me, but NOT linux-kernel! In other words, you did "reply", not "reply to all", thus you removed three thousand people from the discussion. One of those people is the person who created the bug which you're hitting, and that person no longer knows what's happening. So please go back and resend all those emails, and retain ALL Cc:'s. Don't just send them only to me. Keep all indivisuals and all mailing lists on the email Cc: list. gcc -m32 -m elf_i386 /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o -o /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile gcc: /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o: No such file or directory gcc: no input files make: [/usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile] Error 1 (ignored) -- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.34-rc1 eat my photo SD card :-(
On Wed, Nov 07 2007, Roland Dreier wrote: > > Well, I spent the last 36 hours (more or less) trying to bisect the SD > > problem. The method I used was to insert the card, umount it, and make 8 dd > > in a row; the kernel is "bad" if they differs, "good" if they are the > same. > > > > I could not finish the bisect. The last pair good/bad were: > > > > bad: [7aeacf982203fb4dea2f3434eefdc268cfd5d6d9] > >[BLOCK] blk_rq_map_sg: force clear termination bit > > good: [e38f981758118d829cd40cfe9c09e3fa81e422aa] > >exportfs: update documentation > > Thanks, that helps. I read over the mmc changes in between those two > commits, and I think I found the problem... could you please try the > patch below (on top of the latest kernel) and report back how it > works? Unfortunately I am traveling and I don't have an SD card with > me to test on my laptop... > > Pierre, assuming Romano tests this patch successfully, please apply! > > Thanks, > Roland > > <-- patch below --> > > mmc: Fix sg helper copy-and-paste error > > Commit 45711f1a ("[SG] Update drivers to use sg helpers") had the > following bogus change in drivers/mmc/card/queue.c: > > > - src_buf = page_address(src->page) + src->offset; > > + src_buf = sg_virt(dst); > > (Notice that "src" is converted to "dst"). Turn this "dst" back into > the intended "src". > > Cc: Jens Axboe <[EMAIL PROTECTED]> > Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> > --- > diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c > index 9203a0b..1b9c9b6 100644 > --- a/drivers/mmc/card/queue.c > +++ b/drivers/mmc/card/queue.c > @@ -310,7 +310,7 @@ static void copy_sg(struct scatterlist *dst, unsigned int > dst_len, > } > > if (src_size == 0) { > - src_buf = sg_virt(dst); > + src_buf = sg_virt(src); > src_size = src->length; > } > How embarassing, sorry about that! Pierre, shall I shove this upstream or will you? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10
You are right, seems no dma lines in adapter. hda=nodma helped, no errors anymore. I will try now also libata_dma_mask and will mail result. Btw there is no notes in Documentation/kernel-parameters.txt about it. In any case it is complete board, WRAP.2C made by PCEngines in 2003. Kind of popular and mass produced, before was widely used by StarOS, probably known GPL violator, who didn't bother himself to supply patches, but at same time used it in his projects. If it is valid for all board with this revision, maybe it is better to put it in some kind of fixup/quirk/black list, or how it is called? On Wed, 07 Nov 2007 19:41:15 -0600, Robert Hancock wrote > Denys wrote: > > Finally i got full DMESG with 1GB card till end. Seems not readable too. > > > > ... > > > ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen > > ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in > > res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > > ata1.00: status: { DRDY } > > ata1: soft resetting link > > ata1.00: configured for MWDMA1 > > sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 > > sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor] > > Descriptor sense data with sense descriptors (in hex): > > 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 > > 00 00 00 00 > > sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0 > > end_request: I/O error, dev sda, sector 0 > > Buffer I/O error on device sda, logical block 0 > > ata1: EH complete > > I'm guessing that your CF-to-IDE adapter doesn't have the correct > lines wired up for DMA to work properly, and the card indicates DMA > support, which libata tries to use but which doesn't work. It looks > like it never tried falling back to PIO after DMA failed. Seems like > a deficiency in the speed-down logic? > > -- > Robert Hancock Saskatoon, SK, Canada > To email, remove "nospam" from [EMAIL PROTECTED] > Home Page: http://www.roberthancock.com/ -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change
On Thu, Nov 08 2007, Tejun Heo wrote: > Greg KH wrote: > > On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote: > >> pkt_setup_dev() expects module reference to be held on invocation. > >> This used to be true for sysfs callbacks but not anymore. Test and > >> grab module reference around pkt_setup_dev() in > >> class_pktcdvd_store_add(). > >> > >> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> > >> Acked-by: Peter Osterlund <[EMAIL PROTECTED]> > >> --- > >> Greg, can you please push this patch through your tree? > >> Thanks a lot. > >> > >> drivers/block/pktcdvd.c |9 + > >> 1 file changed, 9 insertions(+) > > > > Why through my tree? I don't do block devices :) > > Because it's a regression introduced by changes in sysfs? > > > Shouldn't Jens or at least Andrew take it? > > That's fine too. Jens? Sure, I'm pushing some stuff off today anyway. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change
Greg KH wrote: > On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote: >> pkt_setup_dev() expects module reference to be held on invocation. >> This used to be true for sysfs callbacks but not anymore. Test and >> grab module reference around pkt_setup_dev() in >> class_pktcdvd_store_add(). >> >> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> >> Acked-by: Peter Osterlund <[EMAIL PROTECTED]> >> --- >> Greg, can you please push this patch through your tree? >> Thanks a lot. >> >> drivers/block/pktcdvd.c |9 + >> 1 file changed, 9 insertions(+) > > Why through my tree? I don't do block devices :) Because it's a regression introduced by changes in sysfs? > Shouldn't Jens or at least Andrew take it? That's fine too. Jens? -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2 breaks nVidia MCP51 High Definition Audio
At Wed, 7 Nov 2007 19:07:07 -0500 (EST), Gerhard Mack wrote: > > On Wed, 7 Nov 2007, Andrew Morton wrote: > > > Date: Wed, 7 Nov 2007 15:21:27 -0800 > > From: Andrew Morton <[EMAIL PROTECTED]> > > To: Gerhard Mack <[EMAIL PROTECTED]> > > Cc: linux-kernel@vger.kernel.org, Jaroslav Kysela <[EMAIL PROTECTED]>, > > Takashi Iwai <[EMAIL PROTECTED]>, Rafael J. Wysocki <[EMAIL PROTECTED]> > > Subject: Re: 2.6.24-rc2 breaks nVidia MCP51 High Definition Audio > > > > > On Wed, 7 Nov 2007 17:39:41 -0500 (EST) Gerhard Mack <[EMAIL PROTECTED]> > > > wrote: > > > hello, > > > > > > This worked fine in 2.6.23 but now the kernel no longer sees my audio > > > controller. > > > > > > 00:10.1 Audio device: nVidia Corporation MCP51 High Definition Audio (rev > > > a2) > > > 00:10.1 0403: 10de:026c (rev a2) > > > > > > Let me know if I can provide more info or test patches. > > > > > > > Please provide the output of `dmesg -s 100' for both 2.6.23 > > and 2.6.24-rc3, thanks. > > > > Are you sure that the driver is suitably configured? Sometimes > > we like to fiddle config options so that a `make oldconfig' will go and > > unconfigure drivers which you need. > > Found an option for generic HD audio and enabled that with only marginally > better results. Now instead of not detecting my card it's showing a > single volume control in the mixer and not providing any sound at all. > > 2.6.23: > Advanced Linux Sound Architecture Driver Version 1.0.14 (Fri Jul 20 > 09:12:58 2007 UTC). > ACPI: PCI Interrupt Link [AAZA] enabled at IRQ 22 > ACPI: PCI Interrupt :00:10.1[B] -> Link [AAZA] -> GSI 22 (level, low) > -> IRQ 22 > PCI: Setting latency timer of device :00:10.1 to 64 > ALSA device list: > #0: HDA NVidia at 0xfe024000 irq 22 > GACT probability on > > 2.6.24-rc2: > Advanced Linux Sound Architecture Driver Version 1.0.15 (Tue Oct 23 > 06:09:18 2007 UTC). > ACPI: PCI Interrupt Link [AAZA] enabled at IRQ 22 > ACPI: PCI Interrupt :00:10.1[B] -> Link [AAZA] -> GSI 22 (level, low) > -> IRQ 22 > PCI: Setting latency timer of device :00:10.1 to 64 > ieee1394: Host added: ID:BUS[0-00:1023] GUID[0011d8000101f761] > ALSA device list: > #0: HDA NVidia at 0xfe024000 irq 22 > GACT probability on Both look OK. Please show your kernel config and /proc/asound/card0/codec#* contents. Did you choose CONFIG_SND_HDA_CODEC_* properly? Also, please be more specific about your hardware. The implementation of HD-audio stuff is deifferent greatly among products. It's very important to know what kind of machine (h/w vendor, product name, model, etc) to identify whether the configuration is known or not (i.e. it was really supported or it worked just casually). Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [kvm-devel] [PATCH 2/3] Put the virtio under the virtualization menu
Anthony Liguori wrote: > This patch moves virtio under the virtualization menu and changes virtio > devices to not claim to only be for lguest. > Perhaps the virt menu needs to be split into a host-side support menu and guest-side support menu. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Use of virtio device IDs
Gregory Haskins wrote: > >> PCI means that you can reuse all of the platform's infrastructure for >> irq allocation, discovery, device hotplug, and management. >> > > Its tempting to use, yes. However, most of that infrastructure is > completely inappropriate for a PV implementation, IMHO. Why? > You are > probably better off designing something that is PV specific instead of > shoehorning it in to fit a different model (at least for the things I > have in mind). Well, if we design our pv devices to look like hardware, they will fit quite well. Both to the guest OS and to user's expectations. > Its not a heck of a lot of code to write a pv-centric > version of these facilities. > > It is. Especially if you consider Windows and a gazillion versions of deployed, non-pv-capable Linux systems. For pv-friendly newer Linux, it's probably doable, but why? Look at the mess Xen finds itself in. >> You can write it for new guests but backporting it to older guests will be a >> huge task. >> >> We will support non-pci for s390, but in order to support Windows and >> older Linux PCI is necessary. >> > > I don't know if I would agree with "necessary". "Easier" perhaps. ;) By > definition once you are PV you are hypervisor aware. Now its just a > matter of plugging in the appropriate plumbing to bridge the hypervisor > to the guest-os. Some might be easier than others, sure. But all > should be extensible to a degree. > > It's "necessary" in a pragmatic sense: we want to deliver drivers that provide features for a wide variety of guests in a reasonable timeframe. And that means no rewriting guest OS infrastructure. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
From: Andrew Morton <[EMAIL PROTECTED]> Date: Wed, 7 Nov 2007 21:20:05 -0800 > Yup. But userspace will already have a fit if either the start or end time > advanced into the glibc-thought-that-was-an-error range. On x86 only. We could use force_successful_syscall_return() to make sure the condition codes get set correctly on other platforms. But even in that case we'd still be broken when the return value is exactly -1 and that's what the application is going to compare against to test for errors. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
From: Paul Mackerras <[EMAIL PROTECTED]> Date: Thu, 8 Nov 2007 16:15:51 +1100 > David Miller writes: > > > I can't see where x86 is doing this though, so perhaps for x86 > > glibc does make the negative value check. But I doubt it is > > checking the range 0x8000-0x, otherwise mmap() would > > be busted. > > At least for the INTERNAL_SYSCALL macro in glibc, the error check is: > > #define INTERNAL_SYSCALL_ERROR_P(val, err) \ > ((unsigned int) (val) >= 0xf001u) > > in sysdeps/unix/sysv/linux/i386/sysdep.h. Similarly the PSEUDO macro > in that file does a cmpl $-4095,%eax to test for error. (There is also > a PSEUDO_NOERRNO which doesn't test for error.) > > So the convention on (32-bit) x86 is that -4095 .. -1 are error > values, and other values are successful return values. Thanks for figuring that out. Really there is no way to fix sys_times() return values universally. Each proposed solution either doesn't fix the problem, or adds a new failure mode. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
From: Paul Mackerras <[EMAIL PROTECTED]> Date: Thu, 8 Nov 2007 15:59:12 +1100 > Not on powerpc. On powerpc the error indication is carried separately > in a condition register bit. So a force_successful_syscall_return() > call will make glibc automatically do the right thing without any > glibc changes on powerpc. It still won't fix the problem. When the return value is (clock_t) -1, all the force_successful_syscall_return() calls and glibc condition codes checks in the world are not going to fix the application code which checks for error using -1. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [poll] Is the megafreeze development model broken?
On Wed, 07 Nov 2007 23:56:57 +0100 ciol <[EMAIL PROTECTED]> wrote: > Hi, I'd like to ask you a few questions: > > * Do you like the way linux distributions integrate the kernel? > > * Wouldn't you prefer they ship with the stable and still maintained > 2.6.16.X, while providing optionally the latest kernel for those who > want or just have a new hardware? > > * Do you think the megafreeze development model [1] and the "I don't > trust in upstream" development model are broken? (And why) > > > > [1] http://www.modeemi.fi/~tuomov/b/archives/2007/03/03/T19_15_26/ > > > (I'm going to ask this for several projects, not only the kernel) > It's a free world, do what you want. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [kvm-devel] [PATCH 3/3] virtio PCI device
Anthony Liguori wrote: > This is a PCI device that implements a transport for virtio. It allows virtio > devices to be used by QEMU based VMMs like KVM or Xen. > > Didn't see support for dma. I think that with Amit's pvdma patches you can support dma-capable devices as well without too much fuss. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Thu, 8 Nov 2007 16:36:08 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote: > Andrew Morton writes: > > > Yup. But userspace will already have a fit if either the start or end time > > advanced into the glibc-thought-that-was-an-error range. > > Not nearly as much of a fit. The effect on x86 is that values between > -4095 and -1 are reported as -1, so the end-start difference will be > out by less than 41 seconds. That's not nearly as dramatic as a > difference of 21 million seconds (over 16 years). :) > > I really think that wrapping at 0x7fff makes the situation worse, > not better. > Sure. So we need to do what you say: never return an error from sys_times() and change glibc to not perform error-interpretation on sys_times() return values and recommend that people bypass libc and go direct to the syscall so they'll work correctly on older glibc. Lovely. I wonder what happens with things like F_GETOWN, shmat() and lseek(/dev/mem) on x86 (things which use force_successful_syscall_return()). According to the comment in include/linux/ptrace.h, glibc should be special-casing these. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fix incorrect test in trident_ac97_set(); sound/oss/trident.c
On Wed, Nov 07, 2007 at 11:04:41AM -0800, Ray Lee wrote: > On Nov 7, 2007 10:50 AM, Roel Kluin <[EMAIL PROTECTED]> wrote: > > If count reaches zero, the loop ends, but the postfix decrement > > still subtracts: testing for 'count == 0' will not work. > > > > Signed-off-by: Roel Kluin <[EMAIL PROTECTED]> > > --- > > diff --git a/sound/oss/trident.c b/sound/oss/trident.c > > index 96adc47..6959ee1 100644 > > --- a/sound/oss/trident.c > > +++ b/sound/oss/trident.c > > @@ -2935,7 +2935,7 @@ trident_ac97_set(struct ac97_codec *codec, u8 reg, > > u16 val) > > do { > > if ((inw(TRID_REG(card, address)) & busy) == 0) > > break; > > - } while (count--); > > + } while (--count); > > > > data |= (mask | (reg & AC97_REG_ADDR)); > > > > @@ -2996,7 +2996,7 @@ trident_ac97_get(struct ac97_codec *codec, u8 reg) > > data = inl(TRID_REG(card, address)); > > if ((data & busy) == 0) > > break; > > - } while (count--); > > + } while (--count); > > > > spin_unlock_irqrestore(&card->lock, flags); > > > > if (count == 0) { > > > > Thanks, much better. In the future, please also CC: the appropriate > maintainers, or Andrew Morton if you're at a loss... Indeed. > Reviewed-by: Ray Lee <[EMAIL PROTECTED]> Acked-by: Muli Ben-Yehuda <[EMAIL PROTECTED]> Andrew, can you please push to Linus? Thanks, Muli - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
On Wed, Nov 07, 2007 at 03:28:33PM -0800, Andrew Morton wrote: On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote: will return '-1' to user space and set the negated clock_t value to errno. At minimum, perhaps it should return a sane errno value. RETURN VALUE times() returns the number of clock ticks that have elapsed since an arbitrary point in the past. For Linux 2.4 and earlier this point is the moment the system was booted. Since Linux 2.6, this point is (2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot time. The return value may overflow the possible range of type clock_t. On error, (clock_t) -1 is returned, and errno is set appro- priately. The strange -1 behavior is enshrined in history. I think a better answer is to tell people to use getrusage() if they want a return result without this problem. Adding INITIAL_JIFFIES will fix the case where an embedded system is booted up to run a test and then shut down, and the mask, although it causes discontinuities periodically at least moves them away from the early boot. INITIAL_JIFFIES was a good idea, but it is probably best to keep it inside of the kernel. David Brown - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.34-rc1 eat my photo SD card :-(
On Wed, 07 Nov 2007 15:37:46 -0800 Roland Dreier <[EMAIL PROTECTED]> wrote: > > mmc: Fix sg helper copy-and-paste error > > Commit 45711f1a ("[SG] Update drivers to use sg helpers") had the > following bogus change in drivers/mmc/card/queue.c: > > > - src_buf = page_address(src->page) + src->offset; > > + src_buf = sg_virt(dst); > > (Notice that "src" is converted to "dst"). Turn this "dst" back into > the intended "src". > > Cc: Jens Axboe <[EMAIL PROTECTED]> > Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> Ouch! Well that was obviously a bug. I wonder how the hell it only explodes for Romano. I've been shuffling loads of data using -rc1 without an incident. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
Andrew Morton writes: > Yup. But userspace will already have a fit if either the start or end time > advanced into the glibc-thought-that-was-an-error range. Not nearly as much of a fit. The effect on x86 is that values between -4095 and -1 are reported as -1, so the end-start difference will be out by less than 41 seconds. That's not nearly as dramatic as a difference of 21 million seconds (over 16 years). :) I really think that wrapping at 0x7fff makes the situation worse, not better. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change
On Thu, Nov 08, 2007 at 11:27:16AM +0900, Tejun Heo wrote: > pkt_setup_dev() expects module reference to be held on invocation. > This used to be true for sysfs callbacks but not anymore. Test and > grab module reference around pkt_setup_dev() in > class_pktcdvd_store_add(). > > Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> > Acked-by: Peter Osterlund <[EMAIL PROTECTED]> > --- > Greg, can you please push this patch through your tree? > Thanks a lot. > > drivers/block/pktcdvd.c |9 + > 1 file changed, 9 insertions(+) Why through my tree? I don't do block devices :) Shouldn't Jens or at least Andrew take it? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Thu, 8 Nov 2007 15:59:12 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote: > Andrew Morton writes: > > > "the latter" is what my protopatch does isn't it? It wraps at 0x7fff. > > It appears that glibc treats all of 0x8000-0x as an error. > > Not on powerpc. On powerpc the error indication is carried separately > in a condition register bit. So a force_successful_syscall_return() > call will make glibc automatically do the right thing without any > glibc changes on powerpc. OK > Wrapping at 0x7fff will cause programs to see large negative > deltas between successive calls when the wrap occurs. I can see that > giving userspace fits. :) > Yup. But userspace will already have a fit if either the start or end time advanced into the glibc-thought-that-was-an-error range. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
David Miller writes: > I can't see where x86 is doing this though, so perhaps for x86 > glibc does make the negative value check. But I doubt it is > checking the range 0x8000-0x, otherwise mmap() would > be busted. At least for the INTERNAL_SYSCALL macro in glibc, the error check is: #define INTERNAL_SYSCALL_ERROR_P(val, err) \ ((unsigned int) (val) >= 0xf001u) in sysdeps/unix/sysv/linux/i386/sysdep.h. Similarly the PSEUDO macro in that file does a cmpl $-4095,%eax to test for error. (There is also a PSEUDO_NOERRNO which doesn't test for error.) So the convention on (32-bit) x86 is that -4095 .. -1 are error values, and other values are successful return values. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/memory.c: remove warning from an uninitialized spinlock. was: Re: 2.6.21-rc7-mm2
On Wed, Nov 07, 2007 at 02:20:03PM -0500, Steven Rostedt wrote: > > > > Introduce a macro for suppressing gcc from generating a warning about a > > probable uninitialized state of a variable. > > > > Example: > > > > - spinlock_t *ptl; > > + spinlock_t *uninitialized_var(ptl); > > > > Not a happy solution, but those warnings are obnoxious. > > > > - Using the usual pointlessly-set-it-to-zero approach wastes several > > bytes of text. > > > > - Using a macro means we can (hopefully) do something else if gcc changes > > cause the `x = x' hack to stop working > > > > - Using a macro means that people who are worried about hiding true bugs > > can easily turn it off. > > > > Signed-off-by: Borislav Petkov <[EMAIL PROTECTED]> > > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > > I just stumbled across this being in the kernel. Well, I'm finally glad > it made it in, even though it was suggested one year earlier ;-) > > http://lkml.org/lkml/2006/5/11/50 yeah, this was Andrew's idea. The version in the kernel, in contrast to yours, doesn't have a config option so you still have to make really sure you're not aiding any bugs with it. -- Regards/Gruß, Boris. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
Andrew Morton writes: > "the latter" is what my protopatch does isn't it? It wraps at 0x7fff. > It appears that glibc treats all of 0x8000-0x as an error. Not on powerpc. On powerpc the error indication is carried separately in a condition register bit. So a force_successful_syscall_return() call will make glibc automatically do the right thing without any glibc changes on powerpc. Wrapping at 0x7fff will cause programs to see large negative deltas between successive calls when the wrap occurs. I can see that giving userspace fits. :) Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: avoid large irq-latencies in smp-balancing
Peter Zijlstra wrote: > Bah, missed a hunk > > --- > Subject: sched: avoid large irq-latencies in smp-balancing > > SMP balancing is done with IRQs disabled and can iterate the full rq. When rqs > are large this can cause large irq-latencies. Limit the nr of iterations on > each run. > > Signed-off-by: Peter Zijlstra <[EMAIL PROTECTED]> > CC: Peter Williams <[EMAIL PROTECTED]> Tested-by: Gregory Haskins <[EMAIL PROTECTED]> (as part of 23.1-rt11) > --- > include/linux/sched.h |1 + > kernel/sched.c| 15 ++- > kernel/sysctl.c |8 > 3 files changed, 19 insertions(+), 5 deletions(-) > > Index: linux-2.6-2/kernel/sched.c > === > --- linux-2.6-2.orig/kernel/sched.c > +++ linux-2.6-2/kernel/sched.c > @@ -474,6 +474,12 @@ const_debug unsigned int sysctl_sched_fe > #define sched_feat(x) (sysctl_sched_features & SCHED_FEAT_##x) > > /* > + * Number of tasks to iterate in a single balance run. > + * Limited because this is done with IRQs disabled. > + */ > +const_debug unsigned int sysctl_sched_nr_migrate = 32; > + > +/* > * For kernel-internal use: high-speed (but slightly incorrect) per-cpu > * clock constructed from sched_clock(): > */ > @@ -2237,7 +2243,7 @@ balance_tasks(struct rq *this_rq, int th > enum cpu_idle_type idle, int *all_pinned, > int *this_best_prio, struct rq_iterator *iterator) > { > - int pulled = 0, pinned = 0, skip_for_load; > + int loops = 0, pulled = 0, pinned = 0, skip_for_load; > struct task_struct *p; > long rem_load_move = max_load_move; > > @@ -2251,10 +2257,10 @@ balance_tasks(struct rq *this_rq, int th >*/ > p = iterator->start(iterator->arg); > next: > - if (!p) > + if (!p || loops++ > sysctl_sched_nr_migrate) > goto out; > /* > - * To help distribute high priority tasks accross CPUs we don't > + * To help distribute high priority tasks across CPUs we don't >* skip a task if it will be the highest priority task (i.e. smallest >* prio value) on its new queue regardless of its load weight >*/ > @@ -2271,8 +2277,7 @@ next: > rem_load_move -= p->se.load.weight; > > /* > - * We only want to steal up to the prescribed number of tasks > - * and the prescribed amount of weighted load. > + * We only want to steal up to the prescribed amount of weighted load. >*/ > if (rem_load_move > 0) { > if (p->prio < *this_best_prio) > Index: linux-2.6-2/kernel/sysctl.c > === > --- linux-2.6-2.orig/kernel/sysctl.c > +++ linux-2.6-2/kernel/sysctl.c > @@ -298,6 +298,14 @@ static struct ctl_table kern_table[] = { > .mode = 0644, > .proc_handler = &proc_dointvec, > }, > + { > + .ctl_name = CTL_UNNUMBERED, > + .procname = "sched_nr_migrate", > + .data = &sysctl_sched_nr_migrate, > + .maxlen = sizeof(unsigned int), > + .mode = 644, > + .proc_handler = &proc_dointvec, > + }, > #endif > { > .ctl_name = CTL_UNNUMBERED, > Index: linux-2.6-2/include/linux/sched.h > === > --- linux-2.6-2.orig/include/linux/sched.h > +++ linux-2.6-2/include/linux/sched.h > @@ -1466,6 +1466,7 @@ extern unsigned int sysctl_sched_batch_w > extern unsigned int sysctl_sched_child_runs_first; > extern unsigned int sysctl_sched_features; > extern unsigned int sysctl_sched_migration_cost; > +extern unsigned int sysctl_sched_nr_migrate; > #endif > > extern unsigned int sysctl_sched_compat_yield; > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] MN10300: Add the MN10300 architecture to Linux kernel [try #3]
On Wed, Nov 07, 2007 at 05:43:23PM +, David Howells wrote: > > > These patches add the MEI/Panasonic MN10300/AM33 architecture to the Linux > kernel. > > The first patch suppresses AOUT support in the kernel if CONFIG_BINFMT_AOUT=n > and CONFIG_IA32_AOUT=n. MN10300 does not support the AOUT binfmt, so the ELF > binfmt should not be permitted to go looking for AOUT libraries to load, nor > should random bits of the kernel depend on asm/a.out.h. > > The second patch adds the architecture itself, to be selected by ARCH=mn10300 > on the make command line. > > The patches can also be downloaded from: > > http://people.redhat.com/~dhowells/mn10300/mn10300-arch.tar.bz2 The patch to include/asm-generic/Kbuild.asm doesn't seem to be required. +#elif defined(__mn10300__) Please use a CONFIG_ variable in such cases. The parts outside arch/mn10300/ and include/asm-mn10300/ (except for the trivial "&& {,!}MN10300" Kconfig changes) should go separately through the maintainers or get ACKs from the maintainers, even more since they also contain cleanups like - .regions = {ERASEINFO(0x01000,64), + .regions= { + ERASEINFO(0x01000,64), } --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h ... +extern void __kprobes arch_remove_kprobe(struct kprobe *p); This looks as if it will break compilation on avr32 and sparc64. > A suitable toolchain can be downloaded from: > > ftp://ftp.redhat.com/pub/redhat/gnupro/AM33/ >... What is the status of support in upstream GNU binutils and GNU gcc? > David cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':
On Wed, Nov 07, 2007 at 11:52:32PM +0100, Adrian Bunk wrote: > On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote: > > > > But on the other hand, it seems that only the ASIX code will work > > > > right; the DM9601 and MCS7830 Kconfig is different/wrong. > > > > > > I'm not seeing the problem. > > > > > > Which configuration will be handled wrongly? > > > > Notice how only the ASIX kconfig depended on NET_ETHERNET... > > since MII depends on NET_ETHERNET, and (last I knew) the > > reverse dependencies didn't capture the complete dependency > > tree, selecting only MII would leave out some stuff. > > Except for one s390 net driver (I'll check why it's doing this) the > NET_ETHERNET option does not influence what code is being generated - > it's just a Kconfig-internal option allowing to disable a huge bunch > of drivers at once. Damn, I shouldn't have only grep'ed under drivers/. @davem: Please look at net/ipv4/arp.c:arp_process() Am I right that CONFIG_NET_ETHERNET=n and CONFIG_NETDEV_1000=y or CONFIG_NETDEV_1=y will not be handled correctly there? And the best solution is to nuke all #ifdef's in this function and make the code unconditionally available? cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':
On Wed, Nov 07, 2007 at 06:53:48PM -0800, David Brownell wrote: > On Wednesday 07 November 2007, Adrian Bunk wrote: > > On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote: > > > > > But on the other hand, it seems that only the ASIX code will work > > > > > right; the DM9601 and MCS7830 Kconfig is different/wrong. > > > > > > > > I'm not seeing the problem. > > > > > > > > Which configuration will be handled wrongly? > > > > > > Notice how only the ASIX kconfig depended on NET_ETHERNET... > > > since MII depends on NET_ETHERNET, and (last I knew) the > > > reverse dependencies didn't capture the complete dependency > > > tree, selecting only MII would leave out some stuff. > > > > Except for one s390 net driver (I'll check why it's doing this) the > > NET_ETHERNET option does not influence what code is being generated - > > it's just a Kconfig-internal option allowing to disable a huge bunch > > of drivers at once. > > Drivers like ... AX88xxx, DM9601, and MCS7830!! Except as > it turns out, only the first one behaves as intended. > > You can tell it's a problem by the way it's inconsistent, > regardless of the details of the problem. :) I'm all for cleanups that make things consistent. :) As long as we can agree that there's a difference between a problem like a compile or runtime error and an opportunity for making things consistent. > - Dave cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Check length at deprecated_sysctl_warning.
> On Thu, 08 Nov 2007 11:57:26 +0900 Tetsuo Handa <[EMAIL PROTECTED]> wrote: > Original patch assumed args->nlen < CTL_MAXNAME, but it can be false. > > Signed-off-by: Tetsuo Handa <[EMAIL PROTECTED]> > > > --- linux-2.6.22-rc2.orig/kernel/sysctl.c 2007-11-08 10:38:17.0 > +0900 > +++ linux-2.6.22-rc2/kernel/sysctl.c 2007-11-08 11:24:27.0 +0900 > @@ -2609,6 +2609,10 @@ static int deprecated_sysctl_warning(str > int name[CTL_MAXNAME]; > int i; > > + /* Check args->nlen. */ > + if (args->nlen > CTL_MAXNAME) > + return -EFAULT; > + > /* Read in the sysctl name for better debug message logging */ > for (i = 0; i < args->nlen; i++) > if (get_user(name[i], args->name + i)) Well that would have been a nice roothole for someone. Thanks. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
From: Andrew Morton <[EMAIL PROTECTED]> Date: Wed, 7 Nov 2007 19:07:14 -0800 > It appears that glibc treats all of 0x8000-0x as an > error. glibc treats it as an error if the system call returns with the carry condition code set. At least that's how I've understood it to work and at a minimum this is how it works on sparc, ppc, ia64, mips, etc. The error indication is being created by the system call return path in the kernel. It tests for values between -512 and 0, and marks those as errors unless force_successful_syscall() has been called. I can't see where x86 is doing this though, so perhaps for x86 glibc does make the negative value check. But I doubt it is checking the range 0x8000-0x, otherwise mmap() would be busted. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Thu, 8 Nov 2007 12:53:57 +1100 Paul Mackerras <[EMAIL PROTECTED]> wrote: > Andrew Morton writes: > > > Given all this stuff, the return value from sys_times() doesn't seem a > > particularly useful or reliable kernel interface. > > I think the best thing would be to ignore any error from copy_to_user > and always return the number of clock ticks. We should call > force_successful_syscall_return, and glibc on x86 should be taught not > to interpret negative values as an error. Changing glibc might be hard ;) > POSIX doesn't require us to return an EFAULT error if the buf argument > is bogus. If userspace does supply a bogus buf pointer, then either > it will dereference it itself and get a segfault, or it won't > dereference it, in which case it obviously didn't care about the > values we tried to put there. > > If we try to return an error under some circumstances, then there is > at least one 32-bit value for the number of ticks that will cause > confusion. We can either change that value (or values) to some other > value, which seems pretty bogus, or we can just decide not to return > any errors. The latter seems to me to have no significant downside > and to be the simplest solution to the problem. "the latter" is what my protopatch does isn't it? It wraps at 0x7fff. It appears that glibc treats all of 0x8000-0x as an error. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sysctl: Check length at deprecated_sysctl_warning.
Original patch assumed args->nlen < CTL_MAXNAME, but it can be false. Signed-off-by: Tetsuo Handa <[EMAIL PROTECTED]> --- linux-2.6.22-rc2.orig/kernel/sysctl.c 2007-11-08 10:38:17.0 +0900 +++ linux-2.6.22-rc2/kernel/sysctl.c2007-11-08 11:24:27.0 +0900 @@ -2609,6 +2609,10 @@ static int deprecated_sysctl_warning(str int name[CTL_MAXNAME]; int i; + /* Check args->nlen. */ + if (args->nlen > CTL_MAXNAME) + return -EFAULT; + /* Read in the sysctl name for better debug message logging */ for (i = 0; i < args->nlen; i++) if (get_user(name[i], args->name + i)) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':
On Wednesday 07 November 2007, Adrian Bunk wrote: > On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote: > > > > But on the other hand, it seems that only the ASIX code will work > > > > right; the DM9601 and MCS7830 Kconfig is different/wrong. > > > > > > I'm not seeing the problem. > > > > > > Which configuration will be handled wrongly? > > > > Notice how only the ASIX kconfig depended on NET_ETHERNET... > > since MII depends on NET_ETHERNET, and (last I knew) the > > reverse dependencies didn't capture the complete dependency > > tree, selecting only MII would leave out some stuff. > > Except for one s390 net driver (I'll check why it's doing this) the > NET_ETHERNET option does not influence what code is being generated - > it's just a Kconfig-internal option allowing to disable a huge bunch > of drivers at once. Drivers like ... AX88xxx, DM9601, and MCS7830!! Except as it turns out, only the first one behaves as intended. You can tell it's a problem by the way it's inconsistent, regardless of the details of the problem. :) - Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?
On Thu, Nov 08, 2007 at 02:20:58AM +0100, Andi Kleen wrote: > > > But I think we'd be best off stashing a single bit somewhere and > > checking it at migrate time (relatively infrequent) rather than > > copying and zeroing out a potentially enormous affinity mask every > > time we disable migration (often, and in fast paths). Perhaps adding > > TASK_PINNED to the task state flags would do it? > > It would need to be a count to be able to nest it. Ahh, right. Suppose that means fattening the task struct until someone comes up with something more clever. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] virtio PCI device
This is a PCI device that implements a transport for virtio. It allows virtio devices to be used by QEMU based VMMs like KVM or Xen. Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 9e33fc4..c81e0f3 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -6,3 +6,20 @@ config VIRTIO config VIRTIO_RING bool depends on VIRTIO + +config VIRTIO_PCI + tristate "PCI driver for virtio devices (EXPERIMENTAL)" + depends on PCI && EXPERIMENTAL + select VIRTIO + select VIRTIO_RING + ---help--- + This drivers provides support for virtio based paravirtual device + drivers over PCI. This requires that your VMM has appropriate PCI + virtio backends. Most QEMU based VMMs should support these devices + (like KVM or Xen). + + Currently, the ABI is not considered stable so there is no guarantee + that this version of the driver will work with your VMM. + + If unsure, say M. + diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile index f70e409..cc84999 100644 --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -1,2 +1,3 @@ obj-$(CONFIG_VIRTIO) += virtio.o obj-$(CONFIG_VIRTIO_RING) += virtio_ring.o +obj-$(CONFIG_VIRTIO_PCI) += virtio_pci.o diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c new file mode 100644 index 000..85ae096 --- /dev/null +++ b/drivers/virtio/virtio_pci.c @@ -0,0 +1,469 @@ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +MODULE_AUTHOR("Anthony Liguori <[EMAIL PROTECTED]>"); +MODULE_DESCRIPTION("virtio-pci"); +MODULE_LICENSE("GPL"); +MODULE_VERSION("1"); + +/* Our device structure */ +struct virtio_pci_device +{ + /* the virtio device */ + struct virtio_device vdev; + /* the PCI device */ + struct pci_dev *pci_dev; + /* the IO mapping for the PCI config space */ + void *ioaddr; + + spinlock_t lock; + struct list_head virtqueues; +}; + +struct virtio_pci_vq_info +{ + /* the number of entries in the queue */ + int num; + /* the number of pages the device needs for the ring queue */ + int n_pages; + /* the index of the queue */ + int queue_index; + /* the struct page of the ring queue */ + struct page *pages; + /* the virtual address of the ring queue */ + void *queue; + /* a pointer to the virtqueue */ + struct virtqueue *vq; + /* the node pointer */ + struct list_head node; +}; + +/* We have to enumerate here all virtio PCI devices. */ +static struct pci_device_id virtio_pci_id_table[] = { + { 0x5002, 0x2258, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, /* Dummy entry */ + { 0 }, +}; + +MODULE_DEVICE_TABLE(pci, virtio_pci_id_table); + +/* A PCI device has it's own struct device and so does a virtio device so + * we create a place for the virtio devices to show up in sysfs. I think it + * would make more sense for virtio to not insist on having it's own device. */ +static struct device virtio_pci_root = { + .parent = NULL, + .bus_id = "virtio-pci", +}; + +/* Unique numbering for devices under the kvm root */ +static unsigned int dev_index; + +/* Convert a generic virtio device to our structure */ +static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev) +{ + return container_of(vdev, struct virtio_pci_device, vdev); +} + +/* virtio config->feature() implementation */ +static bool vp_feature(struct virtio_device *vdev, unsigned bit) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + u32 mask; + + /* Since this function is supposed to have the side effect of +* enabling a queried feature, we simulate that by doing a read +* from the host feature bitmask and then writing to the guest +* feature bitmask */ + mask = ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES); + if (mask & (1 << bit)) { + mask |= (1 << bit); + iowrite32(mask, vp_dev->ioaddr + VIRTIO_PCI_GUEST_FEATURES); + } + + return !!(mask & (1 << bit)); +} + +/* virtio config->get() implementation */ +static void vp_get(struct virtio_device *vdev, unsigned offset, + void *buf, unsigned len) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + void *ioaddr = vp_dev->ioaddr + VIRTIO_PCI_CONFIG + offset; + + /* We translate appropriately sized get requests into more natural +* IO operations. These functions also take care of endianness +* conversion. */ + switch (len) { + case 1: { + u8 val; + val = ioread8(ioaddr); + memcpy(buf, &val, sizeof(val)); + break; + } + case 2: { + u16 val; + val = ioread16(
[PATCH 2/3] Put the virtio under the virtualization menu
This patch moves virtio under the virtualization menu and changes virtio devices to not claim to only be for lguest. Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]> diff --git a/drivers/Kconfig b/drivers/Kconfig index f4076d9..d945ffc 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -93,6 +93,4 @@ source "drivers/auxdisplay/Kconfig" source "drivers/kvm/Kconfig" source "drivers/uio/Kconfig" - -source "drivers/virtio/Kconfig" endmenu diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig index 4d0119e..be4b224 100644 --- a/drivers/block/Kconfig +++ b/drivers/block/Kconfig @@ -429,6 +429,7 @@ config VIRTIO_BLK tristate "Virtio block driver (EXPERIMENTAL)" depends on EXPERIMENTAL && VIRTIO ---help--- - This is the virtual block driver for lguest. Say Y or M. + This is the virtual block driver for virtio. It can be used with + lguest or QEMU based VMMs (like KVM or Xen). Say Y or M. endif # BLK_DEV diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig index 6569206..ac4bcdf 100644 --- a/drivers/kvm/Kconfig +++ b/drivers/kvm/Kconfig @@ -50,5 +50,6 @@ config KVM_AMD # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/lguest/Kconfig +source drivers/virtio/Kconfig endif # VIRTUALIZATION diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 86b8641..e66aec4 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -3107,6 +3107,7 @@ config VIRTIO_NET tristate "Virtio network driver (EXPERIMENTAL)" depends on EXPERIMENTAL && VIRTIO ---help--- - This is the virtual network driver for lguest. Say Y or M. + This is the virtual network driver for virtio. It can be used with + lguest or QEMU based VMMs (like KVM or Xen). Say Y or M. endif # NETDEVICES - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] Export vring functions for modules to use
This is needed for the virtio PCI device to be compiled as a module. Signed-off-by: Anthony Liguori <[EMAIL PROTECTED]> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 0e1bf05..3f28b47 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -260,6 +260,8 @@ irqreturn_t vring_interrupt(int irq, void *_vq) return IRQ_HANDLED; } +EXPORT_SYMBOL_GPL(vring_interrupt); + static struct virtqueue_ops vring_vq_ops = { .add_buf = vring_add_buf, .get_buf = vring_get_buf, @@ -306,8 +308,12 @@ struct virtqueue *vring_new_virtqueue(unsigned int num, return &vq->vq; } +EXPORT_SYMBOL_GPL(vring_new_virtqueue); + void vring_del_virtqueue(struct virtqueue *vq) { kfree(to_vvq(vq)); } +EXPORT_SYMBOL_GPL(vring_del_virtqueue); + - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/3] virtio PCI driver
This patch series implements a PCI driver for virtio. This allows virtio devices (like block and network) to be used in QEMU/KVM. I'll post a very early KVM userspace backend in kvm-devel for those that are interested. This series depends on the two virtio fixes I've posted and Rusty's config_ops refactoring. I've tested with these patches on Rusty's experimental virtio tree. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio config_ops refactoring
Rusty Russell wrote: On Thursday 08 November 2007 04:30:50 Anthony Liguori wrote: I would prefer that the virtio API not expose a little endian standard. I'm currently converting config->get() ops to ioreadXX depending on the size which already does the endianness conversion for me so this just messes things up. I think it's better to let the backend deal with endianness since it's trivial to handle for both the PCI backend and the lguest backend (lguest doesn't need to do any endianness conversion). -ETOOMUCHMAGIC. We should either expose all the XX interfaces (but this isn't a high-speed interface, so let's not) or not "sometimes" convert endianness. Getting surprises because a field happens to be packed into 4 bytes is counter-intuitive. Then I think it's necessary to expose the XX interfaces. Otherwise, the backend has to deal with doing all register operations at a per-byte granularity which adds a whole lot of complexity on a per-device basis (as opposed to a little complexity once in the transport layer). You really want to be able to rely on multi-byte atomic operations too when setting values. Otherwise, you need another register to just to signal when it's okay for the device to examine any given register. Regards, Anthony Liguori Since your most trivial implementation is to do a byte at a time, I don't think you have a good argument on that basis either. Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RFC 4/7] x86: unify pgtable*.h
All x86 modes and architectures have very similar pagetable structures: the page flags, the accessors for testing/setting them, and the combinations of page flags used for kernel and usermode mappings are all the same. The main difference is between 32 and 64-bit pagetable entries, with the latter supporting the NX bit. The most significant difference between the modes/architectures is the number of levels in the pagetable (4 for 64-bit, 3 for 32-bit/PAE, 2 for non-PAE 32-bit). This accounts for the remaining code in the various mode-specific headers. I've tried to avoid changing formatting as much as possible, so that the code motion is more obvious. A subsequent patch will clean things up in place. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- include/asm-x86/pgtable-2level.h | 21 -- include/asm-x86/pgtable-3level.h | 40 include/asm-x86/pgtable.h| 318 ++ include/asm-x86/pgtable_32.h | 204 include/asm-x86/pgtable_64.h | 225 -- 5 files changed, 331 insertions(+), 477 deletions(-) === --- a/include/asm-x86/pgtable-2level.h +++ b/include/asm-x86/pgtable-2level.h @@ -24,16 +24,13 @@ static inline void native_set_pmd(pmd_t { *pmdp = pmd; } -#ifndef CONFIG_PARAVIRT -#define set_pte(pteptr, pteval)native_set_pte(pteptr, pteval) -#define set_pte_at(mm,addr,ptep,pteval) native_set_pte_at(mm, addr, ptep, pteval) -#define set_pmd(pmdptr, pmdval)native_set_pmd(pmdptr, pmdval) -#endif +#undef set_pte_atomic #define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval) #define set_pte_present(mm,addr,ptep,pteval) set_pte_at(mm,addr,ptep,pteval) #define pte_clear(mm,addr,xp) do { set_pte_at(mm, addr, xp, __pte(0)); } while (0) +#undef pmd_clear #define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0) static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *xp) @@ -50,12 +47,6 @@ static inline pte_t native_ptep_get_and_ #define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) #endif -#define pte_page(x)pfn_to_page(pte_pfn(x)) -#define pte_none(x)(!(x).pte_low) -#define pte_pfn(x) (pte_val(x) >> PAGE_SHIFT) -#define pfn_pte(pfn, prot) __pte(((pfn) << PAGE_SHIFT) | pgprot_val(prot)) -#define pfn_pmd(pfn, prot) __pmd(((pfn) << PAGE_SHIFT) | pgprot_val(prot)) - /* * All present pages are kernel-executable: */ @@ -64,17 +55,13 @@ static inline int pte_exec_kernel(pte_t return 1; } +#define __supported_pte_mask (~0ul) + /* * Bits 0, 6 and 7 are taken, split up the 29 bits of offset * into this range: */ #define PTE_FILE_MAX_BITS 29 - -#define pte_to_pgoff(pte) \ - pte).pte_low >> 1) & 0x1f ) + (((pte).pte_low >> 8) << 5 )) - -#define pgoff_to_pte(off) \ - ((pte_t) { (((off) & 0x1f) << 1) + (((off) >> 5) << 8) + _PAGE_FILE }) /* Encode and de-code a swap entry */ #define __swp_type(x) (((x).val >> 1) & 0x1f) === --- a/include/asm-x86/pgtable-3level.h +++ b/include/asm-x86/pgtable-3level.h @@ -94,17 +94,6 @@ static inline void native_pmd_clear(pmd_ *(tmp + 1) = 0; } -#ifndef CONFIG_PARAVIRT -#define set_pte(ptep, pte) native_set_pte(ptep, pte) -#define set_pte_at(mm, addr, ptep, pte)native_set_pte_at(mm, addr, ptep, pte) -#define set_pte_present(mm, addr, ptep, pte) native_set_pte_present(mm, addr, ptep, pte) -#define set_pte_atomic(ptep, pte) native_set_pte_atomic(ptep, pte) -#define set_pmd(pmdp, pmd) native_set_pmd(pmdp, pmd) -#define set_pud(pudp, pud) native_set_pud(pudp, pud) -#define pte_clear(mm, addr, ptep) native_pte_clear(mm, addr, ptep) -#define pmd_clear(pmd) native_pmd_clear(pmd) -#endif - /* * Pentium-II erratum A13: in PAE mode we explicitly have to flush * the TLB via cr3 if the top-level pgd is changed... @@ -119,10 +108,6 @@ static inline void pud_clear (pud_t * pu #define pud_page_vaddr(pud) \ ((unsigned long) __va(pud_val(pud) & PAGE_MASK)) - -/* Find an entry in the second-level page table.. */ -#define pmd_offset(pud, address) ((pmd_t *) pud_page(*(pud)) + \ - pmd_index(address)) #ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *ptep) @@ -146,38 +131,13 @@ static inline int pte_same(pte_t a, pte_ return a.pte_low == b.pte_low && a.pte_high == b.pte_high; } -#define pte_page(x)pfn_to_page(pte_pfn(x)) - -static inline int pte_none(pte_t pte) -{ - return !pte.pte_low && !pte.pte_high; -} - -static inline unsigned long pte_pfn(pte_t pte) -{ - return pte_val(pte) >> PAGE_SHIFT; -} - extern unsigned long lo
[PATCH RFC 2/7] x86: clean up mm/init_32.c
Some code reformatting in init_32.c. No functional change. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/x86/mm/init_32.c | 31 +-- 1 file changed, 21 insertions(+), 10 deletions(-) === --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -165,16 +165,25 @@ static void __init kernel_physical_mappi pmd = one_md_table_init(pgd); if (pfn >= max_low_pfn) continue; - for (pmd_idx = 0; pmd_idx < PTRS_PER_PMD && pfn < max_low_pfn; pmd++, pmd_idx++) { + for (pmd_idx = 0; +pmd_idx < PTRS_PER_PMD && pfn < max_low_pfn; +pmd++, pmd_idx++) { unsigned int address = pfn * PAGE_SIZE + PAGE_OFFSET; - /* Map with big pages if possible, otherwise create normal page tables. */ + /* Map with big pages if possible, otherwise + create normal page tables. */ if (cpu_has_pse) { - unsigned int address2 = (pfn + PTRS_PER_PTE - 1) * PAGE_SIZE + PAGE_OFFSET + PAGE_SIZE-1; - if (is_kernel_text(address) || is_kernel_text(address2)) - set_pmd(pmd, pfn_pmd(pfn, PAGE_KERNEL_LARGE_EXEC)); - else - set_pmd(pmd, pfn_pmd(pfn, PAGE_KERNEL_LARGE)); + unsigned int address2; + pgprot_t prot = PAGE_KERNEL_LARGE; + + address2 = (pfn + PTRS_PER_PTE - 1) * PAGE_SIZE + + PAGE_OFFSET + PAGE_SIZE-1; + + if (is_kernel_text(address) || + is_kernel_text(address2)) + prot = PAGE_KERNEL_LARGE_EXEC; + + set_pmd(pmd, pfn_pmd(pfn, prot)); pfn += PTRS_PER_PTE; } else { @@ -183,10 +192,12 @@ static void __init kernel_physical_mappi for (pte_ofs = 0; pte_ofs < PTRS_PER_PTE && pfn < max_low_pfn; pte++, pfn++, pte_ofs++, address += PAGE_SIZE) { + pgprot_t prot = PAGE_KERNEL; + if (is_kernel_text(address)) - set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC)); - else - set_pte(pte, pfn_pte(pfn, PAGE_KERNEL)); + prot = PAGE_KERNEL_EXEC; + + set_pte(pte, pfn_pte(pfn, prot)); } } } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RFC 0/7] Unify asm-x86/pgtable.h and page.h
NB: RFC ONLY. DO NOT APPLY. This series unifies many definitions in asm-x86/pgtable.h and page.h. Later in the series, I take advantage of some of the earlier infrastructure to simplify paravirt.h and bits of the Xen code. This patch applies on top of Glauber's 64-bit pvops unification, so it won't apply directly to the current tree. I've tested all the 32-bit combinations (paravirt/non-paravirt/PAE/non-PAE), but haven't set up a 64-bit test box yet. The diffstat of the pure unification bits is nice: arch/x86/mm/init_32.c| 31 ++- arch/x86/mm/init_64.c|3 arch/x86/xen/enlighten.c |8 arch/x86/xen/mmu.c | 67 +++- arch/x86/xen/mmu.h | 26 --- include/asm-x86/page.h | 49 - include/asm-x86/page_32.h| 77 + include/asm-x86/page_64.h| 37 +--- include/asm-x86/paravirt.h | 324 --- include/asm-x86/pgtable-2level.h | 21 -- include/asm-x86/pgtable-3level.h | 40 include/asm-x86/pgtable.h| 318 ++ include/asm-x86/pgtable_32.h | 204 include/asm-x86/pgtable_64.h | 234 14 files changed, 630 insertions(+), 809 deletions(-) (The code formatting patch adds a pile of lines because it splits long single lints into multiline code.) J -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RFC 7/7] x86: fix up formatting in pgtable*.h
Fix up various pieces of unconventional formatting in asm-x86/pgtable*.h. In some cases, the old formatting was arguablly clearer with a wide enough terminal, but this patch gives the option of using a more standard form. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- include/asm-x86/pgtable-2level.h | 24 +++--- include/asm-x86/pgtable-3level.h | 17 --- include/asm-x86/pgtable.h| 91 -- include/asm-x86/pgtable_64.h | 20 +--- 4 files changed, 118 insertions(+), 34 deletions(-) === --- a/include/asm-x86/pgtable-2level.h +++ b/include/asm-x86/pgtable-2level.h @@ -15,25 +15,36 @@ static inline void native_set_pte(pte_t { *ptep = pte; } + static inline void native_set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep , pte_t pte) { native_set_pte(ptep, pte); } + static inline void native_set_pmd(pmd_t *pmdp, pmd_t pmd) { *pmdp = pmd; } #undef set_pte_atomic -#define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval) -#define set_pte_present(mm,addr,ptep,pteval) set_pte_at(mm,addr,ptep,pteval) +#define set_pte_atomic(pteptr, pteval) set_pte(pteptr,pteval) -#define pte_clear(mm,addr,xp) do { set_pte_at(mm, addr, xp, __pte(0)); } while (0) +#define set_pte_present(mm,addr,ptep,pteval) set_pte_at(mm,addr,ptep,pteval) + +#define pte_clear(mm,addr,xp) \ + do {\ + set_pte_at(mm, addr, xp, __pte(0)); \ + } while (0) + #undef pmd_clear -#define pmd_clear(xp) do { set_pmd(xp, __pmd(0)); } while (0) +#define pmd_clear(xp) \ + do {\ + set_pmd(xp, __pmd(0)); \ + } while (0) -static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *xp) +static inline void native_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t *xp) { *xp = __pte(0); } @@ -66,7 +77,8 @@ static inline int pte_exec_kernel(pte_t /* Encode and de-code a swap entry */ #define __swp_type(x) (((x).val >> 1) & 0x1f) #define __swp_offset(x)((x).val >> 8) -#define __swp_entry(type, offset) ((swp_entry_t) { ((type) << 1) | ((offset) << 8) }) +#define __swp_entry(type, offset) \ + ((swp_entry_t) { ((type) << 1) | ((offset) << 8) }) #define __pte_to_swp_entry(pte)((swp_entry_t) { (pte).pte_low }) #define __swp_entry_to_pte(x) ((pte_t) { (x).val }) === --- a/include/asm-x86/pgtable-3level.h +++ b/include/asm-x86/pgtable-3level.h @@ -9,7 +9,8 @@ */ #define pte_ERROR(e) \ - printk("%s:%d: bad pte %p(%08lx%08lx).\n", __FILE__, __LINE__, &(e), (e).pte_high, (e).pte_low) + printk("%s:%d: bad pte %p(%08lx%08lx).\n", __FILE__, __LINE__, \ + &(e), (e).pte_high, (e).pte_low) #define pmd_ERROR(e) \ printk("%s:%d: bad pmd %p(%016Lx).\n", __FILE__, __LINE__, &(e), pmd_val(e)) #define pgd_ERROR(e) \ @@ -39,6 +40,7 @@ static inline void native_set_pte(pte_t smp_wmb(); ptep->pte_low = pte.pte_low; } + static inline void native_set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep , pte_t pte) { @@ -65,10 +67,12 @@ static inline void native_set_pte_atomic { set_64bit((unsigned long long *)(ptep),native_pte_val(pte)); } + static inline void native_set_pmd(pmd_t *pmdp, pmd_t pmd) { set_64bit((unsigned long long *)(pmdp),native_pmd_val(pmd)); } + static inline void native_set_pud(pud_t *pudp, pud_t pud) { *pudp = pud; @@ -79,7 +83,8 @@ static inline void native_set_pud(pud_t * entry, so clear the bottom half first and enforce ordering with a compiler * barrier. */ -static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +static inline void native_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) { ptep->pte_low = 0; smp_wmb(); @@ -102,11 +107,11 @@ static inline void native_pmd_clear(pmd_ */ static inline void pud_clear (pud_t * pud) { } -#define pud_page(pud) \ -((struct page *) __va(pud_val(pud) & PAGE_MASK)) +#define pud_page(pud) \ + ((struct page *) __va(pud_val(pud) & PAGE_MASK)) -#define pud_page_vaddr(pud) \ -((unsigned long) __va(pud_val(pud) & PAGE_MASK)) +#define pud_page_vaddr(pud)\ + ((unsigned long) __va(pud_val(pud) & PAGE_MASK)) #ifdef CONFIG_SMP === --- a/include/asm-x86/pgtable.h +++ b/include/asm-x86/
[PATCH RFC 5/7] x86: simplify pagetable-related operationsin paravirt.h
Simplify paravirt.h using the unified page/pgtable.h infrastructure. This removes a fair amount of duplication of the ops function pointers themselves, but also of PVOP_*CALL* wrappers. The wrappers are complicated by the fact that on a 32-bit PAE system, literal 64-bit values are passed in two arguments, and so a different form of the call must be used compared to 64-bit or 32-bit non-PAE, where all the arguments are less than or equal to the native register size. The code chooses the appropriate form to use by using the compile-time comparison of sizeof(pteval_t) and sizeof(unsigned long). This does not need to be done for calls which are either PAE or 64-bit specific. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- include/asm-x86/paravirt.h | 324 +++- 1 file changed, 141 insertions(+), 183 deletions(-) === --- a/include/asm-x86/paravirt.h +++ b/include/asm-x86/paravirt.h @@ -240,40 +240,38 @@ struct pv_mmu_ops { void (*pte_update_defer)(struct mm_struct *mm, unsigned long addr, pte_t *ptep); + pteval_t (*pte_val)(pte_t); + pgdval_t (*pgd_val)(pgd_t); + + pte_t (*make_pte)(pteval_t pte); + pgd_t (*make_pgd)(pgdval_t pgd); + +#if PAGETABLE_LEVELS >= 3 #ifdef CONFIG_X86_PAE void (*set_pte_atomic)(pte_t *ptep, pte_t pteval); void (*set_pte_present)(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte); #endif -#if defined(CONFIG_X86_PAE) || defined(CONFIG_X86_64) - void (*set_pud)(pud_t *pudp, pud_t pudval); + void (*pte_clear)(struct mm_struct *mm, unsigned long addr, pte_t *ptep); void (*pmd_clear)(pmd_t *pmdp); - unsigned long long (*pte_val)(pte_t); - unsigned long long (*pmd_val)(pmd_t); - unsigned long long (*pgd_val)(pgd_t); + pmdval_t (*pmd_val)(pmd_t); + pmd_t (*make_pmd)(pmdval_t pmd); - pte_t (*make_pte)(unsigned long long pte); - pmd_t (*make_pmd)(unsigned long long pmd); - pgd_t (*make_pgd)(unsigned long long pgd); - #ifdef CONFIG_X86_64 + void (*set_pud)(pud_t *pudp, pud_t pudval); + +#if PAGETABLE_LEVELS == 4 void (*set_pgd)(pgd_t *pgdp, pgd_t pgdval); void (*pud_clear)(pud_t *pudp); void (*pgd_clear)(pgd_t *pgdp); - unsigned long long (*pud_val)(pud_t); + pudval_t (*pud_val)(pud_t); - pud_t (*make_pud)(unsigned long long pud); - #endif -#else - unsigned long (*pte_val)(pte_t); - unsigned long (*pgd_val)(pgd_t); - - pte_t (*make_pte)(unsigned long pte); - pgd_t (*make_pgd)(unsigned long pgd); -#endif + pud_t (*make_pud)(pudval_t pud); +#endif /* PAGETABLE_LEVELS == 4 */ +#endif /* PAGETABLE_LEVELS >= 3 */ #ifdef CONFIG_HIGHPTE void *(*kmap_atomic_pte)(struct page *page, enum km_type type); @@ -958,85 +956,137 @@ static inline void pte_update_defer(stru PVOP_VCALL3(pv_mmu_ops.pte_update_defer, mm, addr, ptep); } -#ifdef CONFIG_X86_PAE -static inline pte_t __pte(unsigned long long val) +/* + * Pagetable manipulators + * + * There are three cases to deal with: + * 32-bit processor, non-PAE: 2-level pagetable with 32-bit entries + * 32-bit processor, PAE: 3-level pagetable with 64-bit entries + * 64-bit processor: 4-level pagetable with 64-bit entries + * + * In 32-bit mode, passing 64-bit parameters must be done in two + * 32-bit chunks, so we need to use a separate PVOP_CALLx macro from + * either 64-bit mode or 32-bit/non-PAE. + * + * We rely on the predefined native_make_X/native_X_val to do + * packing/unpacking of the current pagetable type. + */ +static inline pte_t __pte(pteval_t val) { - unsigned long long ret = PVOP_CALL2(unsigned long long, - pv_mmu_ops.make_pte, - val, val >> 32); - return (pte_t) { ret, ret >> 32 }; + pteval_t ret; + + if (sizeof(val) > sizeof(unsigned long)) + ret = PVOP_CALL2(pteval_t, pv_mmu_ops.make_pte, +val, (u64)val>>32); + else + ret = PVOP_CALL1(pteval_t, pv_mmu_ops.make_pte, val); + + return native_make_pte(ret); } -static inline pmd_t __pmd(unsigned long long val) +static inline pteval_t pte_val(pte_t x) { - return (pmd_t) { PVOP_CALL2(unsigned long long, pv_mmu_ops.make_pmd, - val, val >> 32) }; + pteval_t val = native_pte_val(x); + if (sizeof(pteval_t) > sizeof(unsigned long)) + return PVOP_CALL2(pteval_t, pv_mmu_ops.pte_val, + val, (u64)val>>32); + else + return PVOP_CALL1(pteval_t, pv_mmu_ops.pte_val, val); } -static inline pgd_t __pgd(unsigned long long val) +static inline pgd_t __pgd(pgdval_t val) { -
[PATCH RFC 3/7] x86: clean up asm-x86/page*.h
Unify common definitions in page*.h. To simplify other code, I added typedefs for the value of pte/pmd/pud/pgd values, so they can be used symbolically elsewhere without needing to have lots of 32/64/PAE tests. Also, add PAGETABLE_LEVELS define so that other definitions can test for it directly rather than using indirect 32/64/PAE tests. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- include/asm-x86/page.h| 49 ++-- include/asm-x86/page_32.h | 77 + include/asm-x86/page_64.h | 37 +++-- 3 files changed, 95 insertions(+), 68 deletions(-) === --- a/include/asm-x86/page.h +++ b/include/asm-x86/page.h @@ -1,13 +1,42 @@ +#ifndef _ASM_X86_PAGE_H +#define _ASM_X86_PAGE_H + +#include + +/* PAGE_SHIFT determines the page size */ +#define PAGE_SHIFT 12 +#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) +#define PAGE_MASK (~(PAGE_SIZE-1)) +#define PHYSICAL_PAGE_MASK (~(PAGE_SIZE-1) & __PHYSICAL_MASK) + +#define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1)) +#define LARGE_PAGE_SIZE (_AC(1,UL) << PMD_SHIFT) + #ifdef __KERNEL__ -# ifdef CONFIG_X86_32 -# include "page_32.h" -# else -# include "page_64.h" -# endif + +#ifdef CONFIG_X86_32 +# include "page_32.h" #else -# ifdef __i386__ -# include "page_32.h" -# else -# include "page_64.h" -# endif +# include "page_64.h" #endif + +#ifndef CONFIG_PARAVIRT +#define pgd_val(x) native_pgd_val(x) +#define __pgd(x) native_make_pgd(x) + +#ifndef __PAGETABLE_PUD_FOLDED +#define pud_val(x) native_pud_val(x) +#define __pud(x) native_make_pud(x) +#endif + +#ifndef __PAGETABLE_PMD_FOLDED +#define pmd_val(x) native_pmd_val(x) +#define __pmd(x) native_make_pmd(x) +#endif + +#define pte_val(x) native_pte_val(x) +#define __pte(x) native_make_pte(x) +#endif /* CONFIG_PARAVIRT */ + +#endif /* __KERNEL__ */ +#endif /* _ASM_X86_PAGE_H */ === --- a/include/asm-x86/page_32.h +++ b/include/asm-x86/page_32.h @@ -1,16 +1,13 @@ #ifndef _I386_PAGE_H #define _I386_PAGE_H -/* PAGE_SHIFT determines the page size */ -#define PAGE_SHIFT 12 -#define PAGE_SIZE (1UL << PAGE_SHIFT) -#define PAGE_MASK (~(PAGE_SIZE-1)) +#ifndef _ASM_X86_PAGE_H +#error Include asm/page.h +#endif -#define LARGE_PAGE_MASK (~(LARGE_PAGE_SIZE-1)) -#define LARGE_PAGE_SIZE (1UL << PMD_SHIFT) +#ifndef __ASSEMBLY__ -#ifdef __KERNEL__ -#ifndef __ASSEMBLY__ +#include #ifdef CONFIG_X86_USE_3DNOW @@ -43,71 +40,86 @@ */ extern int nx_enabled; +/* macro to avoid #include hell */ +#define native_pud_val(pud)native_pgd_val((pud).pgd) + #ifdef CONFIG_X86_PAE +#define PAGETABLE_LEVELS 3 + +typedef u64pteval_t; +typedef u64pmdval_t; +typedef u64pudval_t; +typedef u64pgdval_t; + typedef struct { unsigned long pte_low, pte_high; } pte_t; -typedef struct { unsigned long long pmd; } pmd_t; -typedef struct { unsigned long long pgd; } pgd_t; +typedef struct { pmdval_t pmd; } pmd_t; +typedef struct { pgdval_t pgd; } pgd_t; typedef struct { unsigned long long pgprot; } pgprot_t; -static inline unsigned long long native_pgd_val(pgd_t pgd) +static inline pgdval_t native_pgd_val(pgd_t pgd) { return pgd.pgd; } -static inline unsigned long long native_pmd_val(pmd_t pmd) +static inline pmdval_t native_pmd_val(pmd_t pmd) { return pmd.pmd; } -static inline unsigned long long native_pte_val(pte_t pte) +static inline pteval_t native_pte_val(pte_t pte) { return pte.pte_low | ((unsigned long long)pte.pte_high << 32); } -static inline pgd_t native_make_pgd(unsigned long long val) +static inline pgd_t native_make_pgd(pgdval_t val) { return (pgd_t) { val }; } -static inline pmd_t native_make_pmd(unsigned long long val) +static inline pmd_t native_make_pmd(pmdval_t val) { return (pmd_t) { val }; } -static inline pte_t native_make_pte(unsigned long long val) +static inline pte_t native_make_pte(pteval_t val) { return (pte_t) { .pte_low = val, .pte_high = (val >> 32) } ; } -#ifndef CONFIG_PARAVIRT -#define pmd_val(x) native_pmd_val(x) -#define __pmd(x) native_make_pmd(x) -#endif - #define HPAGE_SHIFT21 #include #else /* !CONFIG_X86_PAE */ + +#define PAGETABLE_LEVELS 2 + +typedef u32pteval_t; +typedef u32pmdval_t; +typedef u32pgdval_t; + typedef struct { unsigned long pte_low; } pte_t; typedef struct { unsigned long pgd; } pgd_t; typedef struct { unsigned long pgprot; } pgprot_t; #define boot_pte_t pte_t /* or would you rather have a typedef */ -static inline unsigned long native_pgd_val(pgd_t pgd) +static inline pgdval_t native_pgd_val(pgd_t pgd) { return pgd.pgd; } -static inline unsigned long native_pte_val(pte_t pte) +static inline pteval_t native_pte_val(pte_t pte) { return pte.pte_low;
[PATCH RFC 6/7] x86/xen: simplify Xen mmu operations
Take advantage of the unified page/pgtable.h definitions to reduce the number of duplicate definitions of the various Xen mmu_ops functions. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/x86/xen/enlighten.c |8 - arch/x86/xen/mmu.c | 67 +- arch/x86/xen/mmu.h | 26 + 3 files changed, 41 insertions(+), 60 deletions(-) === --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1038,16 +1038,18 @@ static const struct pv_mmu_ops xen_mmu_o .make_pte = xen_make_pte, .make_pgd = xen_make_pgd, +#if PAGETABLE_LEVELS >= 3 #ifdef CONFIG_X86_PAE .set_pte_atomic = xen_set_pte_atomic, .set_pte_present = xen_set_pte_at, +#endif /* PAE */ .set_pud = xen_set_pud, .pte_clear = xen_pte_clear, .pmd_clear = xen_pmd_clear, .make_pmd = xen_make_pmd, .pmd_val = xen_pmd_val, -#endif /* PAE */ +#endif /* PAGETABLE_LEVELS >= 3 */ .activate_mm = xen_activate_mm, .dup_mmap = xen_dup_mmap, @@ -1175,6 +1177,10 @@ asmlinkage void __init xen_start_kernel( xen_setup_vcpu_info_placement(); #endif +#ifdef CONFIG_X86_PAE + __supported_pte_mask &= ~_PAGE_PCD; +#endif + pv_info.kernel_rpl = 1; if (xen_feature(XENFEAT_supervisor_mode_kernel)) pv_info.kernel_rpl = 0; === --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -211,7 +211,7 @@ void xen_pmd_clear(pmd_t *pmdp) xen_set_pmd(pmdp, __pmd(0)); } -unsigned long long xen_pte_val(pte_t pte) +pteval_t xen_pte_val(pte_t pte) { unsigned long long ret = 0; @@ -223,23 +223,7 @@ unsigned long long xen_pte_val(pte_t pte return ret; } -unsigned long long xen_pmd_val(pmd_t pmd) -{ - unsigned long long ret = pmd.pmd; - if (ret) - ret = machine_to_phys(XMADDR(ret)).paddr | 1; - return ret; -} - -unsigned long long xen_pgd_val(pgd_t pgd) -{ - unsigned long long ret = pgd.pgd; - if (ret) - ret = machine_to_phys(XMADDR(ret)).paddr | 1; - return ret; -} - -pte_t xen_make_pte(unsigned long long pte) +pte_t xen_make_pte(pteval_t pte) { if (pte & 1) pte = phys_to_machine(XPADDR(pte)).maddr; @@ -247,20 +231,13 @@ pte_t xen_make_pte(unsigned long long pt return (pte_t){ pte, pte >> 32 }; } -pmd_t xen_make_pmd(unsigned long long pmd) + +pmd_t xen_make_pmd(pmdval_t pmd) { if (pmd & 1) pmd = phys_to_machine(XPADDR(pmd)).maddr; - return (pmd_t){ pmd }; -} - -pgd_t xen_make_pgd(unsigned long long pgd) -{ - if (pgd & _PAGE_PRESENT) - pgd = phys_to_machine(XPADDR(pgd)).maddr; - - return (pgd_t){ pgd }; + return native_make_pmd(pmd); } #else /* !PAE */ void xen_set_pte(pte_t *ptep, pte_t pte) @@ -268,7 +245,7 @@ void xen_set_pte(pte_t *ptep, pte_t pte) *ptep = pte; } -unsigned long xen_pte_val(pte_t pte) +pteval_t xen_pte_val(pte_t pte) { unsigned long ret = pte.pte_low; @@ -278,30 +255,38 @@ unsigned long xen_pte_val(pte_t pte) return ret; } -unsigned long xen_pgd_val(pgd_t pgd) -{ - unsigned long ret = pgd.pgd; - if (ret) - ret = machine_to_phys(XMADDR(ret)).paddr | 1; - return ret; -} - -pte_t xen_make_pte(unsigned long pte) +pte_t xen_make_pte(pteval_t pte) { if (pte & _PAGE_PRESENT) pte = phys_to_machine(XPADDR(pte)).maddr; return (pte_t){ pte }; } +#endif /* CONFIG_X86_PAE */ -pgd_t xen_make_pgd(unsigned long pgd) +pmdval_t xen_pmd_val(pmd_t pmd) +{ + pmdval_t ret = native_pmd_val(pmd); + if (ret) + ret = machine_to_phys(XMADDR(ret)).paddr | 1; + return ret; +} + +pgdval_t xen_pgd_val(pgd_t pgd) +{ + pgdval_t ret = native_pgd_val(pgd); + if (ret) + ret = machine_to_phys(XMADDR(ret)).paddr | 1; + return ret; +} + +pgd_t xen_make_pgd(pgdval_t pgd) { if (pgd & _PAGE_PRESENT) pgd = phys_to_machine(XPADDR(pgd)).maddr; - return (pgd_t){ pgd }; + return native_make_pgd(pgd); } -#endif /* CONFIG_X86_PAE */ enum pt_level { PT_PGD, === --- a/arch/x86/xen/mmu.h +++ b/arch/x86/xen/mmu.h @@ -30,31 +30,21 @@ void xen_pgd_pin(pgd_t *pgd); void xen_pgd_pin(pgd_t *pgd); //void xen_pgd_unpin(pgd_t *pgd); +pteval_t xen_pte_val(pte_t); +pmdval_t xen_pmd_val(pmd_t); +pgdval_t xen_pgd_val(pgd_t); + +pte_t xen_make_pte(pteval_t); +pmd_t xen_make_pmd(pmdval_t); +pgd_t xen_make_pgd(pgdval_t); + #ifdef CONFIG_X86_PAE -unsigned long long xen_pte_val(pte_t); -unsigned long long xen_pmd_val(pmd_t); -unsigned long long xen_pgd_val(pgd_t); - -pte_t xen_make_pte(unsigned long long); -pm
[PATCH RFC 1/7] x86: kill mk_pte_huge
It only has a single use, which can be trivially replaced. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/x86/mm/init_64.c|3 +-- include/asm-x86/pgtable_64.h |9 - 2 files changed, 1 insertion(+), 11 deletions(-) === --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -768,8 +768,7 @@ int __meminit vmemmap_populate(struct pa if (!p) return -ENOMEM; - entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL); - mk_pte_huge(entry); + entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL_LARGE); set_pmd(pmd, __pmd(pte_val(entry))); printk(KERN_DEBUG " [%lx-%lx] PMD ->%p on node %d\n", === --- a/include/asm-x86/pgtable_64.h +++ b/include/asm-x86/pgtable_64.h @@ -378,15 +378,6 @@ static inline pte_t pte_clrhuge(pte_t pt /* page, protection -> pte */ #define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot)) -static inline pte_t __mk_pte_huge(pte_t entry) -{ - unsigned long pte; - pte = pte_val(entry); - pte |= _PAGE_PRESENT | _PAGE_PSE; - return __pte(pte); -} -#define mk_pte_huge(entry) ((entry) = __mk_pte_huge(entry)) - #include static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: mm snapshot broken-out-2007-11-06-02-32 build failure - !CONFIG_PPC_ISERIES
On Thu, Nov 08, 2007 at 02:27:07AM +0530, Kamalesh Babulal wrote: > Hi Andrew, > > The kernel build fails with randconfig, with following error > > CC arch/powerpc/platforms/celleb/setup.o > arch/powerpc/platforms/celleb/setup.c:151: error: ‘generic_calibrate_decr’ > undeclared here (not in a function) > make[2]: *** [arch/powerpc/platforms/celleb/setup.o] Error 1 > make[1]: *** [arch/powerpc/platforms/celleb] Error 2 > make: *** [arch/powerpc/platforms] Error 2 I think you need this patch: http://patchwork.ozlabs.org/linuxppc/patch?q=Tony%20Breeds&id=14462 Yours Tony linux.conf.auhttp://linux.conf.au/ || http://lca2008.linux.org.au/ Jan 28 - Feb 02 2008 The Australian Linux Technical Conference! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] pktcdvd: fix BUG caused by sysfs module reference semantics change
pkt_setup_dev() expects module reference to be held on invocation. This used to be true for sysfs callbacks but not anymore. Test and grab module reference around pkt_setup_dev() in class_pktcdvd_store_add(). Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> Acked-by: Peter Osterlund <[EMAIL PROTECTED]> --- Greg, can you please push this patch through your tree? Thanks a lot. drivers/block/pktcdvd.c |9 + 1 file changed, 9 insertions(+) diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c index a8130a4..a5ee213 100644 --- a/drivers/block/pktcdvd.c +++ b/drivers/block/pktcdvd.c @@ -358,10 +358,19 @@ static ssize_t class_pktcdvd_store_add(struct class *c, const char *buf, size_t count) { unsigned int major, minor; + if (sscanf(buf, "%u:%u", &major, &minor) == 2) { + /* pkt_setup_dev() expects caller to hold reference to self */ + if (!try_module_get(THIS_MODULE)) + return -ENODEV; + pkt_setup_dev(MKDEV(major, minor), NULL); + + module_put(THIS_MODULE); + return count; } + return -EINVAL; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][VIRTIO] Fix vring_init() ring computations
On Thursday 08 November 2007 12:06:07 Anthony Liguori wrote: > Rusty Russell wrote: > > On Wednesday 07 November 2007 13:52:29 Anthony Liguori wrote: > >> This patch fixes a typo in vring_init(). > > > > Thanks, applied. > > > > I've put it in the new, experimental virtio git tree on git.kernel.org. > > Hrm, perhaps you forgot to push? I don't see it in the tree although I > see the config ops refactoring. It should be in the patches/1 branch. I've pushed again... Thanks, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] virtio config_ops refactoring
On Thursday 08 November 2007 04:30:50 Anthony Liguori wrote: > I would prefer that the virtio API not expose a little endian standard. > I'm currently converting config->get() ops to ioreadXX depending on the > size which already does the endianness conversion for me so this just > messes things up. I think it's better to let the backend deal with > endianness since it's trivial to handle for both the PCI backend and the > lguest backend (lguest doesn't need to do any endianness conversion). -ETOOMUCHMAGIC. We should either expose all the XX interfaces (but this isn't a high-speed interface, so let's not) or not "sometimes" convert endianness. Getting surprises because a field happens to be packed into 4 bytes is counter-intuitive. Since your most trivial implementation is to do a byte at a time, I don't think you have a good argument on that basis either. Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2 take #2] libata: pata_platform: Support polling-mode configuration.
Some SH boards (old R2D-1 boards) have generally not had working CF under libata, due to both buswidth issues (handled by Aoi Shinkai in 43f4b8c7578b928892b6f01d374346ae14e5eb70), and buggy interrupt controllers. For these sorts of boards simply disabling the IRQ and polling ends up working fine. This conditionalizes the IRQ resource for pata_platform and lets platforms that want to use polling mode simply omit the resource entirely. Signed-off-by: Paul Mundt <[EMAIL PROTECTED]> --- drivers/ata/pata_platform.c | 35 --- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/drivers/ata/pata_platform.c b/drivers/ata/pata_platform.c index fc72a96..ac03a90 100644 --- a/drivers/ata/pata_platform.c +++ b/drivers/ata/pata_platform.c @@ -1,7 +1,7 @@ /* * Generic platform device PATA driver * - * Copyright (C) 2006 Paul Mundt + * Copyright (C) 2006 - 2007 Paul Mundt * * Based on pata_pcmcia: * @@ -22,7 +22,7 @@ #include #define DRV_NAME "pata_platform" -#define DRV_VERSION "1.1" +#define DRV_VERSION "1.2" static int pio_mask = 1; @@ -120,15 +120,20 @@ static void pata_platform_setup_port(struct ata_ioports *ioaddr, * Register a platform bus IDE interface. Such interfaces are PIO and we * assume do not support IRQ sharing. * - * Platform devices are expected to contain 3 resources per port: + * Platform devices are expected to contain at least 2 resources per port: * * - I/O Base (IORESOURCE_IO or IORESOURCE_MEM) * - CTL Base (IORESOURCE_IO or IORESOURCE_MEM) + * + * and optionally: + * * - IRQ (IORESOURCE_IRQ) * * If the base resources are both mem types, the ioremap() is handled * here. For IORESOURCE_IO, it's assumed that there's no remapping * necessary. + * + * If no IRQ resource is present, PIO polling mode is used instead. */ static int __devinit pata_platform_probe(struct platform_device *pdev) { @@ -137,11 +142,12 @@ static int __devinit pata_platform_probe(struct platform_device *pdev) struct ata_port *ap; struct pata_platform_info *pp_info; unsigned int mmio; + int irq; /* * Simple resource validation .. */ - if (unlikely(pdev->num_resources != 3)) { + if ((pdev->num_resources != 3) && (pdev->num_resources != 2)) { dev_err(&pdev->dev, "invalid number of resources\n"); return -EINVAL; } @@ -173,6 +179,13 @@ static int __devinit pata_platform_probe(struct platform_device *pdev) (ctl_res->flags == IORESOURCE_MEM)); /* +* And the IRQ +*/ + irq = platform_get_irq(pdev, 0); + if (irq < 0) + irq = 0;/* no irq */ + + /* * Now that that's out of the way, wire up the port.. */ host = ata_host_alloc(&pdev->dev, 1); @@ -185,6 +198,14 @@ static int __devinit pata_platform_probe(struct platform_device *pdev) ap->flags |= ATA_FLAG_SLAVE_POSS; /* +* Use polling mode if there's no IRQ +*/ + if (!irq) { + ap->flags |= ATA_FLAG_PIO_POLLING; + ata_port_desc(ap, "no IRQ, using PIO polling"); + } + + /* * Handle the MMIO case */ if (mmio) { @@ -213,9 +234,9 @@ static int __devinit pata_platform_probe(struct platform_device *pdev) (unsigned long long)ctl_res->start); /* activate */ - return ata_host_activate(host, platform_get_irq(pdev, 0), -ata_interrupt, pp_info ? pp_info->irq_flags -: 0, &pata_platform_sht); + return ata_host_activate(host, irq, irq ? ata_interrupt : NULL, +pp_info ? pp_info->irq_flags : 0, +&pata_platform_sht); } /** - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2 take #2] libata: Support PIO polling-only hosts.
By default ata_host_activate() expects a valid IRQ in order to successfully register the host. This patch enables a special case for registering polling-only hosts that either don't have IRQs or have buggy IRQ generation (either in terms of handling or sensing), which otherwise work fine. Hosts that want to use polling mode can simply set ATA_FLAG_PIO_POLLING and pass in an invalid IRQ. Signed-off-by: Paul Mundt <[EMAIL PROTECTED]> --- drivers/ata/libata-core.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index ec3ce12..89fd0e9 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -7178,6 +7178,10 @@ int ata_host_register(struct ata_host *host, struct scsi_host_template *sht) * request IRQ and register it. This helper takes necessasry * arguments and performs the three steps in one go. * + * An invalid IRQ skips the IRQ registration and expects the host to + * have set polling mode on the port. In this case, @irq_handler + * should be NULL. + * * LOCKING: * Inherited from calling layer (may sleep). * @@ -7194,6 +7198,12 @@ int ata_host_activate(struct ata_host *host, int irq, if (rc) return rc; + /* Special case for polling mode */ + if (!irq) { + WARN_ON(irq_handler); + return ata_host_register(host, sht); + } + rc = devm_request_irq(host->dev, irq, irq_handler, irq_flags, dev_driver_string(host->dev), host); if (rc) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Module init call vs symbols exporting race?
On Wednesday 07 November 2007 21:01:30 Jan Glauber wrote: > Hi Rusty, > > I've seen a symbol-resolving race on s390. The qeth module uses symbols > from qdio and although the loading order seems correct and the qdio > symbols should be available the following error appears: > > qdio: loading QDIO base support version 2 > qeth: Unknown symbol qdio_synchronize Looks like qdio does something which triggers qeth to load, but of course qdio isn't finished initializing yet so its symbols aren't available. It's not obvious what's triggering the load, but you could probably find it by using printk's through qdio.c's init_QDIO(). Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
From: Paul Mackerras <[EMAIL PROTECTED]> Date: Thu, 8 Nov 2007 12:53:57 +1100 > Andrew Morton writes: > > > Given all this stuff, the return value from sys_times() doesn't seem a > > particularly useful or reliable kernel interface. > > I think the best thing would be to ignore any error from copy_to_user > and always return the number of clock ticks. We should call > force_successful_syscall_return, and glibc on x86 should be taught not > to interpret negative values as an error. > > POSIX doesn't require us to return an EFAULT error if the buf argument > is bogus. If userspace does supply a bogus buf pointer, then either > it will dereference it itself and get a segfault, or it won't > dereference it, in which case it obviously didn't care about the > values we tried to put there. > > If we try to return an error under some circumstances, then there is > at least one 32-bit value for the number of ticks that will cause > confusion. We can either change that value (or values) to some other > value, which seems pretty bogus, or we can just decide not to return > any errors. The latter seems to me to have no significant downside > and to be the simplest solution to the problem. I agree with this analysis. The Linux man page for times() explicitly lists (clock_t) -1 as a return value meaning error. So even if we did make some effort to return errors "properly" (via force_successful_syscall_return() et al.) userspace would still be screwed because (clock_t) -1 would be interpreted as an error. Actually I think this basically proves we cannot return (clock_t) -1 ever because all existing userland (I'm not talking about inside glibc, I'm talking about inside of applications) will see this as an error. User applications have no other way to check for error. This API is definitely very poorly designed, no matter which way we "fix" this some case will remain broken. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
Andrew Morton writes: > Given all this stuff, the return value from sys_times() doesn't seem a > particularly useful or reliable kernel interface. I think the best thing would be to ignore any error from copy_to_user and always return the number of clock ticks. We should call force_successful_syscall_return, and glibc on x86 should be taught not to interpret negative values as an error. POSIX doesn't require us to return an EFAULT error if the buf argument is bogus. If userspace does supply a bogus buf pointer, then either it will dereference it itself and get a segfault, or it won't dereference it, in which case it obviously didn't care about the values we tried to put there. If we try to return an error under some circumstances, then there is at least one 32-bit value for the number of ticks that will cause confusion. We can either change that value (or values) to some other value, which seems pretty bogus, or we can just decide not to return any errors. The latter seems to me to have no significant downside and to be the simplest solution to the problem. Paul. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10
Denys wrote: Finally i got full DMESG with 1GB card till end. Seems not readable too. .. ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata1.00: status: { DRDY } ata1: soft resetting link ata1.00: configured for MWDMA1 sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x08 sd 0:0:0:0: [sda] Sense Key : 0xb [current] [descriptor] Descriptor sense data with sense descriptors (in hex): 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 00 00 00 00 sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x0 end_request: I/O error, dev sda, sector 0 Buffer I/O error on device sda, logical block 0 ata1: EH complete I'm guessing that your CF-to-IDE adapter doesn't have the correct lines wired up for DMA to work properly, and the card indicates DMA support, which libata tries to use but which doesn't work. It looks like it never tried falling back to PIO after DMA failed. Seems like a deficiency in the speed-down logic? -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [poll] Is the megafreeze development model broken?
On Wed, Nov 07, 2007 at 11:56:57PM +0100, ciol wrote: > Hi, I'd like to ask you a few questions: > > * Do you like the way linux distributions integrate the kernel? > > * Wouldn't you prefer they ship with the stable and still maintained > 2.6.16.X, while providing optionally the latest kernel for those who want > or just have a new hardware? No. With 2.6.16 "new hardware" roughly equals to "sold during the last 2-3 years", so most users would be forced to use this "option". "providing optionally the latest kernel" would be a horror to support for a distribution. >From all I hear all big distributions spend 3-6 months of QA work between pushing a kernel into the development branch of their distribution and putting it into a release. They can't do this work for 4-6 different upstream kernels each year. And if they'd omit it, their custumers would both blame them for shipping such a buggy distribution and swamp their support with bug reports. > * Do you think the megafreeze development model [1] and the "I don't trust > in upstream" development model are broken? (And why) >... Definitely not. If your "stable base system" contains the kernel you lose the hardware support for recent hardware. What should be more important for users than having their hardware supported? And although it's off-topic for linux-kernel, your suggested "well-maintained additional package collections" also sound horrific: As an example, consider the following: - a new version of GNOME might require a new version of GTK+ - recently GTK+ 2.12 entered Debian testing, and this new version exposed a serious bug in the xfwm4 package that was at that time in testing There are at least two obvious problems with what you propose: - for avoiding breakages for users a huge amount of coordination work between the "additional package collections" would be required - most users want their software to work correctly, not crash, etc. when a distribution has a 2-3 months freeze before a release that's not lost time, that's time where _all_ software that will be shipped gets tested and bugs fixed There's one important thing you must have in mind: Geeks (like you and me) can get the latest software versions from the development versions of their distribution, but for most users - for whom a computer is a tool that should simply work (no matter whether it's a server or a desktop) and not a toy - the QA work done during a freeze has a _huge_ value. Fedora, openSUSE and Ubuntu all offer new releases every 6 months, which results in the software in the latest release always being less than 1 year old plus the user getting the QA work and the resulting stability of a freeze. This seems to be a good solution for desktop user. cu Adrian (2.6.16 maintainer) -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?
> But I think we'd be best off stashing a single bit somewhere and > checking it at migrate time (relatively infrequent) rather than > copying and zeroing out a potentially enormous affinity mask every > time we disable migration (often, and in fast paths). Perhaps adding > TASK_PINNED to the task state flags would do it? It would need to be a count to be able to nest it. > > get_cpu() etc. could be changed to use this then too. > > Some users of get_cpu might be relying on it to avoid actual > preemption. In other words, we should have introduced a > migrate_disable() when we first discovered the preempt/per_cpu > conflict. Ok perhaps it would make sense to migrate it step by step :- define a replacement for get_cpu and migrate over as users are getting audited and eventually deprecate old one. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: build #337 failed for 2.6.24-rc1-gb1d08ac In function `usbnet_set_settings':
On Wed, Nov 07, 2007 at 02:34:52PM -0800, David Brownell wrote: > > > But on the other hand, it seems that only the ASIX code will work > > > right; the DM9601 and MCS7830 Kconfig is different/wrong. > > > > I'm not seeing the problem. > > > > Which configuration will be handled wrongly? > > Notice how only the ASIX kconfig depended on NET_ETHERNET... > since MII depends on NET_ETHERNET, and (last I knew) the > reverse dependencies didn't capture the complete dependency > tree, selecting only MII would leave out some stuff. Except for one s390 net driver (I'll check why it's doing this) the NET_ETHERNET option does not influence what code is being generated - it's just a Kconfig-internal option allowing to disable a huge bunch of drivers at once. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Thu, 08 Nov 2007 01:54:40 +0100 Andreas Schwab <[EMAIL PROTECTED]> wrote: > Andrew Morton <[EMAIL PROTECTED]> writes: > > > diff -puN kernel/compat.c~a kernel/compat.c > > --- a/kernel/compat.c~a > > +++ a/kernel/compat.c > > @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct > > if (copy_to_user(tbuf, &tmp, sizeof(tmp))) > > return -EFAULT; > > } > > - return compat_jiffies_to_clock_t(jiffies); > > + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & > > + LONG_MAX); > > Are you sure you want LONG_MAX here, not 0x7fff? > I'm not sure of anything - I'm just trolling ;) That's 0x7fff for architectures which implement this function. I think that lines up correctly with jiffies and the return value from compat_sys_times(). Perhaps formally it should be USERSPACE_CLOCK_T_MAX, but we don't have that. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Wed, 07 Nov 2007 16:50:22 -0800 (PST) David Miller <[EMAIL PROTECTED]> > wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > Date: Wed, 7 Nov 2007 15:28:33 -0800 > > > Perhaps this is a bug in glibc: it is interpreting the times() return value > > in the same way as other syscalls. > > The problem is more likely that we are failing to > invoke force_successful_syscall_return() here. > > Otherwise the syscall return path interprets negative > values as errors, and sets the cpu condition codes. > > And that is what userspace is actually checking for > to determine if there is an error or not. hm, I'd forgotten about that. It seems to be a no-op on lots of architectures? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] Suppress A.OUT library support in ELF binfmt if !CONFIG_BINFMT_AOUT [try #3]
On Wed, Nov 07, 2007 at 05:43:28PM +, David Howells wrote: > Suppress A.OUT library support in ELF binfmt if CONFIG_BINFMT_AOUT is not set. > > Not all architectures support the A.OUT binfmt, so the ELF binfmt should not > be permitted to go looking for A.OUT libraries to load in such a case. >... The a.out interpreter support for ELF executables is already scheduled for being completely removed in 2.6.25. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 9319] New: National characters are not displayed under console.
This isn't a regression. It's an intentional default change. The default console mode changed from 8-bit legacy to UTF-8 in 2.6.24. Apparently this user is using a legacy character set (note that it's a Slackware machine), and isn't explicitly setting the character set via the appropriate escape sequence. The new default can be overridden via /sys/module/vt/parameters/default_utf8 or something like that... -hpa Andrew Morton wrote: On Wed, 7 Nov 2007 13:19:16 -0800 (PST) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=9319 Summary: National characters are not displayed under console. Product: Drivers Version: 2.5 KernelVersion: 2.6.24-rcX Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Console/Framebuffers AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Most recent kernel where this bug did not occur: 2.6.23 Distribution: Slackware Hardware Environment: Toshiba Tecra M1 Pentium M 1.6 512MB RAM, ICH4-M chipset, Trident CyberBlade XP4 video card Software Environment: Slackware-current (kbd-1.12, glibs 2.5) Problem Description: The national characters like "ą", "ł" or "ż" are not displayed corectlly under console (no matter vesa framebuffer, or standard vga). Instead of them "?" on strange background is displayed. Problem begins on 2.6.24-rc1 and continues on 2.6.24-rc2. On 2.6.23 everything is OK. Steps to reproduce: Run 2.6.24-rcX kernel and set national console font by setfont. Another post-2.6.23 regression. Possible culprits cc'ed? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.24-rc2: Reported regressions from 2.6.23
This message contains a list of some regressions from 2.6.23 which have been reported since 2.6.24-rc1 was released and for which there are no fixes in the mainline that I know of. If any of them have been fixed already, please let me know. If you know of any other unresolved regressions from 2.6.23, please let me know either and I'll add them to the list. Subject : On 2.6.24-rc1-gc9927c2b BUG: unable to handle kernel paging request at virtual address 3d15b925 Submitter : Giacomo Catenazzi <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/24/487 http://bugzilla.kernel.org/show_bug.cgi?id=9246 Handled-By : Patch : Subject : Potential regression in -git15: can't resume stopped root shell? Submitter : Theodore Tso <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/20/114 http://bugzilla.kernel.org/show_bug.cgi?id=9247 Handled-By : Serge Hallyn <[EMAIL PROTECTED]> Patch : http://bugzilla.kernel.org/attachment.cgi?id=13361&action=view http://bugzilla.kernel.org/attachment.cgi?id=13375&action=view Subject : irq 21: nobody cared 2.6.24-rc1 Submitter : Bongani Hlope <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/25/90 http://bugzilla.kernel.org/show_bug.cgi?id=9249 Handled-By : Patch : Subject : [BUG] panic after umount (biscted) Submitter : Sebastian Siewior <[EMAIL PROTECTED]> References : http://marc.info/?l=linux-kernel&m=119338387030335&w=2 http://bugzilla.kernel.org/show_bug.cgi?id=9250 Handled-By : Jens Axboe <[EMAIL PROTECTED]> Patch : http://marc.info/?l=linux-kernel&m=119348520210349&w=2 Subject : 2.6.24-rc1 sysctl table check failed on PowerMac Submitter : Mikael Pettersson <[EMAIL PROTECTED]> References : http://marc.info/?l=linux-kernel&m=119350802331857&w=2 http://bugzilla.kernel.org/show_bug.cgi?id=9251 Handled-By : Alexey Dobriyan <[EMAIL PROTECTED]> Patch : http://marc.info/?l=linux-kernel&m=119351015801660&w=2 Subject : 2.6.24-rc1: pata_acpi fails to activate DMA for DVD-ROM on ALi M5229 secondary channel Submitter : Andrey Borzenkov <[EMAIL PROTECTED]> References : http://marc.info/?l=linux-kernel&m=119342005216716&w=2 http://bugzilla.kernel.org/show_bug.cgi?id=9252 Handled-By : Alan Cox <[EMAIL PROTECTED]> Patch : Note: pata_acpi was not present in 2.6.23 Subject : 2.6.24-rc1 freezes on powerbook at first boot stage Submitter : Elimar Riesebieter <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/24/205 http://bugzilla.kernel.org/show_bug.cgi?id=9254 Handled-By : Patch : Subject : build #286 failed for 2.6.24-rc1-gea45d15 in linux/arch/x86/kernel/setup_32.c Submitter : Toralf Förster <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/28/110 http://bugzilla.kernel.org/show_bug.cgi?id=9256 Handled-By : "H. Peter Anvin" <[EMAIL PROTECTED]> Patch : http://marc.info/[EMAIL PROTECTED] Subject : 2.6.24-rc1 kills onboard r8169 (rtl8111b) NIC Submitter : "Sergey S. Kostyliov" <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/28/144 http://bugzilla.kernel.org/show_bug.cgi?id=9257 Handled-By : Francois Romieu <[EMAIL PROTECTED]> Patch : http://bugzilla.kernel.org/attachment.cgi?id=13441&action=view Subject : Commit "Hibernation: Enter platform hibernation state in a consistent way)" makes my system to resume instantly from S4 Submitter : Maxim Levitsky <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/27/66 http://bugzilla.kernel.org/show_bug.cgi?id=9258 Handled-By : "Rafael J. Wysocki" <[EMAIL PROTECTED]> Patch : Note: $subject commit apparently exposes a problem that existed previously Subject : leds: ledtrig-timer calls sleeping function from invalid context Submitter : Márton Németh <[EMAIL PROTECTED]> References : http://bugzilla.kernel.org/show_bug.cgi?id=9264 Handled-By : Patch : Subject : Device mapper regression 2.6.23 vs. v2.6.23-6597-gcfa76f0 Submitter : Thomas Meyer <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/10/21/153 http://bugzilla.kernel.org/show_bug.cgi?id=9280 Handled-By : Patch : Subject : [2.6.24-rc1][BUG] Oops on battery removal Submitter : Rolf Eike Beer <[EMAIL PROTECTED]> References : http://lkml.org/lkml/2007/11/2/23 http://bugzilla.kernel.org/show_bug.cgi?id=9283 Handled-By : Alexey Starikovskiy <[EMAIL PROTECTED]> Patch : http://lkml.org/lkml/2007/1
Re: Fwd: same problem with 2.6.24-rc2
On Wed, 07 Nov 2007 21:32:43 -0300 (GFT) werner wrote: > On 7/Nov/2007 20:10 werner wrote .. > > With 2.6.23-rc2 is the same problem: it crashed at the beginning: EIP 060 > > c03fdea4 > > EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200 > > Again during the compilation was reclaimed that > > /arch/x86/Makefile.o > > cannot be found and were certain dependencies on it not made, such a file > > isn't > > present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ), > > nor > > was generated automaticaly during compilation, I think this is incorrect > > and the > > reason for the problems Hi, Please provide the complete build log (with V=1 if possible) for the missing Makefile.o problem. E.g.: make V=1 all >build.log 2>&1 Make sure that build.log contains the error message and then send the complete build.log file to us at linux-kernel@vger.kernel.org . > > wl > > [EMAIL PROTECTED] > > = > > On 7/Nov/2007 16:14 Andrew Morton wrote .. > > > > On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[EMAIL PROTECTED]> > > > > wrote: > > > > I really don't know what's happening. I don't understand nothing about > > > > the > > kernel > > > error reporting system. Because of this, always when there is a > > > problem, I > > report > > > it via e-mail to linux-kernel@vger.kernel.org . I don't know what > > > people there > > > do with my messages. > > > > > > > > > It went like this: > > > > > > 1: you sent an email to linux-kernel > > > > > > 2: I sent a reply to you and linux-kernel > > > > > > 3: you sent a reply to me, but NOT linux-kernel! > > > > > > In other words, you did "reply", not "reply to all", thus you removed > > > three > > > thousand people from the discussion. One of those people is the person > > > who > > > created the bug which you're hitting, and that person no longer knows > > > what's happening. > > > > > > > > > So please go back and resend all those emails, and retain ALL Cc:'s. > > > Don't > > > just send them only to me. Keep all indivisuals and all mailing lists on > > > the email Cc: list. --- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?
On Thu, Nov 08, 2007 at 01:31:00AM +0100, Andi Kleen wrote: > On Thursday 08 November 2007 01:20, Matt Mackall wrote: > > On Wed, Nov 07, 2007 at 12:30:45PM -0800, Andrew Morton wrote: > > > Ow. Yes, from my reading delay_tsc() can return early (or after > > > heat-death-of-the-universe) if the TSCs are offset and if preemption > > > migrates the calling task between CPUs. > > > > > > I suppose a lameo fix would be to disable preemption in delay_tsc(). > > > > preempt_disable is lousy documentation here. This and other cases > > (lots of per_cpu users, IIRC) actually want a migrate_disable() which > > is a proper subset. We can simply implement migrate_disable() as > > preempt_disable() for now and come back later and implement a proper > > migrate_disable() that still allows preemption (and thus avoids the > > latency). > > We could actually do this right now. migrate_disable() can be just changing > the cpu affinity of the current thread to current cpu and then restoring it > afterwards. That should even work from interrupt context. Yes, that's one way. But we need somewhere to stash the old flags. Expanding the task struct sucks. Jamming another bit in the preempt count sucks. But I think we'd be best off stashing a single bit somewhere and checking it at migrate time (relatively infrequent) rather than copying and zeroing out a potentially enormous affinity mask every time we disable migration (often, and in fast paths). Perhaps adding TASK_PINNED to the task state flags would do it? > get_cpu() etc. could be changed to use this then too. Some users of get_cpu might be relying on it to avoid actual preemption. In other words, we should have introduced a migrate_disable() when we first discovered the preempt/per_cpu conflict. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][VIRTIO] Fix vring_init() ring computations
Rusty Russell wrote: On Wednesday 07 November 2007 13:52:29 Anthony Liguori wrote: This patch fixes a typo in vring_init(). Thanks, applied. I've put it in the new, experimental virtio git tree on git.kernel.org. Hrm, perhaps you forgot to push? I don't see it in the tree although I see the config ops refactoring. Regards, Anthony Liguori Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[Patch] Allocate sparse vmemmap block above 4G
Resend the patch for more people to review On some single node x64 system with huge amount of physical memory e.g > 64G. the memmap size maybe very big. If the memmap is allocated from low pages, it may occupies too much memory below 4G. then swiotlb could fail to reserve bounce buffer under 4G which will lead to boot failure. This patch will first try to allocate memmap memory above 4G in sparse vmemmap code. If it failed, it will allocate memmap above MAX_DMA_ADDRESS. This patch is against 2.6.24-rc1-git14 Signed-off-by: Zou Nan hai <[EMAIL PROTECTED]> Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> diff -Nraup a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c --- a/arch/x86/mm/init_64.c 2007-11-06 15:16:12.0 +0800 +++ b/arch/x86/mm/init_64.c 2007-11-06 15:55:50.0 +0800 @@ -448,6 +448,13 @@ void online_page(struct page *page) num_physpages++; } +void * __meminit alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size, +unsigned long align) +{ +return __alloc_bootmem_core(pgdat->bdata, size, +align, (4UL*1024*1024*1024), 0, 1); +} + #ifdef CONFIG_MEMORY_HOTPLUG /* * Memory is added always to NORMAL zone. This means you will never get diff -Nraup a/include/linux/bootmem.h b/include/linux/bootmem.h --- a/include/linux/bootmem.h 2007-11-06 16:06:31.0 +0800 +++ b/include/linux/bootmem.h 2007-11-06 15:50:36.0 +0800 @@ -61,6 +61,10 @@ extern void *__alloc_bootmem_core(struct unsigned long limit, int strict_goal); +extern void *alloc_bootmem_high_node(pg_data_t *pgdat, +unsigned long size, +unsigned long align); + #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE extern void reserve_bootmem(unsigned long addr, unsigned long size); #define alloc_bootmem(x) \ diff -Nraup a/mm/bootmem.c b/mm/bootmem.c --- a/mm/bootmem.c 2007-11-06 16:06:31.0 +0800 +++ b/mm/bootmem.c 2007-11-06 15:49:20.0 +0800 @@ -492,3 +492,11 @@ void * __init __alloc_bootmem_low_node(p return __alloc_bootmem_core(pgdat->bdata, size, align, goal, ARCH_LOW_ADDRESS_LIMIT, 0); } + +__attribute__((weak)) __meminit +void *alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long size, +unsigned long align) +{ +return NULL; +} + diff -Nraup a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c --- a/mm/sparse-vmemmap.c 2007-11-06 15:16:12.0 +0800 +++ b/mm/sparse-vmemmap.c 2007-11-06 16:08:52.0 +0800 @@ -43,9 +43,13 @@ void * __meminit vmemmap_alloc_block(uns if (page) return page_address(page); return NULL; - } else + } else { + void *p = alloc_bootmem_high_node(NODE_DATA(node), size, size); + if (p) + return p; return __alloc_bootmem_node(NODE_DATA(node), size, size, __pa(MAX_DMA_ADDRESS)); + } } void __meminit vmemmap_verify(pte_t *pte, int node, - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Patch]Add strict_goal parameter to __alloc_bootmem_core
Resend the patch for more people to review. If __alloc_bootmem_core was given a goal, it will first try to allocate memory above that goal. If failed, it will try from the low pages. Sometimes we don't want this behavior, we want the goal to be strict. This patch introduce a strict_goal parameter to __alloc_bootmem_core, If strict_goal is set, __alloc_bootmem_core will return NULL to indicate it can't allocate memory above that goal. Note we do not scan from last_success if strict_goal is set, it will scan from the beginning of the goal instead We skip this optimization to keep the code simple because strict_goal is not supposed to be used in hot path. Signed-off-by: Zou Nan hai <[EMAIL PROTECTED]> Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]> diff -Nraup a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c --- a/arch/x86/mm/numa_64.c 2007-10-24 11:50:57.0 +0800 +++ b/arch/x86/mm/numa_64.c 2007-11-07 13:06:50.0 +0800 @@ -247,7 +247,7 @@ void __init setup_node_zones(int nodeid) __alloc_bootmem_core(NODE_DATA(nodeid)->bdata, memmapsize, SMP_CACHE_BYTES, round_down(limit - memmapsize, PAGE_SIZE), - limit); + limit, 1); #endif } diff -Nraup a/include/linux/bootmem.h b/include/linux/bootmem.h --- a/include/linux/bootmem.h 2007-11-07 13:06:35.0 +0800 +++ b/include/linux/bootmem.h 2007-11-07 13:06:04.0 +0800 @@ -58,7 +58,8 @@ extern void *__alloc_bootmem_core(struct unsigned long size, unsigned long align, unsigned long goal, - unsigned long limit); + unsigned long limit, + int strict_goal); #ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE extern void reserve_bootmem(unsigned long addr, unsigned long size); diff -Nraup a/mm/bootmem.c b/mm/bootmem.c --- a/mm/bootmem.c 2007-11-07 13:06:35.0 +0800 +++ b/mm/bootmem.c 2007-11-07 13:06:18.0 +0800 @@ -179,7 +179,7 @@ static void __init free_bootmem_core(boo */ void * __init __alloc_bootmem_core(struct bootmem_data *bdata, unsigned long size, - unsigned long align, unsigned long goal, unsigned long limit) + unsigned long align, unsigned long goal, unsigned long limit, int strict_goal) { unsigned long offset, remaining_size, areasize, preferred; unsigned long i, start = 0, incr, eidx, end_pfn; @@ -212,15 +212,20 @@ __alloc_bootmem_core(struct bootmem_data /* * We try to allocate bootmem pages above 'goal' * first, then we try to allocate lower pages. -*/ - if (goal && goal >= bdata->node_boot_start && PFN_DOWN(goal) < end_pfn) { - preferred = goal - bdata->node_boot_start; +* if the goal is not strict. + */ + + preferred = 0; + if (goal) { + if (goal >= bdata->node_boot_start && PFN_DOWN(goal) < end_pfn) { + preferred = goal - bdata->node_boot_start; if (bdata->last_success >= preferred) - if (!limit || (limit && limit > bdata->last_success)) + if (!strict_goal && (!limit || (limit && limit > bdata->last_success))) preferred = bdata->last_success; - } else - preferred = 0; + } else if (strict_goal) +return NULL; + } preferred = PFN_DOWN(ALIGN(preferred, align)) + offset; areasize = (size + PAGE_SIZE-1) / PAGE_SIZE; @@ -247,7 +252,7 @@ restart_scan: i = ALIGN(j, incr); } - if (preferred > offset) { + if (preferred > offset && !strict_goal) { preferred = offset; goto restart_scan; } @@ -421,7 +426,7 @@ void * __init __alloc_bootmem_nopanic(un void *ptr; list_for_each_entry(bdata, &bdata_list, list) { - ptr = __alloc_bootmem_core(bdata, size, align, goal, 0); + ptr = __alloc_bootmem_core(bdata, size, align, goal, 0, 0); if (ptr) return ptr; } @@ -449,7 +454,7 @@ void * __init __alloc_bootmem_node(pg_da { void *ptr; - ptr = __alloc_bootmem_core(pgdat->bdata, size, align, goal, 0); + ptr = __alloc_bootmem_core(pgdat->bdata, size, align, goal, 0, 0); if (ptr) return ptr; @@ -468,7 +473,7 @@ void * __init __alloc_bootmem_low(unsign list_for_each_entry(bdata, &bdata_list, list) { ptr = __alloc_bootmem_core(bdata, size, align, goal, - ARCH_LOW_ADDRESS_LIMIT); + ARCH_LOW_ADDRESS_LIMIT, 0);
Re: compat_sys_times() bogus until jiffies >= 0.
Andrew Morton <[EMAIL PROTECTED]> writes: > diff -puN kernel/compat.c~a kernel/compat.c > --- a/kernel/compat.c~a > +++ a/kernel/compat.c > @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct > if (copy_to_user(tbuf, &tmp, sizeof(tmp))) > return -EFAULT; > } > - return compat_jiffies_to_clock_t(jiffies); > + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & > + LONG_MAX); Are you sure you want LONG_MAX here, not 0x7fff? Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
From: Andrew Morton <[EMAIL PROTECTED]> Date: Wed, 7 Nov 2007 15:28:33 -0800 > Perhaps this is a bug in glibc: it is interpreting the times() return value > in the same way as other syscalls. The problem is more likely that we are failing to invoke force_successful_syscall_return() here. Otherwise the syscall return path interprets negative values as errors, and sets the cpu condition codes. And that is what userspace is actually checking for to determine if there is an error or not. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: writeout stalls in current -git
On Wed, Nov 07, 2007 at 08:15:06AM +0100, Torsten Kaiser wrote: > On 11/7/07, David Chinner <[EMAIL PROTECTED]> wrote: > > Ok, so it's not synchronous writes that we are doing - we're just > > submitting bio's tagged as WRITE_SYNC to get the I/O issued quickly. > > The "synchronous" nature appears to be coming from higher level > > locking when reclaiming inodes (on the flush lock). It appears that > > inode write clustering is failing completely so we are writing the > > same block multiple times i.e. once for each inode in the cluster we > > have to write. > > Works for me. The only remaining stalls are sub second and look > completely valid, considering the amount of files being removed. > Tested-by: Torsten Kaiser <[EMAIL PROTECTED]> Great - thanks for reporting the problem and testing the fix. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.34-rc1 eat my photo SD card :-(
On Wednesday, 7 of November 2007, Romano Giannetti wrote: > > On Tue, 2007-11-06 at 23:17 +0100, Romano Giannetti wrote: > > Well, I started bisecting it. It will be a long shot, I suspect... > > Well, I spent the last 36 hours (more or less) trying to bisect the SD > problem. The method I used was to insert the card, umount it, and make 8 dd > in a row; the kernel is "bad" if they differs, "good" if they are the same. > > I could not finish the bisect. The last pair good/bad were: > > bad: [7aeacf982203fb4dea2f3434eefdc268cfd5d6d9] >[BLOCK] blk_rq_map_sg: force clear termination bit > good: [e38f981758118d829cd40cfe9c09e3fa81e422aa] >exportfs: update documentation > > The problem to conclude the bisect is that there is a whole series of > commits, named [SG] something, that seems to matter; but my three try of a > commit between the previous two ended with a MMC layer not working with this > oops: Can you please update the Bugzilla entry at http://bugzilla.kernel.org/show_bug.cgi?id=9286 with this information? > [ 81.738991] BUG: unable to handle kernel NULL pointer dereference at > virtual address > [ 81.739003] printing eip: c01db437 *pde = > [ 81.739010] Oops: [#1] SMP > [ 81.739016] Modules linked in: mmc_block binfmt_misc rfcomm l2cap > bluetooth ppdev i915 drm acpi_cpufreq cpufreq_conservative cpufreq_stats > cpufreq_ondemand freq_table cpufreq_userspace cpufreq_powersave dock > container sbs sbshc af_packet nls_iso8859_1 nls_cp437 vfat fat nls_utf8 ntfs > dm_crypt dm_mod sbp2 parport_pc lp parport fuse snd_hda_intel snd_pcm_oss > snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss iTCO_wdt iTCO_vendor_support > serio_raw sdhci snd_seq_midi snd_rawmidi snd_seq_midi_event psmouse pcspkr > mmc_core snd_seq snd_timer snd_seq_device snd soundcore video output battery > snd_page_alloc ac button intel_agp agpgart evdev ext3 jbd mbcache sg sr_mod > cdrom sd_mod ata_piix ehci_hcd ata_generic ohci1394 uhci_hcd ieee1394 libata > scsi_mod generic usbcore r8169 thermal processor fan > [ 81.739122] > [ 81.739127] Pid: 6075, comm: mmcqd Not tainted (2.6.23-bisect #19) > [ 81.739132] EIP: 0060:[] EFLAGS: 00010246 CPU: 0 > [ 81.739141] EIP is at blk_rq_map_sg+0xd7/0x190 > [ 81.739145] EAX: 03619000 EBX: ECX: c3464198 EDX: c3464698 > [ 81.739150] ESI: 0361a000 EDI: 1000 EBP: cb82fe24 ESP: cb82fdec > [ 81.739154] DS: 007b ES: 007b FS: 00d8 GS: SS: 0068 > [ 81.739159] Process mmcqd (pid: 6075, ti=cb82e000 task=cb2a5550 > task.ti=cb82e000) > [ 81.739163] Stack: 0292 c366c530 cb839a70 2000 0361b000 c3464698 > 0001 0001 > [ 81.739176] c34e0848 01ae4698 c33ef2b0 c33ef2b0 cb2ec870 > cb82fe3c f8e81e6c > [ 81.739188]00200200 c3342580 c33ef2b0 cb2ec870 cb82ffb8 f8e816f9 > 7898775f 5f6f5965 > [ 81.739200] Call Trace: > [ 81.739204] [] show_trace_log_lvl+0x1a/0x30 > [ 81.739213] [] show_stack_log_lvl+0xb1/0xe0 > [ 81.739220] [] show_registers+0xc1/0x1d0 > [ 81.739226] [] die+0x11a/0x230 > [ 81.739232] [] do_page_fault+0x269/0x5f0 > [ 81.739239] [] error_code+0x72/0x78 > [ 81.739247] [] mmc_queue_map_sg+0x2c/0xe0 [mmc_block] > [ 81.739258] [] mmc_blk_issue_rq+0x199/0x750 [mmc_block] > [ 81.739267] [] mmc_queue_thread+0x80/0xf0 [mmc_block] > [ 81.739275] [] kthread+0x42/0x70 > [ 81.739282] [] kernel_thread_helper+0x7/0x10 > [ 81.739289] === > [ 81.739292] Code: f0 89 45 d8 8b 01 2b 05 80 aa 67 c0 c1 f8 02 69 c0 c5 4e > ec c4 c1 e0 0c 03 41 08 39 45 d8 0f 84 8e 00 00 00 f6 03 02 74 52 31 db <8b> > 03 c7 43 0c 00 00 00 00 c7 43 08 00 00 00 00 83 e0 03 0b 01 > [ 81.739358] EIP: [] blk_rq_map_sg+0xd7/0x190 SS:ESP 0068:cb82fdec > > It seems to me that the two commits: > > [BLOCK] blk_rq_map_sg: force clear termination bit > [BLOCK] Don't clear sg_dma_len/addr() in blk_rq_map_sg() > > have the potential to fix the aforementioned oops, but in a way that create > for the mmc layer the problem reported. It's just gut feeling, I have not > the knowledge of the kernel needed to debug this, but this comment: > > + * If the driver previously mapped a shorter > + * list, we could see a termination bit > + * prematurely unless it fully inits the sg > + * table on each mapping. We KNOW that there > + * must be more entries here or the driver > + * would be buggy, so force clear the > + * termination bit to avoid doing a full > + * sg_init_table() in drivers for each command. > + */ > > rang a bell. When the bug occurs, it seems that some random page is mapped > into the device, so that... maybe the list was not supposed to continue in > this case? > > Well, I hope it can helps someone to find the bug. I am available to > test/try whatever patches you send me. > >Romano > > Complete git bisect log: > > git-bisect start > # bad: [2655e2cee2d77459fcb7e10228259e4ee
Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?
On Thursday 08 November 2007 01:20, Matt Mackall wrote: > On Wed, Nov 07, 2007 at 12:30:45PM -0800, Andrew Morton wrote: > > Ow. Yes, from my reading delay_tsc() can return early (or after > > heat-death-of-the-universe) if the TSCs are offset and if preemption > > migrates the calling task between CPUs. > > > > I suppose a lameo fix would be to disable preemption in delay_tsc(). > > preempt_disable is lousy documentation here. This and other cases > (lots of per_cpu users, IIRC) actually want a migrate_disable() which > is a proper subset. We can simply implement migrate_disable() as > preempt_disable() for now and come back later and implement a proper > migrate_disable() that still allows preemption (and thus avoids the > latency). We could actually do this right now. migrate_disable() can be just changing the cpu affinity of the current thread to current cpu and then restoring it afterwards. That should even work from interrupt context. get_cpu() etc. could be changed to use this then too. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fwd: same problem with 2.6.24-rc2
On 7/Nov/2007 20:10 werner wrote .. > With 2.6.23-rc2 is the same problem: it crashed at the beginning: EIP 060 > c03fdea4 > EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200 > Again during the compilation was reclaimed that > /arch/x86/Makefile.o > cannot be found and were certain dependencies on it not made, such a file > isn't > present in the source code (present are, f.ex. Makefile_32 , Makefile_64 ), > nor > was generated automaticaly during compilation, I think this is incorrect and > the > reason for the problems > > wl > [EMAIL PROTECTED] > = > On 7/Nov/2007 16:14 Andrew Morton wrote .. > > > On Wed, 07 Nov 2007 15:55:12 -0300 (GFT) "werner" <[EMAIL PROTECTED]> > > > wrote: > > > I really don't know what's happening. I don't understand nothing about > > > the > kernel > > error reporting system. Because of this, always when there is a problem, I > report > > it via e-mail to linux-kernel@vger.kernel.org . I don't know what people > > there > > do with my messages. > > > > > > It went like this: > > > > 1: you sent an email to linux-kernel > > > > 2: I sent a reply to you and linux-kernel > > > > 3: you sent a reply to me, but NOT linux-kernel! > > > > In other words, you did "reply", not "reply to all", thus you removed three > > thousand people from the discussion. One of those people is the person who > > created the bug which you're hitting, and that person no longer knows > > what's happening. > > > > > > So please go back and resend all those emails, and retain ALL Cc:'s. Don't > > just send them only to me. Keep all indivisuals and all mailing lists on > > the email Cc: list. > == > *** www.copaya.yi.org / www.monkey.is-a-geek.net *** > O único servidor comunitário na Guiana-Francesa. Situado no local, rápido, > imuno > contra guerras / desastres na Europa. Serviço não-comercial e gratuito de: > http > (forum, página web), irc (chat), ftp (download), name (subdomain) . == *** www.copaya.yi.org / www.monkey.is-a-geek.net *** O único servidor comunitário na Guiana-Francesa. Situado no local, rápido, imuno contra guerras / desastres na Europa. Serviço não-comercial e gratuito de: http (forum, página web), irc (chat), ftp (download), name (subdomain) .
Re: [PATCH] create /sys/.../power when CONFIG_PM is set
On Wed, Nov 07, 2007 at 11:24:55PM +0100, Rafael J. Wysocki wrote: > On Wednesday, 7 of November 2007, Daniel Drake wrote: > > The CONFIG_SUSPEND changes in 2.6.23 caused a regression under certain > > configuration conditions (SUSPEND=n, USB_AUTOSUSPEND=y) where all USB device > > attributes in sysfs (idVendor, idProduct, ...) silently disappeared, causing > > udev breakage and more. > > > > The cause of this is that the /sys/.../power subdirectory is now only > > created > > when CONFIG_PM_SLEEP is set, however, it should be created whenever > > CONFIG_PM > > is set to handle the above situation. The following patch fixes the > > regression. > > > > Signed-off-by: Daniel Drake <[EMAIL PROTECTED]> > > Acked-by: Rafael J. Wysocki <[EMAIL PROTECTED]> > > Greg, I think this patch should go through your tree? Yes, I'll take it. I'm at a conference until Friday, but will take it then and then get it to Linus before 2.6.24 is out. thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched: avoid large irq-latencies in smp-balancing
On Wed, 2007-11-07 at 17:10 -0500, Steven Rostedt wrote: > > > > It would be nice if sched_nr_migrate didn't exist, really. It's hard to > > imagine anyone wanting to tweak it, apart from developers. > > I'm not so sure about that. It is a tunable for RT. That is we can tweak > this value to be smaller if we don't like the latencies it gives us. > > This is one of those things that sacrifices performance for latency. > The higher the number, the better it can spread tasks around, but it > also causes large latencies. > > I've just included this patch into 2.6.23.1-rt11 and it brought down an > unbounded latency to just 42us. (previously we got into the > milliseconds!). > > Perhaps when this feature matures, we can come to a good defined value > that would be good for all. But until then, I recommend keeping this a > tunable. Why not use the latency-expectation infrastructure? Iterate under lock until (or before...) the system global latency is respected. - Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SATA eating my disk, port reset, destroying unrelated data
Norbert Preining wrote: Dear all! (please Cc me for answers) Since about 5 days I am having serious problems with my SATA drive: kernel 2.6.22 (from Debian/sid) hardware nv Sometimes at boot time, often/always at disk io intense stuff: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x40 action 0x2 Serror 0x40 means a handshake error. Usually Serror indications are due to a hardware problem (bad SATA cable, power or drive problem). ata1.00: (BMDMA stat 0x25) ata1.00: cmd 35/00:00:2a:6f:c0/00:04:0c:00:00/e0 tag 0 cdb 0x0 data 524288 out res 51/84:10:1a:72:c0/84:01:0c:00:00/e0 Emask 0x10 (ATA bus error) ata1: soft resetting port ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: configured for UDMA/133 ata1: EH complete Even worse, sometimes the reset does not work ... ata1: device not ready (errno=-16), forcing hardreset ata1: hard resetting port ata1 SRST failed (errno=-19) ata1: reset failed (errno=-19), retrying in 10 secs .. (typed from a digital photo, nothing remains in the logs) After this I need to do a cold boot otherwise the drive is really in a bad state and not even the bios gets it right. If even the BIOS cannot reset properly then that also really points to a hardware problem.. Interestingly the whole stuff DID work for a long time until I did too many things at the same time: 2 x svn up, copying 40G from the SATA drive to an USB drive, aptitude upgrade. Before I did regularly the same stuff (like svn up etc), but this time it was too much, it seems. Apropos data hosing: After the first incident some data on my windows partitions (/dev/sda1) was hosed, programs missing, chkdisk necessary etc. I attach dmesg (from the current boot with a succeeding soft reset, I interrupted the svn process before the SATA drives goes into hard reset failures), .config, lspci -v output. Are there any chances that using 2.6.23 will improve/fix this? Any other suggestions? I would consider it an hardware problem, but since it started at one big io thingy and is persistent since then I am a bit sceptic. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10
On Thursday 08 November 2007, Denys Fedoryshchenko wrote: > 2.6.24-rc2 not working very well > > > dmesg > [ 12.386395] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > [ 12.405579] ide: Assuming 33MHz system bus speed for PIO modes; override > with idebus=xx > [ 12.430441] SC1200: IDE controller (0x100b:0x0502 rev 0x01) at PCI slot > :00:12.2 > [ 12.454070] SC1200: not 100% native mode: will probe irqs later > [ 12.471947] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, > hdb:pio > [ 12.493873] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, > hdd:pio > [ 12.515810] Probing IDE interface ide0... > [ 12.528810] Clocksource tsc unstable (delta = -497423729 ns) > [ 12.545888] Time: pit clocksource has been installed. > [ 12.563379] hda: SanDisk SDCFH-1024, CFA DISK drive > [ 12.578340] hda: applying conservative PIO "downgrade" > [ 12.593869] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO1 > [ 12.594006] hda: MW DMA 2 mode selected > [ 12.594297] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > [ 12.608778] Probing IDE interface ide1... > [ 12.623192] hda: max request size: 128KiB > [ 12.635322] hda: 2001888 sectors (1024 MB) w/1KiB Cache, CHS=1986/16/63, > DMA > [ 12.657134] hda:<4>hda: dma_timer_expiry: dma status == 0x21 > [ 12.865846] hda: DMA timeout error > [ 12.876092] ide_dma_end dma_stat=21 err=1 newerr=0 > [ 12.890753] hda: dma timeout error: status=0x58 { DriveReady SeekComplete > DataRequest } > [ 12.914977] ide: failed opcode was: unknown > [ 12.927743] hda: DMA disabled > [ 12.937035] ide0: reset: success > [ 12.948324] hda1 > > Mounting taking long time on 1GB card cause of DMA issues. In dmesg i am not > sure about timestamp showing few seconds, in real life it took about 2 > minutes. Please try booting with "hda=nodma". It could be a hardware problem (CF adapter without DMA lines). Thanks, Bart - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: is minimum udelay() not respected in preemptible SMP kernel-2.6.23?
On Wed, Nov 07, 2007 at 12:30:45PM -0800, Andrew Morton wrote: > Ow. Yes, from my reading delay_tsc() can return early (or after > heat-death-of-the-universe) if the TSCs are offset and if preemption > migrates the calling task between CPUs. > > I suppose a lameo fix would be to disable preemption in delay_tsc(). preempt_disable is lousy documentation here. This and other cases (lots of per_cpu users, IIRC) actually want a migrate_disable() which is a proper subset. We can simply implement migrate_disable() as preempt_disable() for now and come back later and implement a proper migrate_disable() that still allows preemption (and thus avoids the latency). -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] [PATCH 3/3] Recursive mtime for ext3
On Wed, Nov 07, 2007 at 03:36:05PM +0100, Jan Kara wrote: > > What if more than one application wants to use this facility? > > That should be fine - let's see: Each application keeps somewhere a time > when > it started a scan of a subtree (or it can actually remember a time when it > set the flag for each directory), during the scan, it sets the flag on > each directory. When it wakes up to recheck the subtree it just compares > the rtime against the stored time - if rtime is greater, subtree has been > modified since the last scan and we recurse in it and when we are finished > with it we set the flag. Now notice that we don't care about the flag when > we check for changes - we care only for rtime - so if there are several > applications interested in the same subtree, the flag just gets set more > often and thus the update of rtime happens more often but the same scheme > still works fine. OK, so in this case you don't need to set rtime on the every single file inode, but only directory inode, right? Because you're only using checking the rtime at the directory level, and not the flag. And it's just as easy for you to check the rtime flag for the file's containing directory (modulo magic vis-a-vis hard links) as the file's inode. I'm just really wishing that rtime and the rtime flag didn't have live on disk, but could rather be in memory. If you only needed to save the directory flags and rtimes, that might actually be doable. Note by the way that since you need to own the file/directory to set flags, this means that only programs that are running as root or running as the uid who owns the entire subtree will be able to use this scheme. One advantage of doing in kernel memory is that you might be able to support watching a tree that is not owned by the watcher. > I don't get it here - you need to scan the whole subtree and set the flag > only during the initial scan. Later, you need to scan and set the flag only > for directories in whose subtree something changed. Similarty rtime needs > to be updated for each inode at most once after the scan. OK, so in the worst case every single file in a kernel source tree might change after doing an extreme git checkout. That means around 36k of files get updated. So if you have to set/clear the rtime flag during the checkout process 36k file inodes would have to have their rtime flag cleared, plus 2k worth of directory inodes; but those would probably be folded into other changes made to the inodes anyway. But then when trackerd goes back and scans the subtree, if you are actually setting rtime flags for every single file inode, then that's 38k of indoes that need updating. If you only need to set the rtime flags for directories, that's only 2k worth of extra gratuitous inode updates. - Ted - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] r8169 fix regression on ASUS motherboards (updated)
Mark Lord <[EMAIL PROTECTED]> : [...] > I've now received a couple of private emails from people reporting > full success with this patch. Ok, I have pushed the patch below for Jeff to pull at korg. >From 1dd7681bc2ff171341ea5cae957f8ecb5c0c102e Mon Sep 17 00:00:00 2001 From: Mark Lord <[EMAIL PROTECTED]> Date: Thu, 8 Nov 2007 01:03:04 +0100 Subject: [PATCH] r8169: revert 7da97ec96a0934319c7fbedd3d38baf533e20640 (partly) Various symptoms depending on the .config options: - the card stops working after some (short) time - the card does not work at all - the card disappears (nothing in lspci/dmesg) A real power-off is needed to recover the card. Signed-off-by: Mark Lord <[EMAIL PROTECTED]> Signed-off-by: Francois Romieu <[EMAIL PROTECTED]> --- drivers/net/r8169.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index 9dbab3f..a37cf82 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -1328,6 +1328,7 @@ static void rtl_hw_phy_config(struct net_device *dev) break; case RTL_GIGA_MAC_VER_11: case RTL_GIGA_MAC_VER_12: + break; case RTL_GIGA_MAC_VER_17: rtl8168b_hw_phy_config(ioaddr); break; -- 1.5.3.3 -- Ueimor - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Wed, 7 Nov 2007 15:28:33 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote: > > compat_sys_times() has bogus return until jiffies is >= 0. I discovered > > this running LTP within 5 minutes of booting. > > > > The return result > > > > return compat_jiffies_to_clock_t(jiffies); > > > > will return '-1' to user space and set the negated clock_t value to errno. > > > > I'm not sure what the correct fix for this is. I can come up with a patch > > if anyone has ideas on how to fix it. > > > > At minimum, perhaps it should return a sane errno value. > > RETURN VALUE >times() returns the number of clock ticks that have elapsed since an >arbitrary point in the past. For Linux 2.4 and earlier this point is >the moment the system was booted. Since Linux 2.6, this point is >(2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot >time. The return value may overflow the possible range of type >clock_t. On error, (clock_t) -1 is returned, and errno is set appro- >priately. > > > Perhaps this is a bug in glibc: it is interpreting the times() return value > in the same way as other syscalls. > > It would have been sensible for us to add INITIAL_JIFFIES to the value > instead of exposing this kernel-only detail to the world, although the > problem will of course reoccur once jiffies hits 0x8000. Unfortunately > we've even gone and enshrined this bogon in the manpage. > > Proposed fix: > > -return compat_jiffies_to_clock_t(jiffies); > +return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & > + 0x7fff); > > ? Like this? It gets messy. From: Andrew Morton <[EMAIL PROTECTED]> David Brown points out that compat_sys_times() (and sys_times()) can return arbitrary 32-bit (or 64-bit values). If these happen to be negative (jiffy wrap, or before INITIAL_JIFFIES) then libc will interpret this as an error and will return -1 to the libc user and will set errno. The manpage for times(2) says: times() returns the number of clock ticks that have elapsed since an arbitrary point in the past. For Linux 2.4 and earlier this point is the moment the system was booted. Since Linux 2.6, this point is (2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot time. The return value may overflow the possible range of type clock_t. On error, (clock_t) -1 is returned, and errno is set appro- priately. We can fix this by masking the return value down to a 31-bit (63-bit) value. Also, let's correct for INTIAL_JIFFIES - this isn't a detail which should be exposed to userspace. Unfortunately this change can break userspace. If a program was (correctly) doing: unsigned long start = times(...); ... unsigned long end = times(...); unsigned long delta = end - start; then `delta' can be grossly wrong if we wrapped in the interval. Instead userspace will need to mask `delta' by 0x7fff (0x7fff) to get the correct number. But userspace was already busted in the presence of wraparound, due to glibc's convert-to-negative-one behaviour. Given all this stuff, the return value from sys_times() doesn't seem a particularly useful or reliable kernel interface. Cc: David Brown <[EMAIL PROTECTED]> Cc: Ulrich Drepper <[EMAIL PROTECTED]> Cc: Michael Kerrisk <[EMAIL PROTECTED]> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> --- kernel/compat.c |3 ++- kernel/sys.c|3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff -puN kernel/sys.c~a kernel/sys.c --- a/kernel/sys.c~a +++ a/kernel/sys.c @@ -897,7 +897,8 @@ asmlinkage long sys_times(struct tms __u if (copy_to_user(tbuf, &tmp, sizeof(struct tms))) return -EFAULT; } - return (long) jiffies_64_to_clock_t(get_jiffies_64()); + return jiffies_64_to_clock_t((get_jiffies_64() + INITIAL_JIFFIES) & + LONG_MAX); } /* diff -puN kernel/compat.c~a kernel/compat.c --- a/kernel/compat.c~a +++ a/kernel/compat.c @@ -162,7 +162,8 @@ asmlinkage long compat_sys_times(struct if (copy_to_user(tbuf, &tmp, sizeof(tmp))) return -EFAULT; } - return compat_jiffies_to_clock_t(jiffies); + return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & + LONG_MAX); } /* _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem with accessing namespace_sem from LSM.
Hello. Christoph Hellwig wrote: > Same argument as with the AA folks: it does not have any business looking > at the vfsmount. If you create a file it can and in many setups will > show up in multiple vfsmounts, so making decisions based on the particular > one this creat happens through is wrong and actually dangerous. Thus TOMOYO 1.x doesn't use LSM hooks, and AppArmor for OpenSuSE 10.3 added "struct vfsmount" parameter for VFS helper functions and LSM hooks. Not all systems use bind mounts. There is likely only one vfsmount which corresponds with a given dentry. What does "dangerous" mean? It causes crash? Regards. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86 - 32-bit ptrace emulation mishandles 6th arg
On 11/07/2007 04:12 PM, Roland McGrath wrote: > Sure has my ACK. > I never really understood why my old patch was not taken 2.5 years ago. > > I forget the details, but I had to make some kind of trivial change to make it work in some corner cases. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: sata NCQ blacklist entry
Tejun Heo wrote: Florian La Roche wrote: Hello all, I've taking email addresses from the last NCQ blacklist changes going into the kernel. This Fujitsu drive also gives me spurious command completions. Detailed output also available at https://bugzilla.redhat.com/show_bug.cgi?id=366181. Let me know if you need more info or anything else. --- drivers/ata/libata-core.c +++ drivers/ata/libata-core.c @@ -4222,6 +4222,7 @@ { "WDC WD740ADFD-00NLR1", NULL, ATA_HORKAGE_NONCQ, }, { "WDC WD3200AAJS-00RYA0", "12.01B01", ATA_HORKAGE_NONCQ, }, { "FUJITSU MHV2080BH","00840028", ATA_HORKAGE_NONCQ, }, + { "FUJITSU MHW2160BJ G2", NULL, ATA_HORKAGE_NONCQ }, { "ST9120822AS", "3.CLF", ATA_HORKAGE_NONCQ, }, { "ST9160821AS", "3.CLF", ATA_HORKAGE_NONCQ, }, { "ST9160821AS", "3.ALD", ATA_HORKAGE_NONCQ, }, Thanks. We're currently trying to find out what's actually going on with all these drives. At first, drives which got blacklisted aren't many and made sense (had other problems with NCQ, etc..) but with new generation drives from many vendors showing the same symptom, we aren't too sure now. I'll keep your email in my todo list and add the drive to the blacklist once the problem is verified. I agree that something seems fishy with this. It seems unlikely that this many drives from multiple vendors would have the exact same, relatively obscure problem.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10
2.6.24-rc2 not working very well dmesg [ 12.386395] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 [ 12.405579] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx [ 12.430441] SC1200: IDE controller (0x100b:0x0502 rev 0x01) at PCI slot :00:12.2 [ 12.454070] SC1200: not 100% native mode: will probe irqs later [ 12.471947] ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, hdb:pio [ 12.493873] ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, hdd:pio [ 12.515810] Probing IDE interface ide0... [ 12.528810] Clocksource tsc unstable (delta = -497423729 ns) [ 12.545888] Time: pit clocksource has been installed. [ 12.563379] hda: SanDisk SDCFH-1024, CFA DISK drive [ 12.578340] hda: applying conservative PIO "downgrade" [ 12.593869] hda: host max PIO4 wanted PIO255(auto-tune) selected PIO1 [ 12.594006] hda: MW DMA 2 mode selected [ 12.594297] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 [ 12.608778] Probing IDE interface ide1... [ 12.623192] hda: max request size: 128KiB [ 12.635322] hda: 2001888 sectors (1024 MB) w/1KiB Cache, CHS=1986/16/63, DMA [ 12.657134] hda:<4>hda: dma_timer_expiry: dma status == 0x21 [ 12.865846] hda: DMA timeout error [ 12.876092] ide_dma_end dma_stat=21 err=1 newerr=0 [ 12.890753] hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } [ 12.914977] ide: failed opcode was: unknown [ 12.927743] hda: DMA disabled [ 12.937035] ide0: reset: success [ 12.948324] hda1 Mounting taking long time on 1GB card cause of DMA issues. In dmesg i am not sure about timestamp showing few seconds, in real life it took about 2 minutes. after that in dmesg [ 14.965070] hda: dma_timer_expiry: dma status == 0x21 [ 15.107909] hda: DMA timeout error [ 15.118149] ide_dma_end dma_stat=21 err=1 newerr=0 [ 15.132809] hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } [ 15.157035] ide: failed opcode was: unknown [ 15.169799] hda: DMA disabled [ 15.178797] ide0: reset: success [ 15.312698] hda: dma_timer_expiry: dma status == 0x21 [ 15.650705] hda: DMA timeout error [ 15.660952] ide_dma_end dma_stat=21 err=1 newerr=0 [ 15.675614] hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } [ 15.699836] ide: failed opcode was: unknown [ 15.712601] hda: DMA disabled [ 15.721603] ide0: reset: success [ 16.325999] hda: dma_timer_expiry: dma status == 0x21 [ 16.565756] hda: DMA timeout error [ 16.576001] ide_dma_end dma_stat=21 err=1 newerr=0 [ 16.590661] hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest } [ 16.614886] ide: failed opcode was: unknown [ 16.627651] hda: DMA disabled [ 16.636659] ide0: reset: success [ 16.650061] EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended On Wed, 7 Nov 2007 18:20:45 -0500, Jeff Garzik wrote > On Wed, Nov 07, 2007 at 02:12:55PM -0500, Mark Lord wrote: > > That cannot be correct (??). Is this with hdparm-7.7 (latest sourceforge) > > ?? > > Can you show us the "hdparm --Istdout" output as well, please. > > If this is applicable... FWIW hdparm was only recently (in past <72 > hours) updated from 6.9 to 7.7 in Fedora... > > Jeff -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86 - 32-bit ptrace emulation mishandles 6th arg
FYI, http://sourceware.org/systemtap/wiki/utrace/tests has details on the ptrace-tests suite we're collecting. A test I added there is how I noticed the PTRACE_GET_THREAD_AREA regression. A regression test for the ebp bug should be easy to add too. Thanks, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10
I am using Gentoo (and it is custom build of linux, actually only busybox + kernel + uclibc and few other tools), hdparm is vanilla 7.7 I will try to compile now -rc2 to see if there any changes. With 16MB 2.6.24-rc1 works fine, 1GB working also with some errors in dmesg. And IF that all is important, cause it is relatively old hardware and probably if it is only this hardware-specific bug, it is enough to issue workaround just to be able to use it. I dont think so someone using them now much, but IMHO things must work in kernel if they are there. On Wed, 7 Nov 2007 18:20:45 -0500, Jeff Garzik wrote > On Wed, Nov 07, 2007 at 02:12:55PM -0500, Mark Lord wrote: > > That cannot be correct (??). Is this with hdparm-7.7 (latest sourceforge) > > ?? > > Can you show us the "hdparm --Istdout" output as well, please. > > If this is applicable... FWIW hdparm was only recently (in past <72 > hours) updated from 6.9 to 7.7 in Fedora... > > Jeff -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc1-gb4f5550 oops
On Monday, 5 of November 2007, Grant Wilson wrote: > Hi, > I got this oops on 2.6.24-rc1-641-gb4f5550: (1) Is this reproducible? (2) Did it happen previously on your system? > [18073.371126] Unable to handle kernel NULL pointer dereference at > 0120 RIP: > [18073.371134] [] check_preempt_wakeup+0x6e/0x110 > [18073.371144] PGD 81f9067 PUD 81c8067 PMD 0 > [18073.371151] Oops: [1] PREEMPT SMP > [18073.371157] CPU 2 > [18073.371161] Modules linked in: vfat fat > [18073.371168] Pid: 4639, comm: kwin Not tainted 2.6.24-rc1 #1 > [18073.371171] RIP: 0010:[] [] > check_preempt_wakeup+0x6e/0x110 > [18073.371177] RSP: 0018:810008531a78 EFLAGS: 00010006 > [18073.371179] RAX: RBX: RCX: > > [18073.371183] RDX: 810004441bf0 RSI: 81000801e860 RDI: > 81000444ab80 > [18073.371186] RBP: 810008531aa8 R08: 00d0d47a4a90 R09: > > [18073.371188] R10: 810004441bf0 R11: 0001 R12: > 810006520400 > [18073.371190] R13: 81000801e860 R14: 81000a63a000 R15: > 81000443d8e0 > [18073.371193] FS: 2b7d646a86f0() GS:810004c11780() > knlGS: > [18073.371196] CS: 0010 DS: ES: CR0: 8005003b > [18073.371199] CR2: 0120 CR3: 08495000 CR4: > 06e0 > [18073.371202] DR0: DR1: DR2: > > [18073.371211] DR3: DR6: 0ff0 DR7: > 0400 > [18073.371214] Process kwin (pid: 4639, threadinfo 81000853, task > 81000840a860) > [18073.371216] Stack: 81000444ab80 0001 81000801e860 > 81000444ab80 > [18073.371231] 0002 81000443d8e0 810008531b38 > 8023061e > [18073.371238] 810004441b80 0002 > 0001 > [18073.371245] Call Trace: > [18073.371250] [] try_to_wake_up+0x2fe/0x3a0 > [18073.371253] [] default_wake_function+0xd/0x10 > [18073.371257] [] __wake_up_common+0x5a/0x90 > [18073.371260] [] __wake_up_sync+0x4a/0x70 > [18073.371264] [] unix_write_space+0x8f/0xa0 > [18073.371269] [] sock_wfree+0x49/0x50 > [18073.371272] [] __kfree_skb+0x69/0xe0 > [18073.371275] [] kfree_skb+0x17/0x30 > [18073.371278] [] unix_stream_recvmsg+0x267/0x610 > [18073.371283] [] sock_aio_read+0x107/0x110 > [18073.371287] [] do_sync_read+0xf1/0x130 > [18073.371291] [] sock_ioctl+0x0/0x260 > [18073.371295] [] autoremove_wake_function+0x0/0x40 > [18073.371299] [] unix_ioctl+0xb2/0xf0 > [18073.371302] [] sock_ioctl+0xd1/0x260 > [18073.371305] [] do_ioctl+0x31/0x90 > [18073.371308] [] vfs_read+0x156/0x160 > [18073.371311] [] sys_read+0x50/0x90 > [18073.371315] [] system_call+0x7e/0x83 > [18073.371317] > [18073.371319] > [18073.371319] Code: 48 8b 90 20 01 00 00 48 39 93 20 01 00 00 75 e2 48 81 3b > 00 > [18073.371346] RIP [] check_preempt_wakeup+0x6e/0x110 > [18073.371351] RSP > [18073.371354] CR2: 0120 > [18073.371358] note: kwin[4639] exited with preempt_count 3 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.34-rc1 eat my photo SD card :-(
> Well, I spent the last 36 hours (more or less) trying to bisect the SD > problem. The method I used was to insert the card, umount it, and make 8 dd > in a row; the kernel is "bad" if they differs, "good" if they are the same. > > I could not finish the bisect. The last pair good/bad were: > > bad: [7aeacf982203fb4dea2f3434eefdc268cfd5d6d9] >[BLOCK] blk_rq_map_sg: force clear termination bit > good: [e38f981758118d829cd40cfe9c09e3fa81e422aa] >exportfs: update documentation Thanks, that helps. I read over the mmc changes in between those two commits, and I think I found the problem... could you please try the patch below (on top of the latest kernel) and report back how it works? Unfortunately I am traveling and I don't have an SD card with me to test on my laptop... Pierre, assuming Romano tests this patch successfully, please apply! Thanks, Roland <-- patch below --> mmc: Fix sg helper copy-and-paste error Commit 45711f1a ("[SG] Update drivers to use sg helpers") had the following bogus change in drivers/mmc/card/queue.c: > - src_buf = page_address(src->page) + src->offset; > + src_buf = sg_virt(dst); (Notice that "src" is converted to "dst"). Turn this "dst" back into the intended "src". Cc: Jens Axboe <[EMAIL PROTECTED]> Signed-off-by: Roland Dreier <[EMAIL PROTECTED]> --- diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c index 9203a0b..1b9c9b6 100644 --- a/drivers/mmc/card/queue.c +++ b/drivers/mmc/card/queue.c @@ -310,7 +310,7 @@ static void copy_sg(struct scatterlist *dst, unsigned int dst_len, } if (src_size == 0) { - src_buf = sg_virt(dst); + src_buf = sg_virt(src); src_size = src->length; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] x86 - 32-bit ptrace emulation mishandles 6th arg
On Wed, Nov 07, 2007 at 01:12:22PM -0800, Roland McGrath wrote: > Sure has my ACK. > I never really understood why my old patch was not taken 2.5 years ago. Nor I. It's needed. As is your PTRACE_SET_THREAD_INFO patch from yesterday - with these two fixes, I can boot a 32-bit UML on a 64-bit host. Jeff -- Work email - jdike at linux dot intel dot com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] Suppress A.OUT library support in ELF binfmt if !CONFIG_BINFMT_AOUT [try #3]
David Woodhouse <[EMAIL PROTECTED]> wrote: > Ew, no. This is horridly broken. You should never use CONFIG_xxx_MODULE > in the static kernel at all -- and you should _especially_ not be using > it in header files which are exported to userspace. AOUT support can be mostly built into a module, but a small part of it that is arch-specific still gets built into the main kernel. *That* is the main thing that is wrong. I suppose it might be possible to move those bits of the main kernel into inline functions in asm/a.out.h and thus include them directly in binfmt_aout.ko. > This abomination certainly doesn't seem to have any direct relation to > mn10300 support -- I think all you really need there is not to attempt > to export {asm,linux}/a.out.h if asm/a.out.h doesn't exist, which is > something you haven't attempted here anyway. No, it's not that simple. If asm/a.out.h doesn't exist, then various bits of the kernel break that shouldn't. fs/binfmt_elf.c for example. fs/exec.c for another. They *expect* bits of the asm/a.out.h and linux/a.out.h to exist - which they shouldn't. Not exporting them isn't by itself sufficient. The required constants themselves are not defined for an arch that doesn't have the support, and so the core code must not depend on them. This patch fixes that. Furthermore, STACK_TOP and STACK_TOP_MAX don't belong in asm/a.out.h as far as I can tell. They should probably be wherever TASK_SIZE resides (ie: asm/processor.h). David - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: compat_sys_times() bogus until jiffies >= 0.
> On Wed, 7 Nov 2007 14:47:22 -0800 David Brown <[EMAIL PROTECTED]> wrote: > compat_sys_times() has bogus return until jiffies is >= 0. I discovered > this running LTP within 5 minutes of booting. > > The return result > > return compat_jiffies_to_clock_t(jiffies); > > will return '-1' to user space and set the negated clock_t value to errno. > > I'm not sure what the correct fix for this is. I can come up with a patch > if anyone has ideas on how to fix it. > > At minimum, perhaps it should return a sane errno value. RETURN VALUE times() returns the number of clock ticks that have elapsed since an arbitrary point in the past. For Linux 2.4 and earlier this point is the moment the system was booted. Since Linux 2.6, this point is (2^32/HZ) - 300 (i.e., about 429 million) seconds before system boot time. The return value may overflow the possible range of type clock_t. On error, (clock_t) -1 is returned, and errno is set appro- priately. Perhaps this is a bug in glibc: it is interpreting the times() return value in the same way as other syscalls. It would have been sensible for us to add INITIAL_JIFFIES to the value instead of exposing this kernel-only detail to the world, although the problem will of course reoccur once jiffies hits 0x8000. Unfortunately we've even gone and enshrined this bogon in the manpage. Proposed fix: -return compat_jiffies_to_clock_t(jiffies); +return compat_jiffies_to_clock_t((jiffies + INITIAL_JIFFIES) & + 0x7fff); ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/