Re: Bind mount bug?
Hi Frans! Let's see whether I can explain this (I'm not a guru...) On 11 Nov 2007, at 11:06, Frans Pop wrote: I'm not sure whether this is a bug or expected behavior. Suppose I create a "looped" bind mount situation as follows. # mkdir test # touch test/foo # mkdir bindtest # touch bindtest/bar # mkdir bindtest/test # mount --bind test/ bindtest/test/ # ls bindtest/test/ foo # mount --bind bindtest/ test/ This mounts the bindtest/ tree on test/ _without_ copying the mount points which are found on subtrees. This is necessary to avoid loops in the filesystem (bind mounts are somewhat like hardlinks on directories, just without the headaches). # ls test/ bar test # ls test/test/ # This lists the contents of the original bintest/test/ directory which you created above. Creating e.g. a file in there stores that file physically in bindtest/test/bla, where "test" does _not_ mean the bind mount but the underlying directory here. I'd expected the last command to list "foo", but it shows an empty dir. Shouldn't it also show the original contents of test (as they were before the first bind mount)? # mount | grep test /root/test on /root/bindtest/test type none (rw,bind) /root/bindtest on /root/test type none (rw,bind) You see, the bindtest/test/ mount was not propagated to test/test/. This is very much by design. You can e.g. do # mkdir -p test/test # mount --bind test test/test # ls test/test test # ls test/test/test # so there is no loop (`find test` would actually say that it terminates because it has detected a loop, so it cannot be used to test this). # touch test/test/test/a # ls test/test/test a # ls test/test # umount test/test # ls test/test a # So, you see, test/test/test/a was (as it should) physically created in test/test, where it is shadowed by the bind mount as long as that is not removed. Nothing vanishes into "thin air" ;-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- PGP.sig Description: This is a digitally signed message part
Re: Bind mount bug?
Hi Frans! Let's see whether I can explain this (I'm not a guru...) On 11 Nov 2007, at 11:06, Frans Pop wrote: I'm not sure whether this is a bug or expected behavior. Suppose I create a looped bind mount situation as follows. # mkdir test # touch test/foo # mkdir bindtest # touch bindtest/bar # mkdir bindtest/test # mount --bind test/ bindtest/test/ # ls bindtest/test/ foo # mount --bind bindtest/ test/ This mounts the bindtest/ tree on test/ _without_ copying the mount points which are found on subtrees. This is necessary to avoid loops in the filesystem (bind mounts are somewhat like hardlinks on directories, just without the headaches). # ls test/ bar test # ls test/test/ # This lists the contents of the original bintest/test/ directory which you created above. Creating e.g. a file in there stores that file physically in bindtest/test/bla, where test does _not_ mean the bind mount but the underlying directory here. I'd expected the last command to list foo, but it shows an empty dir. Shouldn't it also show the original contents of test (as they were before the first bind mount)? # mount | grep test /root/test on /root/bindtest/test type none (rw,bind) /root/bindtest on /root/test type none (rw,bind) You see, the bindtest/test/ mount was not propagated to test/test/. This is very much by design. You can e.g. do # mkdir -p test/test # mount --bind test test/test # ls test/test test # ls test/test/test # so there is no loop (`find test` would actually say that it terminates because it has detected a loop, so it cannot be used to test this). # touch test/test/test/a # ls test/test/test a # ls test/test # umount test/test # ls test/test a # So, you see, test/test/test/a was (as it should) physically created in test/test, where it is shadowed by the bind mount as long as that is not removed. Nothing vanishes into thin air ;-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- PGP.sig Description: This is a digitally signed message part
Re: Please release a stable kernel Linux 3.0
Hi Zoltan! On 26 Jun 2007, at 16:37, Zoltán HUBERT wrote: If your vendor don't want to support you anymore, try getting the source. I was asking for a stable kernel, like 2.4, 2.2, 2.0 were before. 2.6 is not. It's a great kernel, better than that of MacOS X, I never said you were doing a bad job, quite the contrary. I wouldn't be using Linux since 10 years if I thought it stinks. I never asked support for closed source drivers, only a stable kernel. Whatever "stable" means. What you mean by "stable" pretty much excludes any serious development, without which the Linux kernel would very soon be obsolete. If you want a stable system, then don't change it. If you update to a kernel which is 2.5 years newer, you simply cannot have stability, because that would mean stagnation, aka "death". Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- PGP.sig Description: This is a digitally signed message part
Re: Please release a stable kernel Linux 3.0
Hi Zoltan! On 26 Jun 2007, at 16:37, Zoltán HUBERT wrote: If your vendor don't want to support you anymore, try getting the source. I was asking for a stable kernel, like 2.4, 2.2, 2.0 were before. 2.6 is not. It's a great kernel, better than that of MacOS X, I never said you were doing a bad job, quite the contrary. I wouldn't be using Linux since 10 years if I thought it stinks. I never asked support for closed source drivers, only a stable kernel. Whatever stable means. What you mean by stable pretty much excludes any serious development, without which the Linux kernel would very soon be obsolete. If you want a stable system, then don't change it. If you update to a kernel which is 2.5 years newer, you simply cannot have stability, because that would mean stagnation, aka death. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- PGP.sig Description: This is a digitally signed message part
Re: Long file names in VFAT broken with iocharset=utf8
Hi! On 7 May 2007, at 20:27, OGAWA Hirofumi wrote: Roland Kuhn <[EMAIL PROTECTED]> writes: PATH_MAX specifically counts _bytes_ not characters, so UTF-8 does not matter. ISTR that PATH_MAX was 256 at some point, but I just quickly grepped /usr/include and found various mention of 4096, so where's the central repository for this configuration item? A hard- coded value of 256 somewhere inside the kernel smells like a bug. There is a nasty issue here. FAT is limited by 255 unicode chars or so. So, we would need to count number of unicode chars of filename. No, we don't. At least not when looking at the POSIX spec, which explicitly mentions _bytes_ and _not_ unicode characters. So, to be on the safe side, FAT filesystems would need to support a NAME_MAX of roughly 6*255+3=1533 bytes (not to mention the hassles of forbidden sequences, etc.; do we need to count zero-width characters?) and report it through pathconf() to userspace, then userspace could do with that whatever it liked. What happened to: "file names are just sequences of octets, excluding '/' and NUL"? Adding unicode parsing to the kernel is completely useless _and_ a big trouble maker. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: Long file names in VFAT broken with iocharset=utf8
Hi Andrey! On 7 May 2007, at 19:51, Andrey Borzenkov wrote: This was posted in one of Russian forums. It was not possible to archive (under Linux, using tar) vfat directory where files had long Russian names (really long - over 150 - 170 characters) - tar returned stat failure. When looking with plain ls, file names appeared truncated. Now looking at current (2.6.21) fat driver, __fat_readdir allocates large enough buffer (PAGE_SIZE-522) for UTF-8 name; but for iocharset=utf8 it calls uni16_to_x8() which artificially limits length of UTF-8 name to 256 ... which is obviously not enough for long UTF-8 Russian string (2 bytes per character) not to mention the - theoretical - general case of 6 bytes UTF-8 characters. Similar problem has apparently vfat_lookup()->...->fat_search_long () call chain. Except this appears to be broken even in case of "utf8", because fat_search_long allocates fixed 256 bytes buffer for UTF-8 name. Am I off track here? PATH_MAX specifically counts _bytes_ not characters, so UTF-8 does not matter. ISTR that PATH_MAX was 256 at some point, but I just quickly grepped /usr/include and found various mention of 4096, so where's the central repository for this configuration item? A hard- coded value of 256 somewhere inside the kernel smells like a bug. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: Long file names in VFAT broken with iocharset=utf8
Hi Andrey! On 7 May 2007, at 19:51, Andrey Borzenkov wrote: This was posted in one of Russian forums. It was not possible to archive (under Linux, using tar) vfat directory where files had long Russian names (really long - over 150 - 170 characters) - tar returned stat failure. When looking with plain ls, file names appeared truncated. Now looking at current (2.6.21) fat driver, __fat_readdir allocates large enough buffer (PAGE_SIZE-522) for UTF-8 name; but for iocharset=utf8 it calls uni16_to_x8() which artificially limits length of UTF-8 name to 256 ... which is obviously not enough for long UTF-8 Russian string (2 bytes per character) not to mention the - theoretical - general case of 6 bytes UTF-8 characters. Similar problem has apparently vfat_lookup()-...-fat_search_long () call chain. Except this appears to be broken even in case of utf8, because fat_search_long allocates fixed 256 bytes buffer for UTF-8 name. Am I off track here? PATH_MAX specifically counts _bytes_ not characters, so UTF-8 does not matter. ISTR that PATH_MAX was 256 at some point, but I just quickly grepped /usr/include and found various mention of 4096, so where's the central repository for this configuration item? A hard- coded value of 256 somewhere inside the kernel smells like a bug. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: Long file names in VFAT broken with iocharset=utf8
Hi! On 7 May 2007, at 20:27, OGAWA Hirofumi wrote: Roland Kuhn [EMAIL PROTECTED] writes: PATH_MAX specifically counts _bytes_ not characters, so UTF-8 does not matter. ISTR that PATH_MAX was 256 at some point, but I just quickly grepped /usr/include and found various mention of 4096, so where's the central repository for this configuration item? A hard- coded value of 256 somewhere inside the kernel smells like a bug. There is a nasty issue here. FAT is limited by 255 unicode chars or so. So, we would need to count number of unicode chars of filename. No, we don't. At least not when looking at the POSIX spec, which explicitly mentions _bytes_ and _not_ unicode characters. So, to be on the safe side, FAT filesystems would need to support a NAME_MAX of roughly 6*255+3=1533 bytes (not to mention the hassles of forbidden sequences, etc.; do we need to count zero-width characters?) and report it through pathconf() to userspace, then userspace could do with that whatever it liked. What happened to: file names are just sequences of octets, excluding '/' and NUL? Adding unicode parsing to the kernel is completely useless _and_ a big trouble maker. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
Hi Jens! On 25 Apr 2007, at 12:18, Jens Axboe wrote: On Wed, Apr 25 2007, Brad Campbell wrote: Jens Axboe wrote: It looks to be extremely rare. Aliases are extremely rare, front merges are rare. And you need both to happen with the details you outlined. But it's a large user base, and we've had 3-4 reports on this in the past months. So it obviously does happen. I could not make it trigger without doctoring the unplug code when I used aio. Well, not that rare on this particular machine (I had a case where I could reproduce it in less than an hour of normal use previously on this box), and I've had it occur a number of times on my servers, I just never reported it before as I never took the time to set up a serial console and capture the oops. Extremely rare in the sense that it takes md and some certain conditions to happen for it to trigger. So for most people it'll be extremely rare, and for others (such as yourself) that hit it, it wont be so rare :-) Here's a fix for it, confirmed. Shall I leave the other debugging in, apply this and run it for a few hard days? Yes, that would be perfect! Okay, I left all debugging patches in, disabled all kernel debugging .config stuff and gave it a spin with our usual "killer" workload (as far as batch systems are repeatable anyway) and so far there was not a single glitch or message, so I preliminarily conclude that the bug is squashed. The final word will come once my 1800 batch jobs are processed and I have my particle physics result ;-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
Hi Jens! On 25 Apr 2007, at 12:18, Jens Axboe wrote: On Wed, Apr 25 2007, Brad Campbell wrote: Jens Axboe wrote: It looks to be extremely rare. Aliases are extremely rare, front merges are rare. And you need both to happen with the details you outlined. But it's a large user base, and we've had 3-4 reports on this in the past months. So it obviously does happen. I could not make it trigger without doctoring the unplug code when I used aio. Well, not that rare on this particular machine (I had a case where I could reproduce it in less than an hour of normal use previously on this box), and I've had it occur a number of times on my servers, I just never reported it before as I never took the time to set up a serial console and capture the oops. Extremely rare in the sense that it takes md and some certain conditions to happen for it to trigger. So for most people it'll be extremely rare, and for others (such as yourself) that hit it, it wont be so rare :-) Here's a fix for it, confirmed. Shall I leave the other debugging in, apply this and run it for a few hard days? Yes, that would be perfect! Okay, I left all debugging patches in, disabled all kernel debugging .config stuff and gave it a spin with our usual killer workload (as far as batch systems are repeatable anyway) and so far there was not a single glitch or message, so I preliminarily conclude that the bug is squashed. The final word will come once my 1800 batch jobs are processed and I have my particle physics result ;-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
Hi Jens! On 24 Apr 2007, at 14:32, Jens Axboe wrote: On Tue, Apr 24 2007, Roland Kuhn wrote: Hi Jens! [I made a typo in the Cc: list so that lkml is only included as of now. Actually I copied the typo from you ;-) ] Well no, you started the typo, I merely propagated it and forgot to fix it up :-) Actually, I copied it from your printk() ;-) (thinking helps...) Sure. You might want to include NFS file access into your tests, since we've not triggered this with locally accessing the disks. BTW: How are you exporting the directory (what exports options) - how is it mounted by the client(s)? What chunksize is your raid6 using? And what are the nature of the files on the raid (huge, small, ?) and what are the client(s) doing? Just approximately, I know these things can be hard/difficult/impossible to specify. The files are 100-400MB in size and the client is merging them into a new file in the same directory using the ROOT library, which does in essence alternating sequences of _llseek(somewhere) read(n bytes) _llseek(somewhere+n) read(m bytes) ... and then _llseek(somewhere) rt_sigaction(ignore INT) write(n bytes) rt_sigaction(INT->DFL) time() _llseek(somewhere+n) ... where n is of the the order of 30kB. The input files are treated sequentially, not randomly. Ok, I'll see if I can reproduce it. No luck so far, I'm afraid. Too bad. BTW: the machine just stopped dead, no sign whatsoever on console or netconsole, so I rebooted with elevator=deadline (need to get some work done besides ;-) ) Unfortunately expected, if we can race and lose an update to - >next_rq, we can race and corrupt some of the internal data structures as well. If you have the time and inclination, it would be interesting to see if you can reproduce with some debugging options enabled: - Enable all preempt, spinlock and lockdep debugging measures - Possibly slab poisoning, although that may slow you down somewhat Kernel compilation under way, will report back. Are you using 4kb stacks? No idea, 'grep -i stack .config' gives no indication, but ISTR that 4k was made the default some time back? Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
Hi Jens! [I made a typo in the Cc: list so that lkml is only included as of now. Actually I copied the typo from you ;-) ] On 24 Apr 2007, at 11:40, Jens Axboe wrote: On Tue, Apr 24 2007, Jens Axboe wrote: On Tue, Apr 24 2007, Roland Kuhn wrote: Hi Jens! On 24 Apr 2007, at 11:18, Jens Axboe wrote: On Tue, Apr 24 2007, Roland Kuhn wrote: Hi Jens! We're using a custom built fileserver (dual core Athlon64, using x64_64 arch) with 22 disks in a RAID6 and while resyncing /dev/md2 (9.1GB ext3) after a hardware incident (cable pulled on one disk) the machine would reliably oops while serving some large files over NFSv3. The oops message scrolled partly off the screen, but the IP was in cfq_dispatch_insert, so I tried your debug patch from yesterday with 2.6.21-rc7. I used netconsole for capturing the output (which works nicely, thanks Matt!) and as usual the condition triggered after about half a minute, this with the following printout instead of crashing (still works fine): cfq: rbroot not empty, but ->next_rq == NULL! Fixing up, report the issue to [EMAIL PROTECTED] cfq: busy=1,drv=1,timer=0 cfq rr_list: cfq busy_list: 4272: sort=0,next=,q=0/1,a=2/0,d=0/1,f=221 cfq idle_list: cfq cur_rr: cfq: rbroot not empty, but ->next_rq == NULL! Fixing up, report the issue to [EMAIL PROTECTED] cfq: busy=1,drv=1,timer=0 cfq rr_list: cfq busy_list: 4276: sort=0,next=,q=0/1,a=2/0,d=0/1,f=221 cfq idle_list: cfq cur_rr: There was no backtrace, so the only thing I can tell is that for the previous crashes some nfs threads were always involved, only once did it happen inside an interrupt handler (with the "aieee" kind of message). If you want me to try something else, don't hesitate to ask! Nifty, great that you can reproduce so quickly. I'll try a 3-drive raid6 here and see if read activity along with a resync will trigger anything. If that doesn't work for me, I'll provide you with a more extensive debug patch (if you don't mind). Sure. You might want to include NFS file access into your tests, since we've not triggered this with locally accessing the disks. BTW: How are you exporting the directory (what exports options) - how is it mounted by the client(s)? What chunksize is your raid6 using? And what are the nature of the files on the raid (huge, small, ?) and what are the client(s) doing? Just approximately, I know these things can be hard/difficult/impossible to specify. The files are 100-400MB in size and the client is merging them into a new file in the same directory using the ROOT library, which does in essence alternating sequences of _llseek(somewhere) read(n bytes) _llseek(somewhere+n) read(m bytes) ... and then _llseek(somewhere) rt_sigaction(ignore INT) write(n bytes) rt_sigaction(INT->DFL) time() _llseek(somewhere+n) ... where n is of the the order of 30kB. The input files are treated sequentially, not randomly. BTW: the machine just stopped dead, no sign whatsoever on console or netconsole, so I rebooted with elevator=deadline (need to get some work done besides ;-) ) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
Hi Jens! [I made a typo in the Cc: list so that lkml is only included as of now. Actually I copied the typo from you ;-) ] On 24 Apr 2007, at 11:40, Jens Axboe wrote: On Tue, Apr 24 2007, Jens Axboe wrote: On Tue, Apr 24 2007, Roland Kuhn wrote: Hi Jens! On 24 Apr 2007, at 11:18, Jens Axboe wrote: On Tue, Apr 24 2007, Roland Kuhn wrote: Hi Jens! We're using a custom built fileserver (dual core Athlon64, using x64_64 arch) with 22 disks in a RAID6 and while resyncing /dev/md2 (9.1GB ext3) after a hardware incident (cable pulled on one disk) the machine would reliably oops while serving some large files over NFSv3. The oops message scrolled partly off the screen, but the IP was in cfq_dispatch_insert, so I tried your debug patch from yesterday with 2.6.21-rc7. I used netconsole for capturing the output (which works nicely, thanks Matt!) and as usual the condition triggered after about half a minute, this with the following printout instead of crashing (still works fine): cfq: rbroot not empty, but -next_rq == NULL! Fixing up, report the issue to [EMAIL PROTECTED] cfq: busy=1,drv=1,timer=0 cfq rr_list: cfq busy_list: 4272: sort=0,next=,q=0/1,a=2/0,d=0/1,f=221 cfq idle_list: cfq cur_rr: cfq: rbroot not empty, but -next_rq == NULL! Fixing up, report the issue to [EMAIL PROTECTED] cfq: busy=1,drv=1,timer=0 cfq rr_list: cfq busy_list: 4276: sort=0,next=,q=0/1,a=2/0,d=0/1,f=221 cfq idle_list: cfq cur_rr: There was no backtrace, so the only thing I can tell is that for the previous crashes some nfs threads were always involved, only once did it happen inside an interrupt handler (with the aieee kind of message). If you want me to try something else, don't hesitate to ask! Nifty, great that you can reproduce so quickly. I'll try a 3-drive raid6 here and see if read activity along with a resync will trigger anything. If that doesn't work for me, I'll provide you with a more extensive debug patch (if you don't mind). Sure. You might want to include NFS file access into your tests, since we've not triggered this with locally accessing the disks. BTW: How are you exporting the directory (what exports options) - how is it mounted by the client(s)? What chunksize is your raid6 using? And what are the nature of the files on the raid (huge, small, ?) and what are the client(s) doing? Just approximately, I know these things can be hard/difficult/impossible to specify. The files are 100-400MB in size and the client is merging them into a new file in the same directory using the ROOT library, which does in essence alternating sequences of _llseek(somewhere) read(n bytes) _llseek(somewhere+n) read(m bytes) ... and then _llseek(somewhere) rt_sigaction(ignore INT) write(n bytes) rt_sigaction(INT-DFL) time() _llseek(somewhere+n) ... where n is of the the order of 30kB. The input files are treated sequentially, not randomly. BTW: the machine just stopped dead, no sign whatsoever on console or netconsole, so I rebooted with elevator=deadline (need to get some work done besides ;-) ) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
Hi Jens! On 24 Apr 2007, at 14:32, Jens Axboe wrote: On Tue, Apr 24 2007, Roland Kuhn wrote: Hi Jens! [I made a typo in the Cc: list so that lkml is only included as of now. Actually I copied the typo from you ;-) ] Well no, you started the typo, I merely propagated it and forgot to fix it up :-) Actually, I copied it from your printk() ;-) (thinking helps...) Sure. You might want to include NFS file access into your tests, since we've not triggered this with locally accessing the disks. BTW: How are you exporting the directory (what exports options) - how is it mounted by the client(s)? What chunksize is your raid6 using? And what are the nature of the files on the raid (huge, small, ?) and what are the client(s) doing? Just approximately, I know these things can be hard/difficult/impossible to specify. The files are 100-400MB in size and the client is merging them into a new file in the same directory using the ROOT library, which does in essence alternating sequences of _llseek(somewhere) read(n bytes) _llseek(somewhere+n) read(m bytes) ... and then _llseek(somewhere) rt_sigaction(ignore INT) write(n bytes) rt_sigaction(INT-DFL) time() _llseek(somewhere+n) ... where n is of the the order of 30kB. The input files are treated sequentially, not randomly. Ok, I'll see if I can reproduce it. No luck so far, I'm afraid. Too bad. BTW: the machine just stopped dead, no sign whatsoever on console or netconsole, so I rebooted with elevator=deadline (need to get some work done besides ;-) ) Unfortunately expected, if we can race and lose an update to - next_rq, we can race and corrupt some of the internal data structures as well. If you have the time and inclination, it would be interesting to see if you can reproduce with some debugging options enabled: - Enable all preempt, spinlock and lockdep debugging measures - Possibly slab poisoning, although that may slow you down somewhat Kernel compilation under way, will report back. Are you using 4kb stacks? No idea, 'grep -i stack .config' gives no indication, but ISTR that 4k was made the default some time back? Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: Two questions regarding Opening files within Kernel!
Hi! On 7 Apr 2007, at 08:58, JanuGerman wrote: 1) I have just a file path with me, an absolute path, but no dentry, no inode, no vfsmount object, which function i can call to get a "file" object associated with the absoulte file path. I have surfed arround the source code especially fs/open.c and some other files, but each function requires a parameter "mode" and "fd" beside file path. Actually, i was confuse about the "mode" parameter (and its differece with "flag"), like what to send, and secondly for "fd", i am not sure, what value to send as there is no file infact and only file path exists. Any idea? No, but I'm no guru either. 2) Any functionality within linux kernel source code, to read one line per file? or some indirect way to set buffer size for one read?. That is, any existing header file for doing text I/O rather than binary within the kernel source code? Do you have a compelling reason for not letting userspace feed the file to your driver? That would be the natural and much easier way, I suppose... Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: Two questions regarding Opening files within Kernel!
Hi! On 7 Apr 2007, at 08:58, JanuGerman wrote: 1) I have just a file path with me, an absolute path, but no dentry, no inode, no vfsmount object, which function i can call to get a file object associated with the absoulte file path. I have surfed arround the source code especially fs/open.c and some other files, but each function requires a parameter mode and fd beside file path. Actually, i was confuse about the mode parameter (and its differece with flag), like what to send, and secondly for fd, i am not sure, what value to send as there is no file infact and only file path exists. Any idea? No, but I'm no guru either. 2) Any functionality within linux kernel source code, to read one line per file? or some indirect way to set buffer size for one read?. That is, any existing header file for doing text I/O rather than binary within the kernel source code? Do you have a compelling reason for not letting userspace feed the file to your driver? That would be the natural and much easier way, I suppose... Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [RFC] div64_64 support
Hi Andi! On 6 Mar 2007, at 15:45, Andi Kleen wrote: Let me see... You throw code like that and expect someone to actually understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. Please add something, an URL or even better a nice explanation, per favor... It's straight out of Hacker's delight which is referenced in the commit log. And it's pretty neat, too. Hint: (y+1)**3 = y**3 + 3*y**2 + 3*y + 1. The algorithm is exactly the same as for calculating the cubic root on paper, digit by digit. I found that algo in the school notebook of my grandpa (late 1920ies), a pity that it's not taught anymore... pocket calculators _do_ have downsides ;-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: [RFC] div64_64 support
Hi Andi! On 6 Mar 2007, at 15:45, Andi Kleen wrote: rant Let me see... You throw code like that and expect someone to actually understand it in one year, and be able to correct a bug ? To be honest I don't expect any bugs in this function. /rant Please add something, an URL or even better a nice explanation, per favor... It's straight out of Hacker's delight which is referenced in the commit log. And it's pretty neat, too. Hint: (y+1)**3 = y**3 + 3*y**2 + 3*y + 1. The algorithm is exactly the same as for calculating the cubic root on paper, digit by digit. I found that algo in the school notebook of my grandpa (late 1920ies), a pity that it's not taught anymore... pocket calculators _do_ have downsides ;-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: O_NONBLOCK setting "leak" outside of a process??
Hi Guillaume! On 2 Feb 2007, at 14:48, Guillaume Chazarain wrote: 2007/2/2, Roland Kuhn <[EMAIL PROTECTED]>: That's a bug, right? No, if you want something like: (echo toto; date; echo titi) > file to work in your shell, you'll be happy to have the seek position shared in the processes. As a naive user I'd probably expect that each of the above adds to the output, which perfectly fits the O_APPEND flag (to be set by the shell, of course). The immediate point was about the flags, though, and having O_NONBLOCK on or off certainly is a _design_ choice when writing a program. If I remove O_NONBLOCK, I have a right to expect that I/O functions do not return EAGAIN! Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: O_NONBLOCK setting "leak" outside of a process??
Hi Philippe! On 2 Feb 2007, at 00:15, Philippe Troin wrote: Denis Vlasenko <[EMAIL PROTECTED]> writes: What share the same file descriptor? MC and programs started from it? All the processes started from your shell share at least fds 0, 1 and 2. I thought after exec() fds atre either closed (if CLOEXEC) or becoming independent from parent process (i.e. it you seek, close, etc your fd, parent would not notice that). Am I wrong? I'm afraid so. Seek position and flags are still shared after an exec. That's a bug, right? I couldn't find anything to that effect in IEEE Std. 1003.1, 2004 Edition... Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: O_NONBLOCK setting leak outside of a process??
Hi Philippe! On 2 Feb 2007, at 00:15, Philippe Troin wrote: Denis Vlasenko [EMAIL PROTECTED] writes: What share the same file descriptor? MC and programs started from it? All the processes started from your shell share at least fds 0, 1 and 2. I thought after exec() fds atre either closed (if CLOEXEC) or becoming independent from parent process (i.e. it you seek, close, etc your fd, parent would not notice that). Am I wrong? I'm afraid so. Seek position and flags are still shared after an exec. That's a bug, right? I couldn't find anything to that effect in IEEE Std. 1003.1, 2004 Edition... Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: O_NONBLOCK setting leak outside of a process??
Hi Guillaume! On 2 Feb 2007, at 14:48, Guillaume Chazarain wrote: 2007/2/2, Roland Kuhn [EMAIL PROTECTED]: That's a bug, right? No, if you want something like: (echo toto; date; echo titi) file to work in your shell, you'll be happy to have the seek position shared in the processes. As a naive user I'd probably expect that each of the above adds to the output, which perfectly fits the O_APPEND flag (to be set by the shell, of course). The immediate point was about the flags, though, and having O_NONBLOCK on or off certainly is a _design_ choice when writing a program. If I remove O_NONBLOCK, I have a right to expect that I/O functions do not return EAGAIN! Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: PROBLEM: KB->KiB, MB -> MiB, ... (IEC 60027-2)
Hi Jan! On 21 Jan 2007, at 22:12, Jan Engelhardt wrote: How fast is your Ethernet port? 100Mbps or 95.37Mbps? Same lie like with harddrives. It's around 80, not 100. But it depends on how you look at it. 80 for Layer3, possibly a little more for Layer2/1. Nope, I get consistently 12e6 bytes/sec, which is 96e6 bits/sec across 100Mbps ethernet, fitting nicely with the frame overhead (some 50 bytes out of 1500, without TCP options). So no lie here. With gigabit I'm not completely sure yet, still have to see the advertised 125e6 symbols/sec (got only as far as 115e6 up to now). Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: PROBLEM: KB-KiB, MB - MiB, ... (IEC 60027-2)
Hi Jan! On 21 Jan 2007, at 22:12, Jan Engelhardt wrote: How fast is your Ethernet port? 100Mbps or 95.37Mbps? Same lie like with harddrives. It's around 80, not 100. But it depends on how you look at it. 80 for Layer3, possibly a little more for Layer2/1. Nope, I get consistently 12e6 bytes/sec, which is 96e6 bits/sec across 100Mbps ethernet, fitting nicely with the frame overhead (some 50 bytes out of 1500, without TCP options). So no lie here. With gigabit I'm not completely sure yet, still have to see the advertised 125e6 symbols/sec (got only as far as 115e6 up to now). Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P+++ L+++ E(+) W+ !N K- w--- M + !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h y+++ --END GEEK CODE BLOCK-- smime.p7s Description: S/MIME cryptographic signature PGP.sig Description: This is a digitally signed message part
Re: UPDATE: out of vmalloc space - but vmalloc parameter does not allow boot
Hi Ranko! On Apr 4, 2005, at 4:36 PM, Ranko Zivojnovic wrote: (please do CC replies as I am still not on the list) As I am kind of pressured to resolve this issue, I've set up a test environment using VMWare in order to reproduce the problem and (un)fortunately the attempt was successful. I have noticed a few points that relate to the size of the physical RAM and the behavior vmalloc. As I am not sure if this is by design or a bug, so please someone enlighten me: The strange thing I have seen is that with the increase of the physical RAM, the VmallocTotal in the /proc/meminfo gets smaller! Is this how it is supposed to be? Well, I'm by no means a VM expert (not even a regular kernel hacker), but it seems to me that the sum of LowTotal and VmallocTotal is rather constant for the different settings. Alas, I cannot offer an explanation why this should be, so hopefully a knowledgeable person will shed some light on this issue. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str. 85747 Garching Telefon 089/289-12592; Telefax 089/289-12570 -- When I am working on a problem I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong. -- R. Buckminster Fuller -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a-> C+++ UL P-(+) L+++ E(+) W+ !N K- w--- M+ !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++> h x+++ --END GEEK CODE BLOCK-- PGP.sig Description: This is a digitally signed message part
Re: UPDATE: out of vmalloc space - but vmalloc parameter does not allow boot
Hi Ranko! On Apr 4, 2005, at 4:36 PM, Ranko Zivojnovic wrote: (please do CC replies as I am still not on the list) As I am kind of pressured to resolve this issue, I've set up a test environment using VMWare in order to reproduce the problem and (un)fortunately the attempt was successful. I have noticed a few points that relate to the size of the physical RAM and the behavior vmalloc. As I am not sure if this is by design or a bug, so please someone enlighten me: The strange thing I have seen is that with the increase of the physical RAM, the VmallocTotal in the /proc/meminfo gets smaller! Is this how it is supposed to be? Well, I'm by no means a VM expert (not even a regular kernel hacker), but it seems to me that the sum of LowTotal and VmallocTotal is rather constant for the different settings. Alas, I cannot offer an explanation why this should be, so hopefully a knowledgeable person will shed some light on this issue. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str. 85747 Garching Telefon 089/289-12592; Telefax 089/289-12570 -- When I am working on a problem I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong. -- R. Buckminster Fuller -BEGIN GEEK CODE BLOCK- Version: 3.12 GS/CS/M/MU d-(++) s:+ a- C+++ UL P-(+) L+++ E(+) W+ !N K- w--- M+ !V Y+ PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++ h x+++ --END GEEK CODE BLOCK-- PGP.sig Description: This is a digitally signed message part
Re: [BK] upgrade will be needed
Hi Clemens! On Feb 17, 2005, at 9:09 AM, Clemens Schwaighofer wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/17/2005 04:55 PM, Roland Kuhn wrote: That said, it would of course be possible to improve the internal workflow of our emperor penguin if he used subversion, but the collaboration with others could not benefit the way it does with a changeset-based approach. Question is then, what about keeping a main trunk with the vanialle release, and each dev has its own branch. now at a certain point you have to merge them. Now where is the difference between a central rep and a de-central one. At day X, patches from Andrew's tree have to go to Linus tree and from his tree into the new vanialla kernel. right? Somehow I can't see the difference here. The difference comes after the merge. Suppose Andrew didn't push everything to Linus. Then new patches come in, both trees change. In this situation it is very time consuming with subversion to work out the changes which still have to go from Andrew's tree to Linus' tree. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str. 85747 Garching Telefon 089/289-12592; Telefax 089/289-12570 -- A mouse is a device used to point at the xterm you want to type in. Kim Alm on a.s.r. PGP.sig Description: This is a digitally signed message part
Re: [BK] upgrade will be needed
Hi Clemens! On Feb 17, 2005, at 9:09 AM, Clemens Schwaighofer wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/17/2005 04:55 PM, Roland Kuhn wrote: That said, it would of course be possible to improve the internal workflow of our emperor penguin if he used subversion, but the collaboration with others could not benefit the way it does with a changeset-based approach. Question is then, what about keeping a main trunk with the vanialle release, and each dev has its own branch. now at a certain point you have to merge them. Now where is the difference between a central rep and a de-central one. At day X, patches from Andrew's tree have to go to Linus tree and from his tree into the new vanialla kernel. right? Somehow I can't see the difference here. The difference comes after the merge. Suppose Andrew didn't push everything to Linus. Then new patches come in, both trees change. In this situation it is very time consuming with subversion to work out the changes which still have to go from Andrew's tree to Linus' tree. Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str. 85747 Garching Telefon 089/289-12592; Telefax 089/289-12570 -- A mouse is a device used to point at the xterm you want to type in. Kim Alm on a.s.r. PGP.sig Description: This is a digitally signed message part
Re: [BK] upgrade will be needed
Hi Clemens! On Feb 17, 2005, at 1:11 AM, Clemens Schwaighofer wrote: first. what kind of advantages does bk have over other svn? Seriously. If Apache can use it, and gcc might use it (again two very large projects), what makes linux so differetnt that it can't. And I don't want _anything_ from Larry. I am just pointing out, that this kind of legal clause is more ridicolous than understandable. Well, I'm obviously not Larry, so here are my 2ct: Subversion is superior to CVS in all respects, but that is not an overly strong statement. The main problem is that it is centralized in a way that hinders the parallel existence of development branches because it does not properly support the shuffling of changes back and forth between trees. It all works fine until you want to _partially_ synchronize two trees and keep the ability to continue development on both of them. (Been there, done that, it was a major PITA even in a rather small project. Works fine for my PhD thesis, though ;-) ) That said, it would of course be possible to improve the internal workflow of our emperor penguin if he used subversion, but the collaboration with others could not benefit the way it does with a changeset-based approach. Last, why can you compare cvs to bk? and not subversion, or arch? arch and subversion are way superiour to cvs ... SCM is hard and not sexy, I'm afraid. yes its hard, so we have to use bk with a very strange license? better close the eyes and not change. What do you think is kernel coding? Walk in the park? Do you think all those developers say, nah I better use Windows or Mac OS X, because its hard and not sexy ... pah ... BS! Linux kernel development is hard _and_ sexy :-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str. 85747 Garching Telefon 089/289-12592; Telefax 089/289-12570 -- A mouse is a device used to point at the xterm you want to type in. Kim Alm on a.s.r. PGP.sig Description: This is a digitally signed message part
Re: [BK] upgrade will be needed
Hi Clemens! On Feb 17, 2005, at 1:11 AM, Clemens Schwaighofer wrote: first. what kind of advantages does bk have over other svn? Seriously. If Apache can use it, and gcc might use it (again two very large projects), what makes linux so differetnt that it can't. And I don't want _anything_ from Larry. I am just pointing out, that this kind of legal clause is more ridicolous than understandable. Well, I'm obviously not Larry, so here are my 2ct: Subversion is superior to CVS in all respects, but that is not an overly strong statement. The main problem is that it is centralized in a way that hinders the parallel existence of development branches because it does not properly support the shuffling of changes back and forth between trees. It all works fine until you want to _partially_ synchronize two trees and keep the ability to continue development on both of them. (Been there, done that, it was a major PITA even in a rather small project. Works fine for my PhD thesis, though ;-) ) That said, it would of course be possible to improve the internal workflow of our emperor penguin if he used subversion, but the collaboration with others could not benefit the way it does with a changeset-based approach. Last, why can you compare cvs to bk? and not subversion, or arch? arch and subversion are way superiour to cvs ... SCM is hard and not sexy, I'm afraid. yes its hard, so we have to use bk with a very strange license? better close the eyes and not change. What do you think is kernel coding? Walk in the park? Do you think all those developers say, nah I better use Windows or Mac OS X, because its hard and not sexy ... pah ... BS! Linux kernel development is hard _and_ sexy :-) Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str. 85747 Garching Telefon 089/289-12592; Telefax 089/289-12570 -- A mouse is a device used to point at the xterm you want to type in. Kim Alm on a.s.r. PGP.sig Description: This is a digitally signed message part
2.4.4 Oops in ext2 and strange /proc/ksyms
Hi folks! After seeing the Oops below (and rebooting), I looked into /proc/ksyms (because ksymoops complained about mismatches), and I could not find system_call, do_page_fault, etc. Shouldn't they be there? When doing ksymoops with /proc/ksyms I found recursive calling of do_brk, which for sure is not the right thing. The machine is a dual PII-450, kernel is 2.4.4 vanilla with Neil Brown's knfsd-patch. -- ksymoops 2.4.0 on i686 2.4.4. Options used -V (default) -K (specified) -l /proc/modules (default) -o /lib/modules/2.4.4/ (default) -m /usr/src/linux/System.map (specified) No modules in ksyms, skipping objects No ksyms, skipping lsmod Oops: 0002 CPU: 0 EIP: 0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: ebx: c16d32f8 ecx: c16d32f8 edx: c023b0c0 esi: c16d32f8 edi: c5b556e4 ebp: esp: c2dbded4 ds: 0018 es: 0018 ss: 0018 Process g++ (pid: 24364, stackpage=c2dbd000) Stack: c16d32f8 c0124c06 c16d32f8 c16d32f8 c0124e6b c16d32f8 c2dbdf18 d38f71c0 dd7c5ba0 c015af38 dd7c5ba0 c2dbdf18 c5b556e4 c63dc140 c0124f27 c5b55640 c028ab40 df56fa20 c0149cf5 c5b556e4 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] Code: ff 48 18 8b 53 04 8b 03 89 50 04 89 02 8b 43 10 c7 43 08 00 >>EIP; c0124b96 <__remove_inode_page+26/60> <= Trace; c0124c06 Trace; c0124e6b Trace; c015af38 Trace; c0124f27 Trace; c0149cf5 Trace; c0147e46 Trace; c013e7d9 Trace; c0140c9a Trace; c013f9aa Trace; c0140d6a Trace; c0111ef0 Trace; c0106edb Code; c0124b96 <__remove_inode_page+26/60> <_EIP>: Code; c0124b96 <__remove_inode_page+26/60> <= 0: ff 48 18 decl 0x18(%eax) <= Code; c0124b99 <__remove_inode_page+29/60> 3: 8b 53 04 mov0x4(%ebx),%edx Code; c0124b9c <__remove_inode_page+2c/60> 6: 8b 03 mov(%ebx),%eax Code; c0124b9e <__remove_inode_page+2e/60> 8: 89 50 04 mov%edx,0x4(%eax) Code; c0124ba1 <__remove_inode_page+31/60> b: 89 02 mov%eax,(%edx) Code; c0124ba3 <__remove_inode_page+33/60> d: 8b 43 10 mov0x10(%ebx),%eax Code; c0124ba6 <__remove_inode_page+36/60> 10: c7 43 08 00 00 00 00 movl $0x0,0x8(%ebx) Ciao, Roland +-+ |Tel.:089/326493320561/873744 | |in Radeberger Weg 8Am Fasanenhof 16| | 85748 Garching 34125 Kassel| +---+-+ |Physik-Department E18 | Raum3558 | |James-Franck-Str. | Telefon 089/289-12592 | |85747 Garching | | +---+-+ | May the Source be with you! | +-+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.4.4 Oops in ext2 and strange /proc/ksyms
Hi folks! After seeing the Oops below (and rebooting), I looked into /proc/ksyms (because ksymoops complained about mismatches), and I could not find system_call, do_page_fault, etc. Shouldn't they be there? When doing ksymoops with /proc/ksyms I found recursive calling of do_brk, which for sure is not the right thing. The machine is a dual PII-450, kernel is 2.4.4 vanilla with Neil Brown's knfsd-patch. -- ksymoops 2.4.0 on i686 2.4.4. Options used -V (default) -K (specified) -l /proc/modules (default) -o /lib/modules/2.4.4/ (default) -m /usr/src/linux/System.map (specified) No modules in ksyms, skipping objects No ksyms, skipping lsmod Oops: 0002 CPU: 0 EIP: 0010:[c0124b96] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: ebx: c16d32f8 ecx: c16d32f8 edx: c023b0c0 esi: c16d32f8 edi: c5b556e4 ebp: esp: c2dbded4 ds: 0018 es: 0018 ss: 0018 Process g++ (pid: 24364, stackpage=c2dbd000) Stack: c16d32f8 c0124c06 c16d32f8 c16d32f8 c0124e6b c16d32f8 c2dbdf18 d38f71c0 dd7c5ba0 c015af38 dd7c5ba0 c2dbdf18 c5b556e4 c63dc140 c0124f27 c5b55640 c028ab40 df56fa20 c0149cf5 c5b556e4 Call Trace: [c0124c06] [c0124e6b] [c015af38] [c0124f27] [c0149cf5] [c0147e46] [c013e7d9] [c0140c9a] [c013f9aa] [c0140d6a] [c0111ef0] [c0106edb] Code: ff 48 18 8b 53 04 8b 03 89 50 04 89 02 8b 43 10 c7 43 08 00 EIP; c0124b96 __remove_inode_page+26/60 = Trace; c0124c06 remove_inode_page+36/40 Trace; c0124e6b truncate_list_pages+12b/1a0 Trace; c015af38 ext2_delete_entry+98/100 Trace; c0124f27 truncate_inode_pages+47/80 Trace; c0149cf5 iput+a5/170 Trace; c0147e46 d_delete+66/b0 Trace; c013e7d9 vfs_permission+89/120 Trace; c0140c9a vfs_unlink+17a/1b0 Trace; c013f9aa lookup_hash+4a/e0 Trace; c0140d6a sys_unlink+9a/110 Trace; c0111ef0 do_page_fault+0/470 Trace; c0106edb system_call+33/38 Code; c0124b96 __remove_inode_page+26/60 _EIP: Code; c0124b96 __remove_inode_page+26/60 = 0: ff 48 18 decl 0x18(%eax) = Code; c0124b99 __remove_inode_page+29/60 3: 8b 53 04 mov0x4(%ebx),%edx Code; c0124b9c __remove_inode_page+2c/60 6: 8b 03 mov(%ebx),%eax Code; c0124b9e __remove_inode_page+2e/60 8: 89 50 04 mov%edx,0x4(%eax) Code; c0124ba1 __remove_inode_page+31/60 b: 89 02 mov%eax,(%edx) Code; c0124ba3 __remove_inode_page+33/60 d: 8b 43 10 mov0x10(%ebx),%eax Code; c0124ba6 __remove_inode_page+36/60 10: c7 43 08 00 00 00 00 movl $0x0,0x8(%ebx) Ciao, Roland +-+ |Tel.:089/326493320561/873744 | |in Radeberger Weg 8Am Fasanenhof 16| | 85748 Garching 34125 Kassel| +---+-+ |Physik-Department E18 | Raum3558 | |James-Franck-Str. | Telefon 089/289-12592 | |85747 Garching | | +---+-+ | May the Source be with you! | +-+ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
iptables port remapping problem (was: [newbie] NFS client:port-unreachable)
On Sun, 3 Jun 2001, Trond Myklebust wrote: > Are /home and /compass on the same mount point on the client though? > If not, then they won't share the same port. > > IOW: they will only share the same port if you have '/' as the NFS > mountpoint. When I mount via nfs each mount gets its own local port to communicate with the server. Looking at /proc/net/ip_conntrack I see that one such port (797) got remapped to 772, so I see packets emerging from 772 and getting back from the server, but the mapping is not done upon receive, so that it does not reach port 797 (where it originally came from) but port 772 which has no process attached. This results in an ICMP_PORT_UNREACH to the server and an nfs client not getting an answer. This problem can be cured by 'rmmod ip_conntrack' and restarting the firewall, which is not a good solution. My conclusion: Either iptables has a problem when remapping ports under moderate load (several RPCs masqueraded per second) or the nfs-client does not properly reserve the local port when mounting. BTW: I use util-linux-2.11d but still get 'nfs warning: mount version older than kernel'. DETAILS: I have a DECstation being nis domain server and nfs server for /home, /compass, /usr/local and some other things (all different directories on the server, I have given the mount points for the clients). There are a dozen clients being served without problems, mostly running 2.2.14 (RedHat 6.2), some 2.4.2 (SuSE 7.1). Besides I have another server (RedHat 7.1, kernel 2.4.4 with knfsd-reiserfs-patch from namesys.com), which also mounts /home and /compass from the DEC and serves some internal disk space to a linux cluster (RedHat 6.2). This server has IP 217, but masquerades (via iptables -j SNAT) the cluster as having IPs 218 or 219 (roughly half of the 32 machines on each address), since the cluster machines have no other connection to the internet because we ran out of IPs. Ciao, Roland +-+ |Tel.:089/326493320561/873744 | |in Radeberger Weg 8Am Fasanenhof 16| | 85748 Garching 34125 Kassel| +---+-+ |Physik-Department E18 | Raum3558 | |James-Franck-Str. | Telefon 089/289-12592 | |85747 Garching | | +---+-+ | May the Source be with you! | +-+ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
iptables port remapping problem (was: [newbie] NFS client:port-unreachable)
On Sun, 3 Jun 2001, Trond Myklebust wrote: Are /home and /compass on the same mount point on the client though? If not, then they won't share the same port. IOW: they will only share the same port if you have '/' as the NFS mountpoint. When I mount via nfs each mount gets its own local port to communicate with the server. Looking at /proc/net/ip_conntrack I see that one such port (797) got remapped to 772, so I see packets emerging from 772 and getting back from the server, but the mapping is not done upon receive, so that it does not reach port 797 (where it originally came from) but port 772 which has no process attached. This results in an ICMP_PORT_UNREACH to the server and an nfs client not getting an answer. This problem can be cured by 'rmmod ip_conntrack' and restarting the firewall, which is not a good solution. My conclusion: Either iptables has a problem when remapping ports under moderate load (several RPCs masqueraded per second) or the nfs-client does not properly reserve the local port when mounting. BTW: I use util-linux-2.11d but still get 'nfs warning: mount version older than kernel'. DETAILS: I have a DECstation being nis domain server and nfs server for /home, /compass, /usr/local and some other things (all different directories on the server, I have given the mount points for the clients). There are a dozen clients being served without problems, mostly running 2.2.14 (RedHat 6.2), some 2.4.2 (SuSE 7.1). Besides I have another server (RedHat 7.1, kernel 2.4.4 with knfsd-reiserfs-patch from namesys.com), which also mounts /home and /compass from the DEC and serves some internal disk space to a linux cluster (RedHat 6.2). This server has IP 217, but masquerades (via iptables -j SNAT) the cluster as having IPs 218 or 219 (roughly half of the 32 machines on each address), since the cluster machines have no other connection to the internet because we ran out of IPs. Ciao, Roland +-+ |Tel.:089/326493320561/873744 | |in Radeberger Weg 8Am Fasanenhof 16| | 85748 Garching 34125 Kassel| +---+-+ |Physik-Department E18 | Raum3558 | |James-Franck-Str. | Telefon 089/289-12592 | |85747 Garching | | +---+-+ | May the Source be with you! | +-+ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [newbie] NFS client: port-unreachable
On 1 Jun 2001, Trond Myklebust wrote: > > (port-unreachable) goes out to the server. This is annoying > > since it blocks all access to that directory. The request in > > question is sent and received at port 772. > > > I'm using kernel 2.4.4. > > You probably have set ipchains or ipfilter to block port 772 on your > client. No, I have no port specific rules in the firewall (iptables), but this machine does SNAT for 32 other linux boxes which also get some directories from the same server (including YP). I had some trouble with the YPSERV-calls until I bound two more IPs to the network card and masqueraded the 32 boxes via these additional addresses. What might happen is that the specific port gets allocated by some port remapping in iptables during the request, but I don't see why this should happen only for specific directories (e.g. /home works and /compass doesn't while both are from the same server). Ciao, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [newbie] NFS client: port-unreachable
On 1 Jun 2001, Trond Myklebust wrote: (port-unreachable) goes out to the server. This is annoying since it blocks all access to that directory. The request in question is sent and received at port 772. I'm using kernel 2.4.4. You probably have set ipchains or ipfilter to block port 772 on your client. No, I have no port specific rules in the firewall (iptables), but this machine does SNAT for 32 other linux boxes which also get some directories from the same server (including YP). I had some trouble with the YPSERV-calls until I bound two more IPs to the network card and masqueraded the 32 boxes via these additional addresses. What might happen is that the specific port gets allocated by some port remapping in iptables during the request, but I don't see why this should happen only for specific directories (e.g. /home works and /compass doesn't while both are from the same server). Ciao, Roland - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[newbie] NFS client: port-unreachable
Hi folks! When I lstat64 a directory on an nfs mount the answer to GETATTR is received by the network interface but dropped (not seen by the client) afterwards. Only 50musec after the receive of the answer an icmp-destination-unreachable (port-unreachable) goes out to the server. This is annoying since it blocks all access to that directory. The request in question is sent and received at port 772. I'm using kernel 2.4.4. Please help, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[newbie] NFS broken in 2.4.4?
Hi folks! When a process tries to lstat64 a file on nfs and the reply is not received it gets blocked forever. Should it be that way? Please help, Roland - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[newbie] NFS broken in 2.4.4?
Hi folks! When a process tries to lstat64 a file on nfs and the reply is not received it gets blocked forever. Should it be that way? Please help, Roland - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[newbie] NFS client: port-unreachable
Hi folks! When I lstat64 a directory on an nfs mount the answer to GETATTR is received by the network interface but dropped (not seen by the client) afterwards. Only 50musec after the receive of the answer an icmp-destination-unreachable (port-unreachable) goes out to the server. This is annoying since it blocks all access to that directory. The request in question is sent and received at port 772. I'm using kernel 2.4.4. Please help, Roland - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Oopses with 2.4.4
Hi folks! All of a sudden I experienced at least two Oopses looking like the attached one, which is from process X (the other was bash, but the message had nearly scrolled away). Since I can't find this exact code sequence in arch/i386/entry.S (it appeared exactly the same in the bash Oops) I am asking myself whether it is a hardware failure... How else can code be changed? I am running 2.4.4 with the reiserfs-knfsd-patch on a dual PII 450/440BX/512MB machine. TIA, Roland ksymoops 2.4.0 on i686 2.4.4. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.4/ (default) -m /boot/System.map-2.4.4-knfsdpatch (specified) Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(shmem_file_setup) not found in System.map. Ignoring ksyms_base entry Oops: 0002 CPU: 0 EIP: 0010:[] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010256 eax: ebx: dcd5c000 ecx: dfffc5d0 edx: 00ef esi: edi: ebp: b888 esp: dcd5dfc4 ds: 0018 es: 0018 ss: 0018 Stack: 0100 081d76c0 b888 0001 002b 002b 008e 4014790e 0023 0292 b5e0 002b Code: 00 00 05 20 df 2e c0 85 00 00 24 df 2e c0 0f 85 00 00 00 00 >>EIP; c0106ee6<= Code; c0106ee6 <_EIP>: Code; c0106ee6<= 0: 00 00 add%al,(%eax) <= Code; c0106ee8 2: 05 20 df 2e c0add$0xc02edf20,%eax Code; c0106eed 7: 85 00 test %eax,(%eax) Code; c0106eef 9: 00 24 df add%ah,(%edi,%ebx,8) Code; c0106ef2 c: 2e c0 0f 85 rorb $0x85,%cs:(%edi) 1 warning issued. Results may not be reliable.
Oopses with 2.4.4
Hi folks! All of a sudden I experienced at least two Oopses looking like the attached one, which is from process X (the other was bash, but the message had nearly scrolled away). Since I can't find this exact code sequence in arch/i386/entry.S (it appeared exactly the same in the bash Oops) I am asking myself whether it is a hardware failure... How else can code be changed? I am running 2.4.4 with the reiserfs-knfsd-patch on a dual PII 450/440BX/512MB machine. TIA, Roland ksymoops 2.4.0 on i686 2.4.4. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.4/ (default) -m /boot/System.map-2.4.4-knfsdpatch (specified) Warning (compare_maps): ksyms_base symbol __VERSIONED_SYMBOL(shmem_file_setup) not found in System.map. Ignoring ksyms_base entry Oops: 0002 CPU: 0 EIP: 0010:[c0106ee6] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010256 eax: ebx: dcd5c000 ecx: dfffc5d0 edx: 00ef esi: edi: ebp: b888 esp: dcd5dfc4 ds: 0018 es: 0018 ss: 0018 Stack: 0100 081d76c0 b888 0001 002b 002b 008e 4014790e 0023 0292 b5e0 002b Code: 00 00 05 20 df 2e c0 85 00 00 24 df 2e c0 0f 85 00 00 00 00 EIP; c0106ee6 ret_from_sys_call+6/1a = Code; c0106ee6 ret_from_sys_call+6/1a _EIP: Code; c0106ee6 ret_from_sys_call+6/1a = 0: 00 00 add%al,(%eax) = Code; c0106ee8 ret_from_sys_call+8/1a 2: 05 20 df 2e c0add$0xc02edf20,%eax Code; c0106eed ret_from_sys_call+d/1a 7: 85 00 test %eax,(%eax) Code; c0106eef ret_from_sys_call+f/1a 9: 00 24 df add%ah,(%edi,%ebx,8) Code; c0106ef2 ret_from_sys_call+12/1a c: 2e c0 0f 85 rorb $0x85,%cs:(%edi) 1 warning issued. Results may not be reliable.