Re: fsck_ufs dumps core
> On 12 Aug 2016, at 08:51, Konstantin Belousov wrote: > > On Wed, Aug 10, 2016 at 06:11:39PM +0300, Dmitry Sivachenko wrote: >> >>> On 10 Aug 2016, at 17:55, Konstantin Belousov wrote: >>> >>> On Wed, Aug 10, 2016 at 05:29:31PM +0300, Dmitry Sivachenko wrote: >>>> Hello, >>>> >>>> I am running FreeBSD 10.3-STABLE #0 r299261M >>>> >>>> After unclean reboot I am unable to fsck my UFS filesystem: >>>> >>>> # fsck /dev/mfid0p1 >>>> ** /dev/mfid0p1 >>>> ** Last Mounted on /opt >>>> ** Phase 1 - Check Blocks and Sizes >>>> fsck: /dev/mfid0p1: Segmentation fault >>>> >>>> pid 482 (fsck_ufs), uid 0: exited on signal 11 (core dumped) >>>> >>>> # gdb -c fsck_ufs.482 /sbin/fsck_ufs >>>> GNU gdb 6.1.1 [FreeBSD] >>>> Copyright 2004 Free Software Foundation, Inc. >>>> GDB is free software, covered by the GNU General Public License, and you >>>> are >>>> welcome to change it and/or distribute copies of it under certain >>>> conditions. >>>> Type "show copying" to see the conditions. >>>> There is absolutely no warranty for GDB. Type "show warranty" for details. >>>> This GDB was configured as "amd64-marcel-freebsd"... >>>> Core was generated by `fsck_ufs'. >>>> Program terminated with signal 11, Segmentation fault. >>>> Reading symbols from /lib/libufs.so.6...done. >>>> Loaded symbols for /lib/libufs.so.6 >>>> Reading symbols from /lib/libc.so.7...done. >>>> Loaded symbols for /lib/libc.so.7 >>>> Reading symbols from /libexec/ld-elf.so.1...done. >>>> Loaded symbols for /libexec/ld-elf.so.1 >>>> #0 0x00409a8b in pass1 () at >>>> /place/WRK/src/sbin/fsck_ffs/pass1.c:83 >>>> 83 setbmap(i); >>>> (gdb) bt >>>> #0 0x00409a8b in pass1 () at >>>> /place/WRK/src/sbin/fsck_ffs/pass1.c:83 >>>> #1 0x00409050 in main (argc=, >>>> argv=) at /place/WRK/src/sbin/fsck_ffs/main.c:447 >>>> Current language: auto; currently minimal >>>> (gdb) >>>> >>> >>> Try to use alternative superblock (-b switch). You can get the list of >>> the possible values for -b by 'newfs -N' invocation, but you have to know >>> the parameters which were used for formatting. >> >> >> Yes, I tried several different backup superblocks, with the same result. (I >> created this FS few years ago so I can't be 100% sure about the parameters, >> but I usually only use larger -i NN for big filesystems, and I can guess the >> exact value examining df -ik). >> >> >> BTW I just noticed that when I use larger values for backup superblock, it >> reports an error which looks like overflow: >> >> # fsck_ufs -b 7437746112 /dev/mfid0p1 >> Alternate super block location: -1152188480 >> ** /dev/mfid0p1 >> >> CANNOT SEEK BLK: -1152188480 >> CONTINUE? [yn] > > Well, it seems that your beginning of the volume got obliterated. > Fsck_ffs cannot convert random sequence of bytes into the valid FFS > volume. > > The only other way to try is to restore content of the cylinder groups > which are farther away from the start. Create a scratch volume of the > same size, newfs it with the same parameters. Then dd from the broken > volume to the new one, with some offset. Offset should be large enough > to not include initial superblock, and if the zero cg is damaged, skip > it as well. You should use seek=n skip=n (i.e. the same initial offsets > both for input and output). Okay, then it was simpler for me to backup vital data from this volume and do newfs on it (rather that dd 145TB of data). But fsck_ufs -b still does not work (after fresh newfs): # fsck_ufs -b 343748128704 /dev/mfid0p1 Alternate super block location: 150745024 ** /dev/mfid0p1 150745024 is not a file system superblock 343748128704 was taken from freshly made newfs. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: fsck_ufs dumps core
> On 10 Aug 2016, at 17:55, Konstantin Belousov wrote: > > On Wed, Aug 10, 2016 at 05:29:31PM +0300, Dmitry Sivachenko wrote: >> Hello, >> >> I am running FreeBSD 10.3-STABLE #0 r299261M >> >> After unclean reboot I am unable to fsck my UFS filesystem: >> >> # fsck /dev/mfid0p1 >> ** /dev/mfid0p1 >> ** Last Mounted on /opt >> ** Phase 1 - Check Blocks and Sizes >> fsck: /dev/mfid0p1: Segmentation fault >> >> pid 482 (fsck_ufs), uid 0: exited on signal 11 (core dumped) >> >> # gdb -c fsck_ufs.482 /sbin/fsck_ufs >> GNU gdb 6.1.1 [FreeBSD] >> Copyright 2004 Free Software Foundation, Inc. >> GDB is free software, covered by the GNU General Public License, and you are >> welcome to change it and/or distribute copies of it under certain conditions. >> Type "show copying" to see the conditions. >> There is absolutely no warranty for GDB. Type "show warranty" for details. >> This GDB was configured as "amd64-marcel-freebsd"... >> Core was generated by `fsck_ufs'. >> Program terminated with signal 11, Segmentation fault. >> Reading symbols from /lib/libufs.so.6...done. >> Loaded symbols for /lib/libufs.so.6 >> Reading symbols from /lib/libc.so.7...done. >> Loaded symbols for /lib/libc.so.7 >> Reading symbols from /libexec/ld-elf.so.1...done. >> Loaded symbols for /libexec/ld-elf.so.1 >> #0 0x00409a8b in pass1 () at /place/WRK/src/sbin/fsck_ffs/pass1.c:83 >> 83 setbmap(i); >> (gdb) bt >> #0 0x00409a8b in pass1 () at /place/WRK/src/sbin/fsck_ffs/pass1.c:83 >> #1 0x00409050 in main (argc=, >>argv=) at /place/WRK/src/sbin/fsck_ffs/main.c:447 >> Current language: auto; currently minimal >> (gdb) >> > > Try to use alternative superblock (-b switch). You can get the list of > the possible values for -b by 'newfs -N' invocation, but you have to know > the parameters which were used for formatting. Yes, I tried several different backup superblocks, with the same result. (I created this FS few years ago so I can't be 100% sure about the parameters, but I usually only use larger -i NN for big filesystems, and I can guess the exact value examining df -ik). BTW I just noticed that when I use larger values for backup superblock, it reports an error which looks like overflow: # fsck_ufs -b 7437746112 /dev/mfid0p1 Alternate super block location: -1152188480 ** /dev/mfid0p1 CANNOT SEEK BLK: -1152188480 CONTINUE? [yn] ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
fsck_ufs dumps core
Hello, I am running FreeBSD 10.3-STABLE #0 r299261M After unclean reboot I am unable to fsck my UFS filesystem: # fsck /dev/mfid0p1 ** /dev/mfid0p1 ** Last Mounted on /opt ** Phase 1 - Check Blocks and Sizes fsck: /dev/mfid0p1: Segmentation fault pid 482 (fsck_ufs), uid 0: exited on signal 11 (core dumped) # gdb -c fsck_ufs.482 /sbin/fsck_ufs GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Core was generated by `fsck_ufs'. Program terminated with signal 11, Segmentation fault. Reading symbols from /lib/libufs.so.6...done. Loaded symbols for /lib/libufs.so.6 Reading symbols from /lib/libc.so.7...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /libexec/ld-elf.so.1...done. Loaded symbols for /libexec/ld-elf.so.1 #0 0x00409a8b in pass1 () at /place/WRK/src/sbin/fsck_ffs/pass1.c:83 83 setbmap(i); (gdb) bt #0 0x00409a8b in pass1 () at /place/WRK/src/sbin/fsck_ffs/pass1.c:83 #1 0x00409050 in main (argc=, argv=) at /place/WRK/src/sbin/fsck_ffs/main.c:447 Current language: auto; currently minimal (gdb) ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Failed to write core file (error 14)
> On 19 May 2016, at 19:42, Erik wrote: > > > On 05/19/2016 06:29 PM, Dmitry Sivachenko wrote: >> >> >> gdb does not show stack: >> >> (gdb) bt >> #0 0x000800bffb9b in ?? () >> Cannot access memory at address 0x7fffd588 >> >> It started several months ago after OS update to fresh 10/stable (but I do >> not remember details: which version were before and from which version that >> started). >> >> Does anyone observe something similar? > > > This sounds like: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204426 > and > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204764 > > > This problem exists since 10.2. > 10.1 is fine. Oh, yes, thanks, somehow google missed that for me. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Failed to write core file (error 14)
Hello, On our 10-stable boxes sometimes processes crash with the following errors: Failed to write core file for process check_ssh (error 14) pid 81441 (check_ssh), uid 181: exited on signal 11 Failed to write core file for process nagios (error 14) pid 30255 (nagios), uid 181: exited on signal 11 Failed to write core file for process sh (error 14) pid 59267 (sh), uid 181: exited on signal 11 Failed to write core file for process nagios (error 14) pid 99102 (nagios), uid 181: exited on signal 11 gdb does not show stack: (gdb) bt #0 0x000800bffb9b in ?? () Cannot access memory at address 0x7fffd588 It started several months ago after OS update to fresh 10/stable (but I do not remember details: which version were before and from which version that started). Does anyone observe something similar? Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs_getpages: error 4
> On 05 Mar 2016, at 23:10, Dmitry Sivachenko wrote: > > >> On 05 Mar 2016, at 21:35, Konstantin Belousov wrote: >> >> But I suspect that you do have enough free or reclamaible pages for OOM >> to not trigger, e.g. because you demonstrated commands output from the >> live system after the situation occured. It more likely was a temporal >> free page shortage, after which the system recovered. >> >> I more believe in a bug in the handling of killed process in vm_fault(). >> Could you get the p_flag value for the hung process ? Like >> ps -o flags > > > Unfortunately I already rebooted this machine because our developers needed > it and processes did not stop after kill -9. > > When this repeats, I will try to keep this server up for longer time and > provide any necessary information. So far I got the same error: Mar 8 07:13:08 skazka4 kernel: nfs_getpages: error 4 Mar 8 07:13:08 skazka4 kernel: vm_fault: pager read error, pid 58483 (decodcmd) But the process in question finished successfully. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs_getpages: error 4
> On 05 Mar 2016, at 21:35, Konstantin Belousov wrote: > > But I suspect that you do have enough free or reclamaible pages for OOM > to not trigger, e.g. because you demonstrated commands output from the > live system after the situation occured. It more likely was a temporal > free page shortage, after which the system recovered. > > I more believe in a bug in the handling of killed process in vm_fault(). > Could you get the p_flag value for the hung process ? Like > ps -o flags Unfortunately I already rebooted this machine because our developers needed it and processes did not stop after kill -9. When this repeats, I will try to keep this server up for longer time and provide any necessary information. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs_getpages: error 4
> On 05 Mar 2016, at 19:27, Konstantin Belousov wrote: > > On Sat, Mar 05, 2016 at 05:24:26PM +0300, Dmitry Sivachenko wrote: >>> >>> Again, error 4 is EINTR so you could disable both "soft" and "intr" options >>> for test. >> >> >> "soft" is meaningless in such setup, because "file system calls will fail >> after retrycnt round trip timeout intervals" but "The default is a retry >> count of zero, which means to keep retrying forever". >> >> If I understand "intr" correctly, it matters only when server becomes >> unresponsive, that is "server is not responding" message should be in my >> logs. But I have no such a message. >> >> > > The intr NFS mount option allows signals to interrupt NFS waits for the > RPC responses. This is almost certainly the reason for the EINTR error > you get from the pager. > > You should at last get the > vm_fault: pager read error, pid ... > messages as well. Is this true ? That is true, see my initial post. > The end result would be SIGSEGV > delivered to the process. > > OTOH, I do not quite understand why did your threads requesting page-in > fall into the wait for a free page. I assume that there is enough free > pages in the system ? > I have no swap configured, but it is possible that running processes eat all RAM (I expect them to be killed with OOM rather than stuck?) ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs_getpages: error 4
> On 05 Mar 2016, at 17:01, Eugene Grosbein wrote: > > 05.03.2016 20:42, Dmitry Sivachenko пишет: > >>> and to discover what version is broken. And show full mount command/option >>> set. >> I already included mount flags from fstab in my original e-mail: >> >> rw,bg,intr,soft > > If that's only options you use, there is another workaround: add options > rsize=1024,wsize=1024 to avoid possible packet reassemply/defragmentation > related bugs. I wonder how rsize=wsize=1024 will affect performance? I have 10GBit network and I expect to achieve comparable throughput. > > Again, error 4 is EINTR so you could disable both "soft" and "intr" options > for test. "soft" is meaningless in such setup, because "file system calls will fail after retrycnt round trip timeout intervals" but "The default is a retry count of zero, which means to keep retrying forever". If I understand "intr" correctly, it matters only when server becomes unresponsive, that is "server is not responding" message should be in my logs. But I have no such a message. > Anyway, re-read mount_nfs(8) manual page, section BUGS before switch to NFSv4. That is why I chose to use NFSv3, I thought it is more mature and stable implementation. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs_getpages: error 4
> On 05 Mar 2016, at 16:33, Eugene Grosbein wrote: > > 05.03.2016 19:32, Dmitry Sivachenko пишет: > >>>> I am running a number of machines with /home mounted via nfs (FreeBSD >>>> 10.3-PRERELEASE #0 r294799, rw,bg,intr,soft). >>>> >>>> Sometimes I get the following messages in syslog: >>>> >>>> nfs_getpages: error 4 >>>> vm_fault: pager read error, pid NNN (myprog) >>>> >>>> After that I see I lot of processes stuck in "pfault" state (these are >>>> computational processes which use some files from NFS mount), they use 0% >>>> of CPU after that. >>>> >>>> On NFS server machine I see nothing strange in logs. procstat -kk for >>>> such stuck processes shows: >>>> PIDTID COMM TDNAME KSTACK >>>> 85274 102056 myprog -mi_switch+0xbe >>>> sleepq_wait+0x3a _sleep+0x287 vm_waitpfault+0x8a vm_fault_hold+0xdd0 >>>> vm_fault+0x77 trap_pfault+0x180 trap+0x52c calltrap+0x8 >>>> >>>> >>>> What can be the reason of this? >>> >>> For example, if some processes running on NFS server box modify some files >>> "in-place" >>> and these files are opened by processes running on NFS client, that could >>> be the reason. >>> If so, change this so processes updating such files create new temporary >>> versions of them first >>> and then rename them atomically. >>> >> >> This should not be the case: users are working only on NFS clients. >> Moreover, the nature of computations is so that each process uses it's own >> set of files. >> >> (Forgot to mention in my previous e-mail that these processes can't be >> stopped even with kill -9) > > Make sure you use TCP mounts and TSO is disabled. I do use TCP mount (this is the default). I will try to disable TSO. > Try switching between NFSv3/NFSv4 to avoid this bug As far as I understand, the default is NFSv3 (which should be more stable?). I can try to switch to NFSv4. > and to discover what version is broken. And show full mount command/option > set. I already included mount flags from fstab in my original e-mail: rw,bg,intr,soft ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfs_getpages: error 4
> On 05 Mar 2016, at 15:13, Eugene Grosbein wrote: > > 05.03.2016 18:21, Dmitry Sivachenko пишет: >> Hello, >> >> I am running a number of machines with /home mounted via nfs (FreeBSD >> 10.3-PRERELEASE #0 r294799, rw,bg,intr,soft). >> >> Sometimes I get the following messages in syslog: >> >> nfs_getpages: error 4 >> vm_fault: pager read error, pid NNN (myprog) >> >> After that I see I lot of processes stuck in "pfault" state (these are >> computational processes which use some files from NFS mount), they use 0% of >> CPU after that. >> >> On NFS server machine I see nothing strange in logs. procstat -kk for such >> stuck processes shows: >> PIDTID COMM TDNAME KSTACK >> 85274 102056 myprog -mi_switch+0xbe >> sleepq_wait+0x3a _sleep+0x287 vm_waitpfault+0x8a vm_fault_hold+0xdd0 >> vm_fault+0x77 trap_pfault+0x180 trap+0x52c calltrap+0x8 >> >> >> What can be the reason of this? > > For example, if some processes running on NFS server box modify some files > "in-place" > and these files are opened by processes running on NFS client, that could be > the reason. > If so, change this so processes updating such files create new temporary > versions of them first > and then rename them atomically. > This should not be the case: users are working only on NFS clients. Moreover, the nature of computations is so that each process uses it's own set of files. (Forgot to mention in my previous e-mail that these processes can't be stopped even with kill -9) ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
nfs_getpages: error 4
Hello, I am running a number of machines with /home mounted via nfs (FreeBSD 10.3-PRERELEASE #0 r294799, rw,bg,intr,soft). Sometimes I get the following messages in syslog: nfs_getpages: error 4 vm_fault: pager read error, pid NNN (myprog) After that I see I lot of processes stuck in "pfault" state (these are computational processes which use some files from NFS mount), they use 0% of CPU after that. On NFS server machine I see nothing strange in logs. procstat -kk for such stuck processes shows: PIDTID COMM TDNAME KSTACK 85274 102056 myprog -mi_switch+0xbe sleepq_wait+0x3a _sleep+0x287 vm_waitpfault+0x8a vm_fault_hold+0xdd0 vm_fault+0x77 trap_pfault+0x180 trap+0x52c calltrap+0x8 What can be the reason of this? Thanks. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Regression on 10/STABLE (Was: SOL_TCP def?)
> On 29 Dec 2015, at 23:48, Jonathan Chen wrote: > > On 30 December 2015 at 07:50, Dmitry Sivachenko wrote: >>> Patch gsoap to use IPPROTO_IP instead of SOL_TCP. >> I meant IPPROTO_TCP, sorry. > > Thanks for the quick-fix. However, IMHO this should be classed as a > regression on 10/STABLE. > I think it is non-portable code in gsoap, which was revealed by introduction of TCP_FASTOPEN to 10/stable. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: SOL_TCP def?
> On 29 Dec 2015, at 21:49, Dmitry Sivachenko wrote: > > >> On 29 Dec 2015, at 21:38, Jonathan Chen wrote: >> >> On 30 December 2015 at 07:28, Jonathan Chen wrote: >> [...] >>> devel/gsoap will build on older version. My installed devel/gsoap was >>> last installed on 6-Dec-2015. >> >> Rephrasing: devel/gsoap will build on an older snapshot of 10/STABLE. >> My currently installed devel/gsoap was last installed on 6-Dec-2015. >> -- > > > Patch gsoap to use IPPROTO_IP instead of SOL_TCP. > I meant IPPROTO_TCP, sorry. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: SOL_TCP def?
> On 29 Dec 2015, at 21:38, Jonathan Chen wrote: > > On 30 December 2015 at 07:28, Jonathan Chen wrote: > [...] >> devel/gsoap will build on older version. My installed devel/gsoap was >> last installed on 6-Dec-2015. > > Rephrasing: devel/gsoap will build on an older snapshot of 10/STABLE. > My currently installed devel/gsoap was last installed on 6-Dec-2015. > -- Patch gsoap to use IPPROTO_IP instead of SOL_TCP. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: process scheduling and cpuset
> On 13 сент. 2015 г., at 19:40, Slawa Olhovchenkov wrote: > > On Sun, Sep 13, 2015 at 04:44:40PM +0300, Dmitry Sivachenko wrote: > >> >>> On 13 сент. 2015 г., at 16:09, Slawa Olhovchenkov wrote: >>> >>> On Sun, Sep 13, 2015 at 02:52:08PM +0300, Dmitry Sivachenko wrote: >>> >>>> Hello, >>>> >>>> I have 32 processor machine (2x CPU E5-2650) running several CPU-bound >>>> processes (ULE scheduler). >>>> 3 processes are 32-threaded, and 8 are single threaded. >>>> >>>> I bind all 3 32-threaded processes to CPUs 0-24 (cpuset -C -l 0-24 -p XXX). >>>> >>>> I expect that the remaining 8 single-threaded processes will (mostly) run >>>> on the remaining 25-31 CPU cores and use (almost) 100% cpu each. >>>> >>>> But this is not the case (according to top(1)): they spend a lot of time >>>> on 0-24 CPUs and CPU Idle time is about 10%. >>>> >>>> These are all purely computational programs, in idle system >>>> single-threaded programs steadily consume 100% of a core, and 32-threaded >>>> programs consume all 32 cores and idle time is zero. >>>> >>>> Is it an ULE scheduler feature or am I doing something wrong? >>>> >>>> The goal is to give a single-threaded program a chance to run when >>>> somebody started several 32-threaded processes. >>> >>> You don't have 32 processor machine, you have only 16 processor >>> machine. >>> SMT/hyperthreading don't give real processor, SMT "CPU" have >>> unpredicable power and his load depend on load parent CPU. >>> >>> For example, for my case I see such condition (simpliy) on CPU 0 and 1 >>> (SMT of one real core) with rise load: >>> >>> load 0.1 0.1 >>> load 0.2 0.2 >>> load 0.3 0.3 >>> load 0.4 0.4 >>> load 0.45 0.45 >>> load 0.48 0.48 >>> load 1.00 1.00\ >> >> >> Yes I know about HT. But how does this explain why I have 10% of CPU idle? >> >> If I explicitly bind my single-threaded processes to the remaining CPU cores >> (25-32), they start to receive expected 100% of CPU and overall Idle >> decreases. >> >> I just expect scheduler to do the same for me. >> > > Idle is not goal, goal is lessing task executing time. Thanks for the explanation. In my example SMT pairs are numbered with sequential numbers, so 0+1 is one SMT group, 2+3 is second SMT group, and so on. So in 25-32 range there are several real CPU cores which remain idle while processes are fighting for overloaded 0-24. When I explicitly pin my single-threaded processes to 25-32 range, they start to receive 100% of CPU (and finish faster to be clear). ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: process scheduling and cpuset
> On 13 сент. 2015 г., at 16:09, Slawa Olhovchenkov wrote: > > On Sun, Sep 13, 2015 at 02:52:08PM +0300, Dmitry Sivachenko wrote: > >> Hello, >> >> I have 32 processor machine (2x CPU E5-2650) running several CPU-bound >> processes (ULE scheduler). >> 3 processes are 32-threaded, and 8 are single threaded. >> >> I bind all 3 32-threaded processes to CPUs 0-24 (cpuset -C -l 0-24 -p XXX). >> >> I expect that the remaining 8 single-threaded processes will (mostly) run on >> the remaining 25-31 CPU cores and use (almost) 100% cpu each. >> >> But this is not the case (according to top(1)): they spend a lot of time on >> 0-24 CPUs and CPU Idle time is about 10%. >> >> These are all purely computational programs, in idle system single-threaded >> programs steadily consume 100% of a core, and 32-threaded programs consume >> all 32 cores and idle time is zero. >> >> Is it an ULE scheduler feature or am I doing something wrong? >> >> The goal is to give a single-threaded program a chance to run when somebody >> started several 32-threaded processes. > > You don't have 32 processor machine, you have only 16 processor > machine. > SMT/hyperthreading don't give real processor, SMT "CPU" have > unpredicable power and his load depend on load parent CPU. > > For example, for my case I see such condition (simpliy) on CPU 0 and 1 > (SMT of one real core) with rise load: > > load 0.1 0.1 > load 0.2 0.2 > load 0.3 0.3 > load 0.4 0.4 > load 0.45 0.45 > load 0.48 0.48 > load 1.00 1.00\ Yes I know about HT. But how does this explain why I have 10% of CPU idle? If I explicitly bind my single-threaded processes to the remaining CPU cores (25-32), they start to receive expected 100% of CPU and overall Idle decreases. I just expect scheduler to do the same for me. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
process scheduling and cpuset
Hello, I have 32 processor machine (2x CPU E5-2650) running several CPU-bound processes (ULE scheduler). 3 processes are 32-threaded, and 8 are single threaded. I bind all 3 32-threaded processes to CPUs 0-24 (cpuset -C -l 0-24 -p XXX). I expect that the remaining 8 single-threaded processes will (mostly) run on the remaining 25-31 CPU cores and use (almost) 100% cpu each. But this is not the case (according to top(1)): they spend a lot of time on 0-24 CPUs and CPU Idle time is about 10%. These are all purely computational programs, in idle system single-threaded programs steadily consume 100% of a core, and 32-threaded programs consume all 32 cores and idle time is zero. Is it an ULE scheduler feature or am I doing something wrong? The goal is to give a single-threaded program a chance to run when somebody started several 32-threaded processes. Thanks! ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: wm_page_unwire
> On 20 июня 2015 г., at 13:01, Konstantin Belousov wrote: > > > I was able to reproduce something related, this may be very well your > problem. Take the attached program. Select a scratch file on UFS mount > point, say x. Run the following commands: > mlock_modify x& > dd if=/dev/zero of=x bs=1 count=1 > fg > ^C <- system might panic at this point, if buffers are in short supply > dd if=/dev/zero of=x bs=1 count=1 <- at this point, the system must panic Yes, that is exactly two cases when I was able to reproduce a panic, so it is apparently my issue. I tried your patch and I can confirm that it does fix the problem. Thanks! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic: wm_page_unwire
> On 19 июня 2015 г., at 22:57, Dmitry Sivachenko wrote: > > Hello, > > got this panic today on my 10.1-STABLE #0 r279956 box: > > Well, I tracked this down a bit. Rather easy way to panic -stable box (mine is r279956), but I can't reliably reproduce this. It happens when there is a process running which mmap()+mlock() some file, and while it is running this file is modified on disk (not rm+mv, but open the same file, truncate and write some other data into it). After process exits, system will panic with high probability. So far I got 2 cases: 1) run process which mlock()'s a file; modify that file; stop process and system panics 2) run process which mlock()'s a file; modify that file; stop process [no panic so far]; modify that file again and system panics. Panic message is the same: panic: vm_page_unwire: page 's wire count is zero ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
panic: wm_page_unwire
Hello, got this panic today on my 10.1-STABLE #0 r279956 box: ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: pkg 1.5.0 is out
> On 14 апр. 2015 г., at 23:05, Baptiste Daroussin wrote: > > Final pkg 1.5.0 has been released. > Thank a lot for working on pkg! > > For pkg 1.6.0 among other things and depending on the time, here is what we do > plan to work on: > - > What I really miss a lot is support for package "profiles": an ability to build the same port with different OPTIONs combination. For example: minimal nginx version; nginx version with passenger module (for puppet server) nginx version with some other rare options turned on for custom application. Right now I achieve this with manually renaming /var/db/ports/*/options files and some manipulations in /usr/ports/packages/All. But a framework to automatically handle this would be very useful. Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: dev.cpu.0.freq disapeared
> On 23 марта 2015 г., at 9:03, Ian Smith wrote: > > > Do you have Enhanced Speedstep (EST), disabled in your BIOS settings? > If so, just turn it on. Then you should also be able to set running > frequency to 'MAX performance' or similar there. > > If not disabled, ie you have EST enabled in BIOS, that points to a real > issue of EST detection. And it still seems strange that enabling p4tcc > is enough to have cpufreq(4) include OIDs for freq and freq_levels? > Thanks to all who replied. This is called Intel SpeedStep Tech in that BIOS and it was indeed disabled. I enabled it and now I have in dmesg est0: on cpu0 even with hint.p4tcc.0.disabled="1" for each CPU and dev.cpu.0.freq appeared back. Thanks for your help. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: dev.cpu.0.freq disapeared
> On 22 марта 2015 г., at 17:11, Ian Smith wrote: > > Dmitry Sivachenko wrote: >>> On 22 марта 2015 г., at 8:53, Peter Jeremy wrote: >>> On 2015-Mar-22 00:58:55 +0300, Dmitry Sivachenko >>> wrote: >>>> I have a machine with the following processor: >>>> CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2400.14-MHz >>>> K8-class CPU) Origin="GenuineIntel" Id=0x206c2 Family=0x6 Model=0x2c >>>> Stepping=2 >>> ... >>>> After I upgraded to 10.1-STABLE #0 r279956, this sysctl disapeared. % >>>> sysctl dev.cpu.0.freq sysctl: unknown oid 'dev.cpu.0.freq': No such file >>>> or directory % > >>> What OIDs do you have? Does dev.cpu.0 exist? How about dev.cpu? >> dev.cpu.0 does exist. > > It could be helpful to show all of: > > % sysctl dev.cpu > % sysctl dev.est # if you have that? > % sysctl -a | grep freq | grep -v time > > both before and after re-enabling p4tcc. Hello, With #hint.p4tcc.0.disabled="1" commented out: % sysctl dev.cpu dev.cpu.%parent: dev.cpu.0.%desc: ACPI CPU dev.cpu.0.%driver: cpu dev.cpu.0.%location: handle=\_PR_.P001 dev.cpu.0.%pnpinfo: _HID=none _UID=0 dev.cpu.0.%parent: acpi0 dev.cpu.0.coretemp.delta: 67 dev.cpu.0.coretemp.resolution: 1 dev.cpu.0.coretemp.tjmax: 95.0C dev.cpu.0.coretemp.throttle_log: 0 dev.cpu.0.temperature: 28.0C dev.cpu.0.freq: 2400 dev.cpu.0.freq_levels: 2400/-1 2100/-1 1800/-1 1500/-1 1200/-1 900/-1 600/-1 300/-1 dev.cpu.0.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.0.cx_lowest: C1 dev.cpu.0.cx_usage: 100.00% 0.00% 0.00% last 261us dev.cpu.1.%desc: ACPI CPU dev.cpu.1.%driver: cpu dev.cpu.1.%location: handle=\_PR_.P002 dev.cpu.1.%pnpinfo: _HID=none _UID=0 dev.cpu.1.%parent: acpi0 dev.cpu.1.coretemp.delta: 67 dev.cpu.1.coretemp.resolution: 1 dev.cpu.1.coretemp.tjmax: 95.0C dev.cpu.1.coretemp.throttle_log: 0 dev.cpu.1.temperature: 28.0C dev.cpu.1.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.1.cx_lowest: C1 dev.cpu.1.cx_usage: 100.00% 0.00% 0.00% last 71201us dev.cpu.2.%desc: ACPI CPU dev.cpu.2.%driver: cpu dev.cpu.2.%location: handle=\_PR_.P003 dev.cpu.2.%pnpinfo: _HID=none _UID=0 dev.cpu.2.%parent: acpi0 dev.cpu.2.coretemp.delta: 62 dev.cpu.2.coretemp.resolution: 1 dev.cpu.2.coretemp.tjmax: 95.0C dev.cpu.2.coretemp.throttle_log: 0 dev.cpu.2.temperature: 33.0C dev.cpu.2.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.2.cx_lowest: C1 dev.cpu.2.cx_usage: 100.00% 0.00% 0.00% last 124614us dev.cpu.3.%desc: ACPI CPU dev.cpu.3.%driver: cpu dev.cpu.3.%location: handle=\_PR_.P004 dev.cpu.3.%pnpinfo: _HID=none _UID=0 dev.cpu.3.%parent: acpi0 dev.cpu.3.coretemp.delta: 62 dev.cpu.3.coretemp.resolution: 1 dev.cpu.3.coretemp.tjmax: 95.0C dev.cpu.3.coretemp.throttle_log: 0 dev.cpu.3.temperature: 33.0C dev.cpu.3.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.3.cx_lowest: C1 dev.cpu.3.cx_usage: 100.00% 0.00% 0.00% last 101864us dev.cpu.4.%desc: ACPI CPU dev.cpu.4.%driver: cpu dev.cpu.4.%location: handle=\_PR_.P005 dev.cpu.4.%pnpinfo: _HID=none _UID=0 dev.cpu.4.%parent: acpi0 dev.cpu.4.coretemp.delta: 62 dev.cpu.4.coretemp.resolution: 1 dev.cpu.4.coretemp.tjmax: 95.0C dev.cpu.4.coretemp.throttle_log: 0 dev.cpu.4.temperature: 33.0C dev.cpu.4.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.4.cx_lowest: C1 dev.cpu.4.cx_usage: 100.00% 0.00% 0.00% last 127376us dev.cpu.5.%desc: ACPI CPU dev.cpu.5.%driver: cpu dev.cpu.5.%location: handle=\_PR_.P006 dev.cpu.5.%pnpinfo: _HID=none _UID=0 dev.cpu.5.%parent: acpi0 dev.cpu.5.coretemp.delta: 62 dev.cpu.5.coretemp.resolution: 1 dev.cpu.5.coretemp.tjmax: 95.0C dev.cpu.5.coretemp.throttle_log: 0 dev.cpu.5.temperature: 33.0C dev.cpu.5.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.5.cx_lowest: C1 dev.cpu.5.cx_usage: 100.00% 0.00% 0.00% last 107493us dev.cpu.6.%desc: ACPI CPU dev.cpu.6.%driver: cpu dev.cpu.6.%location: handle=\_PR_.P007 dev.cpu.6.%pnpinfo: _HID=none _UID=0 dev.cpu.6.%parent: acpi0 dev.cpu.6.coretemp.delta: 63 dev.cpu.6.coretemp.resolution: 1 dev.cpu.6.coretemp.tjmax: 95.0C dev.cpu.6.coretemp.throttle_log: 0 dev.cpu.6.temperature: 32.0C dev.cpu.6.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.6.cx_lowest: C1 dev.cpu.6.cx_usage: 100.00% 0.00% 0.00% last 155573us dev.cpu.7.%desc: ACPI CPU dev.cpu.7.%driver: cpu dev.cpu.7.%location: handle=\_PR_.P008 dev.cpu.7.%pnpinfo: _HID=none _UID=0 dev.cpu.7.%parent: acpi0 dev.cpu.7.coretemp.delta: 63 dev.cpu.7.coretemp.resolution: 1 dev.cpu.7.coretemp.tjmax: 95.0C dev.cpu.7.coretemp.throttle_log: 0 dev.cpu.7.temperature: 32.0C dev.cpu.7.cx_supported: C1/1/32 C2/3/96 C3/3/128 dev.cpu.7.cx_lowest: C1 dev.cpu.7.cx_usage: 100.00% 0.00% 0.00% last 32278us dev.cpu.8.%desc: ACPI CPU dev.cpu.8.%driver: cpu dev.cpu.8.%location: handle=\_PR_.P009 dev.cpu.8.%pnpinfo: _HID=none _UID=0 dev.cpu.8.%parent: acpi0 dev.cpu.8.coretemp.delta: 72 dev.cpu.8.coretemp.resolution: 1 dev.cpu.8.co
Re: dev.cpu.0.freq disapeared
> On 22 марта 2015 г., at 8:53, Peter Jeremy wrote: > > On 2015-Mar-22 00:58:55 +0300, Dmitry Sivachenko wrote: >> I have a machine with the following processor: >> >> CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2400.14-MHz K8-class >> CPU) >> Origin="GenuineIntel" Id=0x206c2 Family=0x6 Model=0x2c Stepping=2 > ... >> After I upgraded to 10.1-STABLE #0 r279956, this sysctl disapeared. >> % sysctl dev.cpu.0.freq >> sysctl: unknown oid 'dev.cpu.0.freq': No such file or directory >> % > > What OIDs do you have? Does dev.cpu.0 exist? How about dev.cpu? dev.cpu.0 does exist. I found the problematic change: Author: nwhitehorn Date: Sun Jan 11 17:10:07 2015 New Revision: 276986 URL: https://svnweb.freebsd.org/changeset/base/276986 Log: MFC r265329: Disable ACPI and P4TCC throttling by default, following discussion on freebsd-current. These CPU speed control techniques are usually unhelpful at best. For now, continue building the relevant code into GENERIC so that it can trivially be re-enabled at runtime if anyone wants it. Modified: stable/10/sys/amd64/conf/GENERIC.hints == --- stable/10/sys/amd64/conf/GENERIC.hints Sun Jan 11 17:00:24 2015 (r276985) +++ stable/10/sys/amd64/conf/GENERIC.hints Sun Jan 11 17:10:07 2015 (r276986) @@ -31,3 +31,5 @@ hint.attimer.0.at="isa" hint.attimer.0.port="0x40" hint.attimer.0.irq="0" hint.wbwd.0.at="isa" +hint.acpi_throttle.0.disabled="1" +hint.p4tcc.0.disabled="1" If I remove that hint.p4tcc.0.disabled="1" from device.hints, dev.cpu.0.freq appears back again. I am using dev.cpu.0.freq to ensure that processor is running at expected frequency (with some buggy BIOSes or buggy BIOS options combinations it is possible to end up with machine running at half frequency). Does it really hurt to have this sysctl available? Why it was disabled by default? (I am not discussing hint.acpi_throttle.0.disabled here, just hint.p4tcc.0.disabled). Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: dev.cpu.0.freq disapeared
> On 22 марта 2015 г., at 3:27, Kevin Oberman wrote: > > > # uname -a FreeBSD rogue 10.1-STABLE FreeBSD 10.1-STABLE #0 r280293: Fri Mar > 20 11:28:08 PDT 2015 root@rogue:/usr/obj/usr/src/sys/GENERIC amd64 > # sysctl dev.cpu.0.freq > dev.cpu.0.freq: 2501 > # > No idea why it is not working for you. I'm guessing that something is not > starting up properly, but I have no idea what. This problem seems to be processor-specific: I have a lot of E5-2660 machines which do not suffer this issue. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
dev.cpu.0.freq disapeared
Hello! I have a machine with the following processor: CPU: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz (2400.14-MHz K8-class CPU) Origin="GenuineIntel" Id=0x206c2 Family=0x6 Model=0x2c Stepping=2 When running 10.1-STABLE #5 r276908 I have: % sysctl dev.cpu.0.freq dev.cpu.0.freq: 2400 % After I upgraded to 10.1-STABLE #0 r279956, this sysctl disapeared. % sysctl dev.cpu.0.freq sysctl: unknown oid 'dev.cpu.0.freq': No such file or directory % I did not change kernel config file. What can be the cause of this problem? Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 9-STABLE panic on intensive fork
On 29.08.2013, at 22:45, Konstantin Belousov wrote: > On Wed, Aug 28, 2013 at 06:20:29PM +0400, Dmitry Sivachenko wrote: >> Hello! >> >> I am using very recent FreeBSD-9-STABLE snapshot: >> 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0 r254986: Wed Aug 28 17:18:57 MSK >> 2013 >> >> I run uwsgi program (ports/www/uwsgi) on that machine. >> >> When uwsgi starts, it forks pre-configured number of worker processes. >> If I raise workers parameter high enough (128), I get kernel panic (100% >> reproducible): >> >> Fatal trap 12: page fault while in kernel mode >> >> If I compile kernel with KDB enabled, I get the following stack: >> >> pmap_demote_pde_locked() >> pmap_copy() >> vmspace_fork() >> fork1() >> sys_fork() >> >> I have only remote console for that machine, so I made 2 screenshots: >> >> 1) http://people.freebsd.org/~demon/screen1.jpg >> Panic screen when kernel has no KDB support compiled in >> >> 2) http://people.freebsd.org/~demon/screen2.jpg >> Panic screen (2nd part) with the above stack shown. > Look up the source line for the pmap_demote_pde_locked()+0x471 for your > kernel. Dump the core from the panic. Kernel dump is not generated (despite it is configured at boot), there is no "Dumping" message on console. These screenshots shows everything I see on console. I performed some more investigations on this: I have several (14) totally identical configured machines running exactly the same software. Hardware is a bit different though. I tried to analyze motherboard differences but failed to find common things for the affected machines. Under conditions described in my initial e-mail, some of them crash (exactly the same way), some of them do not. I am confident there is no hardware problems, these machines run for months without reboot, as for now I discovered the only way to crash them. I updated one of the affected servers to 10-current and I can state it does not crash anymore with the same usage scenario. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
9-STABLE panic on intensive fork
Hello! I am using very recent FreeBSD-9-STABLE snapshot: 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0 r254986: Wed Aug 28 17:18:57 MSK 2013 I run uwsgi program (ports/www/uwsgi) on that machine. When uwsgi starts, it forks pre-configured number of worker processes. If I raise workers parameter high enough (128), I get kernel panic (100% reproducible): Fatal trap 12: page fault while in kernel mode If I compile kernel with KDB enabled, I get the following stack: pmap_demote_pde_locked() pmap_copy() vmspace_fork() fork1() sys_fork() I have only remote console for that machine, so I made 2 screenshots: 1) http://people.freebsd.org/~demon/screen1.jpg Panic screen when kernel has no KDB support compiled in 2) http://people.freebsd.org/~demon/screen2.jpg Panic screen (2nd part) with the above stack shown. I can provide any additional information if needed. Thanks! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: RELENG_7 if_nve panic
On Tue, Jan 26, 2010 at 09:49:45AM -0500, John Baldwin wrote: > On Tuesday 26 January 2010 4:29:05 am Dmitry Sivachenko wrote: > > Hello! > > > > I recompiled recent RELENG_7 and I get the following panic after > > trying to kldload if_nve (interesting stack frames are 12, 13, 14 I guess). > > Previous version of RELENG_7 (compiled in the middle of December) > > worked fine. Last few days I was trying to re-cvsup and always get the > > same panic. I get FreeBSD sources via cvsup (cvsup5.freebsd.org). > > > > Any suggestions? > > > > Thanks in advance! > > The bug is perhaps in e1000phy in that it expects all callers to have called > if_initname() before the miibus is probed. Try this patch: That patch solves the problem, thanks! > > Index: if_nve.c > === > --- if_nve.c (revision 202705) > +++ if_nve.c (working copy) > @@ -526,14 +526,6 @@ > goto fail; > } > > - /* Probe device for MII interface to PHY */ > - DEBUGOUT(NVE_DEBUG_INIT, "nve: do mii_phy_probe\n"); > - if (mii_phy_probe(dev, &sc->miibus, nve_ifmedia_upd, nve_ifmedia_sts)) { > - device_printf(dev, "MII without any phy!\n"); > - error = ENXIO; > - goto fail; > - } > - > /* Setup interface parameters */ > ifp->if_softc = sc; > if_initname(ifp, device_get_name(dev), device_get_unit(dev)); > @@ -549,6 +541,14 @@ > ifp->if_capabilities |= IFCAP_VLAN_MTU; > ifp->if_capenable |= IFCAP_VLAN_MTU; > > + /* Probe device for MII interface to PHY */ > + DEBUGOUT(NVE_DEBUG_INIT, "nve: do mii_phy_probe\n"); > + if (mii_phy_probe(dev, &sc->miibus, nve_ifmedia_upd, nve_ifmedia_sts)) { > + device_printf(dev, "MII without any phy!\n"); > + error = ENXIO; > + goto fail; > + } > + > /* Attach to OS's managers. */ > ether_ifattach(ifp, eaddr); > > > -- > John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: RELENG_7 if_nve panic
On Tue, Jan 26, 2010 at 01:00:29PM +0100, Bartosz Stec wrote: > W dniu 2010-01-26 10:29, Dmitry Sivachenko pisze: > > Hello! > > > > I recompiled recent RELENG_7 and I get the following panic after > > trying to kldload if_nve (interesting stack frames are 12, 13, 14 I guess). > > Previous version of RELENG_7 (compiled in the middle of December) > > worked fine. Last few days I was trying to re-cvsup and always get the > > same panic. I get FreeBSD sources via cvsup (cvsup5.freebsd.org). > > > > Any suggestions? > > > > > As well as I know nve driver is based on nvidia binaries (and it's > buggy), and that's way it was replaced by nfe driver as default for > nvidia based NICs as soon as it was ported from OpenBSD. > So my suggestion - if you just need NIC working, use nfe not nve. > Thanks for reminding me about nfe. I just tried it and it does work. I tried nfe sometime in the summer and it did not work on my hardware. That is why I was sticking to nve. Now it seems I can switch to nfe. (but nve is still broken if someone cares). ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RELENG_7 if_nve panic
Hello! I recompiled recent RELENG_7 and I get the following panic after trying to kldload if_nve (interesting stack frames are 12, 13, 14 I guess). Previous version of RELENG_7 (compiled in the middle of December) worked fine. Last few days I was trying to re-cvsup and always get the same panic. I get FreeBSD sources via cvsup (cvsup5.freebsd.org). Any suggestions? Thanks in advance! nve0: port 0xc800-0xc807 mem 0xfe02b000-0xfe02bfff irq 22 at device 20.0 on pci0 nve0: Ethernet address 00:18:f3:f4:73:1c miibus0: on nve0 e1000phy0: PHY 1 on miibus0 Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x8:0x803259cd stack pointer = 0x10:0xff80210ed3e0 frame pointer = 0x10:0xff80210ed3f0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 845 (kldload) panic: from debugger cpuid = 0 KDB: stack backtrace: Uptime: 33s (kgdb) bt #0 doadump () at pcpu.h:195 #1 0x8028b1d8 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0x8028b63c in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0x80183567 in db_panic (addr=Variable "addr" is not available. ) at /usr/src/sys/ddb/db_command.c:446 #4 0x80183bcf in db_command (last_cmdp=0x806414e8, cmd_table=0x0, dopager=1) at /usr/src/sys/ddb/db_command.c:413 #5 0x80183de0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:466 #6 0x801859c9 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:228 #7 0x802bb235 in kdb_trap (type=12, code=0, tf=0xff80210ed330) at /usr/src/sys/kern/subr_kdb.c:524 #8 0x8044a3f0 in trap_fatal (frame=0xff80210ed330, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:772 9 0x8044a7c4 in trap_pfault (frame=0xff80210ed330, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:693 #10 0x8044b0da in trap (frame=0xff80210ed330) at /usr/src/sys/amd64/amd64/trap.c:464 #11 0x804335fe in calltrap () at /usr/src/sys/amd64/amd64/exception.S:218 #12 0x803259cd in strcmp (s1=0x0, s2=0x80496f30 "msk") at /usr/src/sys/libkern/strcmp.c:45 #13 0x801caa3d in e1000phy_attach (dev=0xff0001532900) at /usr/src/sys/dev/mii/e1000phy.c:153 #14 0x802b54e9 in device_attach (dev=0xff0001532900) at device_if.h:178 #15 0x802b6bca in bus_generic_attach (dev=Variable "dev" is not available. ) at /usr/src/sys/kern/subr_bus.c:2923 #16 0x801ce1ee in miibus_attach (dev=0xff00016a6900) at /usr/src/sys/dev/mii/mii.c:186 #17 0x802b54e9 in device_attach (dev=0xff00016a6900) at device_if.h:178 #18 0x802b6bca in bus_generic_attach (dev=Variable "dev" is not available. ) at /usr/src/sys/kern/subr_bus.c:2923 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: RELENG_7_1: bce driver change generating too much interrupts ?
On Tue, Dec 02, 2008 at 04:44:46PM -0800, Xin LI wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi guys, > > I think I got a real fix. > I tried that patch with very recent 7-STABLE. I does fix the problem for me. Thanks a lot! > Cheers, > - -- > Xin LI <[EMAIL PROTECTED]>http://www.delphij.net/ > FreeBSD - The Power to Serve! > -BEGIN PGP SIGNATURE- > Version: GnuPG v2.0.9 (FreeBSD) > > iEYEARECAAYFAkk11n0ACgkQi+vbBBjt66Dy6wCfSl3eLRhj5TVs24Q+8ao5Mcz0 > FNQAoK8KvziiXFoanhSlWv636o+HfYIj > =AixC > -END PGP SIGNATURE- > Index: if_bce.c > === > --- if_bce.c (revision 185565) > +++ if_bce.c (working copy) > @@ -7030,13 +7030,14 @@ > > /* Was it a link change interrupt? */ > if ((status_attn_bits & STATUS_ATTN_BITS_LINK_STATE) != > - (sc->status_block->status_attn_bits_ack & > STATUS_ATTN_BITS_LINK_STATE)) > + (sc->status_block->status_attn_bits_ack & > STATUS_ATTN_BITS_LINK_STATE)) { > bce_phy_intr(sc); > > - /* Clear any transient status updates during link state change. > */ > - REG_WR(sc, BCE_HC_COMMAND, > - sc->hc_command | BCE_HC_COMMAND_COAL_NOW_WO_INT); > - REG_RD(sc, BCE_HC_COMMAND); > + /* Clear any transient status updates during link state > change. */ > + REG_WR(sc, BCE_HC_COMMAND, > + sc->hc_command | > BCE_HC_COMMAND_COAL_NOW_WO_INT); > + REG_RD(sc, BCE_HC_COMMAND); > + } > > /* If any other attention is asserted then the chip is toast. */ > if (((status_attn_bits & ~STATUS_ATTN_BITS_LINK_STATE) != ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
SVR4 problem
Hello! On my recent 4-STABLE: # kldload svr4 kldload: can't load svr4: Exec format error from dmesg: link_elf: symbol svr4_stream_get undefined Is it a known problem? Or what am I doing wrong? Thank you in advance, Dima. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-stable" in the body of the message