Re: enable ECC in OS code?
In <4a954a35.4030...@icyb.net.ua>, a...@icyb.net.ua wrote: > >Here is a question that I am afraid I know an answer for. >I have some ECC capable hardware: >1) Athlon II with embedded memory controller that can do ECC >2) DRAM modules with ECC >Assuming that ECC data lanes are connected between the two on motherboard, and >given that BIOS doesn't perform any ECC setup (nor there is any option to >control >that) - would it be possible to turn on ECC from OS code? >Or is it too late in the game already? It's about 100 times easier to have the BIOS do this. First off, it's usually quite specific to the chip set exactly how to do it. Next, if ECC wasn't enabled previously, the ECC bytes will all be wrong, which means that you'll have to rewrite all of memory after you've turned it on. Oh, and you have to fetch the code that rewrites the ECC from the memory with incorrect ECC to do that. If the BIOS is broken to the extent that it doesn't enable ECC on a system that it should be available, whine at the vendor. -- Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.5" / 37N 20' 15.3" Internet: steve @ Watt.COM Whois: SW32-ARIN Free time? There's no such thing. It just comes in varying prices... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Intermittent system hangs on 7.2-RELEASE-p1
On Wed, 26 Aug 2009, Linda Messerschmidt wrote: I'm trying to troubleshoot an intermittent Apache performance problem, and I've narrowed it down using to what appears to be a brief whole-system hang that last from 0.5 - 3 seconds. They occur every few minutes. One thought would be to use "ps" to try to determine which process, if any, is charged with CPU time during the hang. If you could afford a little downtime, it would be worth seeing if the hang occurs in single-user mode (perhaps with a simple program that loops calling gettimeofday() and warns when the time between successive iterations is large). I once had a problem like this that I eventually traced to a power management problem. (Specifically, the machine had a modem, and would hang for a few seconds whenever the line would ring. It was apparently related to the Wake-On-Ring feature.) If I remember correctly, disabling ACPI made it go away. So that might be something to try, if rebooting is an option. What are the similarities and differences in hardware and software among the affected machines (you mentioned there were several)? -- Nate Eldredge neldre...@math.ucsd.edu ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
MBR hack for serial console
I am hoping for input on a patch I want to apply to the MBR of a FreeBSD 8-BETA3 AMD64 server. I need a serial console on this server. The ASUS motherboard (amibios) has PCI and PCI-e expansion slots, and a Moschip MCS9820 UART (serial board) is installed at pci0:3:5:0. The amibios can be configured to do the plug-and-play enumeration, or this feature turned off, but there is no way to assign a particular i/o port to a PCI device in the BIOS, and I cannot get source for the BIOS to change this behavior. The serial board has a single Base Address Register at 10h in its pci configuration space. Whether the PCI bus is probed by the BIOS or FreeBSD, the UART BAR is assigned the i386 I/O port address of 0xe800. It must be COM1-COM4 (i.e. 0x3F8) to work in the boot sequence. I need access to the serial console before loader. I do not expect the hardware configuration to change so a hack is ok. My plan is to patch the MBR to override the serial card's BAR with 0x3F8. My reasoning is that the CPU is still in Real mode (allowing direct hardware access) until loader executes, and the serial console would work for the boot0 and boot2 calls to the terminal. I have experimented with using pciconf to change the BAR from a command line; curiously the command: pciconf -w pci0:3:5:0 16 1016 loads 0x3F9 into the serial card's PCI configuration space instead of 0x3F8, and I don't understand why. I've worked up this patch and hope someone can tell me why this would or wouldn't work: /usr/src/sys/boot/i386/mbr/mbr.s 41,57d40 < # Patch to reconfigure PCI UART's Base Address to COM1 < # I count 40 bytes in opcode < # < startcon: .set PCIADD_PORT,0xcf8 # Load pci config port addy < .set PCIDATA_PORT,0xcfc # Load pci data port addy < .set PCIADD,0x8003e810 # Load pci register identifier < .set PCIDATA,0x3f8 # Load pci register data < < pushad # save double registers < mov %ax,$PCIADD # put pci reg to access in ax < mov %dx,$PCIADD_PORT# put pci config port in dx < out %dx,%ax # send to cpu i/o space < mov %ax,$PCIDATA# put pci data in ax < mov %dx,$PCIDATA_PORT # put pci data port in dx < out %dx,%ax # send data to cpu i/o space < popad # pop saved registers < # 166,171c149,151 < # < # Instruction messages reduced to numbers, saves 60 bytes < # < msg_pt: .asciz "1" # "Invalid partition table" < msg_rd: .asciz "2" # "Error loading operating system" < msg_os: .asciz "3" # "Missing operating system" --- > msg_pt: .asciz "Invalid partition table" > msg_rd: .asciz "Error loading operating system" > msg_os: .asciz "Missing operating system" Thanks in advance for any help. I am not an assembly coder so am really uncertain about my patch. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Intermittent system hangs on 7.2-RELEASE-p1
On Wednesday 26 August 2009 3:03:13 pm Linda Messerschmidt wrote: > I'm trying to troubleshoot an intermittent Apache performance problem, > and I've narrowed it down using to what appears to be a brief > whole-system hang that last from 0.5 - 3 seconds. They occur every > few minutes. One thing to note is that ktrace only logs voluntary context switches (i.e. call to tsleep or waiting on a condition variable). It specifically does not log preemptions or blocking on a mutex, so in theory if your machine was livelocked temporarily that might explain this. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Intermittent system hangs on 7.2-RELEASE-p1
I'm trying to troubleshoot an intermittent Apache performance problem, and I've narrowed it down using to what appears to be a brief whole-system hang that last from 0.5 - 3 seconds. They occur every few minutes. I took the rather extreme step of doing "ktrace -t cnisuwt -i -d -p 1" and then I waited for the hang. This is what I got: 54937 httpd1251302859.375313 CALL shutdown(0x3,) 54937 httpd1251302859.375333 RET shutdown 0 54937 httpd1251302859.375348 CALL select(0x4,0xbfbfe92c,0,0,0xbfbfe9ac) 54937 httpd1251302859.375363 CSW stop kernel 54937 httpd1251302859.376402 CSW resume kernel 54937 httpd1251302859.376439 RET select 1 54937 httpd1251302859.376453 CALL read(0x3,0xbfbfe9b4,0x200) 54937 httpd1251302859.376470 GIO fd 3 read 0 bytes 54937 httpd1251302859.376482 RET read 0 54937 httpd1251302859.376495 CALL close(0x3) 54937 httpd1251302859.376511 RET close 0 54937 httpd1251302859.376525 CALL sigaction(SIGUSR1,0xbfbfebb0,0xbfbfeb98) 54937 httpd1251302859.376538 RET sigaction 0 54937 httpd1251302859.376552 CALL munmap(0x282ff000,0x11) 54937 httpd1251302859.376607 RET munmap 0 54937 httpd1251302859.376633 CALL accept(0x11,0xbfbfebf0,0xbfbfec10) 54937 httpd1251302859.376649 CSW stop kernel 796 svscan 1251302859.481064 CSW resume kernel 54937 httpd1251302859.489374 CSW resume kernel 54937 httpd1251302859.489391 STRU struct sockaddr { AF_INET, 172.17.0.143:61610 } 98229 httpd1251302859.601850 CSW resume kernel 46517 httpd1251302859.601900 CSW resume kernel 98202 httpd1251302859.611661 CSW resume kernel 837 nrpe21251302859.622681 CSW resume kernel 54454 httpd1251302859.655422 CSW resume kernel 54454 httpd1251302859.655443 STRU struct sockaddr { AF_INET, 172.17.0.131:59011 } 7182 httpd1251302859.722381 CSW resume kernel 98178 httpd1251302859.722438 CSW resume kernel 858 gmond1251302859.794996 CSW resume kernel 858 gmond1251302859.794998 GIO fd 5 wrote 0 bytes 770 ntpd 1251302860.076501 CSW resume kernel 98346 httpd1251302860.086261 CSW resume kernel 65277 httpd1251302860.086300 CSW resume kernel 98514 httpd1251302860.106849 CSW resume kernel 7191 httpd1251302860.106894 CSW resume kernel 796 svscan 1251302861.403335 RET nanosleep 0 796 svscan 1251302861.403370 CALL wait4(0x,0xbfbfee18,WNOHANG,0) 796 svscan 1251302861.403405 RET wait4 0 54454 httpd1251302861.403481 RET accept 3 98229 httpd1251302861.403532 RET select 0 796 svscan 1251302861.403553 CALL stat(0x804a3bb,0xbfbfed6c) 858 gmond1251302861.403601 GIO fd 5 read 20 bytes 54454 httpd1251302861.403619 CSW stop user 46517 httpd1251302861.403647 RET select 0 858 gmond1251302861.403674 RET kevent 1 858 gmond1251302861.403710 CALL socket(PF_INET,SOCK_DGRAM,IPPROTO_IP) 98202 httpd1251302861.403714 RET select 0 858 gmond1251302861.403752 RET socket 9 837 nrpe21251302861.403756 RET select 0 There is a gap between 1251302860.106894 and 1251302861.403335 of over one second, and the "effective gap" starts around 1251302859.376649 and thus lasts for about two seconds. This machine runs Apache and during this sample it was being hit every 0.1 seconds with a test request for a simple static file (in addition to production traffic). It is a 2-processor machine that is 85-95% idle; there's nothing in userspace that runs that long without yielding. According to systat, it handles 5000+ syscalls every second. But according to ktrace, nothing happens at all during the hang. This matches user experience. (The static file request, which usually completes in <0.01s suddenly takes 2 seconds as observed from the remote machine issuing the requests.) Here's the relevant snip from the httpd process handling that static file at the time of the hang: 54937 httpd1251302859.376633 CALL accept(0x11,0xbfbfebf0,0xbfbfec10) 54937 httpd1251302859.376649 CSW stop kernel 54937 httpd1251302859.489374 CSW resume kernel 54937 httpd1251302859.489391 STRU struct sockaddr { AF_INET, 172.17.0.143:61610 } 54937 httpd1251302861.403862 RET accept 3 It's stuck in accept, but does *not* get context-switched away from during the delay. (The earlier context switch corresponds to the 0.1 seconds between requests; there is an Apache instance configured to handle just the test requests with one child process; that process has nothing else to do or block on.) I'll include some other processes below. I think it's weird that all these processes get context-switched-into before/during the hang, and I wonder if it's a clue. The kernel is obviously still running, since it wakes these processes up, but nothing is happening. That and the fact that it happens on multiple machines (though we've only tested this one)
Re: Deprecating ps(1)s -w switch
,--- You/Dag-Erling (Wed, 26 Aug 2009 16:20:59 +0200) * | Tim Kientzle writes: | > The difference between "ps", "ps -w", and "ps -ww" is pretty | > significant for Java, in particular. Java command lines | > are typically enormous (thank you, CLASSPATH) which makes | > "ps -ww" often more annoying than it's worth. | | Java command lines aren't necessarily enormous. If they are, it is | because whoever invoked Java didn't know that it respects the CLASSPATH | environment variable, and that setting -classpath on the command line | f*s up the user's preferences (e.g. the user may want to replace a | particular set of classes with an alternative implementation). Using either the `-classpath' option to `java' or `CLASSPATH' environment variable is a pretty obsolete practice (whoever does either these days, should stop and re-think, IMHO.) The deficiency of the above, in either variation, is the need to list every `jar' file used, which gets ugly with more than a few files. A persons who keeps up with modern Java will call it with one or several of the options: -Djava.ext.dirs -Djava.library.path -Djava.endorsed.dirs Java Virtual Machine will internally list the files in each of the directories (specified on the command line or default ones), saving a user the effort to mention them explicitly in `CLASSPATH'. This cuts on the length of the command line dramatically, but still `java' processes' command lines are typically enormously long: even the lists of the directories, with their absolute paths are significant; on top of it, `java' is usually invoked with a gazillion of options modifying JVM's runtime behaviour. It's a fact of life that for real-life applications, `java' command lines are *long* -- you can't change that by moving from `-classpath' to `CLASSPATH'. (This said, I am not in favor of modifying `ps' in the manner proposed, as my previous message indicated.) -- Alex -- alex-goncha...@comcast.net -- ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
enable ECC in OS code?
Here is a question that I am afraid I know an answer for. I have some ECC capable hardware: 1) Athlon II with embedded memory controller that can do ECC 2) DRAM modules with ECC Assuming that ECC data lanes are connected between the two on motherboard, and given that BIOS doesn't perform any ECC setup (nor there is any option to control that) - would it be possible to turn on ECC from OS code? Or is it too late in the game already? -- Andriy Gapon ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Need some help understanding a jail system call.
bert wiley writes: > No where in the code do i ever see any access to the jail.h type systems > calls Because at that stage in the development process, the system calls in belong to the old implementation. > so does the syscall(375, JAIL_CREATE, argv[1]); actually access the > jail subsystem and create a jail? It calls the new system call, which at that stage hasn't been added to libc yet, because it would conflict with the existing system calls. > Here is the link i used to find this code > http://www.watson.org/~robert/freebsd/jailng/ You realize that this is eight years old, right? And that the jail infrastructure has been extensively modified since then, and is currently being rewritten again? DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Need some help understanding a jail system call.
Hello I found this code under a project called jailNG which has some system calls for doing jail stuff. Im still new to freebsd and im stumped on what this code is actually doing. In the source from the project there are few function calls that look like it creates and access the jail layer. Here is an example #define JAIL_CREATE 1 #define JAIL_DESTROY2 #define JAIL_JOIN 3 extern char *environ[]; static void usage(void) { fprintf(stderr, "usage:\n"); fprintf(stderr, " jailctl create [jailname]\n"); fprintf(stderr, " jailctl destroy [jailname]\n"); fprintf(stderr, " jailctl join [jailname] [-c chrootpath] [path] " "[cmd] [args...]\n"); exit(-1); } static int jail_create(int argc, char *argv[]) { int error; if (argc < 2) usage(); error = syscall(375, JAIL_CREATE, argv[1]); if (error) perror("jailconf().create"); return (error); } No where in the code do i ever see any access to the jail.h type systems calls, so does the syscall(375, JAIL_CREATE, argv[1]); actually access the jail subsystem and create a jail? Here is the link i used to find this code http://www.watson.org/~robert/freebsd/jailng/ Any help on this question is appreciated thanks. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Deprecating ps(1)s -w switch
Ivan Radovanovic writes: > I think software should evolve to be better rather then to stick with > something done the wrong way, even that has been done maybe 30 years > ago - that is why behavior should be changed. It is never too late to > do the right thing ;-) Are you also going to rewrite 30 years' worth of scripts that expect ps(1) to have a -w option which behaves in a particular manner? DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Deprecating ps(1)s -w switch
Tim Kientzle writes: > The difference between "ps", "ps -w", and "ps -ww" is pretty > significant for Java, in particular. Java command lines > are typically enormous (thank you, CLASSPATH) which makes > "ps -ww" often more annoying than it's worth. Java command lines aren't necessarily enormous. If they are, it is because whoever invoked Java didn't know that it respects the CLASSPATH environment variable, and that setting -classpath on the command line f*s up the user's preferences (e.g. the user may want to replace a particular set of classes with an alternative implementation). DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
NMI running X with dual-monitor
Hi guys, I'm running 7.2-STABLE on a Thinkpad T60. When connecting a second monitor to my docking station sometimes my FreeBSD freezes. kgdb on the vmcore-file says "non-maskable interrupt trap" Some details: X.Org 1.5.3 using the radeon-Driver I think the problem appears when moving xterms from the first to the second monitor (or back). The mouse cursor looks _very_ strange then and after some minutes the whole system freezes. Does anyone know about the problem? Is it a hardware-failure for sure? Thanks a lot! ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: AMD SB700 SMBus controller driver
on 26/08/2009 01:27 said the following: > Could you please forward me the patch to make it work in polling mode ? I'd > like to test it as I've been trying to make intpm work with a SB400 (which > should be quite the same as yours) but system hangs when I try to force > polling mode (didn't have the specs nor all the differences you just > presented). And btw, I didn't find any implementation using interrupt > neither but I'm ready to test your updated version. [what charset/encoding was your email?] Please see: http://people.freebsd.org/~avg/ga-ma780g-ud3h/intpm.diff The patch is work-in-progress and is not clean for this reason (style violations, experimental hacks) What the patch does: 1. redefine PCI_INTR_SMB_IRQ9 to 2 (bit 1) 2. disable writing to PCIR_INTLINE 3. add PCI id of my hardware 4. attempt to use IRQ mode with interrupt 20 - doesn't work 5. force polling mode - seems to work -- Andriy Gapon ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: GA-MA780G-UD3H motherboard
on 25/08/2009 21:34 Sam Fourman Jr. said the following: >> Meanwhile, if you interested in any information about this motherboard - data >> dumps, outputs from tools, etc - please let me know, I will try my best to >> provide >> that. > > it would be interesting to see a dmesg as a starting point. Please see http://people.freebsd.org/~avg/ga-ma780g-ud3h/ Replying to the other email - I use amd64 arch. -- Andriy Gapon ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Partial kvm dumps
On Mon, 24 Aug 2009 10:45:58 +0300 Mikolaj Golub wrote: > http://code.google.com/p/trociny/downloads/list > > I would like to hear what other people think about this. It looks > very useful for me. At least as a first step it would be nice to > extend KVM to work with partial dumps so the users could try this and > see if it turned out to be useful. Having recently been debugging core dump support in the base system utilities I spotted what looks like a bug in your code: the 'execfile' parameter to kvm_open or kvm_openfiles should be NULL if you want to use the kernel from the running system; some people may not be running a kernel from "/boot/kernel/kernel" by default. -- Bruce ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Deprecating ps(1)s -w switch
On Tuesday 25 August 2009 22:51:43 Rick C. Petty wrote: > On Tue, Aug 25, 2009 at 04:09:09PM +0200, Jonathan McKeown wrote: > > I usually want to see ps(1) output in easily-read columns. Without width > > limits, this can't be guaranteed. > > > > I would strongly object to the complete removal of any option to limit > > the output width of ps(1) and make it easily human-readable. > > > > I'm also astonished at the suggestion that not using -ww is ``a > > mistake''. I very seldom need to see the whole commandline for every > > process. > > Then you must not use Java much. I almost always need the -ww option. > I'm fine with the default being "fit into my terminal width", but I'd be > for one option to specify limited width and another option (-w) to > specify "as wide as possible". As it happens, you're right: I don't use Java at all. Neither do I object (much) to a change in the default behaviour such that wide output is the norm and restricted-width an option. In the original message, Brian Somers wrote: > The suggestion is that ps's -w switch is a strange artifact that can > be safely deprecated. ps goes to great lengths to implement width > limitations, and any time I've seen people not using -ww has either > been a mistake or doesn't matter. Using 'cut -c1-N' is also a great > way of limiting widths if people really want that... > > I'd like to propose changing ps so that width limits are removed and > '-w' is deprecated - ignored for now with a note in the man page > saying that it will be removed in a future release. The suggestion seems to be to remove the width-limiting code altogether, and make people who want width-restricted output (for example to keep it in columns which are easily scanned by eye) pipe the output through another command. That I do object to. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"