Puzzling stack trace
I'm reposting this here since it's a pretty low-level discussion. Hopefully someone here can explain what's going on. We had an app crash and the resulting core dump produced a very puzzling stack trace: #0 0x0008011d438c in thr_kill () from /lib/libc.so.7 #1 0x0008012722bb in abort () from /lib/libc.so.7 #2 0x0008011fb70c in malloc_usable_size () from /lib/libc.so.7 #3 0x0008011fbb95 in malloc_usable_size () from /lib/libc.so.7 #4 0x0008011fdaea in _malloc_thread_cleanup () from /lib/libc.so.7 #5 0x0008011fdc86 in _malloc_thread_cleanup () from /lib/libc.so.7 #6 0x0008011fc8e9 in malloc_usable_size () from /lib/libc.so.7 #7 0x0008011fccc7 in malloc_usable_size () from /lib/libc.so.7 #8 0x0008011ffe8f in malloc () from /lib/libc.so.7 #9 0x00080127374b in memchr () from /lib/libc.so.7 #10 0x00080125e6e9 in __srget () from /lib/libc.so.7 #11 0x0008012352dd in vsscanf () from /lib/libc.so.7 #12 0x000801220087 in fscanf () from /lib/libc.so.7 This trace resulted from a call to fscanf, as follows: char buffer[21]; fscanf(in, %20s, buffer); We've verified that the data being read was correct, and clearly the buffer in which fscanf is storing the string it reads is valid (i.e., it's not NULL). So what would lead this fscanf() call into calling abort()? Everything seems to be in order. What's more puzzling to us is that we've looked for calls to malloc_usable_size() in the libc sources and although the function is defined we can find no direct call to the function in our FBSD 8 sources: $ grep -R 'malloc_usable_size' *|grep -v .svn libc/stdlib/Symbol.map: malloc_usable_size; libc/stdlib/Makefile.inc: malloc.3 realloc.3 malloc.3 reallocf.3 malloc.3 malloc_usable_size.3 libc/stdlib/malloc.c:malloc_usable_size(const void *ptr) That's it. Nothing calls this function from what we can tell. Even if something did call it, we don't understand why it would call abort(). It has an assert: malloc_usable_size(const void *ptr) { assert(ptr != NULL); return (isalloc(ptr)); } but the pointer we pass to fscanf() is clearly not NULL, so what pointer would this function be testing? It's all very puzzling and we cannot reproduce this failure. We'd like to understand what happened though. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: Puzzling stack trace
Type frame 9 and see what it says. If the bug is easily reproducable, try reproducing it with a debugging version of libc (buildworld with DEBUG_FLAGS=-g) This crash happened at a production customer site--we have the core and the matching binary and our logs for the application that crashed but that's all. We've never seen this particular crash before and cannot reproduce it. The fscanf() call that failed is repeated on a continual basis as part of a monitoring thread, so literally thousands of this exact same call have been made without incident. The frame 9 command doesn't show anything useful: (gdb) frame 9 #9 0x00080127374b in memchr () from /lib/libc.so.7 That's it. And yes, the stack trace appears to be wrong. Even the trace starting from the vsscanf call is wrong. It says that __srget() is the next function in the stack but vsscanf() doesn't call __srget(): int vsscanf(const char * __restrict str, const char * __restrict fmt, __va_list ap) { FILE f; f._file = -1; f._flags = __SRD; f._bf._base = f._p = (unsigned char *)str; f._bf._size = f._r = strlen(str); f._read = eofread; f._ub._base = NULL; f._lb._base = NULL; f._orientation = 0; memset(f._mbstate, 0, sizeof(mbstate_t)); return (__svfscanf(f, fmt, ap)); } So it seems our application went completely out to lunch. This is concerning. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: Puzzling stack trace
Are you absolutely sure the machine you ran gdb on has the exact same libc etc. as the customer's machine? I just connected to the customer's box and generated the stack trace directly on their box. It looks identical to the one I posted in my original message. Something's not right here... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: Puzzling stack trace
Also, you should see if __svfscanf() calls __srget(). The __svfscanf() call frame may not show up in gdb if the compiler re-used the callframe from vsscanf for __svfscanf() as an optimization. I just checked--it does not call __srget()... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: Puzzling stack trace
As stated in a earlier message. This may help get the information you need. Just more of a automated approach to compiling these. Thanks for the script; I'll definitely archive it. Unfortunately, our window for investigating this problem further is over as this customer is upgrading their systems today and the OS is getting wiped... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
So this is arguably a Python bug. Did you contacted anybody who cares about the Python ? I did not, mainly because this link: http://bugs.python.org/msg61870 seems to imply they are already aware of the problem. I agree it must be a Python bug though. It worked in 2.5.1 but not in 2.5.5 and later, so clearly they changed how processes are launched from threads that has led to this problem. One should not have to be forced to make explicit calls to change the signal mask in order to launch an external app. Granted, we've only had this issue with ntpd--other apps launch fine--but there is clearly something wrong somewhere for even one app to hang when it is spawned as a thread. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
I think problem not in ntpd, since I use ntpdate. And in 50% times, when it run from startup script, it hangs with kernel. No Ctrl+C work, kernel don`t answer for ping, just freeze. Problem somewhere in kernel, maybe in subsystems that set new time, maybe in network(UDP) parts. This problem don`t affect other programs, so I think this in time handling code. I think you may be describing a different problem. For one thing, we don't use ntpdate, we use the ntpd -g -q alternative. Secondly, for us ntpd is hanging 100% of the time when run via a Python thread class. The exception is Python 2.5.1; this succeeds 100% of the time. Peter, what platform You use? I use MIPS BCM5354. We have a variety of 1U and 3U boxes. They all hang the same way. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
Very wild guess, check the process signal mask of the child for both methods of spawning. I'm running ntpd through Python. How do I check the process signal mask? I did some quick searches and it seems Python does not support sigprocmask(). In my searches I came across this link: http://bugs.python.org/msg61870 I think you might be right that this is related to the signal mask. In my scenario the select call is hanging indefinitely, just like discussed in this article. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
We'll likely go with this solution instead of downgrading Python and the related libraries. In fact I came up with another solution. I realized that since the problem was related to the process signal mask, instead of called ntpd directly, wrap it up in a C app that resets the signal mask to something that works. I have the following code: sigset_t set, oset; sigemptyset(set); pthread_sigmask(SIG_SETMASK, set, oset); system(/usr/sbin/ntpd -g -q); pthread_sigmask(SIG_SETMASK, oset, NULL); I wrapped this up into a standalone app and call this from Python instead of calling ntpd directly. This solved the problem--no more hang. Thanks very much to Kostik Belousov for his wild guess that this was related to the process signal mask. His guess was dead on. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
You're going to need a debug version of libc, too. gdb won't be able to find a backtrace out of a libc function without it. What's the proper way to build a debug version of libc and the other libraries? I tried this: export CFLAGS=-O0 make buildworld make installworld DESTDIR=/mydir and then copied libc.so.7 from /mydir/lib to the /lib dir on my target system. I also replaced the ntpd binary with the debug version. I can see that -O0 is being used in the various cc commands that are generated, but libc still doesn't seem to be built properly. When I attach to a hung ntpd process, I get this: # gdb /usr/sbin/ntpd -p 2113 GNU gdb 6.1.1 [FreeBSD] ... Attaching to program: /usr/sbin/ntpd, process 2113 Reading symbols from /lib/libm.so.5...(no debugging symbols found)...done. ... Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done. Loaded symbols for /lib/libc.so.7 Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done. ... [Switching to Thread 8012041c0 (LWP 100283)] 0x000800dbeddc in select () from /lib/libc.so.7 (gdb) bt #0 0x000800dbeddc in select () from /lib/libc.so.7 #1 0x004335de in ntpdmain () #2 0x0043310b in main () So I'm getting some symbols from ntpd but I still can't see into select(). It hangs in there forever so that's where I need to drill down further. How do I get libc built with full debug symbols? In other testing I've narrowed the problem down to some kind of Python issue. If I run the Python code at the end of this email where ntpd -g -q is launched as part of a Python thread class, the command hangs (the code assumes that ntpd is not already running). If I run the same ntpd command in a normal function (e.g. main) no hang occurs. I've tried subcommand.Popen and os.spawnv to run ntpd and these calls behave exactly the same way--when called from a thread the ntpd process hangs but it works fine when called from outside of a thread. This is a breakdown of course of our larger project into a simple test app. In our real code we cannot so easily eliminate this thread wrapper. The same code BTW works fine on our FreeBSD 7 boxes, the main difference being we are running an older version of Python on those boxes (2.5.1 instead of 2.6.2). I tried installing the same 2.5.1 package on a FBSD 8 box and that solved the problem. Curiously a slightly newer FBSD 7 version of Python, 2.5.5, causes the same hang to occur. So only Python 2.5.1 built under FreeBSD 7 works to get around this issue with ntpd on FreeBSD 8. That means one potential solution is to downgrade to this 2.5.1, but we have other libraries targeted to work with Python 2.6 and we don't really want to downgrade all these associated libraries. If anyone has any clues at all as to what is causing this issue, I'd appreciate the feedback. Here's the code that reproduces this behavior. #! /usr/bin/env python import os import threading class RunProc(threading.Thread): def __init__(self, cmd): threading.Thread.__init__(self) self.cmd = cmd def run(self): os.system(self.cmd) def main(): RunProc(/usr/sbin/ntpd -g -q).start() if __name__ == __main__: main() ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
What's the proper way to build a debug version of libc and the other libraries? I tried this: You can just do this: cd /usr/src/lib/libc make clean make DEBUG_FLAGS=-g make install When I tried this the make actually failed with various errors. So I decided to do a full make buildworld DEBUG_FLAGS=-g but in looking at the output being generated I see see -O2 in the cc commands and this at least should be -O0. It doesn't look like the DEBUG_FLAGS is having any effect. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
How do I get libc built with full debug symbols? I haven't tried it by myself but think here is the way to go: put the following to /etc/make.conf and recompile needed libraries / ports. WITH_DEBUG=yes DEBUG_FLAGS=-g That didn't seem to have any effect. I still see -O2 being used instead of -O0. Mmm... Do other daemons (sshd, lpd, ...) also fail when started through this script? Normal commands (ls, ps) seem not affected. I tried a few other things and they all seemed to run correctly. We use this same general approach in the full version of this script to launch lots of applications. Its role in fact is a process launcher/monitor. I stripped it down to the bare minimum in order to isolate the cause of the problem. It seems that only ntpd hangs, but not if I use Python 2.5.1. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
I bet ntpd doesn't call select() in all that many places. Instead of going to all this trouble to build a debugging libc, you could just grep for select() and place breakpoints on all occurrences. (It might also be obvious from looking at them which one is the offender.) I just checked--there are five calls to select. I might flag each one with a printf or something and recompile to see which one is the culprit. Also, since a system call is causing the trouble, you might learn something from truss or ktrace. I'll check these out... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
make install should be done with DEBUG_FLAGS containing -g too, otherwise strip(1) is called on the installed binary. Doh, yes. I did not do this; that's likely my problem. Thanks. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
Just out of curiosity, can you attach to the process via gdb and get a backtrace? This smells like a locked pthread_join I hit in my own code a few weeks ago I'm not using the debug version of ntpd so the backtrace isn't too useful, but here's what I get: (gdb) bt #0 0x000800d52bfc in select () from /lib/libc.so.7 #1 0x00425273 in ?? () #2 0x0040540e in ?? () #3 0x00080058 in ?? () #4 0x in ?? () The trace continues for 700+ entries. The first entry is useful enough though. One of the parameters to select() is a timeout parameter. Every time I do the backtrace it's stuck on this select call so it seems they have an infinite timeout set. One of these was running all weekend in fact and it's still stuck. Curiously, this problem only happens when we make the call from code via a system() call. If I run the same command interactively, it never hangs: # /usr/sbin/ntpd -g -q ntpd: time set +28845.997063s The same code that runs this command does not hang when we run it on a BSD 7 box. I think I'm going to have to build the debug version of ntpd and try to debug it. Definitely something weird going on. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: ntpd hangs under FBSD 8
You're going to need a debug version of libc, too. gdb won't be able to find a backtrace out of a libc function without it. Yeah, you're right. This is definitely an annoying bug... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
ntpd hangs under FBSD 8
I posted this originally on the -questions list but did not make any headway. We have an application where the user can change the date/time via a GUI. One of the options the user has is to specify that the time is to be synced using ntp. Our coding worked fine under BSD 7 but since we've moved to BSD 8 we've encountered a problem where the command that we initiate from the GUI: ntpd -g -q to perform the initial time sync is hanging indefinitely. Logs we've captured do not give any clues. This is the log from a BSD 7 system produced when this ntpd command is run: 17 Feb 06:35:36 ntpd[3578]: logging to file /var/log/ntpd.log 17 Feb 06:35:36 ntpd[3578]: ntpd 4.2.0-a Sun Feb 24 09:12:07 UTC 2008 (1) 17 Feb 06:35:36 ntpd[3578]: precision = 1.676 usec 17 Feb 06:35:36 ntpd[3578]: kernel time sync status 2040 17 Feb 06:35:36 ntpd[3578]: frequency initialized -10.706 PPM from /var/db/ntpd.drift 17 Feb 06:35:45 ntpd[3578]: synchronized to 198.186.191.229, stratum=2 17 Feb 06:35:45 ntpd[3578]: time slew +0.003648 s and this is the log from a BSD 8 system: 17 Feb 06:35:36 ntpd[2293]: logging to file /var/log/ntpd.log 17 Feb 06:35:36 ntpd[2293]: precision = 1.676 usec 17 Feb 06:35:36 ntpd[2293]: Listening on interface #0 wildcard, 0.0.0.0#123 Disabled 17 Feb 06:35:36 ntpd[2293]: Listening on interface #1 wildcard, ::#123 Disabled 17 Feb 06:35:36 ntpd[2293]: Listening on interface #2 nic0, fe80::2a0:d1ff:fee3:53cc#123 Enabled 17 Feb 06:35:36 ntpd[2293]: Listening on interface #3 nic1, fe80::2a0:d1ff:fee3:53cd#123 Enabled 17 Feb 06:35:36 ntpd[2293]: Listening on interface #4 lo0, fe80::1#123 Enabled 17 Feb 06:35:36 ntpd[2293]: Listening on interface #5 lo0, ::1#123 Enabled 17 Feb 06:35:36 ntpd[2293]: Listening on interface #6 lo0, 127.0.0.1#123 Enabled 17 Feb 06:35:36 ntpd[2293]: Listening on interface #7 lagg0, 192.168.17.46#123 Enabled 17 Feb 06:35:36 ntpd[2293]: Listening on routing socket on fd #29 for interface updates 17 Feb 06:35:36 ntpd[2293]: kernel time sync status 2040 17 Feb 06:35:36 ntpd[2293]: frequency initialized -10.706 PPM from /var/db/ntpd.drift It never gets past this last log line and we have to do a kill -9 on the ntpd process. The ntp.conf file we're using is # General Configuration server 0.us.pool.ntp.org server 1.us.pool.ntp.org server 2.us.pool.ntp.org server 3.us.pool.ntp.org # Drift file driftfile /var/db/ntpd.drift The versions of the two ntpd binaries are different--4.2.0-a for FBSD 7 and 4.2.4p5 for FBSD 8. Someone suggested that I try the command: ntpq -pc rv localhost But I'm not sure how to interpret the output. On a FBSD 7 system I get this: remote refid st t when poll reach delay offset jitter == +169.229.70.183 169.229.128.214 3 u 40 512 377.9219.170 8.836 *208.75.88.4 192.12.19.20 2 u 43 512 37 12.0498.224 8.168 +217.160.254.116 209.51.161.238 2 u 38 512 37 55.111 -7.128 10.347 +198.247.173.220 128.206.12.130 3 u 39 512 37 47.401 -1.149 3.659 status=c624 sync_alarm, sync_ntp, 2 events, event_peer/strat_chg, version=ntpd 4.2.0-a Sun Feb 24 09:12:07 UTC 2008 (1), processor=amd64, system=FreeBSD/7.0-RELEASE-p9, leap=11, stratum=16, precision=-20, rootdelay=0.000, rootdispersion=8.340, peer=25349, refid=INIT, reftime=. Wed, Feb 6 2036 22:28:16.000, poll=4, clock=cf26c2d5.ea2b4541 Wed, Feb 17 2010 11:32:37.914, state=1, offset=0.000, frequency=-13.269, jitter=0.001, stability=0.000 and on a FBSD 8 system I get this: remote refid st t when poll reach delay offset jitter == assID=0 status=c011 sync_alarm, sync_unspec, 1 event, event_restart, version=ntpd 4.2.4p5-a (1), processor=amd64, system=FreeBSD/8.0-CURRENT, leap=11, stratum=16, precision=-19, rootdelay=0.000, rootdispersion=0.000, peer=0, refid=INIT, reftime=. Wed, Feb 6 2036 22:28:16.000, poll=6, clock=cf26c4d1.d21b33f1 Wed, Feb 17 2010 11:41:05.820, state=1, offset=0.000, frequency=-14.299, jitter=0.002, noise=0.002, stability=0.000, tai=0 169.229.70.183 .INIT. 16 u- 6400.0000.000 0.002 208.75.88.4 .INIT. 16 u- 6400.0000.000 0.002 217.160.254.116 .INIT. 16 u- 6400.0000.000 0.002 198.137.202.16 .INIT. 16 u- 6400.0000.000 0.002 In the case of the FBSD8 output, I collected this while one of these hangs was happening. The most obvious difference is the .INIT. entries, but there also appear to be several 0.0 type of entries that look like the ntpd process is stuck in some kind of initialization state. Anyone have any ideas what's going on here? ___
RE: How can I force boot from alternate drive with boot.config?
So, more precisely, if I wanted to boot from drive 1, I'd use this? 1:ad(1p3)/boot/loader Yes, unless there are more bugs hiding. :-) I fixed a few in August last year. Well, I'll give it a try and let you know if I find new bugs... :-) I just tried this and it works as advertised--thanks. One question though: Why does this string list the device number twice? The man page describes it as bios_drive:interface(unit,[slice,]part)filename where bios_drive is the drive number as recognized by the BIOS. 0 for the first drive, 1 for the second drive, etc., and unit is the unit number of the drive on the interface being used. 0 for the first drive, 1 for the second drive, etc. This sounds like it's describing the same thing, but not exactly, but I've always used the same value in both fields and it's always worked. Is there a case where these values might be different? In the test I just did I booted from the fourth drive of a four drive system using 3:ad(3p4)/boot/loader I know my hardware and knew ad10 mapped to the fourth drive and would be referenced as drive 3 in this context. But how would I determine this generically? For example, given something like /dev/adN, how do I know what number I'd use for this drive in boot.config? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
How can I force boot from alternate drive with boot.config?
I've asked this on the -questions list but haven't had any feedback. I have a system configured with multiple identical drives each loaded with FreeBSD. When I was using MBR partitioning, I could create a boot.config to force the system to boot from a specific drive. For example, if I wanted to boot from the second drive, I'd create a boot.config with: 1:ad(1,a)/boot/loader We've switched to GPT partitioning and I can't seem to find a way to do this same trick. The boot loader only seems to recognize MBR partitions when it comes to this feature. I looked at the boot.c source code and there doesn't seem to be anything specifically related to GPT partitioning. I cannot for example say something like: 1:ad(1,p3)/boot/loader where p3 is the root partition in my GPT partitioned drives. So I'm puzzled: If I have a two drive system with BSD loaded on both drives and the drives are configured with GPT partitions, how can I force the system to boot from the second drive using boot.config? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: How can I force boot from alternate drive with boot.config?
I use: ad(0p3)/boot/loader So, more precisely, if I wanted to boot from drive 1, I'd use this? 1:ad(1p3)/boot/loader ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Converting a bootable USB stick in to bootable CD-ROM
I posted this on the -questions list but didn't get any replies. I have a FreeBSD image that I install on USB sticks to build new systems. When the stick boots it automatically clones itself on the system's hard drive, creating partitions and other configuration parameters that are programmed into the stick's cloning logic. I want to create a similar mechanism using a bootable CD-ROM. The biggest difference in the process of course is that the CD-ROM itself is read-only so clearly there needs to be an mfsroot involved in the process. I looked at how the FreeBSD Live CD is setup and the loader.conf file has these lines: mfsroot_load=YES mfsroot_type=mfs_root mfsroot_name=/boot/mfsroot along with the file /boot/mfsroot.gz and no /etc/fstab. The fstab on my USB stick version has root mounted as /etc/da0s1a and clearly that isn't going to work. I changed my core BSD image accordingly, duplicating the mfsroot settings in my loader.conf. I used the command below to create the iso file from the BSD image I prepared. mkisofs -R -no-emul-boot -o /tmp/bsd.iso -b boot/cdboot /bsd When this iso is copied to a CD, it does boot. However, it doesn't seem to be picking up the mfsroot config and complains that the system is running from on a read-only file system, which of course is what I'm trying to avoid. I assume I simply have the boot config setup wrong. I essentially want the same kind of thing that's done for BSD Live. Can anyone point me to the right info for setting up this kind of bootable BSD CD? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
How to signal a time zone change?
We have a suite of applications with a Java GUI controlling everything. One of the actions the user can perform is to set the time zone. We do this through our Java application and update the /etc/localtime as required. We also make an API call to tell the JVM that the time zone as changed, and from the perspective of the Java app, the time zone is changed correctly (the timestamps for example in our log files reflect the change). Likewise, after the user performs this action, running date on one of our systems shows that the time zone has been changed as requested. The problem is with our C applications. They continue to operate with the old time zone, so things like timestamps in log files are not in sync with the timestamps in the Java app log files. If we stop and restart the C apps they pick up the time zone change. However, we don't want to take this extreme approach. We want the Java app to signal to the C applications that the time zone has changed. However, I've experimented with the various time zone related calls and I cannot figure out what call is needed to make the C applications pick up the time zone change. I've tried setting the environment variable TZ to the new time zone and this doesn't seem to work, and I've tried calling tzset() and tzsetwall(). In each case after I make these calls the function localtime() does not return the same time base as the Java application. Based on what I've read, I would think that the following steps would do the trick on the C side after the Java app changes time zone and updates /etc/localtime: time_t date = time(NULL); unsetenv(TZ); tzset(); printf(time zone is %s/%s, tzname[0], tzname[1]); struct tm* locTime = localtime(date); printf(%02d:%02d:%02d, locTime-tm_hour, locTime-tm_min, locTime-tm_sec); The time printed in this example however is still based on the old time zone. The tzname variable that is set by tzset() still shows for example EDT even if I have just changed the time zone to PDT. If I stop and restart the C app, the time is correct, and tzname is then PDT instead of EDT. I'm very puzzled on what I'm supposed to do to kick start the time zone change in C. We do not want to have to restart our C apps for something as trivial as this. I posted this originally to the questions list but didn't get much traction. I'm hoping someone on this list can point me in the right direction. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: How to signal a time zone change?
You need to signal your app in some way.. Assuming you have source for the app then you can monitor /etc/localtime (or /etc) for change with kevent. Signaling our C apps aren't the problem. We have an IPC framework in place and we can easily tell the C apps when the user has changed the time zone via the GUI. The problem is I can't figure out what C calls are needed to instantiate the time zone change. Based on the documentation, I would think that tzset() would do the trick once /etc/localtime has been updated by the Java app, but this does not work. The only way I've discovered that works is to restart our C apps and we want to avoid that. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: How to signal a time zone change?
What's the value of the TZ environment variable for the C apps? You may need to have them read the new value from somewhere, and then rerun tzset(). The default value of the TZ environment variable is null. I just tried passing the explicitly time zone value to the C app and setting TZ to that value and that seemed to work. I would think that that I should be able to retrieve that value from /etc/localtime as the docs imply. Guess not. If I have to pass the time zone to the C app, then I guess that's what I'll do... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Number of open files per process
Is it possible to determine the number of open files per process? We want to monitor this via a separate process and issue an alarm if some threshold is crossed. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
WARNING: Expected rawoffset 0, found 63?
I posted this on the questions list but didn't get a lot of traction. I've created GEOM mirrored file systems on two slices of my system's drives and everything seems to be working, but I get the warning s WARNING: Expected rawoffset 0, found 63 WARNING: Expected rawoffset 0, found 50332464 when the mirrors are being created. These correspond to the offsets for these slices in the partition table: # fdisk -p ad4 # /dev/ad4 g c484521 h16 s63 p 1 0xa5 63 50332401 a 1 p 2 0xa5 50332464 16778160 p 3 0xa5 67110624 421285536 Partition three is not mirror, just partitions 1 and 2. I use the following command to create the slice 1 mirror: gmirror label -v -n -b round-robin mirror-name drive-names1 and a similar one for slice 2. Additional drives are added to this mirror after the data has been copied to the mirrored file systems. The disks are setup with the required labels, including making sure the c partition is reduced in size by one sector. E.g.: # bsdlabel ad4s1 # /dev/ad4s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 10485760 16 4.2BSD 2048 16384 28528 c: 50332400 0 unused 0 0 # raw part, don't edit d: 8388608 10485776 4.2BSD 2048 16384 28528 e: 31457280 18874384 4.2BSD 2048 16384 28528 bsdlabel: partition c doesn't cover the whole unit! bsdlabel: An incorrect partition c may cause problems for standard system utilities # bsdlabel ad4s2 # /dev/ad4s2: 8 partitions: # size offset fstype [fsize bsize bps/cpg] b: 16778143 16 swap c: 16778159 0 unused 0 0 # raw part, don't edit bsdlabel: partition c doesn't cover the whole unit! bsdlabel: An incorrect partition c may cause problems for standard system utilities So as far as I can tell I have everything configured the way it should be and everything appears to be working fine, but these warnings worry me. Should I be worried? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: How to tear down a geom mirror?
Or simply use the clean command, for example gmirror clean (also supported in other GEOM classes). Can I do a gmirror clean without first doing a gmirror load? That's what I want to avoid since it can hang if the mirror is is a bad state. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: How to tear down a geom mirror?
gmirror and various other geom modules store their metadata on the last sector(s) of the drive, so you need to wipe that too. In our case the systems we are using aren't mirroring the whole drive, just certain slices. Some systems have a single slice mirrored (plus an unmirrored slice), and others have two slices mirrored (plus a third unmirrored slice). I need a way to destroy the existing mirrors, without doing a gmirror load, and ultimately without making any assumptions about the number or condition of mirrored slices on the drives I am about to install a new OS onto. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: How to tear down a geom mirror?
Yes. The clear commands usually just zero-out the last sector of the underlying provider (doesn't matter if it's a drive, slice or something altogether different) so you don't have to do it manually. So, as a generic solution then I could just iterate through all slices of all drives and run gmirror clear on each, and run dd to clear the first sectors. What btw is in these first sectors? I use this command because I saw it being done in one of the gmirror tutorials. I understand what the gmirror clear command does, but what is the dd command clearing? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
Re: How to tear down a geom mirror?
Okay, thanks everyone for their feedback. I think I have a workable solution now. Peter - Original Message - From: Oliver Fromme o...@lurza.secnetix.de To: freebsd-hackers@FreeBSD.ORG, pste...@maxiscale.com Sent: Friday, March 6, 2009 11:15:11 AM GMT -08:00 US/Canada Pacific Subject: Re: How to tear down a geom mirror? Peter Steele wrote: Yes. The clear commands usually just zero-out the last sector of the underlying provider (doesn't matter if it's a drive, slice or something altogether different) so you don't have to do it manually. So, as a generic solution then I could just iterate through all slices of all drives and run gmirror clear on each, and run dd to clear the first sectors. What btw is in these first sectors? I use this command because I saw it being done in one of the gmirror tutorials. I understand what the gmirror clear command does, but what is the dd command clearing? It clears the MBR (slice table) and GPT or disklabel (partition table), if any. Depending on how many sectors you clear, it will also destroy the beginning the file system, e.g. the first UFS superblock. By the way, if you cannot use gmirror clear for any reason, you can also easily clear the last sector on any devices using the information from diskinfo. For example: DEV=/dev/ad0s1a set -- $(diskinfo $DEV) BLOCKSIZE=$2 MEDIASIZE=$4 LASTSEC=$(( $MEDIASIZE - 1 )) dd if=/dev/zero of=$DEV bs=$BLOCKSIZE seek=$(( $MEDIASIZE - 1 )) count=1 That's pretty much what gmirror clear /dev/ad0s1a does. Best regards Oliver -- Oliver Fromme, secnetix GmbH Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd One of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs. -- Robert Firth ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
How to tear down a geom mirror?
I posed this question in the questions list but didn't get any traction. Hopefully someone here will have an answer. I've created a USB boot disk that is used to clone itself onto the systems hard drives, setting up mirrored file systems in the process. The main difficulty I'm having is reimaging a system with an existing OS whose drives are already configured in a mirror. I want of course to destroy the mirror and create a complete new one, but I can't find the right process to accomplish this reliably. I don't want to make any assumptions about what mirrors might exist already and I definitely don't want to do gmirror load before I get a chance to destroy any existing mirrors. What I am doing is to clean the drive using dd. For example, assume my target system has two drives ad1 and ad2. I issue the following commands: dd if=/dev/zero of=/dev/ad1 bs=512 count=79 dd if=/dev/zero of=/dev/ad2 bs=512 count=79 I'm assuming this is enough to destroy any existing mirrors on the target drives, and I do this before the geom driver is loaded. After this, I partition the drives as I want them, and then create the mirrored pair: gmirror load gmirror label -v -n -b round-robin gm0 ad1s1 gmirror insert gm0 ad2s1 This process works exactly as I want it if the system that is being reimaged has existing mirrors. However, if the drives were previously participating in a mirror, the label command fails, reporting the following error: gmirror: Can't store metadata on ad1s1: Operation not permitted. If I make sure the existing mirrors are torn down first doing an remove operation instead of using the dd method, this can solve the problem, but in some cases the mirror on the target system is in a suspect state and I've seen the gmirror load command hang idefiinitely. So I don't want to do a load command before I destroy the old mirrors, but I can't seem to find a way to reliably destroy the old mirrors. Can anyone suggest a way to do this? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org
RE: FreeBSD boot menu is missing
Mirroring the entire slice is far simpler. If you mirror individual partitions, you have to label them *before* you newfs them. What we're really trying to accomplish is an automated install via a PXE boot server. Unfortunately gmirror isn't available in mfsroot at the point the file systems need to be set up. So what we've ended up doing is doing is what amounts to a bootstrap install on the first disk, and then after the installCommit is done, gmirror is available and we have a post install script that runs gmirror on the other drives. Then the script copies the OS slice over to the gmirrored fs, reboots to this mirrored system, and finally adds the original disk to the mirror. It's fully automated and gives us a mirrored OS slice across four drives, and we even handle drives of different sizes. I would mirror the whole drive, though We can't do that. The data on the non-mirrored portion is different on each drive and we don't want it mirrored. - and I would use ZFS, with which you can easily transition to larger drives (just replace them one by one and resilver in between - you can even do it online if your disks are hot-swappable) FreeBSD doesn't handle hot swap very well we've discovered, not unless you are using a RAID based backplane and drives. We cannot use RAID in our application, and don't in fact want to. We're still trying to figure out how to deal with drive removal in a live non-RAIDed system. We plan to move to ZFS but we are too close to a release cycle to make the move now (QA would have to run through weeks of testing). ZFS will happen, though, sooner or later. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: FreeBSD boot menu is missing
So what you do, instead, is make sure there is a little space left over at the end of the slice that you create in the first step. Then, once gmirror is available, you gmirror label the slice, then gmirror insert the corresponding slice on the other disk(s), and gmirror rebuild. No copying involved; gmirror takes care of it all. The key here is that 'gmirror label' is non-destructive as long as the last sector on the provider is unused. The problem is I was unable to get multiple slices defined in a sysinstall config script. I tried many variations of parameters to pump into diskPartitionEditor and diskLabelEditor so that we could create three slices during the install but I couldn't find anything that worked. So I ended up having to create a single full disk slice to install the OS onto, and then in a post commit step slice the disks up as we want them and copy the OS over. I couldn't find a single example how to create multiple slices in a sysinstall config file. If you know how to do this, I'd love to see it. It does, AFAIK, even on SATA, provided the controller supports it and is configured correctly. With the proper controller and drive, yes, FreeBSD does support hot swap, to a point. Let's say for example that you have a file system mounted on a drive and that drive dies. You can pull it and put in a new one, but FreeBSD will not let you unmount the file system on the original drive. Even umount -f fails. We have to reboot to get the old mount point released, and we haven't found any way around this. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: FreeBSD boot menu is missing
I wouldn't use a sysinstall script. Yeah, I should probably have done it that way but I inherited the existing sysinstall framework from someone else and ended up extending it to use gmirror. I know more about this area now and I'd like to redo the whole thing, avoiding sysinstall. That will have to be a future project though. That's an entirely different matter... that's why you use gmirror or graid or zfs or whatever, so you can swap out the drive online. RAID is not an option for us, at least not for this particular problem. Long story. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
FreeBSD boot menu is missing
I have a procedure for converting a FreeBSD box to use a mirrored slice for the OS. Everything working fine except that after I've made the conversion I am no longer getting the normal boot menu, the one that counts down 10 seconds waiting for the user to pick on option. I see a single line showing that the BTX 1.01 loader has been launched, but from there the system simply boots directly with no menu being displayed. I'm obviously missing a step when using gmirror to convert a system over to use mirroring but I'm not sure what. My basic approach is to install the OS onto the first drive, setting it to use the standard boot manager, and then setup the second drive using gmirror and copy the file systems over to the mirror. I then set boot.config to boot off this drive and it comes up fine, there just isn't any boot menu. Any advice on how to solve this would be appreciated. Thanks. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: FreeBSD boot menu is missing
The phrase and copy the file systems over to the mirror worries me. Do you actually copy the file systems, or do you let the mirror system do it for you? In particular, are you mirroring file systems or the entire disk? Because the boot blocks aren't part of any file system, so you won't have copied them over, hence you'll be getting whatever boot software the second drive has installed. I'm more or less using the approach described here: http://people.freebsd.org/~rse/mirror/ This assumes you have an existing OS installed on one drive of a multi-drive system. You then use gmirror to create mirror devices on a second drive to match the partitions of the boot drive, transfer the data to the newly established mirror, adjust /etc/fstab on the mirrored root partition to mount the appropriate mirrored devices, then reboot, telling the boot loader to boot from the mirrored drive instead of the original boot drive (via an entry in boot.config). After it comes up, you can then add the original boot drive to the mirror (and any other drive if there are more than two drives that you want to mirror) using gmirror insert. This all works fine, except I'm not getting the boot menu. I know this isn't part of the mirroring, but it is a step I need to perform as part of the whole process. The question is what do I need to do to make sure the appropriate boot loader is setup? My recommendation for gmirror is to set up one drive to boot from, then us gmirror label to create a gmirror device on each partition (excluding swap). Edit /etc/fstab to use the gmirror devices thus created, and reboot to make sure it's working properly. It will initially boot from the disk device (pretty much required until gmirror is started), then switch to the mirrored root partition. Now use gmirror insert to add the matching partitions on the second disk, and let gmirror update the bits on the second drive. You'll need to copy the boot blocks from the first drive to the second drive by hand if you want to boot off the second drive. I think you are describing more or less the same process here. FWIW, these days I use ZFS on 64 bit systems in preference to UFS and gmirror. We plan to switch our application over to ZFS, but not this close to a release. Final comment: if you didn't ask on -questions first, this would have been more appropriate there than here. My bad. I'm new in this arena and didn't know where the appropriate place to post. I'm use -questions in the future. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: FreeBSD boot menu is missing
He had you install a stock MBR on the second disk. You never copied the boot loader from the first disk, so that's what you're going to use when you boot from the second disk. You need to install the boot block you want on the second disk. Which probably means boot0. boot0cfg will do that for you. You probably want boot0cfg -B -s 1 diskdevice # The device - ad1, not the slice! Okay, that makes sense. That's an easy change to my script. Um, no. He reduced the size of one partition because he's overly paranoid about gmirror failing to recognize the providers properly, which forces him to dump and restore one partition - which leads to doing them all to get them on one disk. If you don't need to resize the partitions, you can just labelling the disk you're already using. Once you've done that, you can gmirror insert the second drive into the mirror, and it will resilver the second drive while providing full access to the first one. No need to copy any data at all. Man, I wish I'd known this. I built a whole automated framework around this, assuming you couldn't set up the initial mirror drive with a live file system. I'll have to try your solution; it is definitely the way to go. We are dealing with identical size drives as well so this shouldn't be a problem. His analysis of the choices is pretty shallow as well. He lets wanting to use different-sized disks dominate the analysis, which is great if you're building your mirror with disks from the parts bin. I tend to by drives to pairs if I want to mirror them, so that's immaterial. Once that's gone, mirroring a full disk slice just doesn't make sense at all - either mirror the entire disk (to get the MBR), or mirror the partitions in the slice (for extra flexibility and less painful resilvering). We don't want to mirror the whole drive, just the OS partitions. I decided to go with the full slice mirroring because of what was described in this link. If mirroring the partitions in the slice is the better way to go, then that's fine my me. Better instructions for getting a full-disk mirror can be found here: http://www.onlamp.com/pub/a/bsd/2005/11/10/FreeBSD_Basics.html I look forward to reading this. Thanks for the help! ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
Hot swapping SATA drives
I've done some searches regarding FreeBSD's 7 support for the hot swapping of SATA drives and the general consensus appears that it *is* supported, but not necessarily with all drive models/brands. In our own testing, we've discovered that our Seagate 250GB drives cannot be hot swapped in our servers. The system appears to sense when they are removed but not when they are reinserted, and we've had numerous panics experimenting with them. We also have some Western Digital drives, and these fare much better. FreeBSD appears to recognize when these drives are removed and inserted. If we have a WD configured as part of a geom mirror, the geom driver automatically re-inserts a previously configured drive as soon as it is plugged in. It isn't even necessary to do an atacontrol attach/detach. However, even with the Western Digital drive, there are issues. In particular, if there are any mounted file systems on a drive when it is removed, attempting to unmount the file systems after it has been removed usually leads to a kernel panic, not necessarily immediately but shortly afterwards. I've tried the latest 7.0 patch level, p6, and the panics appear to have been fixed, but there are still problems. If a drive dies on us, we want to be able to close existing file handles and allow the new drive to take over. But what we've experienced is that even a umount -f will not umount a file system if the drive has been pulled. And as I type this, I have a system in the lab that is completely frozen after a drive pull test. No panic, no reboot, it's just hung up solid. Why does FreeBSD panic/freeze instead of simply issuing an I/O error, and why is there no way to force open file handles to close when a drive is pulled. The implication is that if a drive was to suddenly die on a live system, even if we have gmirror configured for HA, the system will likely panic or freeze and we'll have to reboot. We have software that detects when a drive disappears, but if the system is going to end up having to be rebooted, our detection code isn't going to do us much good. Is there any solution to this? Can a server be built around FreeBSD that supports hot swappable SATA drives? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: Hot swapping SATA drives
Use a real hot-swappable drive plane, attached to a good SATA controller that handles hot-swap in hardware? :) Use ZFS, which seems to work better with drives being added/removed than ata(4)? :) Sorry, the few systems we have running FreeBSD either have single IDE drives, single SATA drives, or 12-24 SATA drives attached to a hardware hot-swappable drive-plane connected to 3Ware 9550/9650 RAID controllers. The single-drive systems obviously can't do swapping, and the rest work without issues. I should further clarified that we are running 4-drive systems, with drive sizes ranging from 250GB-1TB. These drives are not in a RAID cluster and we do not want them to be. We do need the drives to be hot swappable though. I'll contact 3Ware and go from there. Thanks for the reply. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: What are proper install.cfg for configuring multiple slices?
I want to do an automated sysinstall through an install.cfg script and the script partition the install disk into three slices. I've been going through various tests trying to figure out what the proper directives are but I haven't had much luck, and I can't find any good examples. After a lot of experimenting, my impression is that sysinstall simply doesn't support multiple slice installations. It works to a point, but I get some unexpected errors, e.g. Unable to make device node for /dev/ad0s1a in /dev and after the partitioning is complete, there are funky entries under /dev: /dev/ad0c /dev/ad0cs1 /dev/ad0cs2 /dev/ad0cs3 /dev/ad0s1 /dev/ad0s2 /dev/ad0s3 There should be entries such as /dev/ad0s1a and so on, but these do not get created. I've been unable to find even one example of how to formulate multiple partitions in install.cfg, but I'm pretty sure I'm doing it right, based on the sysinstall docs. Does anyone have any experience with this? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: How can I add new binaries to the mfsroot image?
I believe you modify /usr/src/release/${ARCH}/boot_crunch.conf to do this. I haven't actually tried though... I think it would be possible to have a 'GEOM' menu that you can run prior to fdisk, label, etc that would allow you to do some basic stuff like this. While the sysinstall code is a bit fugly it's not that difficult to hack on (speaking from limited experience :) Hmmm. I hadn't planned on actually creating a custom sysinstall but I guess that's another way we could approach this. I have some research to do... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: How can I add new binaries to the mfsroot image?
What I've done in the past is skip sysinstall alltogether and just boot of an NFS root. Then use custom scripts for the slicing/partitioning/ mirroring, copy a minimal system to disk and pkg_add the rest. Would be nice to do all this with install.cfg though. Please let me know when you get this working. I thought of doing something like this as well. I'll have to investigate this as another option to this problem. Thanks for the feedback guys. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
What are proper install.cfg for configuring multiple slices?
I want to do an automated sysinstall through an install.cfg script and the script partition the install disk into three slices. I've been going through various tests trying to figure out what the proper directives are but I haven't had much luck, and I can't find any good examples. Here is a snippet of my config file: disk=ad0 bootManager=standard partition=12582912 diskPartitionEditor partition=2097152 diskPartitionEditor partition=free diskPartitionEditor ad0s1-1=ufs 4194304 / ad0s1-2=ufs 4194304 /tmp ad0s1-3=ufs 4194304 /var ad0s2-1=swap 2097152 none ad0s3-1=ufs 4194304 none ad0s3-2=ufs 4194304 none ad0s3-3=ufs 0 none diskLabelEditor diskLabelCommit My intent here is to create three slices-one 6GB in size, another 1GB in size, and the third sized to consume the remaining free space. When I run this through sysinstall, it complains that it can't find the space for the partitions. It even complains that it can't find any free space. Because the slices don't get created, the subsequent label assignments fail as well. What is the proper commands for creating multiple slices in install.cfg? Another thing I'm having trouble with is partitioning more than one disk. I have four disks that I'd like to partition as part of the install.cfg script. In fact, I want to partition the four disks more or less identically (although only one should have an active root partition). Again though, if I try partitioning another disk after ad0, sysinstall complains about various things and the disk does not get partitioned. Can multiple disks be partitioned in this manner or does the step have to be done as a post-install operation? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: How can I add new binaries to the mfsroot image?
You wouldn't have to do so - you could just run a shell script from sysinstall and do what you want. That brings me back to my original problem. Yes, I can run a shell script from sysinstall, but gmirror isn't available in mfsroot, and adding gmirror to mfsroot isn't straightforward because it needs shared libraries. I think the best approach to use may very well to have a custom boot that mounts root from an NFS disk. Then I can run whatever commands I need without having to actually add anything to mfsroot... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: How can I add new binaries to the mfsroot image?
I'm not sure, but probably the installation CD doesn't carry shared libraries at all? All binaries in /stand are static-linked ones. Yeah, that is absolutely the problem--no shared libraries are available when sysinstall is running. You could also try scripts from mfsbsd project: http://people.freebsd.org/~mm/mfsbsd/ These works for me fine for building custom installation CDs. I'll have to check this out. I'm not getting anywhere with trying to customize mfsroot with my current approach... ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: How can I add new binaries to the mfsroot image?
I'll have to check this out. I'm not getting anywhere with trying to customize mfsroot with my current approach... The goal we are trying to achieve btw is to make gmirror available during an install so that the file systems are mirrored right from the get-go, so that we can avoid having to go through the process of converting a system as a post operation. The standard slicing/partition commands of sysinstall do support the creation of a mirrored file system though, so our idea was to run a script via install.cfg to take care of fdisk/bsdlabel/gmirror phase, and then install the packages in the normal fashion via subsequent steps in install.cfg. Is this something that can be done via sysinstall? If not, what's the best alternative? This whole process is targeted to be on a PXE boot server so we can configure our systems in a completely automated hands-off manner. We have 200+ FreeBSD systems and we definitely need an automated process. We already have it working fine, but without mirroring. We can upgrade doezens of systems at a time simply by making them boot from our PXE server. We now need to tweak this process so that we can establish the mirrored file systems as part of the automated install. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: How can I add new binaries to the mfsroot image?
I'll have to check this out. I'm not getting anywhere with trying to customize mfsroot with my current approach... The goal we are trying to achieve btw is to make gmirror available during an install so that the file systems are mirrored right from the get-go, so that we can avoid having to go through the process of converting a system as a post operation. The standard slicing/partition commands of sysinstall do *not* support the creation of a mirrored file system though, so our idea was to run a script via install.cfg to take care of fdisk/bsdlabel/gmirror phase, and then install the packages in the normal fashion via subsequent steps in install.cfg. Is this something that can be done via sysinstall? If not, what's the best alternative? This whole process is targeted to be on a PXE boot server so we can configure our systems in a completely automated hands-off manner. We have 200+ FreeBSD systems and we definitely need an automated process. We already have it working fine, but without mirroring. We can upgrade doezens of systems at a time simply by making them boot from our PXE server. We now need to tweak this process so that we can establish the mirrored file systems as part of the automated install. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
How can I add new binaries to the mfsroot image?
I want to make a custom FreeBSD install CD-ROM with additional commands available in the mfsroot image. Adding the new commands to the image is easy enough, and I've made an install.cfg file on the CD-ROM as well so that when the CD runs the commands in install.cfg are automatically executed. This all works, except none of the new binaries I add to the mfsroot image run during the automated sysinstall session. If I reference one of the default commands (the ones stored in /stand) they run fine, but if I add a new FreeBSD binary to the /stand directory (e.g. gmirror), the command fails. What's weird is that I can open a fixit shell after the install.cfg script fails and then run the same commands interactively and they work fine. Why would work these commands work in an interactive fixit shell but not during the automated sysinstall session? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]
RE: How can I add new binaries to the mfsroot image?
How does it fail? There doesn't seem to be any error generated. Or at least I tried to capture stderr and got nothing. Is the binary you added statically linked? The command I'm doing most of my testing with is gmirror. I pulled it from one of our operation FreeBSD boxes, and it appears to be referencing several shared libraries: # strings /stand/gmirror | grep '.so.' /libexec/ld-elf.so.1 libgeom.so.4 libsbuf.so.4 libbsdxml.so.3 libutil.so.7 libc.so.7 Wild guess: the shared libraries are present somewhere else on the CD, which perhaps is either not mounted or not pointed to by LD_LIBRARY_PATH or similar until the fixit shell is run. All of these shared libraries exist under /dist, which is mounted as the FreeBSD CD. The first one is an absolute path that is in fact a symbolic link in the fixit shell that ends up pointing to a location under /dist. LD_LIBRARY_PATH is not set in the fixit shell, so I'm curious how these shared libraries are being located under /dist (the ones without the explicit path). I think you are right though, it might be related to the shared libraries. I'll try setting LD_LIBRARY_PATH explicitly to see if that solves the problem. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to [EMAIL PROTECTED]