Re: [OpenIndiana-discuss] Deleting zpool.cache on OI
Yes, I suggest you follow the guidance given by Jim. Once you have the system up and running, you may want to try to import the pool using an explicit zpool import command. Esp. older versions of zfs had a problem when large files were deleted. The fs mount path tries to perform any interrupted delete-s that have been started before the pool is mounted. Did you by any chance delete some large files? The pool may come up eventually, but it may take a long time. (possibly days). Once you decouple the startup from the pool mount, you can either just destroy the problem pool, or let it finish what it is doing in the background. Steve - Original Message - Yes, Steve, exactly. I'd like to save the rest of my installation, but I have a pool that when mounted on any system, prevents a reboot when mounted. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Deleting zpool.cache on OI
I could be wrong but I think he wants to know how to do this on a system that hangs while trying to mount the pool. Your article is should help . Nice job, btw. Steve - Original Message - What do you want to achieve this way? http://wiki.openindiana.org/oi/Advanced+-+ZFS+Pools+as+SMF+services+and+iSCSI+loopback+mounts ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] RAM based devices as ZIL
For a fast (high ingest rate) system, 4G may not be enough. If your Ram device space is not sufficient to hold all the in-flight Zill blocks, it will fail over, and the Zil will just redirect to your main data pool. This is hard to notice, unless you have an idea of how much data should be flowing to your pool, as you monitor it with zpool iostat. Then you may notice the extra data being written to your data pool. The calculation of how much Zil space you need is not straight forward, because blocks in general are freed in a delayed manner. In other words, it is possible that the some Zil blocks are no longer needed because the transactions they represent already committed, but the blocks have not made it back to available status because of the conservative nature of the freed block recycling algo. Rule of thumb, 3 to 5 txg-s worth of ingest, depending on who you ask. Dedup and compression makes Slog sizing harder, because the Zil is neither compressed nor deduped. I would say if you dedup and / or compress, all bets are off. /sG/ - Original Message - Hello, Does anyone have any real world experience using a RAM based device like the DDRdrive X1 as a ZIL on 151a7? At 4GB they seem to be a little small but with some txg commit interval tweaking it looks like it may work. The entire 4GB is saved to NAND in case of a power failure so it seems like a pretty safe solution (entire system is on UPS and generator anyway). Thanks, Wim ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] spontaneous reboot with record in fault management
This looks like an actual pci device error to me. I would dig deeper and look at the errors with fmdump -v -e Steve - Original Message - on occasion i have systems spontaneously rebooting. i can often find entries like this in fault management but it is not particularly helpful. i suspect there is really nothing wrong and the software is generating a panic and rebooting. is there a way to mask this from any type of action or figure out what the source of the issue is? in this particularly case, i watched the system dump 96gb of ram on to a dedicated dump device. however, i was unable to retrieve the data afterwards and received a message from savecore that read something like 'save core: bad magic number b' any insights would be appreciated. thanks, j. root@db017:~# fmadm faulty ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Inefficient zvol space usage on 4k drives
Hi Jim, This looks to me more like a rounding-up problem, esp. looking at the bug report quoted. The waste factor increases as the block size goes down. Kind of looks like it fits the ratio of the blocks nominal size, vs its minimal on-disk foot print. For example, compressed blocks are variable size. If a block compresses to some small but non-zero size, it would take up the size of the smallest on-disk allocation unit. For an 8K block block, the smallest non-zero allocation could be 4K ( vs. 512 bytes) A similar thing would happen to small files, taking up less than a single block's worth of bytes. Zfs alters the block size for these, to closely match the actual bytes stored. A one byte file would take up merely a single sector. For small files, 512 byes vs. 4K minimum size can make a big difference. If most of the blocks are compressed, or there are a lot of small files, the 8k vs 512 or the 8k vs 4k ratio pretty much predicts doubling of the on-disk footprint at 8k block size. I do not see how the sector size could cause a similarly significant increase in the on-disk footprint by making metadata storage inefficient. I presume when you are talking about metadata, you mean the interior nodes (level 0) of files. If a file is = 3 blocks in size, it will not have any interior nodes. Otherwise, the nodes are allocated one page at a time, as many as needed. Metadata pages currently contain 128 block pointer structs (128*128 bytes == 16K) This interior node page size is independent of the file system's user-changeable block size. I do not believe that these pages are variable size. So a rough guesstimate would be : one 16K metadata page for every 128 blocks in the file. (Technically, there could be multiple levels of interior node pages, but the 128x fanout is so aggressive that you can neglect those for an order-of-magnitude rough guess) On the average, metadata takes up less than 2% of the space needed by user payload. I am planning on playing with 4K sectors to try and repeat the experiment mentioned, I am curious what are the performance and space usage implications when the file size and compression are taken into consideration. Steve - Original Message - Yes, I've had similar results on my rig and complained some time ago... yet the ZFS world moves forward with desiring ashift=12 as the default (and it may be inevitable ultimately). I think the main problem is that small userdata blocks involve a larger portion of metadata, which may come in small blocks which don't fully cover a sector (supposedly they should be aggregated into up-to-16k clusters, but evidently are not always so. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] COMSTAR qlt dropping link and resetting
At one time I posted a dtrace script to track txg open times. Look for it in the forum archives or I can repost it .Some Other folks Posted Similar Scripts ... I would not be surprised to find a txg being open for an unusually long time when the problem happens. That would Indicate a problem in the Zfs disk io path. eg. Large File deletions with dedup Turned On may cause an Io Storm and that In turn may cause your Problem. Steve On May 18, 2012, at 4:25 AM, Adrian Carpenter ta...@wbic.cam.ac.uk wrote: We are using one port of each of a pair of Qlogic 2562 cards to act as a FC target for our Xen environment, we are running oi_151a4. The other port on each card is used as an initiator to attach to some FC storage(Nexsan). We use two Qlogic SanBoxes configured so that the Xen hypervisors have a redundant path to the FC target which provides a Storage Repository from a ZVOL. Everything works really well as expected with good throughput, but randomly once every couple of days simultaneously BOTH FC targets on the Openindiana box reset their FC connections - this of course causes real problem in Xen environment and is not help by having redundant multipaths. We are at a bit of a loss, does anyone have any suggestions? Adrian ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Access to /etc on a pool from another system?
I am reading you mail as I connect the disk array from a dismantled OI 148 system to another computer and... /etc is typically on your root pool (ie. rpool). Note that the rpool is often on a different device, perhaps a flash drive. You may, or may not have connected this to the new motherboard. Steve - Original Message - I'm trying to access the /etc files from another system on which I installed OI 148. I can import the pool as fpool and can access /mnt/fpool /mnt/export. But for the life of me I can't figure out how to get to the /etc filesystem in fpool. All the examples google turns up point to things I already know how to do (e.g access fpool/export) I *think* I've done this before, but don't find any notes in my logbook Thanks, Reg ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] How do I debug this?
It is long shot, but check how much space you have where your core dumps supposed to go. Your root pool may have limited space. Also, the visibility of core dumps has security implications, they could be inaccessible unless you are looking as root. Steve - Original Message - Thanks. I'll give it a go and see if I get a core file. Interesting that I have per-process core dumps enabled but this one just didn't show up. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Replacing OI 151 ssh with OpenSSH 5.9?
Take a look at README.altprivsep in usr/src/cmd/ssh. Seems like the Solaris team significantly changed how privilege separation works. Looking at the Illumos hg log (which contains the tail end of the Osol hg log) the Sun ssh code was periodically resynced with openssh. The last resync visible 2009/408 (presumably 2009 april 8). That would peg the Sun ssh version as last synced with OpenSSH 5.2. The current OpenSSH is 5.9.. Steve G - Original Message - They're needed so that sshd correctly uses solaris's version of PAM and audit and other subsystems like that. Probably but someone would have to do the work. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] ZIL write cache performance
In the scenario you are describing An ssd should be faster The reason it is not cut and dry is because you are comparing the spare Bandwidth of your array which has possibly many Spindles to the bandwidth of a single device. So it depends on how fast your array is, how much spare bandwidth it has and what is the sustainable write rate of your ssd. There is no guarantee that the ssd will come out on top although in your case it would with the perf fix ::sG:: On Jan 13, 2012, at 6:56 AM, Matt Connolly matt.connolly...@gmail.com wrote: Yes, it is as you guess comparing ZIL on the main pool vs ZIL on an SSD. I understand that the ZIL on its own is more of an integrity function rather than a performance boost. However, I would have expected some performance boost by using an SSD log device since writing to the dedicated log device reduces I/O load on the main pool (or is this wrong?) Thanks for the heads up about the bug and pending fix.. I'll take a look. -Matt. On 13/01/2012, at 9:34 AM, Steve Gonczi gon...@comcast.net wrote: Hi Matt, The ZIL is not a performance enhancer. (This is a common misunderstanding, people sometimes view the ZIL as a write cache) . It is a way to simulate sync semantics on files where you really need that, instead of the coarser ganularity guarantee that zfs gives you without it. (txg level, where the in-progress transaction group may roll back if you crash). If I am reading your post correctly, you are comparing 2 scenarios 1) Zil is enabled, and goes to the main storage pool 2) Zil is enabled but it goes to a dedicated SSD instead. Please verify that this is indeed the case. Yes, this is the case. You should not expect having an SSD based zil performing better than when turning the ZIL off altogether. The latter of course will have better performance, but you have to live with the possibility of losing some data. Given that the case is (1) and (2), it all depends on how much performance headroom your pool has, vs. the write performance of the SSD. A fast SSD ( e.g.: DRAM based, and preferably dedicated to Zil and not split) would work best. It does not have to be huge, just large enough to store (say) 5 seconds worth of your planned peak data inflow. You need to be aware of a recent performance regression discovered pertaining to ZIL ( George Wilson has just posted the fix for review on the illumos dev list) This has been in Illumos for a while, so it is possible, that it is biting you. Steve - Original Message - Hi, I've installed an SSD drive in my OI machine and have it partitioned (sliced) with a main slice to boot from and a smaller slice to use as a write cache (ZIL) for our data pool. I've noticed that for many tasks, using the ZIL actually slows many tasks at hand (operation within a qemu-kvm virtual machine, mysql loading importing a dump file, etc). I know I bought a cheap SSD to play with so I wasn't expected the best performance, but I would have expected some improvement, not a slow down. In one particular test, I have mysql running in a zone and loading a test data set takes about 40 seconds without the ZIL and about 60 seconds with ZIL. I certainly wasn't expecting a 50% slow down. Is this to be expected? Are there any best practices for testing an SSD to see if it will actually improve performance of a zfs pool? Thanks, Matt ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] ZIL write cache performance
Hi Matt, The ZIL is not a performance enhancer. (This is a common misunderstanding, people sometimes view the ZIL as a write cache) . It is a way to simulate sync semantics on files where you really need that, instead of the coarser ganularity guarantee that zfs gives you without it. (txg level, where the in-progress transaction group may roll back if you crash). If I am reading your post correctly, you are comparing 2 scenarios 1) Zil is enabled, and goes to the main storage pool 2) Zil is enabled but it goes to a dedicated SSD instead. Please verify that this is indeed the case. You should not expect having an SSD based zil performing better than when turning the ZIL off altogether. The latter of course will have better performance, but you have to live with the possibility of losing some data. Given that the case is (1) and (2), it all depends on how much performance headroom your pool has, vs. the write performance of the SSD. A fast SSD ( e.g.: DRAM based, and preferably dedicated to Zil and not split) would work best. It does not have to be huge, just large enough to store (say) 5 seconds worth of your planned peak data inflow. You need to be aware of a recent performance regression discovered pertaining to ZIL ( George Wilson has just posted the fix for review on the illumos dev list) This has been in Illumos for a while, so it is possible, that it is biting you. Steve - Original Message - Hi, I've installed an SSD drive in my OI machine and have it partitioned (sliced) with a main slice to boot from and a smaller slice to use as a write cache (ZIL) for our data pool. I've noticed that for many tasks, using the ZIL actually slows many tasks at hand (operation within a qemu-kvm virtual machine, mysql loading importing a dump file, etc). I know I bought a cheap SSD to play with so I wasn't expected the best performance, but I would have expected some improvement, not a slow down. In one particular test, I have mysql running in a zone and loading a test data set takes about 40 seconds without the ZIL and about 60 seconds with ZIL. I certainly wasn't expecting a 50% slow down. Is this to be expected? Are there any best practices for testing an SSD to see if it will actually improve performance of a zfs pool? Thanks, Matt ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] ZFS stalls with oi_151?
Are you running with dedup enabled? If the box is still responsive, try to generate a thread stack listing e.g: echo ::threadlist -v mdb -k /tmp/threads.txt Steve On Oct 21, 2011, at 4:16, Tommy Eriksen t...@rackhosting.com wrote: Hi guys, I've got a bit of a ZFS problem: All of a sudden, and it doesn't seem related to load or anything, the system will stop writing to the disks in my storage pool. No error messages are logged (that I can find anyway), nothing in dmesg, messages or the likes. ZFS stalls, a simple snapshot command (or the likes) just hangs indefinitely and can't be stopped with ctrl+c or kill -9. Today, the stall happened after I had been running 2 VMs on each (running on vsphere5 connecting via iscsi) running iozone -s 200G (just to generate a bunch of load). Happily, this morning, I saw that they were still running without problem and stopped them. Then, when asking vSphere to delete the VMs, all write I/O stalled. A bit too much irony for me :) However, and this puzzled me, everything else seems to run perfectly, even up to zfs writing new data on the l2arc devices while data is read. Boxes (2 of the same) are: Supermicro based, 24 bay chassis 2*X5645 Xeon 48gigs of RAM 3*LSI2008 controllers coupled to 20 Seagate Constellation ES 3TB SATA 2 Intel 600GB SSD 2 Intel 311 20GB SSD 18 of the 3TB drives are set up in mirrored vdevs, the last 2 are spares. Running oi_151a (trying a downgrade to 148 today, I think, since I have 5 or so boxes running without problems on 148, but both my 151a are playing up). /etc/system variables: set zfs:zfs_vdev_max_pending = 4 set zfs:l2arc_noprefetch = 0 set zfs:zfs_vdev_cache_size = 0 I can write to a (spare) disk on the same controller without errors, so I take it its not a general I/O stall on the controller: root@zfsnas3:/var/adm# dd if=/dev/zero of=/dev/rdsk/c8t5000C50035DE14FAd0s0 bs=1M ^C1640+0 records in 1640+0 records out 1719664640 bytes (1.7 GB) copied, 11.131 s, 154 MB/s iostat reported - note no writes to any of the other drives. All writes just stall. extended device statistics errors --- r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device 3631.6 167.2 14505.5 152337.1 0.0 2.20.00.6 0 157 0 0 0 0 c8 109.00.8 472.90.0 0.0 0.00.00.5 0 3 0 0 0 0 c8t5000C50035B922CCd0 143.00.8 567.10.0 0.0 0.10.00.5 0 3 0 0 0 0 c8t5000C50035CA8A5Cd0 89.60.8 414.10.0 0.0 0.10.00.6 0 2 0 0 0 0 c8t5000C50035CAB258d0 95.80.8 443.30.0 0.0 0.00.00.5 0 2 0 0 0 0 c8t5000C50035DE3DEBd0 144.80.8 626.40.0 0.0 0.10.00.6 0 4 0 0 0 0 c8t5000C50035BE1945d0 134.00.8 505.70.0 0.0 0.00.00.4 0 3 0 0 0 0 c8t5000C50035DDB02Ed0 1.00.43.40.0 0.0 0.00.00.0 0 0 0 0 0 0 c8t5000C50035DE0414d0 107.80.8 461.60.0 0.0 0.00.00.3 0 2 0 0 0 0 c8t5000C50035D40D15d0 117.20.8 516.50.0 0.0 0.10.00.5 0 3 0 0 0 0 c8t5000C50035DE0C86d0 64.20.8 261.20.0 0.0 0.00.00.6 0 2 0 0 0 0 c8t5000C50035DD6044d0 2.00.86.80.0 0.0 0.00.00.0 0 0 0 0 0 0 c8t5001517959582943d0 2.00.86.80.0 0.0 0.00.00.0 0 0 0 0 0 0 c8t5001517959582691d0 109.80.8 423.50.0 0.0 0.00.00.3 0 2 0 0 0 0 c8t5000C50035C13A6Bd0 765.00.8 3070.90.0 0.0 0.20.00.2 0 7 0 0 0 0 c8t5001517959699FE0d0 1.0 149.23.4 152337.1 0.0 1.00.06.5 0 97 0 0 0 0 c8t5000C50035DE14FAd0 210.40.8 775.40.0 0.0 0.10.00.4 0 3 0 0 0 0 c8t5000C50035CA1E58d0 689.40.8 2776.60.0 0.0 0.10.00.2 0 7 0 0 0 0 c8t50015179596A8717d0 108.60.8 430.50.0 0.0 0.00.00.4 0 2 0 0 0 0 c8t5000C50035CBD12Ad0 165.60.8 561.50.0 0.0 0.10.00.4 0 3 0 0 0 0 c8t5000C50035CA90DDd0 164.40.8 578.50.0 0.0 0.10.00.4 0 4 0 0 0 0 c8t5000C50035DDFC34d0 125.60.8 477.70.0 0.0 0.00.00.4 0 2 0 0 0 0 c8t5000C50035DE2AD3d0 93.20.8 371.30.0 0.0 0.00.00.4 0 2 0 0 0 0 c8t5000C50035B94C40d0 113.20.8 445.30.0 0.0 0.10.00.5 0 3 0 0 0 0 c8t5000C50035BA02AEd0 75.40.8 304.80.0 0.0 0.00.00.4 0 2 0 0 0 0 c8t5000C50035DDA579d0 …Is anyone else seeing similar? Thanks a lot, Tommy ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org
Re: [OpenIndiana-discuss] How to troubleshoot failing hardware causing hoot hangs
Hello, Looking at the Hald source: ( usr/src/cmd/hal/hald /hald.c) Error 95 is coming from a script, ti is just informing you that a fatal error occurred. The informative error code is the 2. This tells you that hald forked a child process, and it timed out waiting for the child process to write to a pipe. The child process hung or failed for some reason, and the parent decided to kill it. The child code could hang for a number of reasons. One possible way to debug this is load mdb so that it breaks early in the boot, set a breakpoints on some of the processing steps, like hald_dbus_local_server_init , ospec_init ettc.. and similar processing steps to narrow down where it hangs. I see from the source that Hald has fairly detailed built-in logging that may help debugging this. If the environment variables HALD_VERBOSE and HALD_USE_SYSLOG are defined, you should get detailed status messages. There is probably a man page somewhere on how to set these. Said log settings can also be modified via hald command line options ( Sorry, I have no idea what script or setup file you have to hack to specify these on startup): static void 210 usage () 211 { 212 fprintf ( stderr , \n usage : hald [--daemon=yes|no] [--verbose=yes|no] [--help]\n ); 213 fprintf ( stderr , 214 \n 215 --daemon=yes|no Become a daemon\n 216 --verbose=yes|no Print out debug (overrides HALD_VERBOSE)\n 217 --use-syslog Print out debug messages to syslog instead of stderr.\n 218 Use this option to get debug messages if HAL runs as\n 219 daemon.\n 220 --help Show this information and exit\n 221 --version Output version information and exit 222 \n 223 The HAL daemon detects devices present in the system and provides the\n 224 org.freedesktop.Hal service through the system-wide message bus provided\n 225 by D-BUS.\n Steve - Original Message - Hi, I'm about to RMA my motherboard but before that I want to troubleshoot the issue further so that I can give more specific information on what's failing on the motherboard. What happens is that some hardware is failing on the motherboard which causes OI to hang during boot. So my question is how can I find out what hardware is failing? The problem is that when I reset the system it boots up just fine after the reset and e.g. the svcs -xv gives no information on failures on last boot. These issues also don't happen every time I start up the system, it happens rather sporadically. Here's what I found out; when it freezes, the last lines of the console looks like this: ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] How to troubleshoot failing hardware causing hoot hangs
Perhaps the focus should be amping up hald logging, so that if and when the problem happens you have some info to look at. The hald man page has examples on how to do this via svccfg. Steve - Original Message - Hi Steve, thanks a lot for your help! The problem is that the issues that occur are different at different bootups. Since the beginning of this year this computer/server has been started up and shut down a bit over 200 times, where this error occurred 5 times including today. ---snip--- ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] init 6/reboot reboots the OS but not the hardware
You need remote reset or remote power cycle capability. An ILOM console, if your hardware has support for it would provide this. ILOM is common on most server class hardware ( Sun servers certainly have it, so do Supermicro boards). Failing that, there are inexpensive remote power cycle power outlets you can buy. These allow power control of the individual power plugs remotely. Steve - Original Message - -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, I just upgraded an aged OpenSolaris 2009.6 to OpenIndiana 148 and I have a silly but annoying issue. When doing init 6 or, even, reboot, the OS shutdowns and reboot inmediatelly, but the machine not actually reboots. That is, the system is shutdown and the kernel reboots inmediatelly, just like would do a zone reboot, but the machine doesn't powercycle, doesn't show the BIOS, doesn't show the POST, and, the really annoying thing in this case, doesn't show the GRUB menu. And this is critical to me. The machine is in a remote datacenter, but I have KVM access. What can I do?. My GRUB entry is: (just in case is something related to the way I launch OI) root@ns224064:~# cat /rpool/boot/grub/menu.lst default 2 timeout 10 title opensolaris-2 bootfs rpool/ROOT/opensolaris-2 kernel$ /platform/i86pc/kernel/$ISADIR/unix -v -B $ZFS-BOOTFS module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = title SOLARIS10 rootnoverify (hd2,0) chainloader +1 title OpenIndiana-148 bootfs rpool/ROOT/OpenIndiana-148 kernel$ /platform/i86pc/kernel/$ISADIR/unix -v -B $ZFS-BOOTFS module$ /platform/i86pc/$ISADIR/boot_archive # End of LIBBE entry = - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ j...@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:j...@jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ Things are not so easy _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ My name is Dump, Core Dump _/_/_/ _/_/_/ _/_/ _/_/ El amor es poner tu felicidad en la felicidad de otro - Leibniz -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTjrnwJlgi5GaxT1NAQLoJAQAn26zsDa2nbKz8c1gmVFA/R4ODB0Y1SHf hY8k9f96PEABlyAo9gI5ggijFzAzGmNzlGwwJVgEwZllbcnBZjhFL2RA2zHLbstU 4CvWUhVsKZuKPukIUs5136ezZLaxGmJ76UnaMo1xk8J+NNtGfrCCil/C8sBCmZYR Hlntc/HVDmc= =JsqL -END PGP SIGNATURE- ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Kernel Panic installing openindiana on a HP BL460c G1
Hello, This should be analyzed and root caused. A ::stack woudl be useful to see the call parameters. Without disassembling zio_buf_alloc() I can only guess that the mutex_enter you see crashing is really in kmem_cache_alloc() If that proves to be the case, I would verify the offset of cc_lock in ccp, to see if ccp was null, or corrupt. Next step would be a trip to the kmem_cpu _cache related code, to see if KMEM_CPU_CACHE can return zero or a corrupt value in some cases. Again, my guess would be this is a NULL pointer dereference,. Does this system suffer from a severe out of memory condition? Best Wishes, Steve */ 2486 void * 2487 kmem_cache_alloc ( kmem_cache_t * cp , int kmflag ) 2488 { 2489 kmem_cpu_cache_t * ccp = KMEM_CPU_CACHE ( cp ); 2490 kmem_magazine_t * fmp ; 2491 void * buf ; 2492 2493 mutex_enter ( ccp - cc_lock ); /sG/ - Original Message - Now with the pictures, hope this works: http://imageshack.us/photo/my-images/402/oi151paniccdollarstatus.jpg/ http://img3.imageshack.us/i/oi151panicmsgbuf1.jpg/ http://img5.imageshack.us/i/oi151panicmsgbuf2.jpg/ http://img89.imageshack.us/i/oi151panicmsgbuf3.jpg/ http://img23.imageshack.us/i/oi151panicmsgbuf4.jpg/ http://img16.imageshack.us/i/oi151panicmsgbuf5.jpg/ http://img694.imageshack.us/i/oi151panicmsgbuf6.jpg/ ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] oracle removes 32bit x86 cpu support for solaris 11 will OI do same?
For Intel CPUs, 32 bit code is certainly more compact , and in some cases arguably faster than 64 bit code. (say, comparing the same code on the same machine compiled 32 and 64 bit) But, newer cpu silicon tends to make performance improvements in many ways (e.g locating more supporting circuity on the cpu's silicon, increasing L1 /L2 cache sizes, etc) Newer CPUs also tend to be more energy efficient. Intel made great strides towards energy efficiency. E.g.: idling the cpu when not in use ( deep C states etc. of gating off any circuitry that is not in use, modulating the cpu clock rate ( SpeedStep). So performance and energy efficiency is more dependent on which generation of cpu core design we have, rather than on just the the bitness . The primary advantage of 64 bit per se ( ie running a given cpu in 64 bit mode) is the increased addressable memory space. The current hardware limit set by the manufacturers is at 48 address bits (256 terabytes theoretical limit) Actual OS support cuts this in half, or less. Motherboard limitations further curtail this, but 48G motherboards are now commonplace. On 32 bit Intel (Amd) you are typically limited to 4G, which is split between kernel and userland depending on the OS and configuration. (E.g.: 1G kernel and 3G userland) Steve - Michael Stapleton michael.staple...@techsologic.com wrote: While we are talking about 32 | 64 bit processes; Which one is better? Faster? More efficient? Mike ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] PANIC vmem_hash_delete(): bad free all versions svn_111
Hi Gabriel, The immediate cause of this panic is an attempt to free an address==null. The intersting part (how this comes about), is hard to figure out without more info. At the very minimum, a stack (and some amount of luck), ideally a crash dump would be necessary. This brings into focus another issue. The community would benefit from a server, where people could upload crash dumps in cases like this. I am sure there are several people reading this list, who may be able and inclined to take a quick look and provide a first cut diagnosis on a volunteer basis . Steve /sG/ - Gabriel de la Cruz gabriel.delac...@gmail.com wrote: Hi, could someone point me out what is going on here?, I have an IBM x3550 M3 panicking with any kernel version higher than svn_111... I tried upgrading to svn_134, installing svn_134 from live cd and with oi_148b live cd.. The live CDs panic as well. Any ideas? Thanks! :D ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] PANIC vmem_hash_delete(): bad free all versions svn_111
man dumpadm.. You have to enable crash dumps, and select a location where you have some room to save them. When you just run dumpadm it will tell you what your current settings are. If you do not have crashdumps enabled, you may be able to save the last crashdump by running savecore /some/location/where/you/have/room as soon as possible after you come up. If you are unable to boot up (keep crashing) you could edit the grub menu entry for the current kernel and add substitute -k -d for console=graphic, to crash into the debugger and look around. /sG/ - ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss
Re: [OpenIndiana-discuss] Monitoring OI with Zenoss
Check out chime -:::-sG-:::- On Jan 27, 2011, at 23:28, WK openindi...@familyk.org wrote: I am experimenting with using Zenoss to monitor OI 148, and I was wondering if anyone had any advice on configuring SNMP for this purpose? Zenoss is showing the uptime, but not much more. On my Linux machines, it shows network routers, file systems, load average, cpu utilization, memory utilization and I/O. I'm just trying to get a similar display for OI. ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss ___ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss