[zfs-discuss] Re: system wont boot after zfs
Our client is looking for a Senior SAN Administrator for a Full-Time position located in Philadelphia, PA. Job Description: This position will have systems administration responsibilities of the company’s SAN infrastructure. Department is Financial Automation - Technical Services - Enterprise Systems and Storage Department. Reports to Director - Enterprise Systems and Storage Group. Responsibilities: *Design, configure, implement, maintain, and support all SAN hardware, software and network infrastructure. *Perform architectural design and implementation based on current and future SAN requirements. *Monitor and track health of SAN and storage infrastructure. *Create and distribute utilization, capacity, trending, and analysis reports. *Provide routine maintenance and upgrades to established service levels. *Deploy and maintain SAN monitoring tools. *Collaborate with the Enterprise systems administration team to deploy storage and backup solutions. Regards, Gwen Miller This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: system wont boot after zfs
>All with the same problem. I disabled the onboard nvidia nforce 410/430 >raid bios in the bios in all cases. Now whether it actually does not look >for a signature, I do not know. I'm attempting to make this box into an >iSCSI target for my ESX environments. I can put W3K and SanMelody on there, >but it is not as interesting and I am attempting to help the Solaris >community. >I am simply making the business case that over three major vendors boards >and the absolute latest (gigabyte), the effect was the same. Have you tried disabling the disks in the BIOS? Select the disk in the BIOS and set the type to "NONE". That will prevent the disks from being accessed by the BIOS. They will still be usable in Solaris. >The given fact is that PC vendors are not readily adopting EFI bios at this >time, the millions of PC's out there are vulnerable to this. And if x86 >Solaris is to be really viable, this community needs to be addressed. Now I >was at Sun 1/4 of my entire life and I know the politics, but the PC area is >different. If you tell the customer to go to the mobo vendor to fix the >bios, they will have to find some guy in a bunker in Taiwan. Not likely. >Now I'm at VMware actively working on consolidating companies into x86 >platforms. The simple fact that the holy war between AMD and Intel has >created processors that a cheap enough and fast enough to cause disruption >in the enterprise space. My new dual core AMD processor is incredibly fast >and the entire box cost me $500 to assemble. There's a lot of hidden EFI out there, reportedly. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: system wont boot after zfs
Dave which BIOS manufacturers and revisions? that seems to be more of the problem as choices are typically limited across vendors .. and I take it you're running 6/06 u2 Jonathan On Nov 30, 2006, at 12:46, David Elefante wrote: Just as background: I attempted this process on the following: 1. Jetway amd socket 734 (vintage 2005) 2. Asus amd socket 939 (vintage 2005) 3. Gigabyte amd socket am2 (vintage 2006) All with the same problem. I disabled the onboard nvidia nforce 410/430 raid bios in the bios in all cases. Now whether it actually does not look for a signature, I do not know. I'm attempting to make this box into an iSCSI target for my ESX environments. I can put W3K and SanMelody on there, but it is not as interesting and I am attempting to help the Solaris community. I am simply making the business case that over three major vendors boards and the absolute latest (gigabyte), the effect was the same. As a workaround I can make slice 0 1 cyl and slice 1 1-x, and the zpool on the rest of the disk and be fine with that. So on a PC with zpool create there should be a warning for pc users that most likely if they use the entire disk, the resultant EFI label is likely to cause lack of bootability. I attempted to hotplug the sata drives after booting, and Nevada 51 came up with scratch space errors and did not recognize the drive. In any case I'm not hotplugging my drives every time. The given fact is that PC vendors are not readily adopting EFI bios at this time, the millions of PC's out there are vulnerable to this. And if x86 Solaris is to be really viable, this community needs to be addressed. Now I was at Sun 1/4 of my entire life and I know the politics, but the PC area is different. If you tell the customer to go to the mobo vendor to fix the bios, they will have to find some guy in a bunker in Taiwan. Not likely. Now I'm at VMware actively working on consolidating companies into x86 platforms. The simple fact that the holy war between AMD and Intel has created processors that a cheap enough and fast enough to cause disruption in the enterprise space. My new dual core AMD processor is incredibly fast and the entire box cost me $500 to assemble. The latest Solaris 10 documentation (thx Richard) has use the entire disk all over it. I don't see any warning in here about EFI labels, in fact these statements discourage putting ZFS in a slice.: ZFS applies an EFI label when you create a storage pool with whole disks. Disks can be labeled with a traditional Solaris VTOC label when you create a storage pool with a disk slice. Slices should only be used under the following conditions: * The device name is nonstandard. * A single disk is shared between ZFS and another file system, such as UFS. * A disk is used as a swap or a dump device. Disks can be specified by using either the full path, such as /dev/dsk/c1t0d0, or a shorthand name that consists of the device name within the /dev/dsk directory, such as c1t0d0. For example, the following are valid disk names: * c1t0d0 * /dev/dsk/c1t0d0 * c0t0d6s2 * /dev/foo/disk ZFS works best when given whole physical disks. Although constructing logical devices using a volume manager, such as Solaris Volume Manager (SVM), Veritas Volume Manager (VxVM), or a hardware volume manager (LUNs or hardware RAID) is possible, these configurations are not recommended. While ZFS functions properly on such devices, less-than-optimal performance might be the result. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, November 29, 2006 1:24 PM To: Jonathan Edwards Cc: David Elefante; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Re: system wont boot after zfs I suspect a lack of an MBR could cause some BIOS implementations to barf .. Why? Zeroed disks don't have that issue either. What appears to be happening is more that raid controllers attempt to interpret the data in the EFI label as the proprietary "hardware raid" labels. At least, it seems to be a problem with internal RAIDs only. In my experience, removing the disks from the boot sequence was not enough; you need to disable the disks in the BIOS. The SCSI disks with EFI labels in the same system caused no issues at all; but the disks connected to the on-board RAID did have issues. So what you need to do is: - remove the controllers from the probe sequence - disable the disks Casper -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.430 / Virus Database: 268.15.2/560 - Release Date: 11/30/2006 3:41 PM -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.430 / Virus Database: 268.1
Re: [zfs-discuss] Re: system wont boot after zfs
On Nov 29, 2006, at 13:24, [EMAIL PROTECTED] wrote: I suspect a lack of an MBR could cause some BIOS implementations to barf .. Why? Zeroed disks don't have that issue either. you're right - I was thinking that a lack of an MBR with a GPT could be causing problems, but actually it looks like we do write a protective MBR in efi_write() - so it's either going to be the GPT header at LBA1 or backwards compatibility with the version 1.00 spec that the BIOS vendors aren't dealing with correctly. Proprietary BIOS RAID signatures does sound quite plausible as a common cause for problems. Digging a little deeper, I'm thinking some of our EFI code might be a little old .. in efi_partition.h we've got the following defined for dk_gpt and dk_part: 161 /* Solaris library abstraction for EFI partitons */ 162 typedef struct dk_part { 163 diskaddr_t p_start;/* starting LBA */ 164 diskaddr_t p_size; /* size in blocks */ 165 struct uuid p_guid; /* partion type GUID */ 166 ushort_tp_tag; /* converted to part'n type GUID */ 167 ushort_tp_flag; /* attributes */ 168 charp_name[EFI_PART_NAME_LEN]; /* partition name */ 169 struct uuid p_uguid;/* unique partition GUID */ 170 uint_t p_resv[8]; /* future use - set to zero */ 171 } dk_part_t; 172 173 /* Solaris library abstraction for an EFI GPT */ 174 #define EFI_VERSION102 0x00010002 175 #define EFI_VERSION100 0x0001 176 #define EFI_VERSION_CURRENT EFI_VERSION100 177 typedef struct dk_gpt { 178 uint_t efi_version;/* set to EFI_VERSION_CURRENT */ 179 uint_t efi_nparts; /* number of partitions below */ 180 uint_t efi_part_size; /* size of each partition entry */ 181 /* efi_part_size is unused */ 182 uint_t efi_lbasize;/* size of block in bytes */ 183 diskaddr_t efi_last_lba; /* last block on the disk */ 184 diskaddr_t efi_first_u_lba; /* first block after labels */ 185 diskaddr_t efi_last_u_lba; /* last block before backup labels */ 186 struct uuid efi_disk_uguid; /* unique disk GUID */ 187 uint_t efi_flags; 188 uint_t efi_reserved[15]; /* future use - set to zero */ 189 struct dk_part efi_parts[1]; /* array of partitions */ 190 } dk_gpt_t; which looks lke we're using the EFI Version 1.00 spec and looking at cmd/zpool/zpool_vdev.c we call efi_write() which does the label and writes the PMBR at LBA0 (first 512B block), the EFI header at LBA1 and should reserve the next 16KB for other partition tables .. [now we really should be using EFI version 1.10 with the -001 addendum (which is what 1.02 morphed into about 5 years back) or version 2.0 in the UEFI space .. but that's a separate discussion, as the address boundaries haven't really changed for device labels.] in uts/common/fs/zfs/vdev_label.c we define the zfs boot block 500 501 /* 502 * Initialize boot block header. 503 */ 504 vb = zio_buf_alloc(sizeof (vdev_boot_header_t)); 505 bzero(vb, sizeof (vdev_boot_header_t)); 506 vb->vb_magic = VDEV_BOOT_MAGIC; 507 vb->vb_version = VDEV_BOOT_VERSION; 508 vb->vb_offset = VDEV_BOOT_OFFSET; 509 vb->vb_size = VDEV_BOOT_SIZE; which gets written down at the 8KB boundary after we start usable space from LBA34: 857 vtoc->efi_parts[0].p_start = vtoc->efi_first_u_lba; [note: 17KB isn't typically well aligned for most logical volumes .. it would probably be better to start writing data at LBA1024 so we stay well aligned for logical volumes with stripe widths up to 512KB and avoid the R/M/W misalignment that can occur there .. currently with a 256KB vdev label, I believe we start the data portion out on LBA546 which seems like a problem] and then we apparently store a backup vtoc right before the backup partition table entries and backup GPT: 858 vtoc->efi_parts[0].p_size = vtoc->efi_last_u_lba + 1 - 859 vtoc->efi_first_u_lba - resv; this next bit is interesting since we should probably define a GUID for ZFS partitions that points to the ZFS vdev label instead of using V_USR 860 861 /* 862 * Why we use V_USR: V_BACKUP confuses users, and is considered 863 * disposable by some EFI utilities (since EFI doesn't have a backup 864 * slice). V_UNASSIGNED is supposed to be used only for zero size 865 * partitions, and efi_write() will fail if we use it. V_ROOT, V_BOOT, 866 * etc. were all pretty specific
Re: [zfs-discuss] Re: system wont boot after zfs
>I suspect a lack of an MBR could cause some BIOS implementations to >barf .. Why? Zeroed disks don't have that issue either. What appears to be happening is more that raid controllers attempt to interpret the data in the EFI label as the proprietary "hardware raid" labels. At least, it seems to be a problem with internal RAIDs only. In my experience, removing the disks from the boot sequence was not enough; you need to disable the disks in the BIOS. The SCSI disks with EFI labels in the same system caused no issues at all; but the disks connected to the on-board RAID did have issues. So what you need to do is: - remove the controllers from the probe sequence - disable the disks Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: system wont boot after zfs
On Nov 29, 2006, at 10:41, [EMAIL PROTECTED] wrote: This is a problem since how can anyone use ZFS on a PC??? My motherboard is a newly minted AM2 w/ all the latest firmware. I disabled boot detection on the sata channels and it still refuses to boot. I had to purchase an external SATA enclosure to fix the drives. This seems to me to be a serious problem. I put build 47 and 50 on there with the same issue. A serious problem *IN YOUR BIOS*. You will need to format the disks, at ordinary PC (fdisk) labels and on those create Solaris partitions and give those to ZFS. take a look at the EFI/GPT discussion here (apple): http://www.roughlydrafted.com/RD/Home/7CC25766-EF64-4D85-AD37- BCC39FBD2A4F.html I suspect a lack of an MBR could cause some BIOS implementations to barf .. does our fdisk put an MBR on the disk? if so, does the EFI vdev labeling invalidate the MBR? Jonathan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: system wont boot after zfs
>This is a problem since how can anyone use ZFS on a PC??? My motherboard is a >newly minted AM2 w/ all the latest firmware. I disabled boot detection on the sata channels and it still refuses to b oot. I had to purchase an external SATA enclosure to fix the drives. This seems to me to be a ser ious problem. I put build 47 and 50 on there with the same issue. A serious problem *IN YOUR BIOS*. You will need to format the disks, at ordinary PC (fdisk) labels and on those create Solaris partitions and give those to ZFS. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: system wont boot after zfs
On 11/30/06, David Elefante <[EMAIL PROTECTED]> wrote: I had the same thing happen to me twice on my x86 box. I installed ZFS (RaidZ) on my enclosure with four drives and upon reboot the bios hangs upon detection of the newly EFI'd drives. I've already RMA'd 4 drives to seagate and the new batch was frozen as well. I was suspecting my enclosure, but I was suspicious when it only went bye bye after installing ZFS. This is a problem since how can anyone use ZFS on a PC??? My motherboard is a newly minted AM2 w/ all the latest firmware. I disabled boot detection on the sata channels and it still refuses to boot. I had to purchase an external SATA enclosure to fix the drives. This seems to me to be a serious problem. I put build 47 and 50 on there with the same issue. Yes, this is a serious problem. It's a problem with your motherboard bios, which is clearly not up to date. The Sun Ultra-20 bios was updated with a fix for this issue back in May. Until you have updated your bios, you will need to destroy the EFI labels, write SMI labels to the disks, and create slices on those disks which are the size that you want to devote to ZFS. Then you can specify the slice name when you run your zpool create operation. This has been covered in the ZFS discussion lists several times, and a quick google search should have found the answer for you. James C. McPherson -- Solaris kernel software engineer, system admin and troubleshooter http://www.jmcp.homeunix.com/blog Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: system wont boot after zfs
On 29-Nov-06, at 9:30 AM, David Elefante wrote: I had the same thing happen to me twice on my x86 box. I installed ZFS (RaidZ) on my enclosure with four drives and upon reboot the bios hangs upon detection of the newly EFI'd drives. ... This seems to me to be a serious problem. Indeed. Yay for PC BIOS. --Toby ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: system wont boot after zfs
I had the same thing happen to me twice on my x86 box. I installed ZFS (RaidZ) on my enclosure with four drives and upon reboot the bios hangs upon detection of the newly EFI'd drives. I've already RMA'd 4 drives to seagate and the new batch was frozen as well. I was suspecting my enclosure, but I was suspicious when it only went bye bye after installing ZFS. This is a problem since how can anyone use ZFS on a PC??? My motherboard is a newly minted AM2 w/ all the latest firmware. I disabled boot detection on the sata channels and it still refuses to boot. I had to purchase an external SATA enclosure to fix the drives. This seems to me to be a serious problem. I put build 47 and 50 on there with the same issue. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss