[zfs-discuss] Re: system wont boot after zfs

2007-03-09 Thread Gwen
Our client is looking for a Senior SAN Administrator for a Full-Time position 
located in Philadelphia, PA.

Job Description:
This position will have systems administration responsibilities of the 
company’s SAN infrastructure. Department is Financial Automation - Technical 
Services - Enterprise Systems and Storage Department. Reports to Director - 
Enterprise Systems and Storage Group.

Responsibilities:
*Design, configure, implement, maintain, and support all SAN hardware, software 
and network infrastructure.
*Perform architectural design and implementation based on current and future 
SAN requirements.
*Monitor and track health of SAN and storage infrastructure.
*Create and distribute utilization, capacity, trending, and analysis reports. 
*Provide routine maintenance and upgrades to established service levels.
*Deploy and maintain SAN monitoring tools.
*Collaborate with the Enterprise systems administration team to deploy storage 
and backup solutions.


Regards,

Gwen Miller
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-30 Thread Casper . Dik

>All with the same problem.  I disabled the onboard nvidia nforce 410/430
>raid bios in the bios in all cases.  Now whether it actually does not look
>for a signature, I do not know. I'm attempting to make this box into an
>iSCSI target for my ESX environments.  I can put W3K and SanMelody on there,
>but it is not as interesting and I am attempting to help the Solaris
>community. 

>I am simply making the business case that over three major vendors boards
>and the absolute latest (gigabyte), the effect was the same.

Have you tried disabling the disks in the BIOS?

Select the disk in the BIOS and set the type to "NONE".

That will prevent the disks from being accessed by the BIOS.

They will still be usable in Solaris.

>The given fact is that PC vendors are not readily adopting EFI bios at this
>time, the millions of PC's out there are vulnerable to this.  And if x86
>Solaris is to be really viable, this community needs to be addressed.  Now I
>was at Sun 1/4 of my entire life and I know the politics, but the PC area is
>different.  If you tell the customer to go to the mobo vendor to fix the
>bios, they will have to find some guy in a bunker in Taiwan.  Not likely.
>Now I'm at VMware actively working on consolidating companies into x86
>platforms.  The simple fact that the holy war between AMD and Intel has
>created processors that a cheap enough and fast enough to cause disruption
>in the enterprise space.  My new dual core AMD processor is incredibly fast
>and the entire box cost me $500 to assemble.

There's a lot of hidden EFI out there, reportedly.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-30 Thread Jonathan Edwards

Dave

which BIOS manufacturers and revisions?   that seems to be more of  
the problem
as choices are typically limited across vendors .. and I take it  
you're running 6/06 u2


Jonathan

On Nov 30, 2006, at 12:46, David Elefante wrote:


Just as background:

I attempted this process on the following:

1.  Jetway amd socket 734 (vintage 2005)
2.  Asus amd socket 939 (vintage 2005)
3. Gigabyte amd socket am2 (vintage 2006)

All with the same problem.  I disabled the onboard nvidia nforce  
410/430
raid bios in the bios in all cases.  Now whether it actually does  
not look
for a signature, I do not know. I'm attempting to make this box  
into an
iSCSI target for my ESX environments.  I can put W3K and SanMelody  
on there,

but it is not as interesting and I am attempting to help the Solaris
community.

I am simply making the business case that over three major vendors  
boards

and the absolute latest (gigabyte), the effect was the same.

As a workaround I can make slice 0 1 cyl and slice 1 1-x, and the  
zpool on
the rest of the disk and be fine with that.  So on a PC with zpool  
create
there should be a warning for pc users that most likely if they use  
the
entire disk, the resultant EFI label is likely to cause lack of  
bootability.


I attempted to hotplug the sata drives after booting, and Nevada 51  
came up
with scratch space errors and did not recognize the drive.  In any  
case I'm

not hotplugging my drives every time.

The given fact is that PC vendors are not readily adopting EFI bios  
at this
time, the millions of PC's out there are vulnerable to this.  And  
if x86
Solaris is to be really viable, this community needs to be  
addressed.  Now I
was at Sun 1/4 of my entire life and I know the politics, but the  
PC area is
different.  If you tell the customer to go to the mobo vendor to  
fix the
bios, they will have to find some guy in a bunker in Taiwan.  Not  
likely.

Now I'm at VMware actively working on consolidating companies into x86
platforms.  The simple fact that the holy war between AMD and Intel  
has
created processors that a cheap enough and fast enough to cause  
disruption
in the enterprise space.  My new dual core AMD processor is  
incredibly fast

and the entire box cost me $500 to assemble.

The latest Solaris 10 documentation (thx Richard) has use the  
entire disk
all over it.  I don't see any warning in here about EFI labels, in  
fact

these statements discourage putting ZFS in a slice.:


ZFS applies an EFI label when you create a storage pool with whole  
disks.
Disks can be labeled with a traditional Solaris VTOC label when you  
create a

storage pool with a disk slice.

Slices should only be used under the following conditions:

*

  The device name is nonstandard.
*

  A single disk is shared between ZFS and another file system,  
such as

UFS.
*

  A disk is used as a swap or a dump device.

Disks can be specified by using either the full path, such as
/dev/dsk/c1t0d0, or a shorthand name that consists of the device  
name within
the /dev/dsk directory, such as c1t0d0. For example, the following  
are valid

disk names:

*

  c1t0d0
*

  /dev/dsk/c1t0d0
*

  c0t0d6s2
*

  /dev/foo/disk

ZFS works best when given whole physical disks. Although constructing
logical devices using a volume manager, such as Solaris Volume Manager
(SVM), Veritas Volume Manager (VxVM), or a hardware volume manager  
(LUNs or
hardware RAID) is possible, these configurations are not  
recommended. While
ZFS functions properly on such devices, less-than-optimal  
performance might

be the result.

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On  
Behalf Of

[EMAIL PROTECTED]
Sent: Wednesday, November 29, 2006 1:24 PM
To: Jonathan Edwards
Cc: David Elefante; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Re: system wont boot after zfs



I suspect a lack of an MBR could cause some BIOS implementations to
barf ..


Why?

Zeroed disks don't have that issue either.

What appears to be happening is more that raid controllers attempt
to interpret the data in the EFI label as the proprietary
"hardware raid" labels.  At least, it seems to be a problem
with internal RAIDs only.

In my experience, removing the disks from the boot sequence was
not enough; you need to disable the disks in the BIOS.

The SCSI disks with EFI labels in the same system caused no
issues at all; but the disks connected to the on-board RAID
did have issues.

So what you need to do is:

- remove the controllers from the probe sequence
- disable the disks

Casper

--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.430 / Virus Database: 268.15.2/560 - Release Date:  
11/30/2006

3:41 PM


--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.430 / Virus Database: 268.1

Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-30 Thread Jonathan Edwards


On Nov 29, 2006, at 13:24, [EMAIL PROTECTED] wrote:




I suspect a lack of an MBR could cause some BIOS implementations to
barf ..


Why?

Zeroed disks don't have that issue either.



you're right - I was thinking that a lack of an MBR with a GPT could  
be causing problems, but actually it looks like we do write a  
protective MBR in efi_write() - so it's either going to be the GPT  
header at LBA1 or backwards compatibility with the version 1.00 spec  
that the BIOS vendors aren't dealing with correctly.  Proprietary  
BIOS RAID signatures does sound quite plausible as a common cause for  
problems.


Digging a little deeper, I'm thinking some of our EFI code might be a  
little old ..


in efi_partition.h we've got the following defined for dk_gpt and  
dk_part:

161 /* Solaris library abstraction for EFI partitons */
162 typedef struct dk_part  {
163 diskaddr_t  p_start;/* starting LBA */
164 diskaddr_t  p_size; /* size in blocks */
165 struct uuid p_guid; /* partion type GUID */
166 ushort_tp_tag;  /* converted to part'n type 
GUID */
167 ushort_tp_flag; /* attributes */
168 charp_name[EFI_PART_NAME_LEN]; /* partition name */
169 struct uuid p_uguid;/* unique partition GUID */
170 uint_t  p_resv[8];  /* future use - set to zero */
171 } dk_part_t;
172
173 /* Solaris library abstraction for an EFI GPT */
174 #define EFI_VERSION102  0x00010002
175 #define EFI_VERSION100  0x0001
176 #define EFI_VERSION_CURRENT EFI_VERSION100
177 typedef struct dk_gpt {
178 uint_t  efi_version;/* set to EFI_VERSION_CURRENT */
179 uint_t  efi_nparts; /* number of partitions below */
180 uint_t  efi_part_size;  /* size of each partition entry 
*/
181 /* efi_part_size is unused */
182 uint_t  efi_lbasize;/* size of block in bytes */
183 diskaddr_t  efi_last_lba;   /* last block on the disk */
184 diskaddr_t  efi_first_u_lba; /* first block after labels */
185 	diskaddr_t	efi_last_u_lba;	/* last block before backup  
labels */

186 struct uuid efi_disk_uguid; /* unique disk GUID */
187 uint_t  efi_flags;
188 uint_t  efi_reserved[15]; /* future use - set to zero */
189 struct dk_part  efi_parts[1];   /* array of partitions */
190 } dk_gpt_t;

which looks lke we're using the EFI Version 1.00 spec and looking at  
cmd/zpool/zpool_vdev.c we call efi_write() which does the label and  
writes the PMBR at LBA0 (first 512B block), the EFI header at LBA1  
and should reserve the next 16KB for other partition tables .. [now  
we really should be using EFI version 1.10 with the -001 addendum  
(which is what 1.02 morphed into about 5 years back) or version 2.0  
in the UEFI space .. but that's a separate discussion, as the address  
boundaries haven't really changed for device labels.]


in uts/common/fs/zfs/vdev_label.c we define the zfs boot block
500
501 /*
502  * Initialize boot block header.
503  */
504 vb = zio_buf_alloc(sizeof (vdev_boot_header_t));
505 bzero(vb, sizeof (vdev_boot_header_t));
506 vb->vb_magic = VDEV_BOOT_MAGIC;
507 vb->vb_version = VDEV_BOOT_VERSION;
508 vb->vb_offset = VDEV_BOOT_OFFSET;
509 vb->vb_size = VDEV_BOOT_SIZE;

which gets written down at the 8KB boundary after we start usable  
space from LBA34:

857 vtoc->efi_parts[0].p_start = vtoc->efi_first_u_lba;

[note: 17KB isn't typically well aligned for most logical volumes ..  
it would probably be better to start writing data at LBA1024 so we  
stay well aligned for logical volumes with stripe widths up to 512KB  
and avoid the R/M/W misalignment that can occur there .. currently  
with a 256KB vdev label, I believe we start the data portion out on  
LBA546 which seems like a problem]


and then we apparently store a backup vtoc right before the backup  
partition table entries and backup GPT:

858 vtoc->efi_parts[0].p_size = vtoc->efi_last_u_lba + 1 -
859 vtoc->efi_first_u_lba - resv;

this next bit is interesting since we should probably define a GUID  
for ZFS partitions that points to the ZFS vdev label instead of using  
V_USR

860
861 /*
862 	 * Why we use V_USR: V_BACKUP confuses users, and is  
considered
863 	 * disposable by some EFI utilities (since EFI doesn't have  
a backup
864 	 * slice).  V_UNASSIGNED is supposed to be used only for  
zero size
865 	 * partitions, and efi_write() will fail if we use it.   
V_ROOT, V_BOOT,
866 	 * etc. were all pretty specific

Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-29 Thread Casper . Dik

>I suspect a lack of an MBR could cause some BIOS implementations to  
>barf ..

Why?

Zeroed disks don't have that issue either.

What appears to be happening is more that raid controllers attempt
to interpret the data in the EFI label as the proprietary
"hardware raid" labels.  At least, it seems to be a problem
with internal RAIDs only.

In my experience, removing the disks from the boot sequence was
not enough; you need to disable the disks in the BIOS.

The SCSI disks with EFI labels in the same system caused no
issues at all; but the disks connected to the on-board RAID
did have issues.

So what you need to do is:

- remove the controllers from the probe sequence
- disable the disks

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-29 Thread Jonathan Edwards

On Nov 29, 2006, at 10:41, [EMAIL PROTECTED] wrote:

This is a problem since how can anyone use ZFS on a PC???  My  
motherboard is a newly minted AM2 w/ all the latest firmware.  I  
disabled boot detection on the sata channels and it still refuses  
to boot.  I had to purchase an external SATA enclosure to fix the  
drives.  This seems to me to be a serious problem.  I put build 47  
and 50 on there with the same issue.


A serious problem *IN YOUR BIOS*.

You will need to format the disks, at ordinary PC (fdisk) labels and
on those create Solaris partitions and give those to ZFS.


take a look at the EFI/GPT discussion here (apple):
http://www.roughlydrafted.com/RD/Home/7CC25766-EF64-4D85-AD37- 
BCC39FBD2A4F.html


I suspect a lack of an MBR could cause some BIOS implementations to  
barf ..


does our fdisk put an MBR on the disk?  if so, does the EFI vdev  
labeling invalidate the MBR?


Jonathan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-29 Thread Casper . Dik
>This is a problem since how can anyone use ZFS on a PC???  My motherboard is a 
>newly minted AM2 w/
 all the latest firmware.  I disabled boot detection on the sata channels and 
it still refuses to b
oot.  I had to purchase an external SATA enclosure to fix the drives.  This 
seems to me to be a ser
ious problem.  I put build 47 and 50 on there with the same issue.

A serious problem *IN YOUR BIOS*.

You will need to format the disks, at ordinary PC (fdisk) labels and
on those create Solaris partitions and give those to ZFS.

Casper
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-29 Thread James McPherson

On 11/30/06, David Elefante <[EMAIL PROTECTED]> wrote:

I had the same thing happen to me twice on my x86 box.  I
installed ZFS (RaidZ) on my enclosure with four drives and
upon reboot the bios hangs upon detection of the newly EFI'd
drives.  I've already RMA'd 4 drives to seagate and the new
batch was frozen as well.  I was suspecting my enclosure,
but I was suspicious when it only went bye bye after installing ZFS.

This is a problem since how can anyone use ZFS on a PC???
My motherboard is a newly minted AM2 w/ all the latest
firmware.  I disabled boot detection on the sata channels and
it still refuses to boot.  I had to purchase an external SATA
enclosure to fix the drives.  This seems to me to be a serious
problem.  I put build 47 and 50 on there with the same issue.




Yes, this is a serious problem. It's a problem with your motherboard
bios, which is clearly not up to date. The Sun Ultra-20 bios was
updated with a fix for this issue back in May.

Until you have updated your bios, you will need to destroy the
EFI labels, write SMI labels to the disks, and create slices on those
disks which are the size that you want to devote to ZFS. Then you
can specify the slice name when you run your zpool create operation.

This has been covered in the ZFS discussion lists several times, and
a quick google search should have found the answer for you.


James C. McPherson
--
Solaris kernel software engineer, system admin and troubleshooter
 http://www.jmcp.homeunix.com/blog
Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: system wont boot after zfs

2006-11-29 Thread Toby Thain


On 29-Nov-06, at 9:30 AM, David Elefante wrote:

I had the same thing happen to me twice on my x86 box.  I installed  
ZFS (RaidZ) on my enclosure with four drives and upon reboot the  
bios hangs upon detection of the newly EFI'd drives.  ...  This  
seems to me to be a serious problem.


Indeed. Yay for PC BIOS.

--Toby

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: system wont boot after zfs

2006-11-29 Thread David Elefante
I had the same thing happen to me twice on my x86 box.  I installed ZFS (RaidZ) 
on my enclosure with four drives and upon reboot the bios hangs upon detection 
of the newly EFI'd drives.  I've already RMA'd 4 drives to seagate and the new 
batch was frozen as well.  I was suspecting my enclosure, but I was suspicious 
when it only went bye bye after installing ZFS.

This is a problem since how can anyone use ZFS on a PC???  My motherboard is a 
newly minted AM2 w/ all the latest firmware.  I disabled boot detection on the 
sata channels and it still refuses to boot.  I had to purchase an external SATA 
enclosure to fix the drives.  This seems to me to be a serious problem.  I put 
build 47 and 50 on there with the same issue.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss