Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-08 Thread Robert Milkowski

 On 01/ 7/11 09:02 PM, Pawel Jakub Dawidek wrote:

On Fri, Jan 07, 2011 at 07:33:53PM +, Robert Milkowski wrote:


Now what if block B is a meta-data block?

Metadata is not deduplicated.


Good point but then it depends on a perspective.
What if you you are storing lots of VMDKs?
One corrupted block which is shared among hundreds of VMDKs will affect 
all of them.

And it might be a block containing meta-data information within vmdk...

Anyway, green or not, imho if in a given environment turning 
verification on still delivers acceptable performance then I would 
basically turn it on.


In other environments it is about risk assessment.

Best regards,
 Robert Milkowski
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2011-01-08 Thread Garrett D'Amore

On 01/ 6/11 05:28 AM, Edward Ned Harvey wrote:

From: Khushil Dep [mailto:khushil@gmail.com]

I've deployed large SAN's on both SuperMicro 825/826/846 and Dell
R610/R710's and I've not found any issues so far. I always make a point of
installing Intel chipset NIC's on the DELL's and disabling the Broadcom ones
but other than that it's always been plain sailing - hardware-wise anyway.
 

not found any issues, except the broadcom one which causes the system to crash 
regularly in the default factory configuration.

How did you learn about the broadcom issue for the first time?  I had to learn 
the hard way, and with all the involvement of both Dell and Oracle support 
teams, nobody could tell me what I needed to change.  We literally replaced 
every component of the server twice over a period of 1 year, and I spent 
mandays upgrading and downgrading firmwares randomly trying to find a stable 
configuration.  I scoured the internet to find this little tidbit about 
replacing the broadcom NIC, and randomly guessed, and replaced my nic with an 
intel card to make the problem go away.

The same system doesn't have a problem running RHEL/centos.

What will be the new problem in the next line of servers?  Why, during my 
internet scouring, did I find a lot of other reports, of people who needed to 
disable c-states (didn't work for me) and lots of false leads indicating 
firmware downgrade would fix my broadcom issue?

See my point?  Next time I buy a server, I do not have confidence to simply 
expect solaris on dell to work reliably.  The same goes for solaris 
derivatives, and all non-sun hardware.  There simply is not an adequate 
qualification and/or support process.
   


When you purchase NexentaStor from a top-tier Nexenta Hardware Partner, 
you get a product that has been through a rigorous qualification process 
which includes the hardware and software configuration matched together, 
tested with an extensive battery.  You also can get a higher level of 
support than is offered to people who build their own systems.


Oracle is *not* the only company capable of performing in depth testing 
of Solaris.


I can also know enough about problems that Oracle customers (or rather 
Sun customers) faced with Solaris on Sun hardware -- such as the 
terrible nvidia ethernet problems on first generation U20 and U40 
problems, or the marvell SATA problems on Thumper -- that I know that 
your picture of Oracle isn't nearly as rosy as you believe.  Of course, 
I also lived (as a Sun employee) through the UltraSPARC-II ECC fiasco...


  - Garrett

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2011-01-08 Thread Fajar A. Nugraha
On Thu, Jan 6, 2011 at 11:36 PM, Garrett D'Amore garr...@nexenta.com wrote:
 On 01/ 6/11 05:28 AM, Edward Ned Harvey wrote:
 See my point?  Next time I buy a server, I do not have confidence to
 simply expect solaris on dell to work reliably.  The same goes for solaris
 derivatives, and all non-sun hardware.  There simply is not an adequate
 qualification and/or support process.


 When you purchase NexentaStor from a top-tier Nexenta Hardware Partner, you

Where is the list? Is this the one on
http://www.nexenta.com/corp/technology-partners-overview/certified-technology-partners
?

 get a product that has been through a rigorous qualification process which
 includes the hardware and software configuration matched together, tested
 with an extensive battery.  You also can get a higher level of support than
 is offered to people who build their own systems.

 Oracle is *not* the only company capable of performing in depth testing of
 Solaris.

Does this roughly mean I can expect similar (or even better) hardware
compatibility support and with nexentastor on supermicro as solaris on
oracle/sun hardware?

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2011-01-08 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Garrett D'Amore
 
 When you purchase NexentaStor from a top-tier Nexenta Hardware Partner,
 you get a product that has been through a rigorous qualification process

How do I do this, exactly?  I am serious.  Before too long, I'm going to
need another server, and I would very seriously consider reprovisioning my
unstable Dell Solaris server to become a linux or some other stable machine.
The role it's currently fulfilling is the backup server, which basically
does nothing except zfs receive from the primary Sun solaris 10u9 file
server.  Since the role is just for backups, it's a perfect opportunity for
experimentation, hence the Dell hardware with solaris.  I'd be happy to put
some other configuration in there experimentally instead ... say ...
nexenta.  Assuming it will be just as good at zfs receive from the primary
server.

Is there some specific hardware configuration you guys sell?  Or recommend?
How about a Dell R510/R610/R710?  Buy the hardware separately and buy
NexentaStor as just a software product?  Or buy a somehow more certified
hardware  software bundle together?

If I do encounter a bug, where the only known fact is that the system keeps
crashing intermittently on an approximately weekly basis, and there is
absolutely no clue what's wrong in hardware or software...  How do you guys
handle it?

If you'd like to follow up offlist, that's fine.  Then just email me at the
email address:  nexenta at nedharvey.com
(I use disposable email addresses on mailing lists like this, so at any
random unknown time, I'll destroy my present alias and start using a new
one.)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-08 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Robert Milkowski
 
 What if you you are storing lots of VMDKs?
 One corrupted block which is shared among hundreds of VMDKs will affect
 all of them.
 And it might be a block containing meta-data information within vmdk...

Although the probability of hash collision is astronomically infinitesimally
small, if it were to happen, the damage could be expansive and
unrecoverable.  Even backups could not protect you, because the corruption
would be replicated undetected into your backups too.  Just like other
astronomical events (meteors, supernova, etc) which could destroy all your
data, all your backups, and kill everyone you know, if these risks are not
acceptable to you, you need to take precautions against it.  But you have to
weigh the odds of damage versus the cost of protection.  Admittedly,
precautions against nuclear strike are more costly to implement than
precaution against hash collision (enabling verification is a trivial task).
But that does not mean enabling verification comes without cost.

Has anybody measured the cost of enabling or disabling verification?

The cost of disabling verification is an infinitesimally small number
multiplied by possibly all your data.  Basically lim-0 times lim-infinity.
This can only be evaluated on a case-by-case basis and there's no use in
making any more generalizations in favor or against it.

The benefit of disabling verification would presumably be faster
performance.  Has anybody got any measurements, or even calculations or
vague estimates or clueless guesses, to indicate how significant this is?
How much is there to gain by disabling verification?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2011-01-08 Thread Stephan Budach

Am 08.01.11 18:33, schrieb Edward Ned Harvey:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Garrett D'Amore

When you purchase NexentaStor from a top-tier Nexenta Hardware Partner,
you get a product that has been through a rigorous qualification process

How do I do this, exactly?  I am serious.  Before too long, I'm going to
need another server, and I would very seriously consider reprovisioning my
unstable Dell Solaris server to become a linux or some other stable machine.
The role it's currently fulfilling is the backup server, which basically
does nothing except zfs receive from the primary Sun solaris 10u9 file
server.  Since the role is just for backups, it's a perfect opportunity for
experimentation, hence the Dell hardware with solaris.  I'd be happy to put
some other configuration in there experimentally instead ... say ...
nexenta.  Assuming it will be just as good at zfs receive from the primary
server.

Is there some specific hardware configuration you guys sell?  Or recommend?
How about a Dell R510/R610/R710?  Buy the hardware separately and buy
NexentaStor as just a software product?  Or buy a somehow more certified
hardware  software bundle together?

If I do encounter a bug, where the only known fact is that the system keeps
crashing intermittently on an approximately weekly basis, and there is
absolutely no clue what's wrong in hardware or software...  How do you guys
handle it?

If you'd like to follow up offlist, that's fine.  Then just email me at the
email address:  nexenta at nedharvey.com
(I use disposable email addresses on mailing lists like this, so at any
random unknown time, I'll destroy my present alias and start using a new
one.)

___
Hmm… that'd interest me as well - I do have 4 Dell PE R610, that are 
running OSol or Sol11Expr. I actually bought a Sun Fire X4170 M2, since 
I couldn't get my R610 stable, just as Edward points out.


So, if you guys think that NexentaStor avoids these issues, then I'd 
seriously consider to jumpship - so either please don't continue 
offlist, or please include me in that conversation. ;)


Cheers,
budy

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] pool metadata corrupted - any options?

2011-01-08 Thread David Stein
Running zpool status -x gives the results below.  Do I have any
options besides restoring from tape?

David

$ zpool status -x
  pool: home
 state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Destroy and re-create the pool from a backup source.
   see: http://www.sun.com/msg/ZFS-8000-72
 scrub: none requested
config:

NAME STATE READ WRITE CKSUM
home   FAULTED  0 0 1  corrupted data
  raidz2 ONLINE   0 0 6
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c0t13d0  ONLINE   0 0 0
c0t14d0  ONLINE   0 0 0
c0t15d0  ONLINE   0 0 0
c0t16d0  ONLINE   0 0 0
c0t17d0  ONLINE   0 0 0
c0t18d0  ONLINE   0 0 0
c0t19d0  ONLINE   0 0 0
c0t20d0  ONLINE   0 0 0
c0t21d0  ONLINE   0 0 1
c0t22d0  ONLINE   0 0 0
c0t23d0  ONLINE   0 0 0
c0t2d0   ONLINE   0 0 0
c0t3d0   ONLINE   0 0 0
c0t4d0   ONLINE   0 0 0
c0t5d0   ONLINE   0 0 0
c0t6d0   ONLINE   0 0 0
c0t7d0   ONLINE   0 0 0
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] A few questions

2011-01-08 Thread Garrett D'Amore

On 01/ 8/11 10:43 AM, Stephan Budach wrote:

Am 08.01.11 18:33, schrieb Edward Ned Harvey:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Garrett D'Amore

When you purchase NexentaStor from a top-tier Nexenta Hardware Partner,
you get a product that has been through a rigorous qualification 
process

How do I do this, exactly?  I am serious.  Before too long, I'm going to
need another server, and I would very seriously consider 
reprovisioning my
unstable Dell Solaris server to become a linux or some other stable 
machine.
The role it's currently fulfilling is the backup server, which 
basically

does nothing except zfs receive from the primary Sun solaris 10u9 file
server.  Since the role is just for backups, it's a perfect 
opportunity for
experimentation, hence the Dell hardware with solaris.  I'd be happy 
to put

some other configuration in there experimentally instead ... say ...
nexenta.  Assuming it will be just as good at zfs receive from the 
primary

server.

Is there some specific hardware configuration you guys sell?  Or 
recommend?

How about a Dell R510/R610/R710?  Buy the hardware separately and buy
NexentaStor as just a software product?  Or buy a somehow more certified
hardware  software bundle together?

If I do encounter a bug, where the only known fact is that the system 
keeps

crashing intermittently on an approximately weekly basis, and there is
absolutely no clue what's wrong in hardware or software...  How do 
you guys

handle it?


Such problems are handled on a case by case basis.  Usually we can do 
some analysis from a crash dump, but not always.   My team includes 
several people who are experienced with such analysis, and when problems 
like this occur, we are called into action.


Ultimately this usually results in a patch, sometimes workaround 
suggestions, and sometimes even binary relief (which happens faster than 
a regular patch, but without the deeper QA.)


  - Garrett
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (Fletcher+Verification) versus (Sha256+No Verification)

2011-01-08 Thread Bob Friesenhahn

On Thu, 6 Jan 2011, David Magda wrote:


If you're not worried about disk read errors (and/or are not experiencing
them), then you shouldn't be worried about has collisions.


Except for the little problem that if there is a collision then there 
will always be a collision for the same data and it is unavoidable. 
:-)


Bit rot is a different sort of problem.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss