Re: [zfs-discuss] Large scale performance query

2011-07-26 Thread Rocky Shek
Phil,

 

Recently, we have built a large configuration on 4 way Xeon sever with 8 4U
24 Bay JBOD. We are using 2x LSI 6160 SAS switch so we can easy to expand
the Storage in the future.

 

1)  If you are planning to expand your storage, you should consider
using LSI SAS switch for easy future expansion. 

2)  We carefully pick one HD from each JBOD to create RAIDZ2. So we can
loss two JBOD at the same time while data is still accessible . It is good
to know you have the same idea

3)  Seq. read/write is currently limited by 10G NIC. Local storage can
easily hit 1500MB/s + with even small number of HD. Again 10G is bottom-neck


4)  I recommend you use native SAS HD in large scale system if possible.
Native SAS HD work better 

5)  We are using DSM to locate fail disk and monitor FRU of JBOD
http://dataonstorage.com/dsm.

 

I hope the above points can help

 

The configuration is similar to the configuration 3 in the following link

http://dataonstorage.com/dataon-solutions/lsi-6gb-sas-switch-sas6160-storage
.html

 

Technical Specs:

DNS-4800 4way Intel Xeon 7550 server with 256G RAM 

2x LSI 9200-8E HBA

2x LSI 6160 SAS Switch

8x DNS-1600 4U 24bay JBOD(dual IO in MPxIO) with 2TB Seagate SAS HD RAIDZ2

STEC Zeus RAM for ZIL

Intel 320 SSD for L2ARC   

10G NIC

 

Rocky 

 

From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Phil Harrison
Sent: Sunday, July 24, 2011 11:34 PM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Large scale performance query

 

Hi All,

 

Hoping to gain some insight from some people who have done large scale
systems before? I'm hoping to get some performance estimates, suggestions
and/or general discussion/feedback. I cannot discuss the exact specifics of
the purpose but will go into as much detail as I can.

 

Technical Specs:

216x 3TB 7k3000 HDDs

24x 9 drive RAIDZ3

4x JBOD Chassis (45 bay)

1x server (36 bay)

2x AMD 12 Core CPU

128GB EEC RAM

2x 480GB SSD Cache

10Gbit NIC

 

Workloads:

 

Mainly streaming compressed data. That is, pulling compressed data in a
sequential manner however could have multiple streams happening at once
making it somewhat random. We are hoping to have 5 clients pull 500Mbit
sustained. 

 

Considerations:

 

The main reason RAIDZ3 was chosen was so we can distribute the parity across
the JBOD enclosures. With this method even if an entire JBOD enclosure is
taken offline the data is still accessible. 

 

Questions:

 

How to manage the physical locations of such a vast number of drives? I have
read this
(http://blogs.oracle.com/eschrock/entry/external_storage_enclosures_in_solar
is) and am hoping some can shed some light if the SES2 enclosure
identification has worked for them? (enclosures are SES2)

 

What kind of performance would you expect from this setup? I know we can
multiple the base IOPS by 24 but what about max sequential read/write?


Thanks, 

 

Phil

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-26 Thread Casper . Dik


Bullshit. I just got a OCZ Vertex 3, and the first fill was 450-500MB/s.
Second and sequent fills are at half that speed. I'm quite confident
that it's due to the flash erase cycle that's needed, and if stuff can
be TRIM:ed (and thus flash erased as well), speed would be regained.
Overwriting an previously used block requires a flash erase, and if that
can be done in the background when the timing is not critical instead of
just before you can actually write the block you want, performance will
increase.

I think TRIM is needed both for flash (for speed) and for
thin provisioning; ZFS will dirty all of the volume even though only a 
small part of the volume is used at any particular time.  That makes ZFS 
more or less unusable with thin provisioning; support for TRIM would fix 
that if the underlying volume management supports TRIM.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Andrew Gabriel
Does anyone know if it's OK to do zfs send/receive between zpools with 
different ashift values?


--
Andrew Gabriel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-26 Thread Fajar A. Nugraha
On Tue, Jul 26, 2011 at 3:28 PM,  casper@oracle.com wrote:


Bullshit. I just got a OCZ Vertex 3, and the first fill was 450-500MB/s.
Second and sequent fills are at half that speed. I'm quite confident
that it's due to the flash erase cycle that's needed, and if stuff can
be TRIM:ed (and thus flash erased as well), speed would be regained.
Overwriting an previously used block requires a flash erase, and if that
can be done in the background when the timing is not critical instead of
just before you can actually write the block you want, performance will
increase.

 I think TRIM is needed both for flash (for speed) and for
 thin provisioning; ZFS will dirty all of the volume even though only a
 small part of the volume is used at any particular time.  That makes ZFS
 more or less unusable with thin provisioning; support for TRIM would fix
 that if the underlying volume management supports TRIM.

 Casper

Shouldn't modern SSD controllers be smart enough already that they know:
- if there's a request to overwrite a sector, then the old data on
that sector is no longer needed
- allocate a clean sector from pool of available sectors (part of
wear-leveling mechanism)
- clear the old sector, and add it to the pool (possibly done in
background operation)

It seems to be the case with sandforce-based SSDs. That would pretty
much let the SSD work just fine even without TRIM (like when used
under HW raid).

-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-26 Thread Casper . Dik


Shouldn't modern SSD controllers be smart enough already that they know:
- if there's a request to overwrite a sector, then the old data on
that sector is no longer needed
- allocate a clean sector from pool of available sectors (part of
wear-leveling mechanism)
- clear the old sector, and add it to the pool (possibly done in
background operation)

It seems to be the case with sandforce-based SSDs. That would pretty
much let the SSD work just fine even without TRIM (like when used
under HW raid).


That is possibly not sufficient.  If ZFS writes bytes to every sector, 
even though the pool is not full, the controller cannot know where to
reclaim the data.  If it uses spare sectors then it can map them to the 
to the new data and add the overwritten sectors to the free pool.

With TRIM, it gets more blocks to reuse and it gives more time to
erase them, making the SSD faster.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Darren J Moffat

On 07/26/11 10:14, Andrew Gabriel wrote:

Does anyone know if it's OK to do zfs send/receive between zpools with
different ashift values?


The ZFS Send stream is at the DMU layer at this layer the data is 
uncompress and decrypted - ie exactly how the application wants it.


The ashift is a vdev layer concept - ie below the DMU layer.

There is nothing in the send stream format that knows what an ashift 
actually is.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Fred Liu

 
 The ZFS Send stream is at the DMU layer at this layer the data is
 uncompress and decrypted - ie exactly how the application wants it.
 

Even the data compressed/encrypted by ZFS will be decrypted? If it is true, 
will it be any CPU overhead?
And ZFS send/receive tunneled by ssh becomes the only way to encrypt the data 
transmission?

Thanks.


Fred
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Darren J Moffat

On 07/26/11 11:28, Fred Liu wrote:

The ZFS Send stream is at the DMU layer at this layer the data is
uncompress and decrypted - ie exactly how the application wants it.



Even the data compressed/encrypted by ZFS will be decrypted?


Yes, which is exactly what I said.

All data as seen by the DMU is decrypted and decompressed, the DMU layer 
is what the ZPL layer is built ontop of so it has to be that way.


 If it is true, will it be any CPU overhead?

There is always some overhead for doing a decryption and decompression, 
the question is really can you detect it and if you can does it mater.
If you are running Solaris on processors with built in support for AES 
(eg SPARC T2, T3 or Intel with AES-NI) the overhead is reduced 
significantly in many cases.


For many people getting the stuff from disk takes more time than doing 
the transform to get back your plaintext.


In some of the testing I did I found that gzip decompression can be more 
significant to a workload than doing the AES decryption.


So basically yes of course but does it actually mater ?


And ZFS send/receive tunneled by ssh becomes the only way to encrypt the data 
transmission?


That isn't the only way.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Fred Liu
 
 Yes, which is exactly what I said.
 
 All data as seen by the DMU is decrypted and decompressed, the DMU
 layer
 is what the ZPL layer is built ontop of so it has to be that way.
 

Understand. Thank you. ;-)
 
 There is always some overhead for doing a decryption and decompression,
 the question is really can you detect it and if you can does it mater.
 If you are running Solaris on processors with built in support for AES
 (eg SPARC T2, T3 or Intel with AES-NI) the overhead is reduced
 significantly in many cases.
 
 For many people getting the stuff from disk takes more time than doing
 the transform to get back your plaintext.
 
 In some of the testing I did I found that gzip decompression can be
 more
 significant to a workload than doing the AES decryption.
 
 So basically yes of course but does it actually mater ?
 

It is up to how big the delta is. It does matter if the data backup can not
be finished within the required backup window when people use zfs  send/receive
to do the mass data backup.
BTW adding a sort of off-topic question -- will NDMP protocol in Solaris will 
do 
decompression and decryption? Thanks.

  And ZFS send/receive tunneled by ssh becomes the only way to encrypt
 the data transmission?
 
 That isn't the only way.
 
 
 --

Any alternatives, if you don't mind? ;-)

Thanks.

Fred
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Frank Van Damme
Op 26-07-11 12:56, Fred Liu schreef:
 Any alternatives, if you don't mind? ;-)

vpn's, openssl piped over netcat, a password-protected zip file,... ;)

ssh would be the most practical, probably.

-- 
No part of this copyright message may be reproduced, read or seen,
dead or alive or by any means, including but not limited to telepathy
without the benevolence of the author.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive and ashift

2011-07-26 Thread Darren J Moffat

On 07/26/11 11:56, Fred Liu wrote:

It is up to how big the delta is. It does matter if the data backup can not
be finished within the required backup window when people use zfs  send/receive
to do the mass data backup.


The only way you will know of decrypting and decompressing causes a 
problem in that case is if you try it on your systems.  I seriously 
doubt it will be unless the system is already heavily CPU bound and your 
backup window is already very tight.



BTW adding a sort of off-topic question -- will NDMP protocol in Solaris will do
decompression and decryption? Thanks.


My understanding of the NDMP protocol is that it would be a translator 
that did that it isn't part of the core protocol.


The way I would do it is to use a T1C tape drive and have it do the 
compression and encryption of the data.


http://www.oracle.com/us/products/servers-storage/storage/tape-storage/t1c-tape-drive-292151.html

The alternative is to have the node in your NDMP network that does the 
writing to the tape to do the compression and encryption of the data 
stream before putting it on the tape.



And ZFS send/receive tunneled by ssh becomes the only way to encrypt

the data transmission?

That isn't the only way.


--


Any alternatives, if you don't mind? ;-)


For starters SSL/TLS (which is what the Oracle ZFSSA provides for 
replication) or IPsec are possibilities as well, depends what the risk 
is you are trying to protect against and what transport layer is.


But basically it is not provided by ZFS itself it is up to the person 
building the system to secure the transport layer used for ZFS send.


It could also be write directly to a T10k encrypting tape drive.

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-26 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Fajar A. Nugraha
 
 Shouldn't modern SSD controllers be smart enough already that they know:
 - if there's a request to overwrite a sector, then the old data on
 that sector is no longer needed

In the present state of the world, somebody I know in the trade describes
SSD's as the pimple on the butt of the elephant when it comes to flash
manufacturing.  In other words, mobile devices account for a huge majority
(something like 90%) of flash produced in the world, and SSD's are something
like 4%, and for some reason (I don't know why) there's a benefit to
optimizing on 8k pages.  Which means no.  If you overwrite a sector of a
SSD, that does not mean you can erase the page.  Because you can only erase
the whole page, and the disk can only interact with the OS using 4k blocks
or smaller.  So there's a minimum of 2 logical blocks per page in the SSD.
When you trim a block, only half of the page gets marked as free.
Eventually the controller needs to read half a block from page A, half a
block from page B, write them both to blank page C, and then erase pages A
and B.


 - allocate a clean sector from pool of available sectors (part of
 wear-leveling mechanism)
 - clear the old sector, and add it to the pool (possibly done in
 background operation)

The complexity here is much larger...  In all the storage pages in the SSD,
some are marked used, some are marked unused, some are erased, and some are
not erased.  You can only write to a sector if it's both unused and erased.
Each sector takes half a page.  You can write to an individual sector, but
you cannot erase an individual sector.

At the OS interface, only sectors are logically addressed, but internally
the controller must map those to physical halves of pages.  So the
controller maintains a completely arbitrary lookup table so any sector can
map to any sector or page.  When the OS requests to overwrite some sector,
the controller will actually write to some formerly unused sector and remap
and mark the old one as unused.  Later in the background, if the other half
of the page is also unused, the page will be erased.

Does that clarify anything?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Adding mirrors to an existing zfs-pool

2011-07-26 Thread Bernd W. Hennig
G'Day,

- zfs pool with 4 disks (from Clariion A)
- must migrate to Clariion B (so I created 4 disks with the same size,
  avaiable for the zfs)

The zfs pool has no mirrors, my idea was to add the new 4 disks from
the Clariion B to the 4 disks which are still in the pool - and later
remove the original 4 disks.

I only found in all example how to create a new pool with mirrors
but no example how to add to a pool without mirrors a mirror disk
for each disk in the pool.

- is it possible to add disks to each disk in the pool (they have different
  sizes, so I have exact add the correct disks form Clariion B to the 
  original disk from Clariion B)
- can I later remove the disks from the Clariion A, pool is intact, user
  can work with the pool 


??

Sorry for the beginner questions

Tnx for help
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Adding mirrors to an existing zfs-pool

2011-07-26 Thread Andrew Gabriel

Bernd W. Hennig wrote:

G'Day,

- zfs pool with 4 disks (from Clariion A)
- must migrate to Clariion B (so I created 4 disks with the same size,
  avaiable for the zfs)

The zfs pool has no mirrors, my idea was to add the new 4 disks from
the Clariion B to the 4 disks which are still in the pool - and later
remove the original 4 disks.

I only found in all example how to create a new pool with mirrors
but no example how to add to a pool without mirrors a mirror disk
for each disk in the pool.

- is it possible to add disks to each disk in the pool (they have different
  sizes, so I have exact add the correct disks form Clariion B to the 
  original disk from Clariion B)

- can I later remove the disks from the Clariion A, pool is intact, user
  can work with the pool
  


Depends on a few things...

What OS are you running, and what release/update or build?

What's the RAID layout of your pool zpool status?

--
Andrew Gabriel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-26 Thread David Dyer-Bennet

On Mon, July 25, 2011 10:03, Orvar Korvar wrote:
 There is at least a common perception (misperception?) that devices
 cannot process TRIM requests while they are 100% busy processing other
 tasks.

 Just to confirm; SSD disks can do TRIM while processing other tasks?

Processing the request just means flagging the blocks, though, right? 
And the actual benefits only acrue if the garbage collection / block
reshuffling background tasks get a chance to run?

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
Hi all-
We've been experiencing a very strange problem for two days now.  

We have three client (Linux boxes) connected to a ZFS box (Nexenta) via iSCSI.  
Every few seconds (seems random), iostats shows the clients go from an normal 
80K+ IOPS to zero.  It lasts up to a few seconds and things are fine again.  
When that happens, I/Os on the local disks stops too, even the totally 
unrelated ones. How can that be?  All three clients show the same pattern and 
everything was fine prior to Sunday.  Nothing has changed on neither the 
clients or the server. The ZFS box is not even close to be saturated, nor the 
network.

We don't even know where to start... any advices?
Ian
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Adding mirrors to an existing zfs-pool]

2011-07-26 Thread Cindy Swearingen


Subject: Re: [zfs-discuss] Adding mirrors to an existing zfs-pool
Date: Tue, 26 Jul 2011 08:54:38 -0600
From: Cindy Swearingen cindy.swearin...@oracle.com
To: Bernd W. Hennig consult...@hennig-consulting.com
References: 342994905.11311662049567.JavaMail.Twebapp@sf-app1

Hi Bernd,

If you are talking about attaching 4 new disks to a non redundant pool
with 4 disks, and then you want to detach the previous disks then yes,
this is possible and a good way to migrate to new disks.

The new disks must be the equivalent size or larger than the original
disks.

See the hypothetical example below.

If you mean something else, then please provide your zpool status
output.

Thanks,

Cindy


# zpool status tank
 pool: tank
 state: ONLINE
 scan: resilvered 1018K in 0h0m with 0 errors on Fri Jul 22 15:54:52 2011
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
c4t1d0  ONLINE   0 0 0
c4t2d0  ONLINE   0 0 0
c4t3d0  ONLINE   0 0 0
c4t4d0  ONLINE   0 0 0


# zpool attach tank c4t1d0 c6t1d0
# zpool attach tank c4t2d0 c6t2d0
# zpool attach tank c4t3d0 c6t3d0
# zpool attach tank c4t4d0 c6t4d0

The above syntax will create 4 mirrored pairs of disks.

Attach each new disk, wait for it to resilver, attach the next disk,
resilver, and so on. I would scrub the pool after resilvering is
complete, and check fmdump to ensure all new devices are operational.

When all the disks are replaced and the pool is operational, detach
the original disks.

# zpool detach tank c4t1d0
# zpool detach tank c4t2d0
# zpool detach tank c4t3d0
# zpool detach tank c4t4d0


On 07/26/11 00:33, Bernd W. Hennig wrote:

G'Day,

- zfs pool with 4 disks (from Clariion A)
- must migrate to Clariion B (so I created 4 disks with the same size,
  avaiable for the zfs)

The zfs pool has no mirrors, my idea was to add the new 4 disks from
the Clariion B to the 4 disks which are still in the pool - and later
remove the original 4 disks.

I only found in all example how to create a new pool with mirrors
but no example how to add to a pool without mirrors a mirror disk
for each disk in the pool.

- is it possible to add disks to each disk in the pool (they have different
  sizes, so I have exact add the correct disks form Clariion B to the 
  original disk from Clariion B)

- can I later remove the disks from the Clariion A, pool is intact, user
  can work with the pool 



??

Sorry for the beginner questions

Tnx for help


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
To add to that... iostat on the client boxes show the connection to always be 
around 98% util and tops at 100% whenever it hangs.   The same clients are 
connected to another ZFS server with much lower specs and a smaller number of 
slower disks, it performs much better and rarely get past 5% util.  They share 
the same network.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Rocky Shek
Ian,

Did you enable DeDup? 

Rocky 


-Original Message-
From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ian D
Sent: Tuesday, July 26, 2011 7:52 AM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] Entire client hangs every few seconds

Hi all-
We've been experiencing a very strange problem for two days now.  

We have three client (Linux boxes) connected to a ZFS box (Nexenta) via
iSCSI.  Every few seconds (seems random), iostats shows the clients go from
an normal 80K+ IOPS to zero.  It lasts up to a few seconds and things are
fine again.  When that happens, I/Os on the local disks stops too, even the
totally unrelated ones. How can that be?  All three clients show the same
pattern and everything was fine prior to Sunday.  Nothing has changed on
neither the clients or the server. The ZFS box is not even close to be
saturated, nor the network.

We don't even know where to start... any advices?
Ian
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Adding mirrors to an existing zfs-pool

2011-07-26 Thread hung-sheng tsao
Hi
It is better just ceate new ool in array 8
Then use cpio ro copy the data

On 7/26/11, Bernd W. Hennig consult...@hennig-consulting.com wrote:
 G'Day,

 - zfs pool with 4 disks (from Clariion A)
 - must migrate to Clariion B (so I created 4 disks with the same size,
   avaiable for the zfs)

 The zfs pool has no mirrors, my idea was to add the new 4 disks from
 the Clariion B to the 4 disks which are still in the pool - and later
 remove the original 4 disks.

 I only found in all example how to create a new pool with mirrors
 but no example how to add to a pool without mirrors a mirror disk
 for each disk in the pool.

 - is it possible to add disks to each disk in the pool (they have different
   sizes, so I have exact add the correct disks form Clariion B to the
   original disk from Clariion B)
 - can I later remove the disks from the Clariion A, pool is intact, user
   can work with the pool


 ??

 Sorry for the beginner questions

 Tnx for help
 --
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 
Sent from my mobile device

Hung-Sheng Tsao, Ph.D. laot...@gmail.com
laot...@gmail.com
http://laotsao.wordpress.com
cell:9734950840
gvoice:8623970640
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
No dedup.  

The hiccups started around 2am on Sunday while (obviously) nobody was 
interacting with neither the clients or the server.  It's been running for 
months (as is) without any problem.

My guess is that it's a defective hard drive that instead of totally failing, 
just stutters.  Or maybe it's the cache.  We disabled the SLOG with no effect, 
but we haven't tried with the L2ARC.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] recover zpool with a new installation

2011-07-26 Thread Roberto Scudeller
Hi all,

I lost my storage because rpool don't boot. I try to recover, but
opensolaris says to destroy and re-create.
My rpool installed on flash drive, and my pool (with my info) it's on
another disks.

My question is: It's possible I reinstall opensolaris in new flash drive,
without stirring on my pool of disks, and recover this pool?

Thanks.

Regards,
-- 
Roberto Scudeller
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-26 Thread Brandon High
On Tue, Jul 26, 2011 at 7:51 AM, David Dyer-Bennet d...@dd-b.net wrote:

 Processing the request just means flagging the blocks, though, right?
 And the actual benefits only acrue if the garbage collection / block
 reshuffling background tasks get a chance to run?


I think that's right. TRIM just gives hints to the garbage collector that
sectors are no longer in use. When the GC runs, it can find more flash
blocks more easily that aren't used or combine several mostly-empty
blocks and erase or otherwise free them for reuse later.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS resilvering loop from hell

2011-07-26 Thread Charles Stephens
I'm on S11E 150.0.1.9 and I replaced one of the drives and the pool seems to be 
stuck in a resilvering loop.  I performed a 'zpool clear' and 'zpool scrub' and 
just complains that the drives I didn't replace are degraded because of too 
many errors.  Oddly the replaced drive is reported as being fine.  The CKSUM 
counts get up to about 108 or so when the resilver is completed.

I'm now trying to evacuate the pool onto another pool, however the zfs 
send/receive is dying after 380GB into sending the first dataset.

Here is some output.  Any help or insights will be helpful.  Thanks

cfs

  pool: dpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Tue Jul 26 15:03:32 2011
63.4G scanned out of 5.02T at 6.81M/s, 212h12m to go
15.1G resilvered, 1.23% done
config:

NAMESTATE READ WRITE CKSUM
dpool   DEGRADED 0 0 6
  raidz1-0  DEGRADED 0 012
c9t0d0  DEGRADED 0 0 0  too many errors
c9t1d0  DEGRADED 0 0 0  too many errors
c9t3d0  DEGRADED 0 0 0  too many errors
c9t2d0  ONLINE   0 0 0  (resilvering)

errors: Permanent errors have been detected in the following files:

metadata:0x0
[redacted list of 20 files, mostly in the same directory]


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Garrett D'Amore
This is actually a recently known problem, and a fix for it is in the
3.1 version, which should be available any minute now, if it isn't
already available.

The problem has to do with some allocations which are sleeping, and jobs
in the ZFS subsystem get backed behind some other work.

If you have adequate system memory, you are less likely to see this
problem, I think.

 - Garrett


On Tue, 2011-07-26 at 08:29 -0700, Rocky Shek wrote:
 Ian,
 
 Did you enable DeDup? 
 
 Rocky 
 
 
 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org
 [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ian D
 Sent: Tuesday, July 26, 2011 7:52 AM
 To: zfs-discuss@opensolaris.org
 Subject: [zfs-discuss] Entire client hangs every few seconds
 
 Hi all-
 We've been experiencing a very strange problem for two days now.  
 
 We have three client (Linux boxes) connected to a ZFS box (Nexenta) via
 iSCSI.  Every few seconds (seems random), iostats shows the clients go from
 an normal 80K+ IOPS to zero.  It lasts up to a few seconds and things are
 fine again.  When that happens, I/Os on the local disks stops too, even the
 totally unrelated ones. How can that be?  All three clients show the same
 pattern and everything was fine prior to Sunday.  Nothing has changed on
 neither the clients or the server. The ZFS box is not even close to be
 saturated, nor the network.
 
 We don't even know where to start... any advices?
 Ian


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Ian D
Hi Garrett-
It is something that could happen at any time on a system that has been working 
fine for a while?  That system has 256G of RAM, I think adequate is not a 
concern here :)

We'll try 3.1 as soon as we can download it.
Ian
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] recover zpool with a new installation

2011-07-26 Thread Cindy Swearingen

Hi Roberto,

Yes, you can reinstall the OS on another disk and as long as the
OS install doesn't touch the other pool's disks, your
previous non-root pool should be intact. After the install
is complete, just import the pool.

Thanks,

Cindy



On 07/26/11 10:49, Roberto Scudeller wrote:

Hi all,

I lost my storage because rpool don't boot. I try to recover, but 
opensolaris says to destroy and re-create.
My rpool installed on flash drive, and my pool (with my info) it's on 
another disks.


My question is: It's possible I reinstall opensolaris in new flash 
drive, without stirring on my pool of disks, and recover this pool?


Thanks.

Regards,
--
Roberto Scudeller





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Adding mirrors to an existing zfs-pool

2011-07-26 Thread Fajar A. Nugraha
On Tue, Jul 26, 2011 at 1:33 PM, Bernd W. Hennig
consult...@hennig-consulting.com wrote:
 G'Day,

 - zfs pool with 4 disks (from Clariion A)
 - must migrate to Clariion B (so I created 4 disks with the same size,
  avaiable for the zfs)

 The zfs pool has no mirrors, my idea was to add the new 4 disks from
 the Clariion B to the 4 disks which are still in the pool - and later
 remove the original 4 disks.

 I only found in all example how to create a new pool with mirrors
 but no example how to add to a pool without mirrors a mirror disk
 for each disk in the pool.

 - is it possible to add disks to each disk in the pool (they have different
  sizes, so I have exact add the correct disks form Clariion B to the
  original disk from Clariion B)

from man zpool

   zpool attach [-f] pool device new_device

   Attaches new_device to an existing zpool device. The
existing device cannot be part of a raidz configuration. If device is
not currently  part
   of  a  mirrored  configuration, device automatically
transforms into a two-way mirror of device and new_device. If device
is part of a two-way
   mirror, attaching new_device creates a three-way mirror,
and so on. In either case, new_device begins to resilver immediately.

   -fForces use of new_device, even if its appears to be
in use. Not all devices can be overridden in this manner.

 - can I later remove the disks from the Clariion A, pool is intact, user
  can work with the pool

   zpool detach pool device

   Detaches device from a mirror. The operation is refused if
there are no other valid replicas of the data.



If you're using raidz, you can't use zpool attach. Your best bet in
this case is zpool replace.

   zpool replace [-f] pool old_device [new_device]

   Replaces old_device with new_device. This is equivalent to
attaching new_device, waiting for it to resilver, and then detaching
old_device.

   The size of new_device must be greater than or equal to the
minimum size of all the devices in a mirror or raidz configuration.

   new_device  is  required  if the pool is not redundant. If
new_device is not specified, it defaults to old_device. This form of
replacement is
   useful after an existing disk has failed and has been
physically replaced. In this case, the new disk may have the same /dev
path as  the  old
   device, even though it is actually a different disk. ZFS
recognizes this.

   -fForces use of new_device, even if its appears to be
in use. Not all devices can be overridden in this manner.


-- 
Fajar
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Entire client hangs every few seconds

2011-07-26 Thread Gordon Ross
Are the disk active lights typically ON when this happens?

On Tue, Jul 26, 2011 at 3:27 PM, Garrett D'Amore garr...@damore.org wrote:
 This is actually a recently known problem, and a fix for it is in the
 3.1 version, which should be available any minute now, if it isn't
 already available.

 The problem has to do with some allocations which are sleeping, and jobs
 in the ZFS subsystem get backed behind some other work.

 If you have adequate system memory, you are less likely to see this
 problem, I think.

         - Garrett


 On Tue, 2011-07-26 at 08:29 -0700, Rocky Shek wrote:
 Ian,

 Did you enable DeDup?

 Rocky


 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org
 [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Ian D
 Sent: Tuesday, July 26, 2011 7:52 AM
 To: zfs-discuss@opensolaris.org
 Subject: [zfs-discuss] Entire client hangs every few seconds

 Hi all-
 We've been experiencing a very strange problem for two days now.

 We have three client (Linux boxes) connected to a ZFS box (Nexenta) via
 iSCSI.  Every few seconds (seems random), iostats shows the clients go from
 an normal 80K+ IOPS to zero.  It lasts up to a few seconds and things are
 fine again.  When that happens, I/Os on the local disks stops too, even the
 totally unrelated ones. How can that be?  All three clients show the same
 pattern and everything was fine prior to Sunday.  Nothing has changed on
 neither the clients or the server. The ZFS box is not even close to be
 saturated, nor the network.

 We don't even know where to start... any advices?
 Ian


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] recover zpool with a new installation

2011-07-26 Thread Brandon High
On Tue, Jul 26, 2011 at 1:14 PM, Cindy Swearingen 
cindy.swearin...@oracle.com wrote:

 Yes, you can reinstall the OS on another disk and as long as the
 OS install doesn't touch the other pool's disks, your
 previous non-root pool should be intact. After the install
 is complete, just import the pool.


You can also use the Live CD or Live USB to access your pool or possibly fix
your existing installation.

You will have to force the zpool import with either a reinstall or a Live
boot.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD vs hybrid drive - any advice?

2011-07-26 Thread Peter Jeremy
On 2011-Jul-26 17:24:05 +0800, Fajar A. Nugraha w...@fajar.net wrote:
Shouldn't modern SSD controllers be smart enough already that they know:
- if there's a request to overwrite a sector, then the old data on
that sector is no longer needed

ZFS never does update-in-place and UFS only does update-in-place for
metadata and where the application forces update-in-place.  This means
there will generally (always for ZFS) be a delay between when a
filesystem frees (is no longer interested in the contents of) a
sector and when it overwrites that sector.  Without TRIM support, a
SSD can only use overwrite to indicate that the contents of a sector
are not needed.  Which, in turn, means there is a pool of sectors that
the FS knows are unused but the SSD doesn't - and is therefore forced
to preserve.

Since an overwrite almost never matches the erase page, this increases
wear on the SSD because it is forced to rewrite unwanted data in order
to free up pages for erasure to support external write requests.  It
also reduces performance for several reasons:
- The SSD has to unnecessarily copy data - which takes time.
- The space recovered by each erasure is effectively reduced by the
  amount of rewritten data so more time-consuming erasures are needed
  for a given external write load.
- The pools of unused but not erased and erased (available)
  sectors are smaller, increasing the probability that an external
  write will require a synchronous erase cycle to complete.

- allocate a clean sector from pool of available sectors (part of
wear-leveling mechanism)

As above, in the absence of TRIM, the pool will be smaller (and more
likely to be empty).

- clear the old sector, and add it to the pool (possibly done in
background operation)

Otherwise a sector could never be rewritten.

It seems to be the case with sandforce-based SSDs. That would pretty
much let the SSD work just fine even without TRIM (like when used
under HW raid).

Better SSDs mitigate the problem by having more hidden space
(keeping the available pool larger to reduce the probability of a
synchronous erase being needed) and higher performance (masking the
impact of the additional internal writes and erasures).

If TRIM support was available then the performance would still
improve.  This means you either get better system performance from
the same SSD, or you can get the same system performance from a
lower-performance (cheaper) SSD.

-- 
Peter Jeremy


pgpoOozgavEXj.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss