Hello Richard,
Wednesday, October 15, 2008, 6:39:49 PM, you wrote:
RE Archie Cowan wrote:
I just stumbled upon this thread somehow and thought I'd share my zfs over
iscsi experience.
We recently abandoned a similar configuration with several pairs of x4500s
exporting zvols as iscsi
Robert Milkowski wrote:
Hello Richard,
Wednesday, October 15, 2008, 6:39:49 PM, you wrote:
RE Archie Cowan wrote:
I just stumbled upon this thread somehow and thought I'd share my zfs over
iscsi experience.
We recently abandoned a similar configuration with several pairs of x4500s
On Thu, Oct 16, 2008 at 03:50:19PM +0800, Gray Carper wrote:
Sidenote: Today we made eight network/iSCSI related tweaks that, in
aggregate, have resulted in dramatic performance improvements (some I
just hadn't gotten around to yet, others suggested by Sun's Mertol
Ozyoney)...
Gary,
Sidenote: Today we made eight network/iSCSI related tweaks that, in
aggregate, have resulted in dramatic performance improvements
(some I
just hadn't gotten around to yet, others suggested by Sun's Mertol
Ozyoney)...
- disabling the Nagle algorithm on the head node
-
Hey, Jim! Thanks so much for the excellent assist on this - much better than
I could have ever answered it!
I thought I'd add a little bit on the other four...
- raising ddi_msix_alloc_limit to 8
For PCI cards that use up to 8 interrupts, which our 10GBe adapters do. The
previous value of 2
Some of that is very worrying Miles, do you have bug ID's for any of those
problems?
I'm guessing the problem of the device being reported ok after the reboot could
be this one:
http://bugs.opensolaris.org/view_bug.do?bug_id=6582549
And could the errors after the reboot be one of these?
Well obviously recovery scenarios need testing, but I still don't see it being
that bad. My thinking on this is:
1. Loss of a server is very much the worst case scenario. Disk errors are
much more likely, and with raid-z2 pools on the individual servers this should
not pose a problem. I
Howdy!
Very valuable advice here (and from Bob, who made similar comments - thanks,
Bob!). I think, then, we'll generally stick to 128K recordsizes. In the case
of databases, we'll stray as appropriate, and we may also stray with the HPC
compute cluster if we can get demonstrate that it is worth
Miles makes a good point here, you really need to look at how this copes with
various failure modes.
Based on my experience, iSCSI is something that may cause you problems. When I
tested this kind of setup last year I found that the entire pool hung for 3
minutes any time an iSCSI volume went
Oops - one thing I meant to mention: We only plan to cross-site replicate
data for those folks who require it. The HPC data crunching would have no
use for it, so that filesystem wouldn't be replicated. In reality, we only
expect a select few users, with relatively small filesystems, to actually
r == Ross [EMAIL PROTECTED] writes:
r 1. Loss of a server is very much the worst case scenario.
r Disk errors are much more likely, and with raid-z2 pools on
r the individual servers
yeah, it kind of sucks that the slow resilvering speed enforces this
two-tier scheme.
Also if
[EMAIL PROTECTED] said:
It's interesting how the speed and optimisation of these maintenance
activities limit pool size. It's not just full scrubs. If the filesystem is
subject to corruption, you need a backup. If the filesystem takes two months
to back up / restore, then you need really
pNFS is NFS-centric of course and it is not yet stable, isn't it? btw,
what is the ETA for pNFS putback?
On Thu, 2008-10-16 at 12:20 -0700, Marion Hakanson wrote:
[EMAIL PROTECTED] said:
It's interesting how the speed and optimisation of these maintenance
activities limit pool size. It's
On Thu, Oct 16, 2008 at 12:20:36PM -0700, Marion Hakanson wrote:
I'll chime in here with feeling uncomfortable with such a huge ZFS pool,
and also with my discomfort of the ZFS-over-ISCSI-on-ZFS approach. There
just seem to be too many moving parts depending on each other, any one of
which
[EMAIL PROTECTED] said:
In general, such tasks would be better served by T5220 (or the new T5440 :-)
and J4500s. This would change the data paths from:
client --net-- T5220 --net-- X4500 --SATA-- disks to
client --net-- T5440 --SAS-- disks
With the J4500 you get the same storage
nw == Nicolas Williams [EMAIL PROTECTED] writes:
nw But does it work well enough? It may be faster than NFS if
You're talking about different things. Gray is using NFS period
between the storage cluster and the compute cluster, no iSCSI.
Gray's (``does it work well enough''): iSCSI
On Thu, Oct 16, 2008 at 04:30:28PM -0400, Miles Nordin wrote:
nw == Nicolas Williams [EMAIL PROTECTED] writes:
nw But does it work well enough? It may be faster than NFS if
You're talking about different things. Gray is using NFS period
between the storage cluster and the compute
nw == Nicolas Williams [EMAIL PROTECTED] writes:
mh == Marion Hakanson [EMAIL PROTECTED] writes:
nw I was replying to Marion's [...]
nw ZFS-over-iSCSI could certainly perform better than NFS,
better than what, ZFS-over-'mkfile'-files-on-NFS? No one was
suggesting that. Do you mean
On Oct 16, 2008, at 15:20, Marion Hakanson wrote:
For the stated usage of the original poster, I think I would aim
toward
turning each of the Thumpers into an NFS server, configure the head-
node
as a pNFS/NFSv4.1
It's a shame that Lustre isn't available on Solaris yet either.
[EMAIL PROTECTED] said:
but Marion's is not really possible at all, and won't be for a while with
other groups' choice of storage-consumer platform, so it'd have to be
GlusterFS or some other goofy fringe FUSEy thing or not-very-general crude
in-house hack.
Well, of course the magnitude of
On Wed, 15 Oct 2008, Gray Carper wrote:
be good to set different recordsize paramaters for each one. Do you have any
suggestions on good starting sizes for each? I'd imagine filesharing might
benefit from a relatively small record size (64K?), image-based backup
targets might like a pretty
I just stumbled upon this thread somehow and thought I'd share my zfs over
iscsi experience.
We recently abandoned a similar configuration with several pairs of x4500s
exporting zvols as iscsi targets and mirroring them for high availability
with T5220s.
Initially, our performance was also
Howdy, Brent!
Thanks for your interest! We're pretty enthused about this project over here
and I'd be happy to share some details with you (and anyone else who cares
to peek). In this post I'll try to hit the major configuration
bullet-points, but I can also throw you command-line level specifics
Archie Cowan wrote:
I just stumbled upon this thread somehow and thought I'd share my zfs over
iscsi experience.
We recently abandoned a similar configuration with several pairs of x4500s
exporting zvols as iscsi targets and mirroring them for high availability
with T5220s.
In
Hi Gray,
You've got a nice setup going there, few comments:
1. Do not tune ZFS without a proven test-case to show otherwise, except...
2. For databases. Tune recordsize for that particular FS to match DB recordsize.
Few questions...
* How are you divvying up the space ?
* How are you taking
Am I right in thinking your top level zpool is a raid-z pool consisting of six
28TB iSCSI volumes? If so that's a very nice setup, it's what we'd be doing if
we had that kind of cash :-)
--
This message posted from opensolaris.org
___
zfs-discuss
gc == Gray Carper [EMAIL PROTECTED] writes:
gc 5. The NAS nead node has wrangled up all six of the iSCSI
gc targets
are you using raidz on the head node? It sounds like simple striping,
which is probably dangerous with the current code. This kind of sucks
because with simple striping
r == Ross [EMAIL PROTECTED] writes:
r Am I right in thinking your top level zpool is a raid-z pool
r consisting of six 28TB iSCSI volumes? If so that's a very
r nice setup,
not if it scrubs at 400GB/day, and 'zfs send' is uselessly slow. Also
I am thinking the J4500 Richard
On Wed, 15 Oct 2008, Marcelo Leal wrote:
Are you talking about what he had in the logic of the configuration at top
level, or you are saying his top level pool is a raidz?
I would think his top level zpool is a raid0...
ZFS does not support RAID0 (simple striping).
Bob
On 15 October, 2008 - Bob Friesenhahn sent me these 0,6K bytes:
On Wed, 15 Oct 2008, Marcelo Leal wrote:
Are you talking about what he had in the logic of the configuration at top
level, or you are saying his top level pool is a raidz?
I would think his top level zpool is a raid0...
So, there is no raid10 in a solaris/zfs setup?
I´m talking about no redundancy...
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Wed, 15 Oct 2008, Tomas Ögren wrote:
ZFS does not support RAID0 (simple striping).
zpool create mypool disk1 disk2 disk3
Sure it does.
This is load-share, not RAID0. Also, to answer the other fellow,
since ZFS does not support RAID0, it also does not support RAID 1+0
(10). :-)
With
Bob Friesenhahn wrote:
On Wed, 15 Oct 2008, Tomas Ögren wrote:
ZFS does not support RAID0 (simple striping).
zpool create mypool disk1 disk2 disk3
Sure it does.
This is load-share, not RAID0. Also, to answer the other fellow,
since ZFS does not support RAID0, it also does not support
Hey, all!
We've recently used six x4500 Thumpers, all publishing ~28TB iSCSI targets over
ip-multipathed 10GB ethernet, to build a ~150TB ZFS pool on an x4200 head node.
In trying to discover optimal ZFS pool construction settings, we've run a
number of iozone tests, so I thought I'd share
Howdy!
Sounds good. We'll upgrade to 1.1 (b101) as soon as it is released, re-run
our battery of tests, and see where we stand.
Thanks!
-Gray
On Tue, Oct 14, 2008 at 8:47 PM, James C. McPherson [EMAIL PROTECTED]
wrote:
Gray Carper wrote:
Hello again! (And hellos to Erast, who has been a
Hey there, James!
We're actually running NexentaStor v1.0.8, which is based on b85. We haven't
done any tuning ourselves, but I suppose it is possible that Nexenta did. If
there's something specific you have in mind, I'd be happy to look for it.
Thanks!
-Gray
On Tue, Oct 14, 2008 at 8:10 PM,
Gray Carper wrote:
Hey there, James!
We're actually running NexentaStor v1.0.8, which is based on b85. We
haven't done any tuning ourselves, but I suppose it is possible that
Nexenta did. If there's something specific you'd like me to look for,
I'd be happy to.
Hi Gray,
So build 85
Just a random spectator here, but I think artifacts you're seeing here are not
due to file size, but rather due to record size.
What is the ZFS record size ?
On a personal note, I wouldn't do non-concurrent (?) benchmarks. They are at
best useless and at worst misleading for ZFS
- Akhilesh.
On Tue, 14 Oct 2008, Gray Carper wrote:
So, how concerned should we be about the low scores here and there?
Any suggestions on how to improve our configuration? And how excited
should we be about the 8GB tests? ;
The level of concern should depend on how you expect your storage pool
to
James, all serious ZFS bug fixes back-ported to b85 as well as marvell
and other sata drivers. Not everything is possible to back-port of
course, but I would say all critical things are there. This includes ZFS
ARC optimization patches, for example.
On Tue, 2008-10-14 at 22:33 +1000, James C.
On Tue, Oct 14, 2008 at 12:31 AM, Gray Carper [EMAIL PROTECTED] wrote:
Hey, all!
We've recently used six x4500 Thumpers, all publishing ~28TB iSCSI targets
over ip-multipathed 10GB ethernet, to build a ~150TB ZFS pool on an x4200
head node. In trying to discover optimal ZFS pool
Gray Carper wrote:
Hey, all!
We've recently used six x4500 Thumpers, all publishing ~28TB iSCSI
targets over ip-multipathed 10GB ethernet, to build a ~150TB ZFS pool on
an x4200 head node. In trying to discover optimal ZFS pool construction
settings, we've run a number of iozone tests, so I
Gray Carper wrote:
Hello again! (And hellos to Erast, who has been a huge help to me many,
many times! :)
As I understand it, Nexenta 1.1 should be released in a matter of weeks
and it'll be based on build 101. We are waiting for that with baited
breath, since it includes some very
Hello again! (And hellos to Erast, who has been a huge help to me many, many
times! :)
As I understand it, Nexenta 1.1 should be released in a matter of weeks and
it'll be based on build 101. We are waiting for that with baited breath,
since it includes some very important Active Directory
Erast Benson wrote:
James, all serious ZFS bug fixes back-ported to b85 as well as marvell
and other sata drivers. Not everything is possible to back-port of
course, but I would say all critical things are there. This includes ZFS
ARC optimization patches, for example.
Excellent!
James
--
Hey there, Bob!
Looks like you and Akhilesh (thanks, Akhilesh!) are driving at a similar,
very valid point. I'm currently using the default recordsize (128K) on all
of the ZFS pool (those of the iSCSI target nodes and the aggregate pool on
the head node).
I should've mentioned something about
46 matches
Mail list logo